How do I concatenate two lists in Python?
Example:
listone = [1, 2, 3]
listtwo = [4, 5, 6]
Expected outcome:
>>> joinedlist
[1, 2, 3, 4, 5, 6]
concatenationlistpython
How do I concatenate two lists in Python?
Example:
listone = [1, 2, 3]
listtwo = [4, 5, 6]
Expected outcome:
>>> joinedlist
[1, 2, 3, 4, 5, 6]
How can I merge two Python dictionaries in a single expression?
For dictionaries x
and y
, z
becomes a shallowly-merged dictionary with values from y
replacing those from x
.
In Python 3.9.0 or greater (released 17 October 2020): PEP-584, discussed here, was implemented and provides the simplest method:
z = x | y # NOTE: 3.9+ ONLY
In Python 3.5 or greater:
z = {**x, **y}
In Python 2, (or 3.4 or lower) write a function:
def merge_two_dicts(x, y):
z = x.copy() # start with keys and values of x
z.update(y) # modifies z with keys and values of y
return z
and now:
z = merge_two_dicts(x, y)
Say you have two dictionaries and you want to merge them into a new dictionary without altering the original dictionaries:
x = {'a': 1, 'b': 2}
y = {'b': 3, 'c': 4}
The desired result is to get a new dictionary (z
) with the values merged, and the second dictionary's values overwriting those from the first.
>>> z
{'a': 1, 'b': 3, 'c': 4}
A new syntax for this, proposed in PEP 448 and available as of Python 3.5, is
z = {**x, **y}
And it is indeed a single expression.
Note that we can merge in with literal notation as well:
z = {**x, 'foo': 1, 'bar': 2, **y}
and now:
>>> z
{'a': 1, 'b': 3, 'foo': 1, 'bar': 2, 'c': 4}
It is now showing as implemented in the release schedule for 3.5, PEP 478, and it has now made its way into the What's New in Python 3.5 document.
However, since many organizations are still on Python 2, you may wish to do this in a backward-compatible way. The classically Pythonic way, available in Python 2 and Python 3.0-3.4, is to do this as a two-step process:
z = x.copy()
z.update(y) # which returns None since it mutates z
In both approaches, y
will come second and its values will replace x
's values, thus b
will point to 3
in our final result.
If you are not yet on Python 3.5 or need to write backward-compatible code, and you want this in a single expression, the most performant while the correct approach is to put it in a function:
def merge_two_dicts(x, y):
"""Given two dictionaries, merge them into a new dict as a shallow copy."""
z = x.copy()
z.update(y)
return z
and then you have a single expression:
z = merge_two_dicts(x, y)
You can also make a function to merge an arbitrary number of dictionaries, from zero to a very large number:
def merge_dicts(*dict_args):
"""
Given any number of dictionaries, shallow copy and merge into a new dict,
precedence goes to key-value pairs in latter dictionaries.
"""
result = {}
for dictionary in dict_args:
result.update(dictionary)
return result
This function will work in Python 2 and 3 for all dictionaries. e.g. given dictionaries a
to g
:
z = merge_dicts(a, b, c, d, e, f, g)
and key-value pairs in g
will take precedence over dictionaries a
to f
, and so on.
Don't use what you see in the formerly accepted answer:
z = dict(x.items() + y.items())
In Python 2, you create two lists in memory for each dict, create a third list in memory with length equal to the length of the first two put together, and then discard all three lists to create the dict. In Python 3, this will fail because you're adding two dict_items
objects together, not two lists -
>>> c = dict(a.items() + b.items())
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for +: 'dict_items' and 'dict_items'
and you would have to explicitly create them as lists, e.g. z = dict(list(x.items()) + list(y.items()))
. This is a waste of resources and computation power.
Similarly, taking the union of items()
in Python 3 (viewitems()
in Python 2.7) will also fail when values are unhashable objects (like lists, for example). Even if your values are hashable, since sets are semantically unordered, the behavior is undefined in regards to precedence. So don't do this:
>>> c = dict(a.items() | b.items())
This example demonstrates what happens when values are unhashable:
>>> x = {'a': []}
>>> y = {'b': []}
>>> dict(x.items() | y.items())
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'list'
Here's an example where y
should have precedence, but instead the value from x
is retained due to the arbitrary order of sets:
>>> x = {'a': 2}
>>> y = {'a': 1}
>>> dict(x.items() | y.items())
{'a': 2}
Another hack you should not use:
z = dict(x, **y)
This uses the dict
constructor and is very fast and memory-efficient (even slightly more so than our two-step process) but unless you know precisely what is happening here (that is, the second dict is being passed as keyword arguments to the dict constructor), it's difficult to read, it's not the intended usage, and so it is not Pythonic.
Here's an example of the usage being remediated in django.
Dictionaries are intended to take hashable keys (e.g. frozenset
s or tuples), but this method fails in Python 3 when keys are not strings.
>>> c = dict(a, **b)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: keyword arguments must be strings
From the mailing list, Guido van Rossum, the creator of the language, wrote:
I am fine with declaring dict({}, **{1:3}) illegal, since after all it is abuse of the ** mechanism.
and
Apparently dict(x, **y) is going around as "cool hack" for "call x.update(y) and return x". Personally, I find it more despicable than cool.
It is my understanding (as well as the understanding of the creator of the language) that the intended usage for dict(**y)
is for creating dictionaries for readability purposes, e.g.:
dict(a=1, b=10, c=11)
instead of
{'a': 1, 'b': 10, 'c': 11}
Despite what Guido says,
dict(x, **y)
is in line with the dict specification, which btw. works for both Python 2 and 3. The fact that this only works for string keys is a direct consequence of how keyword parameters work and not a short-coming of dict. Nor is using the ** operator in this place an abuse of the mechanism, in fact, ** was designed precisely to pass dictionaries as keywords.
Again, it doesn't work for 3 when keys are not strings. The implicit calling contract is that namespaces take ordinary dictionaries, while users must only pass keyword arguments that are strings. All other callables enforced it. dict
broke this consistency in Python 2:
>>> foo(**{('a', 'b'): None})
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: foo() keywords must be strings
>>> dict(**{('a', 'b'): None})
{('a', 'b'): None}
This inconsistency was bad given other implementations of Python (PyPy, Jython, IronPython). Thus it was fixed in Python 3, as this usage could be a breaking change.
I submit to you that it is malicious incompetence to intentionally write code that only works in one version of a language or that only works given certain arbitrary constraints.
More comments:
dict(x.items() + y.items())
is still the most readable solution for Python 2. Readability counts.
My response: merge_two_dicts(x, y)
actually seems much clearer to me, if we're actually concerned about readability. And it is not forward compatible, as Python 2 is increasingly deprecated.
{**x, **y}
does not seem to handle nested dictionaries. the contents of nested keys are simply overwritten, not merged [...] I ended up being burnt by these answers that do not merge recursively and I was surprised no one mentioned it. In my interpretation of the word "merging" these answers describe "updating one dict with another", and not merging.
Yes. I must refer you back to the question, which is asking for a shallow merge of two dictionaries, with the first's values being overwritten by the second's - in a single expression.
Assuming two dictionaries of dictionaries, one might recursively merge them in a single function, but you should be careful not to modify the dictionaries from either source, and the surest way to avoid that is to make a copy when assigning values. As keys must be hashable and are usually therefore immutable, it is pointless to copy them:
from copy import deepcopy
def dict_of_dicts_merge(x, y):
z = {}
overlapping_keys = x.keys() & y.keys()
for key in overlapping_keys:
z[key] = dict_of_dicts_merge(x[key], y[key])
for key in x.keys() - overlapping_keys:
z[key] = deepcopy(x[key])
for key in y.keys() - overlapping_keys:
z[key] = deepcopy(y[key])
return z
Usage:
>>> x = {'a':{1:{}}, 'b': {2:{}}}
>>> y = {'b':{10:{}}, 'c': {11:{}}}
>>> dict_of_dicts_merge(x, y)
{'b': {2: {}, 10: {}}, 'a': {1: {}}, 'c': {11: {}}}
Coming up with contingencies for other value types is far beyond the scope of this question, so I will point you at my answer to the canonical question on a "Dictionaries of dictionaries merge".
These approaches are less performant, but they will provide correct behavior.
They will be much less performant than copy
and update
or the new unpacking because they iterate through each key-value pair at a higher level of abstraction, but they do respect the order of precedence (latter dictionaries have precedence)
You can also chain the dictionaries manually inside a dict comprehension:
{k: v for d in dicts for k, v in d.items()} # iteritems in Python 2.7
or in Python 2.6 (and perhaps as early as 2.4 when generator expressions were introduced):
dict((k, v) for d in dicts for k, v in d.items()) # iteritems in Python 2
itertools.chain
will chain the iterators over the key-value pairs in the correct order:
from itertools import chain
z = dict(chain(x.items(), y.items())) # iteritems in Python 2
I'm only going to do the performance analysis of the usages known to behave correctly. (Self-contained so you can copy and paste yourself.)
from timeit import repeat
from itertools import chain
x = dict.fromkeys('abcdefg')
y = dict.fromkeys('efghijk')
def merge_two_dicts(x, y):
z = x.copy()
z.update(y)
return z
min(repeat(lambda: {**x, **y}))
min(repeat(lambda: merge_two_dicts(x, y)))
min(repeat(lambda: {k: v for d in (x, y) for k, v in d.items()}))
min(repeat(lambda: dict(chain(x.items(), y.items()))))
min(repeat(lambda: dict(item for d in (x, y) for item in d.items())))
In Python 3.8.1, NixOS:
>>> min(repeat(lambda: {**x, **y}))
1.0804965235292912
>>> min(repeat(lambda: merge_two_dicts(x, y)))
1.636518670246005
>>> min(repeat(lambda: {k: v for d in (x, y) for k, v in d.items()}))
3.1779992282390594
>>> min(repeat(lambda: dict(chain(x.items(), y.items()))))
2.740647904574871
>>> min(repeat(lambda: dict(item for d in (x, y) for item in d.items())))
4.266070580109954
$ uname -a
Linux nixos 4.19.113 #1-NixOS SMP Wed Mar 25 07:06:15 UTC 2020 x86_64 GNU/Linux
Before understanding metaclasses, you need to master classes in Python. And Python has a very peculiar idea of what classes are, borrowed from the Smalltalk language.
In most languages, classes are just pieces of code that describe how to produce an object. That's kinda true in Python too:
>>> class ObjectCreator(object):
... pass
...
>>> my_object = ObjectCreator()
>>> print(my_object)
<__main__.ObjectCreator object at 0x8974f2c>
But classes are more than that in Python. Classes are objects too.
Yes, objects.
As soon as you use the keyword class
, Python executes it and creates
an object. The instruction
>>> class ObjectCreator(object):
... pass
...
creates in memory an object with the name ObjectCreator
.
This object (the class) is itself capable of creating objects (the instances), and this is why it's a class.
But still, it's an object, and therefore:
e.g.:
>>> print(ObjectCreator) # you can print a class because it's an object
<class '__main__.ObjectCreator'>
>>> def echo(o):
... print(o)
...
>>> echo(ObjectCreator) # you can pass a class as a parameter
<class '__main__.ObjectCreator'>
>>> print(hasattr(ObjectCreator, 'new_attribute'))
False
>>> ObjectCreator.new_attribute = 'foo' # you can add attributes to a class
>>> print(hasattr(ObjectCreator, 'new_attribute'))
True
>>> print(ObjectCreator.new_attribute)
foo
>>> ObjectCreatorMirror = ObjectCreator # you can assign a class to a variable
>>> print(ObjectCreatorMirror.new_attribute)
foo
>>> print(ObjectCreatorMirror())
<__main__.ObjectCreator object at 0x8997b4c>
Since classes are objects, you can create them on the fly, like any object.
First, you can create a class in a function using class
:
>>> def choose_class(name):
... if name == 'foo':
... class Foo(object):
... pass
... return Foo # return the class, not an instance
... else:
... class Bar(object):
... pass
... return Bar
...
>>> MyClass = choose_class('foo')
>>> print(MyClass) # the function returns a class, not an instance
<class '__main__.Foo'>
>>> print(MyClass()) # you can create an object from this class
<__main__.Foo object at 0x89c6d4c>
But it's not so dynamic, since you still have to write the whole class yourself.
Since classes are objects, they must be generated by something.
When you use the class
keyword, Python creates this object automatically. But as
with most things in Python, it gives you a way to do it manually.
Remember the function type
? The good old function that lets you know what
type an object is:
>>> print(type(1))
<type 'int'>
>>> print(type("1"))
<type 'str'>
>>> print(type(ObjectCreator))
<type 'type'>
>>> print(type(ObjectCreator()))
<class '__main__.ObjectCreator'>
Well, type
has a completely different ability, it can also create classes on the fly. type
can take the description of a class as parameters,
and return a class.
(I know, it's silly that the same function can have two completely different uses according to the parameters you pass to it. It's an issue due to backward compatibility in Python)
type
works this way:
type(name, bases, attrs)
Where:
name
: name of the classbases
: tuple of the parent class (for inheritance, can be empty)attrs
: dictionary containing attributes names and valuese.g.:
>>> class MyShinyClass(object):
... pass
can be created manually this way:
>>> MyShinyClass = type('MyShinyClass', (), {}) # returns a class object
>>> print(MyShinyClass)
<class '__main__.MyShinyClass'>
>>> print(MyShinyClass()) # create an instance with the class
<__main__.MyShinyClass object at 0x8997cec>
You'll notice that we use MyShinyClass
as the name of the class
and as the variable to hold the class reference. They can be different,
but there is no reason to complicate things.
type
accepts a dictionary to define the attributes of the class. So:
>>> class Foo(object):
... bar = True
Can be translated to:
>>> Foo = type('Foo', (), {'bar':True})
And used as a normal class:
>>> print(Foo)
<class '__main__.Foo'>
>>> print(Foo.bar)
True
>>> f = Foo()
>>> print(f)
<__main__.Foo object at 0x8a9b84c>
>>> print(f.bar)
True
And of course, you can inherit from it, so:
>>> class FooChild(Foo):
... pass
would be:
>>> FooChild = type('FooChild', (Foo,), {})
>>> print(FooChild)
<class '__main__.FooChild'>
>>> print(FooChild.bar) # bar is inherited from Foo
True
Eventually, you'll want to add methods to your class. Just define a function with the proper signature and assign it as an attribute.
>>> def echo_bar(self):
... print(self.bar)
...
>>> FooChild = type('FooChild', (Foo,), {'echo_bar': echo_bar})
>>> hasattr(Foo, 'echo_bar')
False
>>> hasattr(FooChild, 'echo_bar')
True
>>> my_foo = FooChild()
>>> my_foo.echo_bar()
True
And you can add even more methods after you dynamically create the class, just like adding methods to a normally created class object.
>>> def echo_bar_more(self):
... print('yet another method')
...
>>> FooChild.echo_bar_more = echo_bar_more
>>> hasattr(FooChild, 'echo_bar_more')
True
You see where we are going: in Python, classes are objects, and you can create a class on the fly, dynamically.
This is what Python does when you use the keyword class
, and it does so by using a metaclass.
Metaclasses are the 'stuff' that creates classes.
You define classes in order to create objects, right?
But we learned that Python classes are objects.
Well, metaclasses are what create these objects. They are the classes' classes, you can picture them this way:
MyClass = MetaClass()
my_object = MyClass()
You've seen that type
lets you do something like this:
MyClass = type('MyClass', (), {})
It's because the function type
is in fact a metaclass. type
is the
metaclass Python uses to create all classes behind the scenes.
Now you wonder "why the heck is it written in lowercase, and not Type
?"
Well, I guess it's a matter of consistency with str
, the class that creates
strings objects, and int
the class that creates integer objects. type
is
just the class that creates class objects.
You see that by checking the __class__
attribute.
Everything, and I mean everything, is an object in Python. That includes integers, strings, functions and classes. All of them are objects. And all of them have been created from a class:
>>> age = 35
>>> age.__class__
<type 'int'>
>>> name = 'bob'
>>> name.__class__
<type 'str'>
>>> def foo(): pass
>>> foo.__class__
<type 'function'>
>>> class Bar(object): pass
>>> b = Bar()
>>> b.__class__
<class '__main__.Bar'>
Now, what is the __class__
of any __class__
?
>>> age.__class__.__class__
<type 'type'>
>>> name.__class__.__class__
<type 'type'>
>>> foo.__class__.__class__
<type 'type'>
>>> b.__class__.__class__
<type 'type'>
So, a metaclass is just the stuff that creates class objects.
You can call it a 'class factory' if you wish.
type
is the built-in metaclass Python uses, but of course, you can create your
own metaclass.
__metaclass__
attributeIn Python 2, you can add a __metaclass__
attribute when you write a class (see next section for the Python 3 syntax):
class Foo(object):
__metaclass__ = something...
[...]
If you do so, Python will use the metaclass to create the class Foo
.
Careful, it's tricky.
You write class Foo(object)
first, but the class object Foo
is not created
in memory yet.
Python will look for __metaclass__
in the class definition. If it finds it,
it will use it to create the object class Foo
. If it doesn't, it will use
type
to create the class.
Read that several times.
When you do:
class Foo(Bar):
pass
Python does the following:
Is there a __metaclass__
attribute in Foo
?
If yes, create in-memory a class object (I said a class object, stay with me here), with the name Foo
by using what is in __metaclass__
.
If Python can't find __metaclass__
, it will look for a __metaclass__
at the MODULE level, and try to do the same (but only for classes that don't inherit anything, basically old-style classes).
Then if it can't find any __metaclass__
at all, it will use the Bar
's (the first parent) own metaclass (which might be the default type
) to create the class object.
Be careful here that the __metaclass__
attribute will not be inherited, the metaclass of the parent (Bar.__class__
) will be. If Bar
used a __metaclass__
attribute that created Bar
with type()
(and not type.__new__()
), the subclasses will not inherit that behavior.
Now the big question is, what can you put in __metaclass__
?
The answer is something that can create a class.
And what can create a class? type
, or anything that subclasses or uses it.
The syntax to set the metaclass has been changed in Python 3:
class Foo(object, metaclass=something):
...
i.e. the __metaclass__
attribute is no longer used, in favor of a keyword argument in the list of base classes.
The behavior of metaclasses however stays largely the same.
One thing added to metaclasses in Python 3 is that you can also pass attributes as keyword-arguments into a metaclass, like so:
class Foo(object, metaclass=something, kwarg1=value1, kwarg2=value2):
...
Read the section below for how Python handles this.
The main purpose of a metaclass is to change the class automatically, when it's created.
You usually do this for APIs, where you want to create classes matching the current context.
Imagine a stupid example, where you decide that all classes in your module
should have their attributes written in uppercase. There are several ways to
do this, but one way is to set __metaclass__
at the module level.
This way, all classes of this module will be created using this metaclass, and we just have to tell the metaclass to turn all attributes to uppercase.
Luckily, __metaclass__
can actually be any callable, it doesn't need to be a
formal class (I know, something with 'class' in its name doesn't need to be
a class, go figure... but it's helpful).
So we will start with a simple example, by using a function.
# the metaclass will automatically get passed the same argument
# that you usually pass to `type`
def upper_attr(future_class_name, future_class_parents, future_class_attrs):
"""
Return a class object, with the list of its attribute turned
into uppercase.
"""
# pick up any attribute that doesn't start with '__' and uppercase it
uppercase_attrs = {
attr if attr.startswith("__") else attr.upper(): v
for attr, v in future_class_attrs.items()
}
# let `type` do the class creation
return type(future_class_name, future_class_parents, uppercase_attrs)
__metaclass__ = upper_attr # this will affect all classes in the module
class Foo(): # global __metaclass__ won't work with "object" though
# but we can define __metaclass__ here instead to affect only this class
# and this will work with "object" children
bar = 'bip'
Let's check:
>>> hasattr(Foo, 'bar')
False
>>> hasattr(Foo, 'BAR')
True
>>> Foo.BAR
'bip'
Now, let's do exactly the same, but using a real class for a metaclass:
# remember that `type` is actually a class like `str` and `int`
# so you can inherit from it
class UpperAttrMetaclass(type):
# __new__ is the method called before __init__
# it's the method that creates the object and returns it
# while __init__ just initializes the object passed as parameter
# you rarely use __new__, except when you want to control how the object
# is created.
# here the created object is the class, and we want to customize it
# so we override __new__
# you can do some stuff in __init__ too if you wish
# some advanced use involves overriding __call__ as well, but we won't
# see this
def __new__(upperattr_metaclass, future_class_name,
future_class_parents, future_class_attrs):
uppercase_attrs = {
attr if attr.startswith("__") else attr.upper(): v
for attr, v in future_class_attrs.items()
}
return type(future_class_name, future_class_parents, uppercase_attrs)
Let's rewrite the above, but with shorter and more realistic variable names now that we know what they mean:
class UpperAttrMetaclass(type):
def __new__(cls, clsname, bases, attrs):
uppercase_attrs = {
attr if attr.startswith("__") else attr.upper(): v
for attr, v in attrs.items()
}
return type(clsname, bases, uppercase_attrs)
You may have noticed the extra argument cls
. There is
nothing special about it: __new__
always receives the class it's defined in, as the first parameter. Just like you have self
for ordinary methods which receive the instance as the first parameter, or the defining class for class methods.
But this is not proper OOP. We are calling type
directly and we aren't overriding or calling the parent's __new__
. Let's do that instead:
class UpperAttrMetaclass(type):
def __new__(cls, clsname, bases, attrs):
uppercase_attrs = {
attr if attr.startswith("__") else attr.upper(): v
for attr, v in attrs.items()
}
return type.__new__(cls, clsname, bases, uppercase_attrs)
We can make it even cleaner by using super
, which will ease inheritance (because yes, you can have metaclasses, inheriting from metaclasses, inheriting from type):
class UpperAttrMetaclass(type):
def __new__(cls, clsname, bases, attrs):
uppercase_attrs = {
attr if attr.startswith("__") else attr.upper(): v
for attr, v in attrs.items()
}
return super(UpperAttrMetaclass, cls).__new__(
cls, clsname, bases, uppercase_attrs)
Oh, and in Python 3 if you do this call with keyword arguments, like this:
class Foo(object, metaclass=MyMetaclass, kwarg1=value1):
...
It translates to this in the metaclass to use it:
class MyMetaclass(type):
def __new__(cls, clsname, bases, dct, kwargs1=default):
...
That's it. There is really nothing more about metaclasses.
The reason behind the complexity of the code using metaclasses is not because
of metaclasses, it's because you usually use metaclasses to do twisted stuff
relying on introspection, manipulating inheritance, vars such as __dict__
, etc.
Indeed, metaclasses are especially useful to do black magic, and therefore complicated stuff. But by themselves, they are simple:
Since __metaclass__
can accept any callable, why would you use a class
since it's obviously more complicated?
There are several reasons to do so:
UpperAttrMetaclass(type)
, you know
what's going to follow__new__
, __init__
and __call__
. Which will allow you to do different stuff, Even if usually you can do it all in __new__
,
some people are just more comfortable using __init__
.Now the big question. Why would you use some obscure error-prone feature?
Well, usually you don't:
Metaclasses are deeper magic that 99% of users should never worry about it. If you wonder whether you need them, you don't (the people who actually need them to know with certainty that they need them and don't need an explanation about why).
Python Guru Tim Peters
The main use case for a metaclass is creating an API. A typical example of this is the Django ORM. It allows you to define something like this:
class Person(models.Model):
name = models.CharField(max_length=30)
age = models.IntegerField()
But if you do this:
person = Person(name='bob', age='35')
print(person.age)
It won't return an IntegerField
object. It will return an int
, and can even take it directly from the database.
This is possible because models.Model
defines __metaclass__
and
it uses some magic that will turn the Person
you just defined with simple statements
into a complex hook to a database field.
Django makes something complex look simple by exposing a simple API and using metaclasses, recreating code from this API to do the real job behind the scenes.
First, you know that classes are objects that can create instances.
Well, in fact, classes are themselves instances. Of metaclasses.
>>> class Foo(object): pass
>>> id(Foo)
142630324
Everything is an object in Python, and they are all either instance of classes or instances of metaclasses.
Except for type
.
type
is actually its own metaclass. This is not something you could
reproduce in pure Python, and is done by cheating a little bit at the implementation
level.
Secondly, metaclasses are complicated. You may not want to use them for very simple class alterations. You can change classes by using two different techniques:
99% of the time you need class alteration, you are better off using these.
But 98% of the time, you don't need class alteration at all.
Best Answer
You can use the
+
operator to combine them:Output: