Python – List changes unexpectedly after assignment. Why is this and how can I prevent it

clonecopylistpythonreference

While using new_list = my_list, any modifications to new_list changes my_list every time. Why is this, and how can I clone or copy the list to prevent it?

Best Answer

With new_list = my_list, you don't actually have two lists. The assignment just copies the reference to the list, not the actual list, so both new_list and my_list refer to the same list after the assignment.

To actually copy the list, you have various possibilities:

You can use the builtin list.copy() method (available since Python 3.3):
```
new_list = old_list.copy()
```
You can slice it:
```
new_list = old_list[:]
```
Alex Martelli's opinion (at least back in 2007) about this is, that it is a weird syntax and it does not make sense to use it ever. ;) (In his opinion, the next one is more readable).
You can use the built in list() function:
```
new_list = list(old_list)
```
You can use generic copy.copy():
```
import copy
new_list = copy.copy(old_list)
```
This is a little slower than list() because it has to find out the datatype of old_list first.
If the list contains objects and you want to copy them as well, use generic copy.deepcopy():
```
import copy
new_list = copy.deepcopy(old_list)
```
Obviously the slowest and most memory-needing method, but sometimes unavoidable.

Example:

import copy

class Foo(object):
    def __init__(self, val):
         self.val = val

    def __repr__(self):
        return 'Foo({!r})'.format(self.val)

foo = Foo(1)

a = ['foo', foo]
b = a.copy()
c = a[:]
d = list(a)
e = copy.copy(a)
f = copy.deepcopy(a)

# edit orignal list and instance 
a.append('baz')
foo.val = 5

print('original: %r\nlist.copy(): %r\nslice: %r\nlist(): %r\ncopy: %r\ndeepcopy: %r'
      % (a, b, c, d, e, f))

Result:

original: ['foo', Foo(5), 'baz']
list.copy(): ['foo', Foo(5)]
slice: ['foo', Foo(5)]
list(): ['foo', Foo(5)]
copy: ['foo', Foo(5)]
deepcopy: ['foo', Foo(1)]

Related Solutions

C# – Deep cloning objects

Whereas one approach is to implement the ICloneable interface (described here, so I won't regurgitate), here's a nice deep clone object copier I found on The Code Project a while ago and incorporated it into our code. As mentioned elsewhere, it requires your objects to be serializable.

using System;
using System.IO;
using System.Runtime.Serialization;
using System.Runtime.Serialization.Formatters.Binary;

/// <summary>
/// Reference Article http://www.codeproject.com/KB/tips/SerializedObjectCloner.aspx
/// Provides a method for performing a deep copy of an object.
/// Binary Serialization is used to perform the copy.
/// </summary>
public static class ObjectCopier
{
    /// <summary>
    /// Perform a deep copy of the object via serialization.
    /// </summary>
    /// <typeparam name="T">The type of object being copied.</typeparam>
    /// <param name="source">The object instance to copy.</param>
    /// <returns>A deep copy of the object.</returns>
    public static T Clone<T>(T source)
    {
        if (!typeof(T).IsSerializable)
        {
            throw new ArgumentException("The type must be serializable.", nameof(source));
        }

        // Don't serialize a null object, simply return the default for that object
        if (ReferenceEquals(source, null)) return default;

        using var Stream stream = new MemoryStream();
        IFormatter formatter = new BinaryFormatter();
        formatter.Serialize(stream, source);
        stream.Seek(0, SeekOrigin.Begin);
        return (T)formatter.Deserialize(stream);
    }
}

The idea is that it serializes your object and then deserializes it into a fresh object. The benefit is that you don't have to concern yourself about cloning everything when an object gets too complex.

In case of you prefer to use the new extension methods of C# 3.0, change the method to have the following signature:

public static T Clone<T>(this T source)
{
   // ...
}

Now the method call simply becomes objectBeingCloned.Clone();.

EDIT (January 10 2015) Thought I'd revisit this, to mention I recently started using (Newtonsoft) Json to do this, it should be lighter, and avoids the overhead of [Serializable] tags. (NB @atconway has pointed out in the comments that private members are not cloned using the JSON method)

/// <summary>
/// Perform a deep Copy of the object, using Json as a serialization method. NOTE: Private members are not cloned using this method.
/// </summary>
/// <typeparam name="T">The type of object being copied.</typeparam>
/// <param name="source">The object instance to copy.</param>
/// <returns>The copied object.</returns>
public static T CloneJson<T>(this T source)
{            
    // Don't serialize a null object, simply return the default for that object
    if (ReferenceEquals(source, null)) return default;

    // initialize inner objects individually
    // for example in default constructor some list property initialized with some values,
    // but in 'source' these items are cleaned -
    // without ObjectCreationHandling.Replace default constructor values will be added to result
    var deserializeSettings = new JsonSerializerSettings {ObjectCreationHandling = ObjectCreationHandling.Replace};

    return JsonConvert.DeserializeObject<T>(JsonConvert.SerializeObject(source), deserializeSettings);
}

Python – List of lists changes reflected across sublists unexpectedly

When you write [x]*3 you get, essentially, the list [x, x, x]. That is, a list with 3 references to the same x. When you then modify this single x it is visible via all three references to it:

x = [1] * 4
l = [x] * 3
print(f"id(x): {id(x)}")
# id(x): 140560897920048
print(
    f"id(l[0]): {id(l[0])}\n"
    f"id(l[1]): {id(l[1])}\n"
    f"id(l[2]): {id(l[2])}"
)
# id(l[0]): 140560897920048
# id(l[1]): 140560897920048
# id(l[2]): 140560897920048

x[0] = 42
print(f"x: {x}")
# x: [42, 1, 1, 1]
print(f"l: {l}")
# l: [[42, 1, 1, 1], [42, 1, 1, 1], [42, 1, 1, 1]]

To fix it, you need to make sure that you create a new list at each position. One way to do it is

[[1]*4 for _ in range(3)]

which will reevaluate [1]*4 each time instead of evaluating it once and making 3 references to 1 list.

You might wonder why * can't make independent objects the way the list comprehension does. That's because the multiplication operator * operates on objects, without seeing expressions. When you use * to multiply [[1] * 4] by 3, * only sees the 1-element list [[1] * 4] evaluates to, not the [[1] * 4 expression text. * has no idea how to make copies of that element, no idea how to reevaluate [[1] * 4], and no idea you even want copies, and in general, there might not even be a way to copy the element.

The only option * has is to make new references to the existing sublist instead of trying to make new sublists. Anything else would be inconsistent or require major redesigning of fundamental language design decisions.

In contrast, a list comprehension reevaluates the element expression on every iteration. [[1] * 4 for n in range(3)] reevaluates [1] * 4 every time for the same reason [x**2 for x in range(3)] reevaluates x**2 every time. Every evaluation of [1] * 4 generates a new list, so the list comprehension does what you wanted.

Incidentally, [1] * 4 also doesn't copy the elements of [1], but that doesn't matter, since integers are immutable. You can't do something like 1.value = 2 and turn a 1 into a 2.

Best Answer

Related Solutions

C# – Deep cloning objects

Python – List of lists changes reflected across sublists unexpectedly

Related Topic