Credit: David Ascher
You want to create a multidimensional list, but the apparently simplest solution is fraught with surprises.
Use list comprehensions (also known as list displays) to avoid implicit reference sharing:
multilist = [[0 for col in range(5)] for row in range(10)]
When a newcomer to Python is shown the power of the multiplication operation on lists, he often gets quite excited about it, since it is such an elegant notation. For example:
>>> [0] * 5
[0, 0, 0, 0, 0]
The problem is that one-dimensional problems often grow a second dimension, so there is a natural progression to:
>>> multi = [[0] * 5] * 3
>>> print multi
[[0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0]]
This appears to have worked, but the same newcomer is then often puzzled by bugs, which typically can be boiled down to the following test:
>>> multi[0][0] = 'Changed!'
>>> print multi
[['Changed!', 0, 0, 0, 0], ['Changed!', 0, 0, 0, 0], ['Changed!', 0, 0, 0, 0]]
This problem definitely confuses most programmers at least once, if not a few times (see the FAQ entry at http://www.python.org/doc/FAQ.html#4.50). To understand it, it helps to decompose the creation of the multidimensional list into two steps:
>>> row = [0] * 5 # a list with five references to 0 >>> multi = [row] * 3 # a list with three references to the row object
The problem still exists in this version (Python is not that
magical). The comments are key to understanding the source of the
confusion. The process of multiplying a sequence by a number creates
a new sequence with the specified number of new references to the
original contents. In the case of the creation of
row
, it doesn’t matter whether
references are being duplicated or not, since the referent (the
object being referred to) is immutable. In other words, there is no
difference between an object and a reference to an object if that
object is immutable. In the second line, however, what is created is
a new list containing three references to the contents of the
[row]
list, which is a single reference to a list.
Thus, multi
contains three references to a single
object. So when the first element of the first element of
multi
is changed, you are actually modifying the
first element of the shared list. Hence the surprise.
List comprehensions, added in Python 2.2, provide a nice syntax that avoids the problem, as illustrated in the solution. With list comprehensions, there is no sharing of references—it’s a truly nested computation. Note that the performance characteristics of the solution are O(M x N), meaning that it will scale with each dimension. The list-multiplication idiom, however, is an O(M) computation, as it doesn’t really do duplications.