Using zip and izip

Zip takes two equal length collections and merges them together in pairs. Zip is a built-in Python function.

Getting ready

Let's demonstrate zip using a very simple example.

How to do it…

Let's pass two sequences to a zip function and print the output:

print zip(range(1,5),range(1,5))

How it works…

The two parameters to our zip function are two lists, both with values ranging from 1 to 5.

A range function takes three parameters. The starting value of the list, ending value of the list, and a step value. The default step value is one. In our case, we passed 1 and 5 as the starting and ending values of the list. Remember that Python is right-closed, so the range (1, 5) will return a list as follows:

[1,2,3,4]

We pass the two sequences to the zip function and the result is as follows:

[(1, 1), (2, 2), (3, 3), (4, 4)]

Keep in mind that both the collections should be of the same size; if not, then the output is truncated to the size of the shortest collection.

There's more…

Now, look at the following code:

x,y = zip(*out)
print x,y

Can you guess what the output is?

Let's see what the * operator does. A * operator unpacks a collection in their positional arguments:

a =(2,3)
print pow(*a)

The power operation takes two arguments. Now a is a tuple; as you can see, the * operator splits the tuple into two separate arguments. The * operator unpacks the tuple in 2 and 3. They are passed as parameters, pow(2,3), and we get the output, 8.

The ** operator can be used to unpack a dictionary. Look at the following snippet:

a_dict = {"x":10,"y":10,"z":10,"x1":10,"y1":10,"z1":10} 

The ** operator unpacks a dictionary as a set of named arguments. In this case, we will get an output,6, when we apply the ** operator to a dictionary. Look at the following function, which takes six arguments:

def dist(x,y,z,x1,y1,z1):
return abs((x-x1)+(y-y1)+(z-z1))

print dist(**a_dict) 

The output of the print statement is zero.

Armed with these two operators, we can write a function without any restrictions on the number of variables that it can ingest:

def any_sum(*args):
tot = 0
for arg in args:
tot+=arg
return tot

print any_sum(1,2)
print any_sum(1,2,3)

As you can see, in the preceding code snippet, the any_sum function can now work on any number of variables. A curious reader may comment about why not use a list instead as an argument to the any_sum function, where we can now pass a list of values. Very well, yes in this case, but we will soon encounter cases where we really don't know what kind of arguments will be passed.

Back to the zip utility. One drawback with zip is that it can compute the list all at once. This may be an issue when we have two very large lists. The izip comes to the rescue in these cases. They compute the elements only when requested. The izip is a part of itertools. Please refer to the itertools recipe for more details.

See also

  • Working with Itertools recipe in Chapter 1, Using Python for Data Science
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset