CHAPTER 6

image

Data Structures

Data structures are important for most any serious (and some not so serious) programs. They allow us to store groups of related data under a single variable name and access them quickly and logically. There are many types of data structures available under Python and each will be explained in the following sections.

Data Structure Example

Let’s say that we need to keep a list of color names that are available to the end user and that list has the following values:

Red, Orange, Yellow, Green, Blue, Purple

We could simply create a number of distinct and separate variables to hold each value. On the other hand, we could use a list data structure to keep track of them under a single variable.

ColorList = ['Red','Orange','Yellow','Green','Blue','Purple']

By doing it this way, we can easliy access whichever color name we want by simply using an index into the ColorList variable.

print ColorList[2]

Will return “Yellow”. Remember that the indexes are zero-based.

Digging Deeper

If we need to keep data, such as registration information, for the program that we are creating, we would need information like:

First Name, Last Name, Address, City, State, Postal Code

The first thought that springs to mind is to use a database to store this information. However, a quicker way to do it would be to use a dictionary structure. A dictionary (as you will see below), allows us to store data associated with a “key”. By using the dictionary, we don’t have the overhead of dealing with databases. We’ll examine this later on when we discuss dictionaries later on in this chapter.

Lists

In other languages, there is a data structure called an Array. Going back to the shoe box analogy that I used back in Chapter 2, an array is simply a bunch of shoe boxes “glued” together that holds like data together under a single variable name. Python does not provide a native Array type. Instead, we have the List object.

A list is simply a collection of items that are accessible by an index number, similar to arrays. Unlike Arrays, Lists can contain any value like strings, integers, floating point numbers or other objects like dictionaries or even other lists. You can also mix types of data within a list. Lists are dynamic, so may be modified at any time.

To create a list by hand, you would use the square bracket characters “[“ and “]”. Each item in a list is separated by a comma.

MyList = ['This','is','a','list']
NumberList = [0,1,2,3,4,5,6]
MyEmptyList = []
SillyList = [3,'A String',42,'42',5,'The End']

To access a single item within a list, you would do it by accessing a list by its index value. Lists have zero-based indexes, so the first item in a list is index 0, the second item is index 1, and so on. Using the above example MyList:

>>> print MyList[2]
a
>>> print MyList[3]
list
>>> print MyList[0]
This

If you attempt to access the index of a list that does not exist (index position 4 in the MyList list for example), you will get an error.

To walk (or iterate) through the entire list from beginning to end, you can use a simple for loop:

for i in range(0,len(MyList)):
    print MyList[i]

This
is
a
list

An alternative way to do this is to something like the following code, which some programmers find simpler and more “pythonic” and at the same time produces the same output:

for elem in MyList:
    print elem

You can also convert other types of data structures to a list. In the example below, the variable t is a tuple.

>>> t = (1,2,3)
>>> l = list(t)
>>> l
[1, 2, 3]

List Functions

The following built-in operators are available to the List object.

len(L)

Returns the number of items in a list.

>>> l = [1,2,3,4,5,6,7]
>>> len(l)
7

min(L)

Returns the minimum value in a list.

>>> l = [1,2,3,4,5,6,7]
>>> min(l)
1

max(L) function

Returns the maximum value in a list.

>>> l = [1,2,3,4,5,6,7]
>>> max(l)
7

x in L

Returns True if x is in the list L.

>>> l = [1,2,3,4,5,6,7]
>>> 42 in l
False
>>> 3 in l
True

x not in L

Returns True if x is NOT in the list L.

>>> l = [1,2,3,4,5,6,7]
>>> 42 not in l
True

L1 + L2

Concatenate L2 to the end of L1.

>>> l = [1,2,3,4,5,6,7]
>>> l2 = [9,10,11,12]
>>> l+l2
[1, 2, 3, 4, 5, 6, 7, 9, 10, 11, 12]

L[x]

Retrieve item in list at index position x (zero-based). This is pretty much the same thing as using an array in another language. If you need to have something that acts like a multidimensional array, which is not available in Python, you can use a list of lists.

>>> l = [1,2,3,4,5,6,7]
>>>l[3]
4

L[x1:x2]

Slice of list L from index position x1 to x2 (zero-based).

>>> l = [1,2,3,4,5,6,7]
>>> l[2:4]
[3, 4]

del(L[x])

Removes the item from list L at index position x (zero-based).

>>> l = ['F', 'E', 'D', 'C', 'B', 'A']
>>> del(l[2])
>>> l
['F', 'E', 'C', 'B', 'A']

List Methods

The following methods are available to lists.

.append(x)

Append the value in x to a list.

>>> l = [0,1,2,3,4]
>>> l.append(5)
>>> l
[0, 1, 2, 3, 4, 5]

.extend(L)

Append a list to another list. In the following example, l is modified, l2 is not.

>>> l = [0,1,2,3,4]
>>> l2 = [5,6,7]
>>> l.extend(l2)
>>> l
[0, 1, 2, 3, 4, 5, 6, 7]

.insert(i,x)

Insert a value x into list at index I. The following example inserts the value 5 at position 2 in the list.

>>> l = [0,1,2,3,4]
>>> l.insert(2,5)
>>> l
[0, 1, 5, 2, 3, 4]

If carefully used, lists with the .insert() and .pop() methods can be a quick and easy way to implement a LIFO (Last in, First Out) queue or stack.

.remove(x)

Removes the first item in the list that matches ‘x’. An error occurs if the item does not exist. The following example removes the value 2 from the list. The second example tries to do it again but gets an error.

>>> l = [0,1,2,3,4,5]
>>> l.remove(2)
>>> l
[0, 1, 3, 4, 5]
>>> l.remove(2)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: list.remove(x): x not in list

.pop([i])

Returns and removes the last item in the list if the optional index number is not included. If it is, it removes the item at that index (zero-based). The following example uses pop() to remove the last item in the list, then removes the item at index position 2.

>>> l = [0,1,2,3,4,5]
>>> l.pop()
5
>>> l.pop()
4
>>> l.pop(2)
2
>>> l
[0, 1, 3]

If carefully used, lists with the .insert() and .pop() methods can be a quick and easy way to implement a LIFO (Last in, First Out) queue or stack.

.index(x)

Returns the position of the item in the list.

The following example first shows the index position of the value 3 in the example list (which is 3). The second example shows the index position of the item “Oranges” in the list.

>>> l = [0,1,2,3,4,5]
>>> l.index(3)
3
>>> l1 = ['Apples','Oranges','Kiwi','Peach']
>>> l1
['Apples', 'Oranges', 'Kiwi', 'Peach']
>>> l1.index('Oranges')
1
  

.count(x)

Returns the count of the matching items in the list. If item is not in the list, it returns 0.

>>> l = [3,1,3,4,3,6,7,8]
>>> l.count(3)
3
>>> l.count(2)
0

.sort( )

Sorts the list from low to high.

>>> l2 = [0,1,2,3,2,5,7,3,1,2,5]
>>> l2.sort()
>>> l2
[0, 1, 1, 2, 2, 2, 3, 3, 5, 5, 7]

.reverse( )

Reverses the list.

>>> l = [0,1,2,3,4,5,6,7,8]
>>> l.reverse()
>>> l
[8, 7, 6, 5, 4, 3, 2, 1, 0]
>>> l = ['A','B','C','D','E','F']
>>> l.reverse()
>>> l
['F', 'E', 'D', 'C', 'B', 'A']

Dictionaries

Dictionaries are a very valuable tool in our Python arsenal. A dictionary is like a list, but it allows you to store data with a keyword attached as a matched pair with the data. Earlier in this book, I talked about the need for registration information in an imaginary program. The information that is needed would be:

First Name, Last Name, Address, City, State, Postal Code

A dictionary allows us to keep that information very easily. For each piece of data we have a key that is associated with it. The structure of the key/value pair is:

{Key:Value}

Each key within a dictionary must be unique, but the value may be repeated if needed. Curly brackets are used to set up the dictionary. Keys may be strings or numbers.

dict = {"Fname":"Jack","LName":"Sprat"}

You can create a blank dictionary by simply assiging a variable to an empty set of curly brackets.

Names = {}

You can add new key/value pairs to a dictionary.

>>> names = {'fname':'Fred','lname':'Frackel','city':'Aurora','state':'CO'}>>> names['phone'] = '222-222-2222'
>>> names
{'lname': 'Frackel', 'city': 'Aurora', 'state': 'CO', 'fname': 'Fred', 'phone':'222-222-2222'}

To iterate (walk through) a dictionary you can use the .iteritems() built in function. Notice that dictionaries do not store the key/value pairs in any predefined order, so items might not appear in the order that it is entered:

>>> names = {'fname':'Fred','lname':'Frackel','city':'Aurora','state':'CO'}
>>> for key,value in names.iteritems():
...    print key,value
...
lname Frackel
city Aurora
state CO
fname Fred
>>>

Dictionary Functions

Dictionaries have the following built-in functions.

len(dictionary)

Returns the number of items in a dictionary.

>>> d = {'lname':'Frackel','fname':'Fred','city':'Aurora','state':'CO'}
>>> len(d)
4

dict(list)

Creates a dictionary from a list provided and the list must contain at least one two-element tuple. The first element in the tuple is the key and the second is the value.

>>> d2 = dict([('one',1),('two',2),('three',3)])
>>> d2
{'three': 3, 'two': 2, 'one': 1}

Dictionary Methods

Dictionaries have the following built-in methods:

.clear( )

Removes all items from a dictionary.

>>> test = {'one':'1','two':'2','three':'3'}
>>> test
{'three': '3', 'two': '2', 'one': '1'}
>>> test.clear()
>>> test
{}

.copy( )

To make a copy of a dictionary, use the .copy() method. This is a shallow copy, which means that the content of the dictionary is not copied directly by value, but by reference and points to the actual original dictionary.

>>> first = {'a':1,'b':2,'c':3}
>>> clone = first.copy()
>>> first
{'a': 1, 'b': 2, 'c': 3}
>>> clone

{'a': 1, 'b': 2, 'c': 3}

.get(key[,default])

Returns a single value by key from a dictionary. Unlike the .pop() method, this does not remove the key/value pair from the dictionary. If key does not exist in dictionary, value from the optional default parameter is returned. If default is not given, returns None.

>>> names = {'lname': 'Frackel', 'city': 'Aurora', 'state': 'CO', 'fname': 'Fred', 'phone':'222-222-2222'}
>>> names
{'lname': 'Frackel', 'city': 'Aurora', 'state': 'CO', 'fname': 'Fred', 'phone':'222-222-2222'}
>>> names.get('lname')
'Frackel'

.has_key(key)

Returns True if the key exists in the dictionary. This method has been depreceated and the suggested way to check is to use key in d.

>>> names = {'lname': 'Frackel', 'city': 'Aurora', 'state': 'CO', 'fname': 'Fred', 'phone':'222-222-2222'}
>>> names.has_key('city')
True
>>> names.has_key('address')
False

.items( )

Returns a list of all key/value pairs in a dictionary. Notice that this is a nonsorted list and is not in the order that the data was entered.

>>> names = {'lname': 'Frackel', 'city': 'Aurora', 'state': 'CO', 'fname': 'Fred', 'phone':'222-222-2222'}
>>> names.items()
[('lname', 'Frackel'), ('city', 'Aurora'), ('state', 'CO'), ('fname', 'Fred'), ('phone', '222-222-2222')]

.keys( )

To get a list of keys from a dictionary, use the built in function .keys(). Notice that this is a nonsorted list and is not in the order that the keys were entered.

>>> names = {'lname': 'Frackel', 'city': 'Aurora', 'state': 'CO', 'fname': 'Fred'}
>>> names.keys()
['lname', 'city', 'state', 'fname']

To get a list of keys from a dictionary that is sorted, assign the return values from the .keys() function to a list then apply the .sort() function to that variable.

>>> names={'lname': 'Frackel', 'city': 'Aurora', 'state': 'CO', 'fname': 'Fred'}
>>> namekeys = names.keys()
>>> namekeys.sort()
>>> namekeys
['city', 'fname', 'lname', 'state']

.pop(key[,default])

Removes and returns the value of an item in the dictionary based on the key provided. If default is not given and the key does not exist, a KeyError is raised.

>>> names = {'lname': 'Frackel', 'city': 'Aurora', 'state': 'CO', 'fname': 'Fred','address':'123 Main Street'}
>>> names.pop('address')
'123 Main Street'
>>> names
{'lname': 'Frackel', 'city': 'Aurora', 'state': 'CO', 'fname': 'Fred', 'phone':'222-222-2222'}

.setdefault(key[,default])

Returns a value from the supplied key, if it exists. If not, it enters the key as a new item with the supplied default value.

>>> d1
{'a': 1, 'c': 3, 'b': 2, 'e': 0, 'd': 4}
>>> d1.setdefault('c',6)
3
>>> d1.setdefault('f',6)
6
>>> d1
{'a': 1, 'c': 3, 'b': 2, 'e': 0, 'd': 4, 'f': 6}

.update(other)

Updates the dictionary with the key/value pair provided in other. This will overwrite existing keys. Returns None. The other parameter can be either a tuple or list providing the key/value pair(s), or another dictionary.

>>> names = {'lname': 'Frackel', 'city': 'Aurora', 'state': 'CO', 'fname': 'Fred'}
>>> names.update({'address':'123 Main Street'})
>>> names
{'lname': 'Frackel', 'city': 'Aurora', 'state': 'CO', 'fname': 'Fred', 'phone':'222-222-2222', 'address': '123 Main Street'}

.values( )

Returns a list of all values in a dictionary. The list returned is unsorted and may not be in the order that the data was entered. Using the list after the above update:

>>> names
{'lname': 'Frackel', 'city': 'Aurora', 'state': 'CO', 'fname': 'Fred', 'phone':'222-222-2222'}
>>> names.values()

['Frackel', 'Aurora', 'CO', 'Fred', '222-222-2222']

Tuples

A tuple is another kind of sequence data type. Tuples are a number of values seperated by commas. The data within a tuple may consist of numbers, strings and even other objects.

>>> t = 3,42,'The time has come for all good men'
>>> t
(3, 42, 'The time has come for all good men')
>>> t[0]
3
>>> t[2]
'The time has come for all good men'

Tuples are immutable objects, which mean that once it has been created, it can not be changed.

>>> t[0] = 73
Traceback (most recent call last):
 File "<stdin>", line 1, in <module>
TypeError: 'tuple' object does not support item assignment

Although a tuple is immutable, it can contain mutable objects like lists.

>>> t1 = [1,2,3],['a','b','c']
>>> t1
([1, 2, 3], ['a', 'b', 'c'])
>>> t1[0]
[1, 2, 3]
>>> t1[0][1] = 4
>>> t1
([1, 4, 3], ['a', 'b', 'c'])

You can also assign the values within a tuple to mutable variables.

>>> x,y = t1
>>> x
[1, 4, 3]
>>> y
['a', 'b', 'c']

Sets

Sets are an unordered collection with no duplicate elements. Sets are mutable (may be changed).

In the snippet below, you will see that we use the string ‘This is a test’ as the data for the set. When we get the actual data used in the set, there are only eight values. All other values are discarded because they are duplicates. Notice also that when the set is displayed, it is actually a list.

>>> settest = set('This is a test')
>>> settest
set(['a', ' ', 'e', 'i', 'h', 's', 'T', 't'])

>>> 'a' in settest
True
>>> 'b' in settest
False

Set Functions

The following functions are available for sets.

len(set)

Returns the length or count if items within the set.

>>> c
set([2, 3, 4, 5, 6, 7, 8, 9, 11])
>>> len(c)
9

min(set)

Returns the minimum value in the set.

>>> c
set([2, 3, 4, 5, 6, 7, 8, 9, 11])
>>> min(c)
2

max(set)

Returns the maximum value in the set.

>>> c
set([2, 3, 4, 5, 6, 7, 8, 9, 11])
>>> max(c)
11

Set Methods

The following methods are available for sets.

.clear( )

Removes all data from the set.

>>> b = set([1,2,3,4,5])
>>> b
set([1, 2, 3, 4, 5])
>>> b.clear()
>>> b
set([])

.copy( )

Creates a new set by making a shallow copy.

>>> b
set([3, 4, 5, 6])
>>> c = b.copy()
>>> c
set([3, 4, 5, 6])

.pop( )

Removes an arbitrary item from the set. If the set is empty, a KeyError exception is raised.

>>> b = set([1,2,3,4,5])
>>> b
set([1, 2, 3, 4, 5])
>>> b.pop()
1
>>> b.pop()
2
>>> b.pop()
3
>>> b
set([4, 5])

.add(item)

Adds an item to the set. Since sets can not hold duplicates, if the item already exists, nothing will be done.

>>> b
set([4, 5])
>>> b.add(3)
>>> b
set([3, 4, 5])
>>> b.add(4)
>>> b
set([3, 4, 5])

.remove(item)

Deletes an item from the set. If the item does not exist, a KeyError exception will be raised.

>>> b
set([3, 4, 5])
>>> b.remove(4)
>>> b
set([3, 5])
>>> b.remove(4)
Traceback (most recent call last):
 File "<stdin>", line 1, in <module>
KeyError: 4

.discard(item)

Removes an item from the set. If the item is not in the set, no error will be raised.

>>> b
set([3, 5])
>>> b.discard(4)
>>> b.discard(5)
>>> b
set([3])

.update(set) or alternately x|=y

Merges values from the new set into the old set. If a value exists, it is ignored.

>>> b
set([3])
>>> b.update([3,2,1,4,5])
>>> b
set([1, 2, 3, 4, 5])

.intersection_update(set) or alternately x&=y

Updates the set x, discarading any elements that are not in both set x and y.

>>> a = set([1,2,3,4,5])
>>> b = set([2,3,4])
>>> a.intersection_update(b)
>>> a
set([2, 3, 4])

.difference_update(set) or alternately x-=y

Updates set x to a new set having only the values NOT in both set x and y.

>>> a = set([1,2,3,4,5])
>>> b = set([2,3,4])
>>> a.difference_update(b)
>>> a
set([1, 5])

.symmetric_difference_update(set) or alternately x^=y

Updates set x to contain only those values that are not in both set x and y.

>>> a = set([1,2,3])
>>> b = set([3,4,5])
>>> a.symmetric_difference_update(b)
>>> a
set([1, 2, 4, 5])

.issubset(set) or alternately x<=y

Returns True if set y is a subset of set x; otherwise, it returns False.

>>> a = set([1,2,3])
>>> b = set([3,4,5])
>>> c = set([2,3])
>>> c.issubset(a)
True

.issuperset(set) or alternately x>=y

Returns True if set x is a superset of set y; otherwise, it returns False.

>>> a = set([1,2,3])
>>> c = set([2,3])
>>> a.issubset(c)
True

.union(set) or alternately x|y

Returns a set that conains all unique values in sets x and y.

>>> a = set([1,2,3])
>>> c = set([5,6,7])
>>> a.union(c)
set([1, 2, 3, 5, 6, 7])

.intersection(set) or alternately x&y

Returns a new set that contains the values that are in both sets x and y.

>>> a
set([1, 2, 3])
>>> b
set([2, 3])
>>> a.intersection(b)
set([2, 3])

.difference(set) or alternately x-y

Returns a new set that contains the values that are not in both sets x and y.

>>> a
set([1, 2, 3])
>>> b
set([2, 3])
>>> a.difference(b)
set([1])

.symmetric_difference(set) or alternately x^y

Returns a new set that contains the values that are not in both sets x and y, but does not update set x.

>>> a
set([1, 2, 3])
>>> b = set([3,4,5])
>>> a.symmetric_difference(b)
set([1, 2, 4, 5])

Frozensets

Frozensets are, for the most part, the same as sets, except they are immutable (they can not be changed). This means that the .add and .update methods will return an error.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset