6
By the end of this chapter, you will be able to:
This lesson describes dictionaries and sets. We cover creating, reading, and writing data to these data structures.
You have already seen lists that hold values that you can access by using indexes. However, what if you wanted to name each value, instead of using an index? For example, suppose that you want to access a list of cake ingredients, but you do not know where in the array it is. In that case, a dictionary would come in handy.
Dictionaries, sometimes referred to as associative arrays in other languages, are data structures that hold data or information in a key-value order. Dictionaries allow you to access whatever value you want, using the much easier to remember key.
Dictionaries, unlike lists, are indexed using keys, which are usually strings. There are two kinds of dictionaries that you can use in Python: the default dict, which is unordered, and a special kind of dictionary called an OrderedDict. The difference is that the keys in the default dictionary are stored in an unordered manner, whereas an OrderedDict stores key-value pairs in the order of insertion.
A set is a collection of data items that are unordered and unique, that is, items cannot be repeated. For example, [1, 1] is a valid list, but not a valid set. With no duplicates in sets, we are able to perform mathematical operations, such as unions and intersection. You can store any kind of valid string or integer in a set, so long as it is unique.
For background information, sets are very important objects in mathematics, as well. A whole section of mathematics, called Set Theory, has been dedicated to the study of sets and their properties. We will not get into a lot of the mathematics around sets, but we will discuss some basic set operations.
Suppose that we have a set, A = {1,2,3}, and another set, B = {3,4,5}.
The union of set A and B, mathematically denoted as A u B, will be {1,2,3,4,5}. The union is simply the set of everything in A and B. Remember, there are no duplicates, so 3 will appear only once.
On the other hand, the intersection of A and B, denoted as A n B, will be the set of everything that is common in both A and B, which, in our case, is just {3}.
You can create a dictionary in two ways. The first way is to simply assign an empty dictionary to a variable by using curly brackets, like so:
dictionary = {}
The second way is to use the dict() function to return a new dictionary:
dictionary = dict()
In both cases, a dictionary object will be created. We can inspect the attributes and properties defined on the dictionary object by using the built-in dir() function:
>>> dir(dictionary)
['__class__', '__contains__', '__delattr__', '__delitem__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__iter__', '__le__', '__len__', '__lt__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__setitem__', '__sizeof__', '__str__', '__subclasshook__', 'clear', 'copy', 'fromkeys', 'get', 'items', 'keys', 'pop', 'popitem', 'setdefault', 'update', 'values']
We can also confirm that we have an actual dictionary by using the built-in isinstance() function to check that dictionary is an instance of the dict class:
>>> isinstance(dictionary, dict)
True
The isinstance() function is used to check the type of an object. It takes two arguments, the first one being the object being inspected, and the second being the class that we want to type-check against; for example, int, str, dict, list, and so on.
In this activity, we will create a dictionary and verify its type. The steps are as follows:
Solution for this activity can be found at page 286.
A typical example of a dictionary is as follows:
d = dict(
state="NY",
city="New York"
)
In this example, state and city are keys, while NY and New York are the respective values assigned to them.
In this exercise, we will see how to add data to a dictionary:
dictionary1 = dict(
state="NY",
city="New York"
)
print(dictionary1)
dictionary2 = {
"state": "Maryland",
"city": "Baltimore"
}
print(dictionary2)
Notice that when we use the dict() function, we assign values to keys using the = operator. When using {}, we separate the keys (which are strings) from the values by using :. Running this code will print the following:
{'state': 'NY', 'city': 'New York'}
{'state': 'Maryland', 'city': 'Baltimore'}
dictionary2['bird'] = 'Baltimore oriole'
This will add a new key to dictionary2, with the name bird and the value Baltimore oriole.
>>> print(dictionary2)
{'state': 'Maryland', 'city': 'Baltimore', 'bird': 'Baltimore oriole'}
dictionary1['state'] = 'New York'
This will change the value of state from NY to New York, like so:
>>> print(dictionary1)
{'state': 'New York', 'city': 'New York'}
In this exercise, we will look at how to read data from a dictionary:
print(dictionary1['state'])
>>> NY
Using this format, if the key does not exist, we will get a KeyError. For example:
print(dictionary1['age'])
Traceback (most recent call last):
File "python", line 13, in <module>
KeyError: 'age'
The get() function returns None if an item does not exist. You can also use the get() function to specify what should be returned when no value exists. Use the get() function, as shown here:
print(dictionary1.get('state'))
print(dictionary1.get('age'))
print(dictionary1.get('age', 'Key age is not defined'))
This will output the following:
NY
None
Key age is not defined
In this exercise, we will look at how to iterate through dictionary values by using various methods:
dictionary1 = dict(
state="NY",
city="New York"
)
for item in dictionary1:
print(item)
This code, by default, iterates through the dictionary's keys, and will print the following:
state
city
dictionary1 = dict(
state="NY",
city="New York"
)
for item in dictionary1.keys():
print(item)
This will also output the following:
state
city
dictionary1 = dict(
state="NY",
city="New York"
)
for item in dictionary1.values():
print(item)
The output for this will be as follows:
NY
New York
dictionary1 = dict(
state="NY",
city="New York"
)
for key, value in dictionary1.items():
print(key, value)
This will output the following:
state NY
city New York
You can use the in keyword to check whether a particular key exists in a dictionary, without iterating through it. This works the same way as it does in lists, and you will get back a Boolean value of True if the key exists, and False if it doesn't:
a = {
"size": "10 feet",
"weight": "16 pounds"
}
print("size" in a)
print("length" in a)
This example outputs the following:
True
False
If you run the dir() function on a dictionary, you will see a few more attributes defined that we have not yet touched upon. The following is a sample output:
['__class__', '__contains__', '__delattr__', '__delitem__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__',
'__getitem__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__iter__', '__le__', '__len__', '__lt__', '__ne__', '__new__',
'__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__setitem__', '__sizeof__', '__str__', '__subclasshook__', 'clear', 'copy',
'fromkeys', 'get', 'items', 'keys', 'pop', 'popitem', 'setdefault', 'update', 'values']
Let's go through some of these attributes and see what they can do.
The .update() method on dictionaries, is used to insert new key-value pairs into a dictionary, or update the value of an existing one.
For example, if we have an empty dictionary, calling .update() will insert a new entry:
>>> a = {}
>>> a.update({"name": "Dan Brown"})
>>> a
{'name': 'Dan Brown'}
What if we update the name key again? Consider the following code:
>>> a.update({"name": "Dan Brown Xavier"})
>>> a
{'name': 'Dan Brown Xavier'}
As you can see, calling .update() with an existing key replaces the value of that key. Also, note that the .update() function takes a dictionary with the key-value pairs defined, in order to update the existing dictionary. This means that the .update() function would come in handy if you had two dictionaries with different keys that you wanted to combine into one.
The clear method is used to remove all keys from a dictionary. For example:
>>> a
{'name': 'Dan Brown Xavier'}
>>> a.clear()
>>> a
{}
If you only want to remove one key-value pair, you can use the del keyword, like so:
>>> a = {"name": "Skandar Keynes", "age": "24"}
>>> del a["name"]
>>> a
{'age': '24'}
What if you want to remove a key-value pair from the dictionary and do something with the value? In that case, you can use pop(), which will delete the entry from the dictionary and return the value:
>>> a = {"name": "Skandar Keynes", "age": "24"}
>>> b = a.pop("name")
>>> a
{'age': '24'}
>>> b
'Skandar Keynes'
The copy method is used to create shallow copies of dictionaries. For example:
>>> a = {"name": "Skandar Keynes", "age": "24"}
>>> b = a.copy()
>>> b
{'name': 'Skandar Keynes', 'age': '24'}
>>> a["name"] = "Janet Jackson"
>>> a
{'name': 'Janet Jackson', 'age': '24'}
>>> b
{'name': 'Skandar Keynes', 'age': '24'}
In this example, you can see that b is a shallow copy of a, and has all of the exact key-value pairs found in a. However, updating a will not update b. They exist as two different entities.
This is different from using the = operator to make a deep copy, where a and b will refer to the same object, and updating one will update the other:
>>> a = {"name": "Skandar Keynes", "age": "24"}
>>> b = a
>>> a["name"] = "Janet Jackson"
>>> b["age"] = 16
>>> a
{'name': 'Janet Jackson', 'age': 16}
>>> b
{'name': 'Janet Jackson', 'age': 16}
The popitem() method pops and returns a random item from the dictionary. That item will no longer exist in the dictionary after that. For example:
>>> a = {"name": "Skandar Keynes", "age": "24", "sex": "male"}
>>> a.popitem()
('sex', 'male')
>>> a.popitem()
('age', '24')
>>> a
{'name': 'Skandar Keynes'}
The setdefault() method takes two arguments: a key to be searched for in the dictionary, and a value. If the key exists in the dictionary, its value will be returned. If the key does not exist, it will be inserted with the value provided in the second argument. If no second argument was passed, any insertion will be done with the value None.
Let's look at an example where the key exists in the dictionary:
>>> a = {"name": "Skandar Keynes", "age": "24", "sex": "male"}
>>> b = a.setdefault("name")
>>> a
{'name': 'Skandar Keynes', 'age': '24', 'sex': 'male'}
>>> b
'Skandar Keynes'
In this case, the value is returned as-is, and the dictionary is left untouched. Passing the second argument in this case will have no effect, since a value already exists.
Let's look at another example, where the key does not exist in the dictionary, and a value was passed. In this case, the key-value pair will be added to the dictionary, and the value will be returned, as well:
>>> a = {"name": "Skandar Keynes", "age": "24", "sex": "male"}
>>> b = a.setdefault("planet", "Earth")
>>> a
{'name': 'Skandar Keynes', 'age': '24', 'sex': 'male', 'planet': 'Earth'}
>>> b
'Earth'
Now, let's look at a final example, where the key does not exist in the dictionary, and no value was passed. In this case, the key will be added with a value of None. Nothing will be returned:
>>> a = {"name": "Skandar Keynes", "age": "24", "sex": "male"}
>>> b = a.setdefault("planet")
>>> a
{'name': 'Skandar Keynes', 'age': '24', 'sex': 'male', 'planet': None}
>>> b
>>>
The dict.fromkeys() method is used to create a dictionary from an iterable of keys, with whatever value is provided by the user. An iterable is anything that you can iterate over (for example, using a for loop).
Let's look at an example of this:
>>> a = dict.fromkeys(["name", "age"], "Nothing here yet")
>>> a
{'name': 'Nothing here yet', 'age': 'Nothing here yet'}
Note that if you do not provide a second argument, the values will be auto-set to None:
>>> a = dict.fromkeys(["name", "age"])
>>> a
{'name': None, 'age': None}
Due to its key-value pair format, dictionaries can be used to present some forms of structured data in an easily readable way. This activity will help you write code to present frequency data for characters in strings.
Dictionaries are very good data structures for organizing data in a readable format. Your aim is to write a function called sentence_analyzer, which will take in a line of text and output a dictionary with each letter as a key, the value being the frequency of appearance of the letter.
The following is an example of what the output would look like:
sentence_analyzer("Pythonn")
>>> {
"P": 1,
"y": 1,
"t": 1,
"h": 1,
"o": 1
"n": 2
}
The steps are as follows:
The following are some hints to help you out:
Solution for this activity can be found at page 287.
So far, the dictionaries that we have created do not maintain the insertion order of the key-value pairs that are added. Ordered dictionaries are dictionaries that maintain the insertion order of keys. This means that when you are iterating through them, you will always access the keys in the order in which they were inserted.
The OrderedDict class is a dict subclass defined in the collections package that Python ships with. We will use ordered dictionaries when it is vitally important to store and retrieve data in a predictable order; for example, when reading database entries.
The following section will describe how to work with them.
Creating an ordered dictionary is as easy as creating an instance of the OrderedDict class and passing in key-value pairs:
>>> from collections import OrderedDict
>>> a = OrderedDict(name="Zeus", role="god")
>>> a
OrderedDict([('name', 'Zeus'), ('role', 'god')])
Everything about OrderedDict, except for it maintaining an internal key order, is similar to normal dictionaries. However, if you inspect the attributes defined on it using dir(), you may see some new ones that were not in the normal dict() class that we looked at previously:
['__class__', '__contains__', '__delattr__', '__delitem__',
'__dict__','__dir__', '__doc__', '__eq__', '__format__', '__ge__',
'__geta ttribute__', '__getitem__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__iter__', '__le__', '__len__', '__lt__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__reversed__',
'__setattr__', '__setitem__', '__sizeof__', '__str__',
'__subcla sshook__', 'clear', 'copy', 'fromkeys', 'get', 'items', 'keys', 'move_to_end', 'pop', 'popitem', 'setdefault', 'update', 'values']
One of the new attributes is move_to_end, which moves a key contained in the OrderedDict from its current position to the very end of the dictionary.
You can look up more information about these attributes at https://docs.python.org/3/library/collections.html#collections.
Note that when you are checking whether two OrderedDict are equal, the order of keys is also considered. Although for a normal dict, having the same key-value pairs is enough to declare equality, in OrderedDict, if they are not in the same order, those two objects are not equal.
Suppose that you have data in two different places and you need to work with both at the same time. Python gives you the ability to extend dictionaries with data from each other. This activity will help you learn how to combine dictionaries, or extend one dictionary with the contents of another.
Write a function called dictionary_masher that will take two dictionaries and return a single dictionary with non-duplicated keys.
The following shows an example output:
>>> dictionary_masher({"name": "Amos"}, {"age": 100})
>>>
{
"name": "Amos",
"age": 100
}
Solution for this activity can be found at page 287.
In this section, we are going to cover sets, which are unique data structures with interesting properties.
Let's begin our journey into sets by looking at how to create sets, how to read data from them, and how to remove data from them.
In this exercise, we will create a set. We can do so by using the set method or the curly bracket notation:
>>> a = set([1,2,3])
>>> a
{1, 2, 3}
>>> b = set((1,2,2,3,4))
>>> b
{1, 2, 3, 4}
Note that in set b, all duplicated values in the original tuple were dropped.
>>> c = {'a', 'b', 'c'}
>>> c
{'c', 'b', 'a'}
>>> a = set("A random string")
>>> a
{'A', 'n', 'm', ' ', 's', 'a', 'r', 'o', 'g', 'd', 't', 'i'}
Passing a dictionary to the set() method will create a set of its keys.
['__and__', '__class__', '__contains__', '__delattr__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__iand__', '__init__', '__init_subclass__', '__ior__', '__isub__', '__iter__', '__ixor__', '__le__', '__len__', '__lt__', '__ne__', '__new__', '__or__', '__rand__', '__reduce__', '__reduce_ex__', '__repr__', '__ror__', '__rsub__', '__rxor__', '__setattr__', '__sizeof__', '__str__', '__sub__', '__subclasshook__', '__xor__', 'add', 'clear', 'copy', 'difference', 'difference_update', 'discard', 'intersection', 'intersection_update', 'isdisjoint', 'issubset', 'issuperset', 'pop', 'remove', 'symmetric_difference', 'symmetric_difference_update', 'union', 'update']
We will look at some of the attributes defined on the set object later on in this chapter.
In this exercise, we will look at how to add data to a set by using the set.add() and update() methods:
>>> a = {1,2,3}
>>> a
{1, 2, 3}
>>> a.add(4)
>>> a
{1, 2, 3, 4}
>>> a
{1, 2, 3, 4}
>>> a.add(4)
>>> a
{1, 2, 3, 4}
>>> a = {1,2,3}
>>> a
{1, 2, 3}
>>> a.update([3,4,5,6])
>>> a
{1, 2, 3, 4, 5, 6}
Set objects do not support indexes, so you cannot access values from them using indexes, like you can in a list or a tuple. In this exercise, we will look at what other methods we can use to read data from a set:
a = {1,2,3,4}
for num in a:
print(num)
This will output the following:
1
2
3
4
>>> a
{1, 2, 3, 4}
>>> a.pop()
1
>>> a
{2, 3, 4}
>>> a.pop()
2
>>> a
{3, 4}
Sets have more utility in the actions that we can perform on them than as a store of data. This means that finding things like the union and intersection of items in sets gives us more insight into the data they hold.
Imagine that you have some data in a list that contains a lot of duplicates that you want to remove. However, you think writing a for loop and building a new list without duplicates is too much overkill. Can you find a more efficient way to remove the duplicated values from the list?
The aim of this activity is to build a set out of a random set of items. Write a function called set_maker that takes a list and turns it into a set. Do not use a for loop or while loop to do so.
The following is an example output:
>>> set_maker([1, 1, 2, 2, 2, 3, 4, 6, 5, 5])
{1, 2, 3, 4, 5, 6}
Solution for this activity can be found at page 287.
There are other ways to remove data from sets without using pop(), especially if you just want to remove the data and not return it. In this exercise, we will look at how we can do this:
>>> a = {1,2,3}
>>> a.remove(3)
>>> a
{1, 2}
>>> a
{1, 2}
>>> a.remove(3)
Traceback (most recent call last):
File "<input>", line 1, in <module>
a.remove(3)
KeyError: 3
>>> a = {1,2,3}
>>> a.discard(2)
>>> a
{1, 3}
>>> a.discard("nonexistent item")
>>> a
{1, 3}
>>> a = {1,2,3,4,5,6}
>>> a
{1, 2, 3, 4, 5, 6}
>>> a.clear()
>>> a
set()
Mathematics has a section called Set Theory, which is dedicated to the study of sets and their properties. You can read more about the mathematical aspects of sets at http://www.math-only-math.com/sets.html.
In this section, we will look at all of the different operations we can perform on a set. Let's begin.
As we stated earlier, a union between sets is the set of all items/elements in both sets.
A union can be represented by the following Venn diagram:
For example, if set A = {1,2,3,4,5,6} and set B = {1,2,3,7,8,9,10}, then A u B will be {1,2,3,4,5,6,7,8,9,10}.
To achieve union between sets in Python, we can use the union method, which is defined on set objects:
>>> a = {1,2,3,4,5,6}
>>> b = {1,2,3,7,8,9,10}
>>> a.union(b)
{1, 2, 3, 4, 5, 6, 7, 8, 9, 10}
>>> b.union(a)
{1, 2, 3, 4, 5, 6, 7, 8, 9, 10}
Another way to achieve union between sets in Python is to use the | operator:
>>> a = {1,2,3,4,5,6}
>>> b = {1,2,3,7,8,9,10}
>>> a | b
{1, 2, 3, 4, 5, 6, 7, 8, 9, 10}
An intersection of sets is the set of all items that appear in all of the sets, that is, what they have in common. An intersection can be represented by the following Venn diagram, with the purple area being the intersection:
As in our previous example, if set A = {1,2,3,4,5,6} and set B = {1,2,3,7,8,9,10}, then A u B will be {1,2,3}.
To find the intersection between sets, we can use the intersection method, which is defined on set objects:
>>> a = {1,2,3,4,5,6}
>>> b = {1,2,3,7,8,9,10}
>>> a.intersection(b)
{1, 2, 3}
>>> b.intersection(a)
{1, 2, 3}
To find the intersection between sets, you can also use the & operator:
>>> a = {1,2,3,4,5,6}
>>> b = {1,2,3,7,8,9,10}
>>> a & b
{1, 2, 3}
The difference between two sets is basically what is in one set and not in the other.
If set A = {1,2,3,4,5,6} and set B = {1,2,3,7,8,9,10}, A – B will be the set of items that are only in A, that is, {4,5,6}. Similarly, B – A will be the set {7,8,9,10}.
The following diagram illustrates A - B:
Programmatically, in Python, we can use the – operator or the difference() method:
>>> a = {1,2,3,4,5,6}
>>> b = {1,2,3,7,8,9,10}
>>> a - b
{4, 5, 6}
>>> b - a
{8, 9, 10, 7}
>>> a.difference(b)
{4, 5, 6}
>>> b.difference(a)
{8, 9, 10, 7}
You can also get the symmetric difference between sets, which is the set of everything that is not in the intersection of the sets:
>>> a = {1,2,3,4,5,6}
>>> b = {1,2,3,7,8,9,10}
>>> a.symmetric_difference(b)
{4, 5, 6, 7, 8, 9, 10}
The issubset() method can be used to check whether all of one set's elements exist in another set (that is, whether the set is a subset of another):
>>> a = {1,2,3,4,5,6,7,8,9,10}
>>> b = {5,2,10}
>>> a.issubset(b)
False
>>> b.issubset(a)
True
In this example, all of the elements in b are a small part of what is in a. Therefore, b is a subset of a. We call a a superset of b:
>>> a.issuperset(b)
True
You can check whether the two sets are equivalent by using the == operator, and whether they are not equivalent by using the != operator.
The following is an example:
>>> a = {1,2,3}
>>> b = a.copy()
>>> c = {"money", "fame"}
>>> a == b
True
>>> a == c
False
>>> c != a
True
The copy() method, as used on sets, produces a shallow copy of a set, much like the dictionary's copy method. A shallow copy means that only references to values are copied, not the values themselves.
You can update a set with values from the results of set operations by using the special update operations defined on the set.
These methods are as follows:
>>> a = {1,2,3}
>>> b = {3,4,5}
>>> a - b
{1, 2}
>>> a.difference_update(b)
>>> a
{1, 2}
>>> a = {1,2,3}
>>> b = {3,4,5}
>>> a.intersection(b)
{3}
>>> a.intersection_update(b)
>>> a
{3}
>>> a = {1,2,3}
>>> b = {3,4,5}
>>> a.symmetric_difference(b)
{1, 2, 4, 5}
>>> a.symmetric_difference_update(b)
>>> a
{1, 2, 4, 5}
Frozen sets are just like sets, and they support all other set operations. However, they are immutable, and they do not support adding or removing items. Frozen sets are useful for holding items that do not need to change; for example, a set containing the names of states in the United States.
To create a frozen set, you can call the built-in frozenset() method with an iterable:
>>> a = frozenset([1,2,3])
>>> a
frozenset({1, 2, 3})
>>> dir(a)
['__and__', '__class__', '__contains__', '__delattr__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__iter__', '__le__', '__len__', '__lt__', '__ne__', '__new__', '__or__', '__rand__
', '__reduce__', '__reduce_ex__', '__repr__', '__ror__', '__rsub__', '__rxor__', '__setattr__', '__sizeof__', '__str__', '__sub__', '__subclasshook__', '__xor__', 'copy', 'difference', 'intersection', 'isdisjoint', 'issubset', 'issuperset', 'symmetric_difference', 'union']
As you can see from the output of dir(), the add, update, pop, discard, and other methods that modify the structure of the frozen set, are not defined.
The aim of this activity is to implement an algorithm that returns the union of elements in a collection.
We will create a function called find_union(), which takes two lists and returns a list of all of the elements in both lists, with no duplicates. Do not use the built-in set function.
The steps are as follows:
The following shows an example output:
>>> find_union([1, 2, 3, 4], [3, 4, 5, 6])
[1, 2, 3, 4, 5, 6]
Solution for this activity can be found at page 288.
In this chapter, we covered dictionaries and their types (the default, unordered dict, and the specialized OrderedDict). We also looked at attributes defined on dictionary objects and their use cases; for example, update and setdefault. Using these attributes, we learned how to iterate through dictionaries and modify them to achieve particular goals.
We also covered sets in this chapter, which are collections of unique and unordered items. We covered operations that you can perform on sets, such as finding unions and intersections, and other specialized operations, such as finding the difference and symmetric difference. We also looked at what frozen sets are, and the potential uses for them.
In the next chapter, we will begin our journey into object-oriented programming with Python, and we will look at how Python implements OOP concepts, such as classes and inheritance.