Chapter 4
In This Chapter
Using numbers and logic
Interacting with strings
Delving into dates
Developing modular code
Making decisions and performing tasks repetitively
Organizing information into sets, lists, and tuples
Iterating through data
Making data easier to find using dictionaries
Chapter 3 helped you create a Python installation that’s specific to data science. However, you can use this installation to perform common Python tasks as well, and that’s actually the best way to test your setup to know that it works as intended. If you already know Python, you might be able to skip this chapter and move on to the next chapter; however, it’s probably best to skim the material as a minimum and test some of the examples, just to be sure you have a good installation.
The focus of this chapter is to provide you with a good overview of how Python works as a language. Of course, part of that focus is how you use Python to solve data science problems. However, you can’t use this book to learn Python from scratch. To learn Python from scratch, you need a book such as my book Beginning Programming with Python For Dummies (published by John Wiley & Sons, Inc.) or a tutorial such as the one at https://docs.python.org/2/tutorial/
. The chapter assumes that you’ve worked with other programming languages and have at least an idea of how Python works. This limitation aside, the chapter gives you a good reminder of how things work in Python, which is all that many people really need.
Data science involves working with data of various sorts, but much of the work involves numbers. In addition, you use logical values to make decisions about the data you use. For example, you might need to know whether two values are equal or whether one value is greater than another value. Python supports these number and logic value types:
int
data type. On most platforms, you can store numbers between –9,223,372,036,854,775,808 and 9,223,372,036,854,775,807 within an int
(which is the maximum value that fits in a 64-bit variable).float
data type. The maximum value that a floating-point variable can contain is ±1.7976931348623157 × 10308 and the minimum value that a floating point variable can contain is ±2.2250738585072014 × 10-308 on most platforms.http://www.mathsisfun.com/numbers/complex-numbers.html
. The imaginary part of a complex number always appears with a j after it. So, if you want to create a complex number with 3 as the real part and 4 as the imaginary part, you make an assignment like this: myComplex = 3 + 4j
.bool
type. A variable of this type can contain only two values: True
or False
. You can assign a value by using the True
or False
keywords, or you can create an expression that defines a logical idea that equates to true or false. For example, you could say myBool = 1 > 2
, which would equate to False because 1 is most definitely not greater than 2.Now that you have the basics down, it’s time to see the data types in action. The following paragraphs provide a quick overview of how you can work with both numeric and logical data in Python.
When working with applications, you store information in variables. A variable is a kind of storage box. Whenever you want to work with the information, you access it using the variable. If you have new information you want to store, you put it in a variable. Changing information means accessing the variable first and then storing the new value in the variable. Just as you store things in boxes in the real world, so you store things in variables (a kind of storage box) when working with applications. To store data in a variable, you assign the data to it using any of a number of assignment operators (special symbols that tell how to store the data). Table 4-1 shows the assignment operators that Python supports.
Table 4-1 Python Assignment Operators
Operator |
Description |
Example |
= |
Assigns the value found in the right operand to the left operand |
MyVar = 2 results in MyVar containing 2 |
+= |
Adds the value found in the right operand to the value found in the left operand and places the result in the left operand |
MyVar += 2 results in MyVar containing 7 |
-= |
Subtracts the value found in the right operand from the value found in the left operand and places the result in the left operand |
MyVar -= 2 results in MyVar containing 3 |
*= |
Multiplies the value found in the right operand by the value found in the left operand and places the result in the left operand |
MyVar *= 2 results in MyVar containing 10 |
/= |
Divides the value found in the left operand by the value found in the right operand and places the result in the left operand |
MyVar /= 2 results in MyVar containing 2.5 |
%= |
Divides the value found in the left operand by the value found in the right operand and places the remainder in the left operand |
MyVar %= 2 results in MyVar containing 1 |
**= |
Determines the exponential value found in the left operand when raised to the power of the value found in the right operand and places the result in the left operand |
MyVar ** 2 results in MyVar containing 25 |
//= |
Divides the value found in the left operand by the value found in the right operand and places the integer (whole number) result in the left operand |
MyVar //= 2 results in MyVar containing 2 |
Storing information in variables makes it easily accessible. However, in order to perform any useful work with the variable, you usually perform some type of arithmetic operation on it. Python supports the common arithmetic operators you use to perform tasks by hand. They appear in Table 4-2.
Table 4-2 Python Arithmetic Operators
Operator |
Description |
Example |
+ |
Adds two values together |
5 + 2 = 7 |
- |
Subtracts the right-hand operand from left operand |
5 – 2 = 3 |
* |
Multiplies the right-hand operand by the left operand |
5 * 2 = 10 |
/ |
Divides the left-hand operand by the right operand |
5 / 2 = 2.5 |
% |
Divides the left-hand operand by the right operand and returns the remainder |
5 % 2 = 1 |
** |
Calculates the exponential value of the right operand by the left operand |
5 ** 2 = 25 |
// |
Performs integer division, in which the left operand is divided by the right operand and only the whole number is returned (also called floor division) |
5 // 2 = 2 |
Sometimes you need to interact with just one variable. Python supports a number of unary operators, those that work with just one variable, as shown in Table 4-3.
Table 4-3 Python Unary Operators
Operator |
Description |
Example |
~ |
Inverts the bits in a number so that all of the 0 bits become 1 bits and vice versa. |
~4 results in a value of –5 |
- |
Negates the original value so that positive becomes negative and vice versa. |
–(–4) results in 4 and –4 results in –4 |
+ |
Is provided purely for the sake of completeness. This operator returns the same value that you provide as input. |
+4 results in a value of 4 |
Computers can perform other sorts of math tasks because of the way in which the processor works. It’s important to remember that computers store data as a series of individual bits. Python makes it possible to access these individual bits using bitwise operators, as shown in Table 4-4.
Table 4-4 Python Bitwise Operators
Operator |
Description |
Example |
& (And) |
Determines whether both individual bits within two operators are true and sets the resulting bit to true when they are. |
0b1100 & 0b0110 = 0b0100 |
| (Or) |
Determines whether either of the individual bits within two operators are true and sets the resulting bit to true when they are. |
0b1100 | 0b0110 = 0b1110 |
^ (Exclusive or) |
Determines whether just one of the individual bits within two operators is true and sets the resulting bit to true when one is. When both bits are true or both bits are false, the result is false. |
0b1100 ^ 0b0110 = 0b1010 |
~ (One’s complement) |
Calculates the one’s complement value of a number. |
~0b1100 = –0b1101 ~0b0110 = –0b0111 |
<< (Left shift) |
Shifts the bits in the left operand left by the value of the right operand. All new bits are set to 0 and all bits that flow off the end are lost. |
0b00110011 << 2 = 0b11001100 |
>> (Right shift) |
Shifts the bits in the left operand right by the value of the right operand. All new bits are set to 0 and all bits that flow off the end are lost. |
0b00110011 >> 2 = 0b00001100 |
Using arithmetic to modify the content of variables is a kind of data manipulation. To determine the effect of data manipulation, a computer must compare the current state of the variable against its original state or the state of a known value. In some cases, detecting the status of one input against another is also necessary. All these operations check the relationship between two variables, so the resulting operators are relational operators, as shown in Table 4-5.
Table 4-5 Python Relational Operators
Operator |
Description |
Example |
== |
Determines whether two values are equal. Notice that the relational operator uses two equals signs. A mistake many developers make is using just one equals sign, which results in one value being assigned to another. |
1 == 2 is False |
!= |
Determines whether two values are not equal. Some older versions of Python allowed you to use the <> operator in place of the != operator. Using the <> operator results in an error in current versions of Python. |
1 != 2 is True |
> |
Verifies that the left operand value is greater than the right operand value. |
1 > 2 is False |
< |
Verifies that the left operand value is less than the right operand value. |
1 < 2 is True |
>= |
Verifies that the left operand value is greater than or equal to the right operand value. |
1 >= 2 is False |
<= |
Verifies that the left operand value is less than or equal to the right operand value. |
1 <= 2 is True |
Sometimes a relational operator can’t tell the whole story of the comparison of two values. For example, you might need to check a condition in which two separate comparisons are needed, such as MyAge > 40
and MyHeight < 74
. The need to add conditions to the comparison requires a logical operator of the sort shown in Table 4-6.
Table 4-6 Python Logical Operators
Operator |
Description |
Example |
and |
Determines whether both operands are true. |
True and True is True True and False is False False and True is False False and False is False |
or |
Determines when one of two operands is true. |
True or True is True True or False is True False or True is True False or False is False |
not |
Negates the truth value of a single operand. A true value becomes false and a false value becomes true. |
not True is False not False is True |
Computers provide order to comparisons by making some operators more significant than others. The ordering of operators is operator precedence. Table 4-7 shows the operator precedence of all the common Python operators, including a few you haven’t seen as part of a discussion yet. When making comparisons, always consider operator precedence because otherwise, the assumptions you make about a comparison outcome will likely be wrong.
Table 4-7 Python Operator Precedence
Operator |
Description |
() |
You use parentheses to group expressions and to override the default precedence so that you can force an operation of lower precedence (such as addition) to take precedence over an operation of higher precedence (such as multiplication). |
** |
Exponentiation raises the value of the left operand to the power of the right operand. |
~ + - |
Unary operators interact with a single variable or expression. |
* / % // |
Multiply, divide, modulo, and floor division. |
+ - |
Addition and subtraction. |
>> << |
Right and left bitwise shift. |
& |
Bitwise AND. |
^ | |
Bitwise exclusive OR and standard OR. |
<= < > >= |
Comparison operators. |
== != |
Equality operators. |
= %= /= //= -= += *= **= |
Assignment operators. |
is is not |
Identity operators. |
in not in |
Membership operators. |
not or and |
Logical operators. |
Of all the data types, strings are the most easily understood by humans and not understood at all by computers. A string is simply any grouping of characters you place within double quotation marks. For example, myString = "Python is a great language."
assigns a string of characters to myString
.
The computer doesn’t see letters at all. Every letter you use is represented by a number in memory. For example, the letter A is actually the number 65. To see this for yourself, type ord("A") at the Python prompt and press Enter. You see 65 as output. It’s possible to convert any single letter to its numeric equivalent using the ord()
command.
Because the computer doesn’t really understand strings, but strings are so useful in writing applications, you sometimes need to convert a string to a number. You can use the int()
and float()
commands to perform this conversion. For example, if you type myInt = int("123") and press Enter at the Python prompt, you create an int
named myInt
that contains the value 123
.
As with numbers, you can use some special operators with strings (and many objects). The member operators make it possible to determine when a string contains specific content. Table 4-8 shows these operators.
Table 4-8 Python Membership Operators
Operator |
Description |
Example |
in |
Determines whether the value in the left operand appears in the sequence found in the right operand |
“Hello” in “Hello Goodbye“ is True |
not in |
Determines whether the value in the left operand is missing from the sequence found in the right operand |
“Hello“ not in “Hello Goodbye“ is False |
The discussion in this section also makes it obvious that you need to know the kind of data that variables contain. You use the identity operators to perform this task, as shown in Table 4-9.
Table 4-9 Python Identity Operators
Operator |
Description |
Example |
is |
Evaluates to true when the type of the value or expression in the right operand points to the same type in the left operand |
type(2) is int is True |
is not |
Evaluates to true when the type of the value or expression in the right operand points to a different type than the value or expression in the left operand |
type(2) is not int is False |
Dates and times are items that most people work with quite a bit. Society bases almost everything on the date and time that a task needs to be or was completed. We make appointments and plan events for specific dates and times. Most of our day revolves around the clock. Because of the time-oriented nature of humans, it’s a good idea to look at how Python deals with interacting with date and time (especially storing these values for later use). As with everything else, computers understand only numbers — date and time don’t really exist.
Computers do have clocks inside them, but the clocks are for the humans using the computer. Yes, some software also depends on the clock, but again, the emphasis is on human needs rather than anything the computer might require. To get the current time, you can simply type datetime.datetime.now() and press Enter. You see the full date and time information as found on your computer’s clock (see Figure 4-1).
You may have noticed that the date and time are a little hard to read in the existing format. Say that you want to get just the current date, and in a readable format. To accomplish this task, you access just the date portion of the output and convert it into a string. Type str(datetime.datetime.now().date()) and press Enter. Figure 4-2 shows that you now have something a little more usable.
Interestingly enough, Python also has a time()
command, which you can use to obtain the current time. You can obtain separate values for each of the components that make up date and time using the day
, month
, year
, hour
, minute
, second
, and microsecond
values. Later chapters help you understand how to use these various date and time features to make working with data science applications easier.
To manage information properly, you need to organize the tools used to perform the required tasks. Each line of code that you create performs a specific task, and you combine these lines of code to achieve a desired result. Sometimes you need to repeat the instructions with different data, and in some cases your code becomes so long that it’s hard to keep track of what each part does. Functions serve as organization tools that keep your code neat and tidy. In addition, functions make it easy to reuse the instructions you’ve created as needed with different data. This section of the chapter tells you all about functions. More important, in this section you start creating your first serious applications in the same way that professional developers do.
You go to your closet, take out pants and shirt, remove the labels, and put them on. At the end of the day, you take everything off and throw it in the trash. Hmmm … that really isn’t what most people do. Most people take the clothes off, wash them, and then put them back into the closet for reuse. Functions are reusable, too. No one wants to keep repeating the same task; it becomes monotonous and boring. When you create a function, you define a package of code that you can use over and over to perform the same task. All you need to do is tell the computer to perform a specific task by telling it which function to use. The computer faithfully executes each instruction in the function absolutely every time you ask it to do so.
At one time, computer programs didn’t include the concept of code reusability. As a result, developers had to keep reinventing the same code. It didn’t take long for someone to come up with the idea of functions, though, and the concept has evolved over the years until functions have become quite flexible. You can make functions do anything you want. Code reusability is a necessary part of applications to
In fact, functions do a whole list of things for applications in the form of reusability. As you work through the examples in this book, you see how reusability makes your life significantly easier. If not for reusability, you’d still be programming by plugging 0s and 1s into the computer by hand.
Creating a function doesn’t require much work. To see how functions work, open a copy of IPython and type in the following code (pressing Enter at the end of each line):
def SayHello():
print('Hello There!')
To end the function, you press Enter a second time after the last line. A function begins with the keyword def
(for define). You provide a function name, parentheses that can contain function arguments (data used in the function), and a colon. The editor automatically indents the next line for you. Python relies on whitespace to define code blocks (statements that are associated with each other in a function).
You can now use the function. Simply type SayHello( ) and press Enter. The parentheses after the function name are important because they tell Python to execute the function, rather than tell you that you are accessing a function as an object (to determine what it is). Figure 4-3 shows the output from this function.
Functions can accept arguments (additional bits of data) and return values. The ability to exchange data makes functions far more useful than they otherwise might be. The following sections describe how to call functions in a variety of ways to both send and receive data.
A function can require the caller to provide arguments to it. A required argument is a variable that must contain data for the function to work. Open a copy of IPython and type the following code:
def DoSum(Value1, Value2):
return Value1 + Value2
You have a new function, DoSum()
. This function requires that you provide two arguments to use it. At least, that’s what you’ve heard so far. Type DoSum() and press Enter. You see an error message, as shown in Figure 4-4, telling you that DoSum
requires two arguments.
Trying DoSum()
with just one argument would result in another error message. In order to use DoSum()
you must provide two argument. To see how this works, type DoSum(1, 2) and press Enter. You see the result in Figure 4-5.
Notice that DoSum()
provides an output value of 3
when you supply 1
and 2
as inputs. The return
statement provides the output value. Whenever you see return
in a function, you know the function provides an output value.
As your functions become more complex and the methods to use them do as well, you may want to provide a little more control over precisely how you call the function and provide arguments to it. Until now, you have positional arguments, which means that you have supplied values in the order in which they appear in the argument list for the function definition. However, Python also has a method for sending arguments by keyword. In this case, you supply the name of the argument followed by an equals sign (=) and the argument value. To see how this works, open a copy of IPython and type the following code:
def DisplaySum(Value1, Value2):
print(str(Value1) + ' + ' + str(Value2) + ' = ' +
str((Value1 + Value2)))
Notice that the print()
function argument includes a list of items to print and that those items are separated by plus signs (+). In addition, the arguments are of different types, so you must convert them using the str()
function. Python makes it easy to mix and match arguments in this manner. This function also introduces the concept of automatic line continuation. The print()
function actually appears on two lines, and Python automatically continues the function from the first line to the second.
Next, it’s time to test DisplaySum()
. Of course, you want to try the function using positional arguments first, so type DisplaySum(2, 3) and press Enter. You see the expected output of 2 + 3 = 5
. Now type DisplaySum(Value2 = 3, Value1 = 2) and press Enter. Again, you receive the output 2 + 3 = 5
even though the position of the arguments has been reversed.
Whether you make the call using positional arguments or keyword arguments, the functions to this point have required that you supply a value. Sometimes a function can use default values when a common value is available. Default values make the function easier to use and less likely to cause errors when a developer doesn’t provide an input. To create a default value, you simply follow the argument name with an equals sign and the default value. To see how this works, open a copy of IPython and type the following code:
def SayHello(Greeting = "No Value Supplied"):
print(Greeting)
The SayHello()
function provides an automatic value for Greeting when a caller doesn’t provide one. When someone tries to call SayHello()
without an argument, it doesn’t raise an error. Type SayHello() and press Enter to see for yourself — you see the default message. Type SayHello("Howdy!") to see a normal response.
In most cases, you know precisely how many arguments to provide with your function. It pays to work toward this goal whenever you can because functions with a fixed number of arguments are easier to troubleshoot later. However, sometimes you simply can’t determine how many arguments the function will receive at the outset. For example, when you create a Python application that works at the command line, the user might provide no arguments, the maximum number of arguments (assuming there is one), or any number of arguments in between.
Fortunately, Python provides a technique for sending a variable number of arguments to a function. You simply create an argument that has an asterisk in front of it, such as *VarArgs
. The usual technique is to provide a second argument that contains the number of arguments passed as an input. To see how this works, open a copy of IPython and type the following code:
def DisplayMulti(ArgCount = 0, *VarArgs):
print('You passed ' + str(ArgCount) + ' arguments.',
Var Args)
Notice that the print()
function displays a string and then the list of arguments. Because of the way this function is designed, you can type DisplayMulti( ) and press Enter to see that it’s possible to pass zero arguments. To see multiple arguments at work, type DisplayMulti(3, 'Hello', 1, True) and press Enter. The output of ('You passed 3 arguments.', ('Hello', 1, True))
shows that you need not pass values of any particular type.
Computer applications aren’t very useful if they perform precisely the same tasks the same number of times every time you run them. Yes, they can perform useful work, but life seldom offers situations in which conditions remain the same. In order to accommodate changing conditions, applications must make decisions and perform tasks a variable number of times. Conditional and loop statements make it possible to perform this task as described in the sections that follow.
You use “if” statements regularly in everyday life. For example, you may say to yourself, “If it’s Wednesday, I’ll eat tuna salad for lunch.” The Python if
statement is a little less verbose, but it follows precisely the same pattern. To see how this works, open a copy of IPython and type the following code:
def TestValue(Value):
if Value == 5:
print('Value equals 5!')
elif Value == 6:
print('Value equals 6!')
else:
print('Value is something else.')
print('It equals ' + str(Value))
Every Python if
statement begins, oddly enough, with the word if. When Python sees if
, it knows that you want it to make a decision. After the word if comes a condition. A condition simply states what sort of comparison you want Python to make. In this case, you want Python to determine whether Value
contains the value 5
.
The condition always ends with a colon (:). If you don’t provide a colon, Python doesn’t know that the condition has ended and will continue to look for additional conditions on which to base its decision. After the colon comes any tasks you want Python to perform.
You may need to perform multiple tasks using a single if statement. The elif
clause makes it possible to add an additional condition and associated tasks. A clause is an addendum to a previous condition, which is an if
statement in this case. The elif
clause always provides a condition, just like the if
statement, and it has its own associated set of tasks to perform.
Sometimes you need to do something no matter what the condition might be. In this case, you add the else
clause. The else
clause tells Python to do something in particular when the conditions of the if
statement aren’t met.
To see this function in action, type TestValue(1) and press Enter. You see the output from the else
clause. Type TestValue(5) and press Enter. The output now reflects the if
statement output. Type TestValue(6) and press Enter. The output now shows the results of the elif
clause. The result is that this function is more flexible than previous functions in the chapter because it can make decisions.
Nesting is the process of placing a subordinate statement within another statement. You can nest any statement within any other statement in most cases. To see how this works, open a copy of IPython and type the following code:
def SecretNumber():
One = int(input("Type a number between 1 and 10: "))
Two = int(input("Type a number between 1 and 10: "))
if (One >= 1) and (One <= 10):
if (Two >= 1) and (Two <= 10):
print('Your secret number is: ' + str(One * Two))
else:
print("Incorrect second value!")
else:
print("Incorrect first value!")
In this case, SecretNumber()
asks you to provide two inputs. Yes, you can get inputs from a user when needed by using the input()
function. The int()
function converts the inputs to a number.
There are two levels of if
statement this time. The first level checks for the validity of the number in One
. The second level checks for the validity of the number in Two
. When both One
and Two
have values between 1 and 10, then SecretNumber()
outputs a secret number for the user.
To see SecretNumber()
in action, type SecretNumber() and press Enter. Type 20 and press Enter when asked for the first input value, and type 10
and press Enter when asked for the second. You see an error message telling you that the first value is incorrect. Type SecretNumber() and press Enter again. This time, use values of 10 and 20. The function will tell you that the second input is incorrect. Try the same sequence again using input values of 10 and 10.
Sometimes you need to perform a task more than one time. You use the for
loop statement when you need to perform a task a specific number of times. The for
loop has a definite beginning and a definite end. The number of times that that loop executes depends on the number of elements in the variable you provide. To see how this works, open a copy of IPython and type the following code:
def DisplayMulti(*VarArgs):
for Arg in VarArgs:
if Arg.upper() == 'CONT':
continue
print('Continue Argument: ' + Arg)
elif Arg.upper() == 'BREAK':
break
print('Break Argument: ' + Arg)
print('Good Argument: ' + Arg)
In this case, the for
loop attempts to process each element in VarArgs
. Notice that there is a nested if
statement in the loop and it tests for two ending conditions. In most cases, the code skips the if
statement and simply prints the argument. However, when the if
statement finds the words CONT
or BREAK
in the input values, it performs one of these two tasks:
continue
: Forces the loop to continue from the current point of execution with the next entry in VarArgs.break
: Stops the loop from executing.The while
loop statement continues to perform tasks until such time that a condition is no longer true. As with the for
statement, the while
statement supports both the continue
and break
keywords for ending the loop prematurely. To see how this works, open a copy of IPython and type the following code:
def SecretNumber():
GotIt = False
while GotIt == False:
One = int(input("Type a number between 1 and 10: "))
Two = int(input("Type a number between 1 and 10: "))
if (One >= 1) and (One <= 10):
if (Two >= 1) and (Two <= 10):
print('Secret number is: ' + str(One * Two))
GotIt = True
continue
else:
print("Incorrect second value!")
else:
print("Incorrect first value!")
print("Try again!")
This is an expansion of the SecretNumber()
function first described in the “Choosing between multiple options using nested decisions” section, earlier in this chapter. However, in this case, the addition of a while
loop statement means that the function continues to ask for input until it receives a valid response.
To see how the while
statement works, type SecretNumber( ) and press Enter. Type 20 and press Enter for the first prompt. Type 10 and press Enter for the second prompt. The example tells you that the first number is wrong and then tells you to try again. Try a second time using values of 10 and 20. This time the second number is wrong and you still need to try again. On the third try, use values of 10 and 10. This time you get a secret number. Notice that the use of a continue
clause means that the application doesn’t tell you to try again.
Python provides a host of methods for storing data in memory. Each method has advantages and disadvantages. Choosing the most appropriate method for your particular need is important. The following sections discuss three common techniques used for storing data for data science needs.
Most people have used sets at one time or another in school to create lists of items that belong together. These lists then became the topic of manipulation using math operations such as intersection, union, difference, and symmetric difference. Sets are the best option to choose when you need to perform membership testing and remove duplicates from a list. You can’t perform sequence-related tasks using sets, such a indexing or slicing. To see how you can work with sets, start a copy of IPython and type the following code:
from sets import Set
SetA = Set(['Red', 'Blue', 'Green', 'Black'])
SetB = Set(['Black', 'Green', 'Yellow', 'Orange'])
SetX = SetA.union(SetB)
SetY = SetA.intersection(SetB)
SetZ = SetA.difference(SetB)
You now have five different sets to play with, each of which has some common elements. To see the results of each math operation, type print '{0} {1} {2}'.format(SetX, SetY, SetZ) and press Enter. You see one set printed on each line, like this:
Set(['Blue', 'Yellow', 'Green', 'Orange', 'Black', 'Red'])
Set(['Green', 'Black'])
Set(['Blue', 'Red'])
You can also test relationships between the various sets. For example, type SetA.issuperset(SetY) and press Enter. The output value of True
tells you that SetA
is a superset of SetY
. Likewise, if you type SetA.issubset(SetX) and press Enter, you find that SetA
is a subset of SetX
.
It’s important to understand that sets are either mutable or immutable. All the sets in this example are mutable, which means that you can add or remove elements from them. For example, if you type SetA.add('Purple') and press Enter, SetA
receives a new element. If you type SetA.issubset(SetX) and press Enter now, you find that SetA
is no longer a subset of SetX
because SetA
has the 'Purple'
element in it.
The Python specification defines a list as a kind of sequence. Sequences simply provide some means of allowing multiple data items to exist together in a single storage unit, but as separate entities. Think about one of those large mail holders you see in apartment buildings. A single mail holder contains a number of small mailboxes, each of which can contain mail. Python supports other kinds of sequences as well:
Of all the sequences, lists are the easiest to understand and are the most directly related to a real-world object. Working with lists helps you become better able to work with other kinds of sequences that provide greater functionality and improved flexibility. The point is that the data is stored in a list much as you would write it on a piece of paper — one item comes after another. The list has a beginning, a middle, and an end. As shown in the figure, the items are numbered. (Even if you might not normally number them in real life, Python always numbers the items for you.) To see how you can work with lists, start a copy of IPython and type the following code:
ListA = [0, 1, 2, 3]
ListB = [4, 5, 6, 7]
ListA.extend(ListB)
ListA
When you type the last line of code, you see the output of [0, 1, 2, 3, 4, 5, 6, 7]
. The extend()
function adds the members of ListB
to ListA
. Beside extending lists, you can also add to them using the append()
function. Type ListA.append(-5) and press Enter. When you type ListA and press Enter again, you see that Python has added –5 to the end of the list. You may find that you need to remove items again and you do that using the remove()
function. For example, type ListA.remove(-5) and press Enter. When you list ListA
again, you see that the added entry is gone.
A tuple is a collection used to create complex lists, in which you can embed one tuple within another. This embedding lets you create hierarchies with tuples. A hierarchy could be something as simple as the directory listing of your hard drive or an organizational chart for your company. The idea is that you can create complex data structures using a tuple.
MyTuple = (1, 2, 3, (4, 5, 6, (7, 8, 9)))
MyTuple
is nested three levels deep. The first level consists of the values 1, 2, and 3, and a tuple. The second level consists of the values 4, 5, and 6, and yet another tuple. The third level consists of the values 7, 8, and 9. To see how this works, type the following code into IPython:
for Value1 in MyTuple:
if type(Value1) == int:
print Value1
else:
for Value2 in Value1:
if type(Value2) == int:
print " ", Value2
else:
for Value3 in Value2:
print " ", Value3
When you run this code, you find that the values really are at three different levels. You can see the indentations showing the level:
1
2
3
4
5
6
7
8
9
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
The chapters that follow use all kinds of techniques to access individual values in various types of data structures. For this section, you use two simple lists, defined as the following:
ListA = ['Orange', 'Yellow', 'Green', 'Brown']
ListB = [1, 2, 3, 4]
The simplest method of accessing a particular value is to use an index. For example, if you type ListA[1] and press Enter, you see 'Yellow'
as the output. All indexes in Python are zero-based, which means that the first entry is 0, not 1.
Ranges present another simple method of accessing values. For example, if you type ListB[1:3] and press Enter, the output is [2, 3]
. You could use the range as input to a for
loop, such as
for Value in ListB[1:3]:
print Value
Instead of the entire list, you see just 2
and 3
as outputs, printed on separate lines. The range has two values separated by a colon. However, the values are optional. For example, ListB[:3]
would output [1, 2, 3]
. When you leave out a value, the range starts at the beginning or the end of the list, as appropriate.
Sometimes you need to process two lists in parallel. The simplest method of doing this is to use the zip()
function. Here’s an example of the zip()
function in action:
for Value1, Value2 in zip(ListA, ListB):
print Value1, ' ', Value2
This code processes both ListA
and ListB
at the same time. The processing ends when the for
loop reaches the shortest of the two lists. In this case, you see the following:
Orange 1
Yellow 2
Green 3
Brown 4
A dictionary is a special kind of sequence that uses a name and value pair. The use of a name makes it easy to access particular values with something other than a numeric index. To create a dictionary, you enclose name and value pairs in curly brackets. Create a test dictionary by typing MyDict = {'Orange':1, 'Blue':2, 'Pink':3} and pressing Enter.
To access a particular value, you use the name as an index. For example, type MyDict['Pink'] and press Enter to see the output value of 3
. The use of dictionaries as data structures makes it easy to access incredibly complex data sets using terms that everyone can understand. In many other respects, working with a dictionary is the same as working with any other sequence.
Dictionaries do have some special features. For example, type MyDict.keys( ) and press Enter to see a list of the keys. You can use the values()
function to see the list of values in the dictionary.