“Expressions are formed from operators and operands.”
Brian W. Kernighan, Dennis M. Ritchie
One of the basic building blocks of programming languages is the expression, which serves to describe a logical or mathematical relation . In this chapter, you will learn how exactly an expression denotes relations between objects, how expressions can be used in statements, and how the objects can be labeled by names.
As an example, take the calculation of the price of an order. The formula is as follows: the price of the product multiplied by the quantity plus the shipping cost. Let’s look at how it is calculated in Python if the product’s price is 500 USD, two pieces are ordered, and the shipping cost is 200 USD.
A Simple Expression
Spaces should not be inserted before the expressions and statements shown in this chapter, and they should not be broken into multiple lines. This is important as the space(s), tabulators at the beginning of the lines, and line breaks do have a special meaning in Python. The relevant rules are detailed later in the chapter.
The computer processes the expression in Listing 1-1 in multiple steps. Integers are considered objects, which, in this case, are integer type ones. The structure referred to as the expression tree and shown in Figure 1-1a is constructed of integer objects and operators . In the figure the rectangles are objects or operators; if the objects correspond to an operator, they are linked with lines to the operator. You can find a bit more about this standard notion at the end of the chapter. The expression tree is constructed by assigning objects to an operator, among which the operation is to be performed. In the expression tree, the result of another operator can also be assigned to an operator, not just an object. This is indicated by connecting it to another operator, meaning that it is to be performed first, and the resulting object will be used then. The uppermost element of the expression tree is the operator to be performed last.
Fully Parenthesized Expression
The expression without brackets and the fully bracketed expression have the same meaning for the computer. Notably, the expressions can be of any complexity; however, there is a certain level of complexity where they become difficult to understand.
Expressions with Additional Types
In Listing 1-1, we have seen integers as objects on which to perform the operators. Expressions , however, can be used in the general sense, whereas operators can be performed between objects of various types. In addition to the integer type objects, other types of objects will also be covered, e.g., Boolean types (true/false values) and strings. You will learn how to create your own type in Chapter 3.
The type of the particular object is important also because certain operators can be executed only by objects of a specified type. If, for example, an attempt was made to add an integer and a string, an error will be raised. When the operation has been successfully executed, the resulting object will always have a type (determined by the operation, the object performing the operation, and the type of the other object participating in the operation).
Boolean Type in an Expression
Boolean Expression
String Type in an Expression
Operation Between Strings
Expressions with Conditional Operators
Conditional Expression
Floating-Point Number Type in an Expression
Floating-Point Numbers
Complex Number Type in an Expression
Complex Numbers
During the design of the Python language , it was a vital design decision that it should contain a small number of built-in types, and at first sight, the behavior of those types is the same as that of the other, not built-in types. Only five data types are examined more thoroughly here; the rest will be discussed in Chapters 5 and 6.
In the Python language , it is possible to put the comparisons into a chain, meaning the 0 < a and a < 100 expressions can be written in the form of 0 < a < 100. The two expressions are equivalent, except that in the second case the expression standing in the place of a is computed only once. This notation can also be used between more than two comparisons.
Variable Names
Assignment Statements
To express the intent that the variable names to which the value once assigned will not require change, the variable names are capitalized. This notation is not more than a convention, and the Python language does not prevent them from being changed.
The resulting expression is thus much easier to understand. Importantly, not the formula itself is assigned to the variable name, but an object created upon computation of the expression instead. It is always necessary to assign an object to the variable name before it is used (it must appear on the left side of the equation before it can appear on the right side).
Variable names are also used to break down complicated expressions into simpler ones. This can be accomplished by assigning some part of the expression to the variable names and then using the variable names in place of the extracted expressions. Furthermore, if the variable names assigned to the particular expressions are chosen well, the meaning of the expression will be more explicit.
When do you have to insert space between characters? The short answer is: when the characters without space would evidently have a different meaning. For example, the meanings of a + b and a+b are the same; thus, the space can be left out. But the cases of a b and ab are different; the two names (identifiers) would become a single name. Table 1-1 shows some examples. If spaces do not affect the meaning, they can be inserted or omitted, depending on readability. The detailed answer will be given in the section “Advanced Details.”
Effects of Spaces Between Characters
Without Space | With Space | Interpretation |
---|---|---|
a+b | a + b | Identical. |
a=2 | a = 2 | Identical. |
ab | a b | Adding a space between alphabetic characters turns the single name into two separate names, which are in most cases syntactically incorrect (an important exception is when one of the names is a keyword). |
12 | 1 2 | Adding a space between numeric characters turns a single literal into two separate literals, which are in most cases syntactically incorrect. |
a1 | a 1 | Adding a space between an alphabetic and a numeric characters turns the single name into a name and a literal, which are in most cases syntactically incorrect (an important exception is when the name is a keyword). |
1a | 1 a | Adding a space between a numeric and an alphabetic characters turns the syntactically incorrect character sequence into a literal and a name, which are in most cases also syntactically incorrect (an important exception is when the name is a keyword like 1 or); in earlier versions of Python 1or character sequence was syntactically correct, but know it is deprecated and will cause a syntax error in the future. |
Statements
Statements
Integer objects are assigned to variable names in the previous example. If, for example, statements in the second and third lines were replaced, an error would occur since there are no objects yet assigned to the quantity label. It is also a consequence of the stepwise statement execution that no object assigned to a variable name would change (except for the unit price) if one more PRICE = 450 line were inserted at the end of the program. For example, the d_available variable name would still point to an object with a true value. To change the assignments, the statements from line 3 to line 5 would have to be re-executed.
The assignment statement consists of a name, an assignment operator, and an expression. In Python , the equal sign means the result of the right- side expression is assigned to the left-side variable name. (The double equal sign is used to test the equality of two expressions.) The assignment statement is not an expression; therefore, it does not have a return value. To obtain an object assigned to the variable name as an expression, the := walrus operator can be used, but it can be used only in restricted cases (details of the walrus operator will be explained in Chapter 5).
Deletion of a Variable Name
Deletion of a Variable Name
The source code elements are the expressions, statements, and concepts touched upon later. The name source code (a code readable for humans) originates from the fact that it is the source of the executable code run by the computer. Files containing the source code are referred to as source files . We can speak about a program when we want to emphasize that it is about the source code realizing an independent and meaningful task. For completeness, programs that can be run as direct source code are usually called scripts . Unfortunately, these denominations are often used loosely in the literature and leave the correct interpretation to the reader.
Additional Language Constructs
In the program fragments described so far, objects are expressed by specific integer, string, and Boolean (True and False) values. Their collective term is literals . Operators are used between them. You have seen the if and else names in the conditional expressions . Names with such special meaning are called keywords . The pass keyword represents the no operation statement. This statement is used to designate that a statement would follow, but our deliberate intention is that nothing happens at that point of the program.
The text located after the character # up to the end of the line is called a comment that is used to place any textual information. A comment is a message to the reader of the source code and can help in understanding the program. If variable names in the program are chosen well, a minimum amount of commenting is necessary.
Pass Statement and Comments
Statements and Expressions in Practice
Assignments with Type Definitions
It is recommended to insert a space on both sides of an equal signs; and put a space only after colons.
Extra spaces should not be inserted for parentheses, but spaces are recommended so that the context is preserved.
Spaces are inserted around the operators as a default; an exception is if the order of operations in the expression is other than simply from left to right, since there are parentheses, or certain operations should be executed sooner. In this case, spaces are removed around operations executed sooner, such as in the case of (a+2) * (b+4) or a*2 + b*4.
If required, the aim or use of the statements can be documented by a comment placed after it (beginning with the character #, as discussed earlier). Lengthier comments can be placed among the statement lines, and each line should begin with the character #. It is recommended that the information recorded here should be such that it could not be figured out from the variable names and the type annotations.
An expression can be included in the so-called formatted string literals (or f-strings). In this case, there is a letter f (as shown in Listing 1-13) before the string’s opening quotation mark, and the expressions are within the quotation marks, between the braces. The result of these expressions will be inserted after they are converted to a string.
In the f-strings , format specifiers can also be added after the expression, separated by a colon (:). Listing 1-14 shows the format of the variable PRICE1 as a decimal fraction with two decimal places and shows writing out variable PRICE2 as an integer filled with 0s, up to five digits.
Formatted String Literals
Formatted String Literals with Format Specifiers
Advanced Details
This section describes technical details in reference manual style and advanced concepts that may need more technical background.
Names
You saw that names (also called identifiers) can be given to objects. The following characters can be present in a name: beginning with a letter or an underscore and continuing with a letter-like character, underscore, or digit. Letter-like means that other categories are added to characters considered letters in the Unicode standard, namely, the “nonspacing mark,” the “spacing combining mark,” and the “connector punctuation” categories. Within names Python discriminates lowercase and uppercase letters, but certain character combinations can be regarded the same according to the NFKC standard. It is recommended to use only letters of the English alphabet in names.
Keywords and Special Names
Keywords
and | as | assert | async | await |
---|---|---|---|---|
break | class | continue | def | del |
elif | else | except | False | finally |
for | from | global | if | import |
in | is | lambda | None | nonlocal |
not | or | pass | raise | return |
True | try | while | with | yield |
Names both beginning and ending with a double underscore character may have a special meaning in the Python language . Therefore, your own variables should not be denoted in this way. A single or double underscore only at the beginning of the name is not subject to this restriction. Their function is detailed in Chapter 3. A single underscore after a variable name is used in case the name would otherwise be a keyword . Finally, a single underscore character as a variable name is used to signal that the value of the variable won’t be used, but defining a variable name is required for syntactic constraints.
Literals
The value of a Boolean can be true or false. The true value is denominated by True, while false is signified by False.
There are also rules applied for integers . Simple decimal integers begin with a digit that is not zero and continue with any digit. If a zero is written, any integer of zeros can follow, but no other digits. If an integer is written in a system other than decimal, the b, o, or x letters after a zero digit annotate the following number in the binary, octal, or hexadecimal number system; then the particular number itself is written according to the rules of the particular number system. (For example, in terms of the binary number system, digits can be only 0 or 1, while in the case of the hexadecimal system, besides the number 9, the lowercase or uppercase letters between A and F are allowed. Lowercase or uppercase letters are both allowed for the letters denoting the number system, as they are for the numbers in the hexadecimal system.) Numbers can always be separated by underscoring. Listing 1-15 shows examples of the previously mentioned syntax.
Integers in Various Number Systems
Floats or, more precisely, floating-point numbers, are those always represented in the decimal system: a dot is placed within the integer or around it, or the integer exponent of number 10 multiplying the particular number is written after the number, separated by a letter e. Floating-point numbers used in Python are according to the IEEE 754 standard.
Complex numbers , in turn, can be represented by two floats. One of them is the imaginary part that is denoted by a letter j written after a float; according to this notation, the 1j corresponds to the square root of -1.
Strings are written between pairs of quotation marks, ' or ". The meanings of the two kinds of quotation marks are the same, but it is recommended to select one and use that one consistently. An exception is when the original text itself contains a kind of quotation mark. In this case, it is recommended to use the other kind at both ends of the string. If you want to write a multiline string, it will have to be between the characters of ''' or """. Any character except the quotation mark can appear within a string. If you would like to write the same characters as the opening and closing quotation marks, we have to use a backslash character before them. After the backslash, other characters can denote otherwise not representable characters (according to Table 1-3).
The character r can be present before strings with the effect of the backslash character working as a simple character. This is useful in describing access paths (for example, the r 'C:Users' notation is equivalent to 'C:\Users') and regular expressions. The letter f can also be present before strings, meaning that if an expression is written between the characters { and }, it is computed, and the result will be inserted as a string (see the former paragraph on formatted string literals ). The letter b may also occur; you can find a more detailed description about it in Appendix B.
Escape Character Table
Character | Meaning |
---|---|
and “new line” | The backslash and the new line will be ignored. |
\ | The backslash itself (). |
a | ASCII bell character (BEL). |
| ASCII backspace (BS). |
f | ASCII form feed (FF). |
| ASCII linefeed (LF). |
| ASCII carriage return (CR). |
| ASCII horizontal tabulation (TAB). |
v | ASCII vertical tabulation (VT). |
ooo | Character code with a value of ooo in an octal number system. |
xhh | Character code with a value of hh in a hexadecimal number system. |
Characters with Special Meaning
Characters with Special Meaning
Category | Characters | ||||||
---|---|---|---|---|---|---|---|
Operators | << | >> | & | ^ | ~ | := | |
< | > | <= | >= | == | != | ||
Delimiters | , | : | . | ; | @ | = | -> |
+= | -= | *= | /= | //= | %= | @= | |
&= | |= | ^= | <<= | >>= | **= | ||
Other special characters | ’ | " | # |
| |||
Can be used only in strings | $ | ? | ‘ |
The dot can appear within a decimal fraction. Three consecutive dots may be present one after the other, which is called an ellipsis literal . This is not used in core Python, only in some extensions (e.g., NumPy).
Precedence of Operators
Operators | Meaning |
---|---|
x := y | Assignment expression |
x if y else z | Conditional expression |
x or y | Logical or |
x and y | Logical and |
not x | Logical negation |
x in y, x not in y,x is y, x is not y,x < y, x <= y, x > y,x >= y, x != y, x == y | Membership tests,identity tests, and comparisons |
x | y | Bitwise or |
x ^ y | Bitwise exclusive or |
x & y | Bitwise and |
x << y, x >> y | Bitwise shift |
x + y, x - y | Addition, subtraction |
x / y, x // y, x % y | Division, integer division, remainder |
+x, -x, ~x | Positive, negative, bitwise negation |
x**y | Raising to the power |
(x) | Expression in parentheses |
Python Standards
The Python language is defined in The Python Language Reference. The content of the book covers basically this document. In addition, several language-related standards will be described in the Python Enhancement Proposals; they are usually referenced as PEP plus a number. An often mentioned PEP document is PEP 8, which contains a recommendation for formatting the Python source code.
Object Diagram Notation
Figures used in the chapter are represented in the object diagram notation of the Unified Modeling Language (UML) . These diagrams can represent objects and their connections of a particular moment. Rectangles represent objects in the diagrams. The names of the objects appear as text written into the rectangle. The names of the objects are always written underlined, optionally with its type after it preceded by a colon. Lines between the rectangles denote connections. Objects in the figures are denoted according to their value, and the value is not represented by a separate instance variable.
Key Takeaways
In the chapter, you learned about the concept of an expression, which is one of the most important building blocks of programming languages. An expression describes operations between objects. Usually, the goal of their usage is to construct a new object needed for the next step of processing (e.g., calculating the sum of the price of products).
The statements describe a step of a program and make changes in the execution context. The assignment statement, which assigns an object to a name, is a good example of this. An expression can serve as a statement, but statements cannot stand where expressions are expected.
The variable name is the first tool to organize your program. For example, a complex expression can be broken into several simpler expressions of which results are assigned to variable names and then combined in a final step.