str – an immutable sequence of Unicode code points

Strings in Python have the datatype str and we've been using them extensively already. A string is a sequence of Unicode code-points, and for the most part you can think of code-points as being like characters, although they aren't strictly equivalent. The sequence of code-points in a Python string is immutable, so once you've constructed a string, you can't modify its contents.

The difference between code points, letters, characters, and glyphs can be confusing. Let's try to clarify with an example: The Greek capital letter Σ (sigma), which is of course used widely in the writing of Greek text, is also used by mathematicians to signify summation of a series. These two uses of the letter sigma are represented by distinct Unicode characters called GREEK CAPITAL LETTER SIGMA and N-ARY SUMMATION respectively. Typically, where the same letter is used to convey different
information, a different Unicode character is used. Another example would be the GREEK CAPITAL LETTER OMEGA and OHM SIGN, the symbol for the unit of electrical resistance. A code point is any one member of the set of of numerical values which make up the code space. Each character is associated with a single code point, so GREEK CAPITAL LETTER SIGMA is assigned to U+03A3 and N-ARY SUMMATION is assigned to U+2211.  As we have done here, code points are often written in U+nnnn form where nnnn is a four, five or six digit hexadecimal number. Not all code points have yet been allocated to characters. For example, U+0378 is an unassigned code point, and there’s nothing to stop you including this code point in a Python str using the u0378 escape sequence; hence, str really is a sequence of code points and not a sequence of characters. Although the term in not used in the context of Python, for completeness we feel we should point out that a glyph is the visual representation of a character. Different characters, such as GREEK CAPITAL LETTER SIGMA and N-ARY SUMMATION may be rendered using the same glyph, or indeed different glyphs, depending on the font in use.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset