13

The Evolution of Python – Discovering New Python Features

Overview

By the end of his chapter, you will understand how Python continues to evolve and how to track that evolution to be up to date with the latest development. The chapter will introduce you to the Python Enhancement Proposals (PEPs) and show you the most significant enhancements of the language, from Python 3.7 to Python 3.11, allowing you to leverage the new features of the language.

Introduction

Across this book, we have seen how to use Python effectively, and the different tools and APIs that the language offers us. However, Python is not a language set in stone; it continues to evolve with every new release.

The Python development team cuts a release of the interpreter every year and provides a window of support, where bug fixes are backported, as well as a long-term support window for critical security fixes only.

In this chapter, we will see how to keep us up to date with the development of Python, how enhancements are made, and the changes that the latest versions of Python have published.

We will be covering the following topics:

  • Python Enhancement Proposals
  • New features released in Python from version 3.7 to version 3.11

Python Enhancement Proposals

The Python language evolves as its reference implementation changes (CPython). The process to introduce a change in the reference implementation and, therefore, the language is done by following the Python developer’s guide (https://devguide.python.org/). An important part of the evolution of the language is the Python Enhancement Proposal (PEP), a step required for any major change in the language. The process starts with a core developer (a person with the commit bit in python/cpython) who sponsors or directly submits a draft PEP in the python/peps repository. Those proposals are usually first discussed in the Python ideas forum to gather a quick opinion by both developers and users alike on how useful they are or what issues they might face.

Tip

A great way to be involved in the evolution of the language is to subscribe to the forum and participate in those conversations.

After a core developer submits a PEP for review, the steering council, the governing body of the Python language, discusses it. The steering council takes input from the rest of the core developers, and if they find the change valid, it can be marked as final and accepted for the language. PEPs are used to propose changes to the language implementation or any process related to the development of Python. PEP 1 documents the process of submitting and reviewing PEPs.

Tip

Subscribe to the PEP discussions at https://discuss.python.org/c/peps to follow major changes in the language.

A PEP usually includes the following sections:

  • A metadata header.
  • An abstract with a short description and motivation of the PEP, and why it is needed.
  • A rationale explaining the decisions taken when writing the PEP.
  • A detailed PEP specification, how can it impact existing Python code, whether it has any security implications, and how you expect trainers to teach the new feature.
  • A reference implementation, if available, rejected ideas when exploring the PEP, and a copyright note. You can find a template of a PEP in PEP 12 (https://peps.python.org/pep-0012/).

If you plan to work professionally using Python, I recommend you read some of the following PEPs:

  • PEP 8: A style guide on Python code. This is a must-read if you are writing Python professionally. It allows Python developers across the globe to write and read Python with a common style, making it easier to read for everyone.
  • PEP 1: A PEP that documents the purpose and instructions on how to submit a PEP. It explains what a PEP is, the workflow for submitting one, and detailed instructions on how to create a good PEP.
  • PEP 11: The different levels of support that CPython offers on different platforms.
  • PEP 602: The new Python annual release schedule.

In the next sections, we will be looking at new features available in each Python version, starting with Python 3.7.

Python 3.7

Python 3.7 was released in June 2018, received bug fixes until June 2020, and will receive security patches until June 2023.

Built-in breakpoint

The new built-in breakpoint() function allows you to quickly drop into a debugger by just writing it anywhere in your code. Rather than having to call the common idiom of import pdb;pdb.set_trace(), in this version of Python, you can just use the built-in breakpoint(), which not only works with the default Python debugger (pdb) but any other debugger that you might use.

Module dynamic attributes

Python is an object-oriented programming language. Everything is an object in Python, and with PEP 562, modules can behave more like classes! With the addition of the work done by PEP 562, you can now add a __getattr__ function to your module that allows you to dynamically evaluate the querying of an attribute.

This is useful when you need to deprecate an attribute of your module, if you need to perform some catching, or if something that you initially declared as an attribute now needs to do some runtime evaluation.

Additionally, you can combine __getattr__ with lru_cache to lazily evaluate the attributes of your module. This is useful when you have a module with constants that are expensive to compute. That allows you to move from the following:


# constants_package.py
constant1 = expensieve_call()
constant2 = expensieve_call2()

To:


# constants_package.py
_constant_resolution = {
    "constants1": expensive_call,
    "constants2": expensive_call2,
}
@functools.lru_cache(maxsize=None)
def __getattr__(name):
    try:
        return _constant_resolution[name]()
    except KeyError:
        raise AttributeError(f"module {__name__!r} has no attribute {name!r}")

The second version of the code will allow you to get the same results without greedily evaluating those constants. In addition, by using lru_cache, no matter how many times users query the attribute, Python will execute each function only once.

Nanosecond support in a time module

As we saw in Chapter 6, The Standard Library, we can use the Python time APIs to get the current time in multiple ways via functions such as time.time and time.monotonic, but those APIs return the time in float, which is usually sufficient in most scenarios, but it might not be adequate if we need an accurate result that can be used with detailed precision. This resulted in PEP 564, which adds a new function to the time module that allows you to get the time with nanosecond precision as integer. The PEP added new functions that end with the _ns prefix, which can be used in situations where we care about getting the precise time. This new API allows the user to work with time using integers, therefore assuring that their computations will always preserve nanosecond precision.

The dict insertion order is preserved

Since Python 3.7, we can rely on Python dictionaries preserving their insertion order, allowing us to iterate them while having a deterministic result. You can see its effect by running the following code in an interpreter before 3.6 and one after 3.6, as even if this was already happening in 3.6, it was not until 3.7 that it was guaranteed by the standard:


x = {}
x["a"] = 1
x["b"] = 2
x[0] = 3
print(list(x))

In Python 2.7, the result will be ['a', 0, 'b'], and you should not rely on the order of the keys, as there are no guarantees. However, if you are using Python 3.7+, you can be sure that the order of the keys is always going to be ['a', 'b', 0]. That is fantastic, as it makes the dictionary (and sets) an ordered container (which is different from a sorted one). This is a property that few languages provide.

Dataclasses

PEP 567 brought dataclasses to Python 3.7. Before this version, users relied on the third-party attrs package, which had a continuously growing popularity. To know more about how to use dataclasses, refer to Chapter 6, The Standard Library, and Exercise 86 – using the dataclass module.

Importlib.resources

This new module in the standard library allows developers to load and read resources within modules. Using importlib.resources allows us to tell an interpreter to read a resource without having to provide paths, making it resilient to package managers that might relocate files. This module also allows us to model packages that might not have a disk representation.

Loading data that is part of a module could not be easier with this module now. There are two APIs that you will usually rely on: importlib.resources.files(package) to get all the files that a Python package provides and importlib.resources.open_text/open_binary(package, resource) to load a file.

Python 3.8

Python 3.8 was released in October 2019, received bug fixes until May 2021, and will receive security patches until October 2024.

Assignment expression

One of the most known additions to Python 3.8 is the assignment expression, also known as the walrus operator. It was quite a controversial addition to Python, which many people attribute to the stepping down of Guido van Rossum from the Python’s final decision-making role in the CPython evolution.

This new syntax allows developers to write an assignment in the place of an expression. This allows for shorter code by combining what otherwise needs to be multiple lines of code. This is quite useful in control flow operations when combined with reading data or using regular expressions. See the following examples.

This is without PEP 572:


running = True
while running:
    data = get_more_data()
    if not data:
        running = check_if_running()
    business_logic(data)

This is with PEP 572:


while data := get_more_data():
    business_logic(data)

In the example, you can see how by using the := operator, we save multiple lines of code, making the code quicker and arguably easier to read. You can treat the result of the assignment expression as an expression, allowing you to write the following code:


while len(data := get_more_data) >= 1

functools.cached_property

This new and terribly handy function allows you to optimize your code by allowing you to do in one line a common Python idiom that was used to cache a class attribute, which might be expensive to compute. Before Python 3.8, you would commonly find code like the following:


class MyClass:
    def __init__(self):
        self._myvar = None
    @property
    def myvar(self):
        if self._myvar is None:
            self._myvar = expensive_operation()
        return self._myvar

With the addition of cached_property, you can now simplify that to the following:


class MyClass:
    @functools.cached_property
    def myvar(self):
        return expensive_operation()

importlib.metadata

A new module was added in Python 3.8 that lets us read metadata about third-party packages that we have installed in our system. importlib.metadata can be used to replace usage of less efficient and third-party dependent code that relies on pkg_resources. See the following examples of how this new module is useful on a Python installation with pytest installed:


import importlib.metadata
importlib.metadata.version("pytest")

You get the following result:

Figure 13.1 – The pytest version

Figure 13.1 – The pytest version

You can get any kind of metadata by getting it as a dictionary, by invoking the metadata function:


import importlib.metadata
importlib.metadata.metadata("pytest")["License"]

Here is the output:

Figure 13.2 – The pytest license

Figure 13.2 – The pytest license

typing.TypedDict, typing.Final, and typing.Literal

If you like to type your Python code, 3.8 brings three new classes to the typing module, which are quite useful to better qualify the types you use in your code.

Using typing.Literal allows you to type your code to specify what concrete values it can get beyond just documenting the type. This is specifically useful in situations where strings can be passed but there is only a known list of values. See the following example:


MODE = Literal['r', 'rb', 'w', 'wb']
def open_helper(file: str, mode: MODE) -> str:

Without typing.Literal, you will need to type mode as str, allowing other strings that are not valid types. In 3.8, you can also use typing.Final, which allows you to mark a variable as a constant, and the type checker will flag an error if you try to change the value of the variable.

Finally, we have typing.TypedDict, a great way to type your dictionaries when you know they need to have a specific set of keys. If you create a type with Point2D = TypedDict('Point2D', x=int, y=int), the type checker will flag errors when you create dictionaries with a key that is neither x nor y.

f-string debug support via =

How many times have you written the name of a variable followed by its value? With Python 3.8, this just became a lot easier with debug support in f-strings using =. With this addition, you can now write code as follows to quickly debug your variables:


import datetime
name = "Python"
birthday = datetime.date(1991, 2, 20)
print(f'{name=} {birthday=}' )

This will produce the following output:

Figure 13.3 – An f-string example

Figure 13.3 – An f-string example

Positional-only parameters

If you are an API provider, you will definitely like this new addition to Python. With PEP570, you can now mark parameters as positional only, making the name of the function parameter private, and allowing you to change it in the future if so desired. Before Python 3.8, if you were creating an API with a signature such as def convert_to_int(variable: float):, users could call your function with the convert_to_int(variable=3.14) syntax. That could be an issue if you wanted to rename your variable in the future or wanted to move to varargs. With the addition of positional-only parameters to the language, you can now use new syntax to mark those arguments as positional only, preventing them from being passed using a def convert_to_int(variable: float, /): keyword. When / is specified, all arguments before it will be marked as positional only, similar to how * can be used to mark all arguments after it as keyword-only.

Python 3.9

Python 3.9 was released in October 2020, received bug fixes until May 2022, and will receive security patches until October 2025.

PEG parser

One of the most significant changes in Python 3.9 is the rewrite of the parser that sits at the core of an interpreter. After 30 years of using the LL1 parser, which was quite useful for Python, the core development team decided to move to a more modern and powerful parser, which enabled many enhancements to the language – from new syntax to better error messages. While this did not result in any big change directly for developers, it has helped the language to continue evolving. Take a read at https://peps.python.org/pep-0617/ to understand the work that was done and how it is helping Python evolve.

Support for the IANA database

If you are working with time zones, you probably have used the IANA database (https://www.iana.org/time-zones) before. This database allows you to map strings to data that defines what offset to set for that time zone when given a date time. Before Python 3.9, two third-party packages, dateutil and pytz, provided this data to developers. With the implementation of PEP 615, developers can now fetch time zone information from their OS without the need to rely on a third-party package.

See the following example that converts a date time from the New York time zone to Los Angeles, all with the standard library:


import datetime
from zoneinfo import ZoneInfo
nyc_tz = ZoneInfo("America/New_York")
la_tz = ZoneInfo("America/Los_Angeles")
dt = datetime.datetime(2022, 5, 21, hour=12, tzinfo=nyc_tz)
print(dt.isoformat())

You will get the following result:

Figure 13.4 – The datetime iso formatted

We can see how both the time and the offset change when we convert the datetime instance to a different time zone using astimezone:


print(dt.astimezone(la_tz).isoformat())

Now, the output will be the following:

Figure 13.5 – The datetime iso formatted after the time zone change

Figure 13.5 – The datetime iso formatted after the time zone change

Merge (|) and update (|=) syntax for dicts

Sets and dictionaries are getting closer and closer functionally. In this version of Python, dicts got support for the | union operator. This allows you to combine dictionaries with the following syntax:


d1 = dict(key1="d1", key3="d1")
d2 = dict(key2="d2", key3="d2")
print(d1 | d2)

This is the output:

Figure 13.6 – The dict merge output

Figure 13.6 – The dict merge output

Something to note is that if a key is present in both dictionaries, it will take the value from the last seen dictionary. Additionally, you can use the |= operator to merge an existing dictionary with another:


d1 = dict(key1="d1", key3="d1")
d1 |= dict(key2="d2", key3="d2")
print(d1)

The output observed is as follows:

Figure 13.7 – The dict merge operator output

Figure 13.7 – The dict merge operator output

str.removeprefix and str.removesuffix

With these two functions, we can remove the suffix or prefix of a string, something that many developers mistakenly used to do with strip. The strip function takes an optional list of characters to override the default and developers got confused, thinking that it was the exact string that would be removed. See the following example:


print("filepy.py".rstrip(".py"))

This gives the output as the following:

Figure 13.8 – The rstrip output

Figure 13.8 – The rstrip output

Users might have expected filepy as the result, but instead, just file is returned, as strip has been instructed to delete all p, y, and . characters from the end of the string. If you want to remove the suffix of a string, you can now use str.removesuffix instead:


print("filepy.py".removesuffix(".py"))

We will now get the expected output:

Figure 13.9 – The removesuffix output

Figure 13.9 – The removesuffix output

Type hints with standard collections

Before Python 3.9, typing collections needed to import their types from the typing module. With the addition of PEP 585, developers can now use the standard library collections when type-hinting their code. This transforms the existing code from the following:


from typing import Dict, List
def myfunc(values: Dict["str", List[int]]) -> None:

To the following:


def myfunc(values: dict["str", list[int]]) -> None:

Python 3.10

Python 3.10 was released in October 2021, will receive bug fixes until May 2023, and will receive security patches until October 2026.

Pattern matching – PEP 634

By far, the most controversial addition to the Python 3.10 pattern matches was bringing match and case to the language. This addition consists of three different PEPS: PEP 634, PEP 635, and PEP 636. This new syntax allows you to mirror-switch structures that you might have seen in other languages:


match code:
    case 1:
        print("Working as expected")
    case -1 | -2 | -3:
        print("Internal Error")
    case _:
        print("Unknown code")

Note that to specify one of the multiple values, you need to use the | operator and not a comma. Using a comma will try to match a list. However, using dictionaries will be more correct for the previous example; the power of pattern matching comes from matching a variable, whose type or length in the case of containers is a lot more dynamic. Pattern matching allows you to evaluate specific properties of an object and copy those in variables when doing a match. See the following example:


match x:
    case {"warning": value}:
        print("warning passed with value:", value)
    case ["error", value]:
        print("Error array passed with value:", value)

Pattern matching is also useful when interacting with data in the form of containers and having to take different actions or create different objects based on their values. See the following example from the Python standard library:


match json_pet:
    case {"type": "cat", "name": name, "pattern": pattern}:
        return Cat(name, pattern)
    case {"type": "dog", "name": name, "breed": breed}:
        return Dog(name, breed)
    case _:
        raise ValueError("Not a suitable pet")

Note how pattern matching not only routes the code through one branch or another based on the attributes that we are matching but also captures others with specific variables. If you want to know more about pattern matching and understand how it works, we recommend you read https://peps.python.org/pep-0636/, which is a tutorial on how to use structural pattern matching.

Parenthesized context managers

Thanks to the introduction of the new PEG parser in Python 3.9, 3.10 was able to address a long-standing issue in Python grammar – allowing the use of parentheses in context managers.

If you have written multiple context managers in Python, you are probably aware of how hard it is to nicely format that code. This change allows you to move from having to write code such as the following:


with CtxManager1(
    ) as example1, CtxManager2(
    ) as example2, CtxManager3(
    ) as example3
):

To being able to write code such as the following:


with (
    CtxManager1() as example1,
    CtxManager2() as example2,
    CtxManager3() as example3,
):

Better error messages

Another advantage of the new parser is the new ability to write code to better handle errors in an interpreter. While Python errors are usually quite informative compared to other languages, when an error happens at parsing time, it is often quite cryptic.

Let’s take the following code, which is missing a closing bracket in the first line:


d = {"key": "value", "key2": ["value"]
def func(): pass

Running it in a Python interpreter before Python 3.10 will give us the following error, which does not reference the first line at all and, therefore, is quite hard to debug:

Figure 13.10 – A previous error output

Figure 13.10 – A previous error output

In Python 3.10, the error message will be the following:

Figure 13.11 – The improved error output

Figure 13.11 – The improved error output

This nicely points developers to the root cause of the issue.

Similar to missing brackets, there have been similar improvements to many other syntaxes, which saves developers time when developing by pointing them to the source of the issue.

Type union operator (|) – PEP 604

Python 3.10 brings some additional syntax sugar for typing. A common situation when type-hinting your code is that a parameter might have one of many types. This used to be handled by using the typing.Union type, but since Python 3.10, you can use the | operator to represent the same.

That allows you to move from writing code as the following:


def parse_number(text: str, pattern: typing.Union[str, re.Pattern]) -> typing.Union[int, float]

To the following instead:


def parse_number(text: str, pattern: str | re.Pattern) ->int | float

Statistics – covariance, correlation, and linear_regression

The Python 3.10 release adds functions to compute the covariance, the correlation, and the linear regression given two inputs:


>>> x = range(9)
>>> y = [*range(3)] * 3
>>> import statistics
>>> statistics.covariance(x, y)
0.75
>>> statistics.correlation(x, y)
0.31622776601683794
>>> statistics.linear_regression(x, y)
LinearRegression(slope=0.1, intercept=0.6)

Python 3.11

Python 3.11 was released in October 2022, will receive bug fixes until May 2024, and will receive security patches until October 2027.

Faster runtime

The new 3.11 is 22% faster than 3.10 when measured with the Python performance benchmark suite. The result depends a lot on your application and will usually range between 10% and 60%. A series of optimization into how code is parsed and run together with startup improvements have made this possible, as part of a project branded as Faster CPython that is focusing on making an interpreter faster.

Enhanced errors in tracebacks

Building on the success achieved with the improvement of error messages in Python 3.10, 3.11 has done substantial work to facilitate the debugging of errors in traceback through PEP 659. The interpreter will now point to the exact expression that caused the exception, allowing a developer to quickly figure out the root issue without using a debugger.

This is quite useful when navigating dictionaries, given the following code:


d = dict(key1=dict(key2=None, key3=None))
print(d["key1"]["key2"]["key3"])

Before Python 3.11, we will get the following error:

Figure 13.12 – The previous dict error output

Figure 13.12 – The previous dict error output

With Python 3.11, we get the following:

Figure 13.13 – The enhanced dict error output

Figure 13.13 – The enhanced dict error output

Note how the interpreter is now pointing us to the lookup that caused the error. Without this information, it would be hard to know where that NoneType was coming from. Here, the developer can easily realize that the exception was triggered when querying key3, meaning that the result of looking up key2 was None.

This is also quite useful when doing math operations. See the following code example:


x = 1
y = 2
str_num = "2"
print((x + y) * int(str_num) + y + str_num)

Before Python 3.11, we would get the following error:

Figure 13.14 – The previous addition error output

Figure 13.14 – The previous addition error output

In Python 3.11, we get the following instead:

Figure 13.15 – The enhanced addition error output

Figure 13.15 – The enhanced addition error output

The new tomllib package

Given the standardization and raising popularity of pyproject.toml, Python 3.11 has added a new module to facilitate reading TOML files. The tomllib package can be used to easily read your project configuration in files such as pyproject.toml. As an example, let’s take the following .toml file:


[build-system]
requires = ["setuptools", "setuptools-scm"]
build-backend = "setuptools.build_meta"
[project]
name = "packt_package"
description = "An example package"
dependencies = [
        "flask",
        "python-dateutil",
]
[project.scripts]
example-script = "packt_package._main:main"

We can now read it in Python with the standard library with the following code:


import tomllib
import pprint
with open("pyproject.toml", "rb") as f:
    data = tomllib.load(f)
pprint.pprint(data)

This generates the following output:

Figure 13.16 – The tomllib output

Figure 13.16 – The tomllib output

This allows us to handle TOML similar to how we can handle JSON with stdlib. The main difference is that the tomllib module does not come with a method to generate TOML, for which developers have to rely on third-party packages, which have different ways of customization and formatting.

Required keys in dicts

If you have been type-hinting your code, this will allow you to go a more strict level in your Python dictionaries. In the past, we saw how we could use TypeDict to declare what keys a dictionary could take, but now with PEP655, there is a new way to mark whether keys are required or not. Using our previous example of a point, we can now add an optional map attribute as TypedDict('Point2D', x=int, y=int, map=NotRequired[str]). That will result in the type checker allowing dict(x=1, y=2) and dict(x=1, y=2, map="new_york") but not one that misses either the x or y keys, such as dict(y=2, map="new_york").

The new LiteralString type

Another addition to type-hinting is the new LiteralString type. This is useful when we are passing strings that are going to be used in SQL statements or shell commands, as a type checker will require that only static strings be passed. That helps developers protect their code from SQL injection and other similar attacks that take advantage of strings interpolation. See the following example that defines an API for a database:


 def get_all_tables(schema_name: str) -> list[str]:
    sql = "SELECT table_name FROM tables WHERE schema=" + schema_name
    …

The developer of this API intended that function to allow other developers to call it as a quick way to get all tables given a schema. The developer considered it safe code as long as the schema_name argument was under the control of the developer, but there was nothing to prevent that. A user of this API could write the following code:


schema = input()
print(get_all_tables(schema))

This allows the user to perform a SQL injection attack by passing to be input a string such as X; DROP TABLES. With PEP 675, the library developer can now mark schema_name as LiteralString, which will make the type checker raise an error if the string is not static and a part of the application code.

Exceptions notes – PEP 678

PEP 678 adds a new method, add_note, to all exceptions, allowing developers to enrich an exception without the need of having to raise a new one. Before this addition, it was quite common to find the following code, as developers wanted to enrich an exception with some additional information:


def func(x, y):
    return x / y
def secret_function(number):
    try:
        func(10_000 , number)
    except ArithmeticError as e:
        raise ArithmeticError(f"Failed secret function: {e}") from e

With exception notes, we can now write the following:


def func(x, y):
    return x / y
def secret_function(number):
    try:
        func(10_000 , number)
    except ArithmeticError as e:
        e.add_note("A note to help with debugging")
        raise

This allows the exception to keep all its original information. Let’s now run the following code:


secret_function(0)

We see the following traceback:

Figure 13.17 – An exceptions notes example

Figure 13.17 – An exceptions notes example

With this, we conclude our review of the new Python features.

Summary

In this final chapter, you have taken your Python knowledge one step further by learning how to continue your journey of improving your Python skills. We have seen the process to enhance Python and the enhancements that the language has accommodated in the most recent releases. You are all set up to continue your Python learning and even ready to submit a proposal for enhancements if you have any good ideas on how to improve the language itself!

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset