For I/O bound code (meaning code that spends a lot of time doing input/output), the best-performing architectures are usually asynchronous (AKA async) ones—as long as the I/O operations can be made nonblocking, meaning that your code can initiate an operation, go on doing other things while that operation is in progress, and then find out when the operation is finished.
Async architectures are also sometimes known as event-driven ones, since the completion of I/O operations (new data becoming available on an input channel, or an output channel becoming ready to accept new data) can be modeled as “events,” external to your code, to which your code responds appropriately.
Asynchronous architectures can be classified into three broad categories:
In what is sometimes known as a multiplexed async architecture, your code keeps track of the I/O channels on which operations may be pending; when you can do no more until one or more of the pending I/O operations completes, the thread running your code goes into a blocking wait (this situation is usually referred to as “your code blocks”), specifically waiting for any completion on the relevant set of channels. When a completion wakes up the blocking wait, your code deals with the specifics of that completion (such “dealing with” may include initiating more I/O operations), then, usually, goes back to the blocking wait. Python offers several low-level modules supporting multiplexed async architectures, but the best one to use is the higher-level selectors
module, covered in “The selectors Module”.
In a callback-based async architecture, your code associates with each expected event a callback—a function or other callable that the asynchronous framework calls when the event occurs. The association can be explicit (passing a function in order to request that it get called back when appropriate), or implicit in various ways (for example, extending a base class and overriding appropriate methods). Python’s standard library’s direct support for callbacks-based async architectures is not particularly strong nor up-to-date, and we do not recommend it nor cover it further in this book; if you’re keen to use such an architecture, we recommend third-party frameworks Twisted (or alternatively, Tornado—more specialized, especially for web tasks, but supports coroutines as mentioned in the next section).
The third, most modern alternative category is coroutine-based async architectures, covered in the next section.
In this chapter, after explaining the general concepts of coroutine-based async architectures, we cover the v3-only module asyncio
, which lets you implement such architectures as well as callback-based ones, and the lower-level module selectors
(in the standard library in v3, but also available as a third-party backport download for v2), which lets you implement multiplexed architectures.
A frequent problem with both multiplexed and callback-based architectures is that code that would, with blocking I/O, be a single, rather linear function gets fragmented into pieces without an immediately clear connection to each other; this can make the code harder to develop and debug.
To avoid this problem, we need functions able to suspend their execution, hand control over to a framework while preserving their internal state, and later resume, from right after the suspension point, when the framework gives them the go-ahead. Such functions are known as coroutines.
The term coroutine is used to describe two different but related entities: the function defining the coroutine (e.g., in 3.5, one starting with async def
), more precisely known as a coroutine function when disambiguation is needed; and the object returned by a coroutine function, representing some combination of I/O and computation that eventually terminates. The latter is more precisely known as a coroutine object when disambiguation is needed.
You can think of a coroutine function as a factory for coroutine objects; more directly, remember that calling a coroutine function does not cause any user-written code to execute, but rather just builds and returns a coroutine object. Coroutine objects are not callable: rather, you schedule them for execution via appropriate calls in a framework such as asyncio
, or await
them in the body of another coroutine object.
v2’s stdlib offers no coroutine-based async architecture support; v3, on the other hand, has strong support for coroutine-based async architectures, through the asyncio
module.
To write coroutine-based async libraries that work with v2 as well as v3, specifically for web uses, consider the third-party framework Tornado, an async web server with a bundled web application framework.
When you write an application, rather than a library, always pick a specific Python version (we strongly recommend v3), thus freeing yourself of any worry about cross-compatibility concerns, and leaving you free to write the best code for the specific task at hand.
To make writing and using coroutines clearer and faster, the v3 release that’s most recent at the time of this writing (Python 3.5) introduces two new keywords, async
and await
, covered in “async and await”. You should use them instead of making coroutines via yield from
as mentioned in “yield from (v3-only)”, “The asyncio Module (v3 Only)” and “asyncio coroutines”, as you had to in 3.4 and earlier: using async
and await
keeps full interoperability with yield from
-based coroutines, but makes your code more explicit and readable, avoids certain kinds of errors, and also helps with performance. tornado
now also supports async
/await
-based coroutines.
One thing all async architectures have is an event loop—code (either within the framework or explicit at the application level) that loops waiting for events (changes of state on I/O channels being watched or—depending on the nature of the application—other external occurrences such as user interactions on a user interface). Coroutine objects can only run when the event loop is running.
Lastly, all kinds of async architectures can also usefully support objects known as futures (or sometimes as promises or deferreds), analogous to the ones covered in Table 14-1 (although the latter rely on a pool of threads or processes, while async futures reflect the completion of some nonblocking I/O operation). Futures wrapping a coroutine object can only run when the event loop is running.
The asyncio
module is at the core of v3’s modern approach to asynchronous programming, especially for network-related tasks. asyncio
as supplied by the standard library focuses on providing infrastructure and components at a relatively low level of abstraction. However, you can find higher-abstraction modules at this wiki page—building blocks such as aiohttp (supporting HTTP servers and clients, including WebSocket), web frameworks (some working on top of aiohttp
, some standalone), AsyncSSH (supporting SSH servers and clients), and so forth.
If you find yourself coding a higher-abstraction module for some task or protocol not yet well covered in modules listed at the wiki page, but maybe of general interest, we strongly recommend you make your module high-quality via tests and docs (that serves you well in any case!), package it up and upload it to PyPI as covered in Chapter 25, and then edit the wiki page to add a pointer to it. Fame as an open source code author awaits, and so does help from other programmers in improving your module!
When you write Python code based on asyncio
(ideally also using add-on modules from asyncio.org), you’ll usually be writing coroutine functions. Up to Python 3.4 included, such functions are generators using the yield from
statement covered in “yield from (v3-only)”, decorated with @asyncio.coroutine
, covered in “asyncio coroutines”; since Python 3.5, although you could still use the yield from
(+ decorator) approach for backward compatibility with previous releases, it’s better to use the new async def
statement covered in “async and await”. In the rest of this chapter, we tag as coroutine those functions or methods supplied by asyncio
that are, indeed, coroutine functions, thus nonblocking and returning coroutine objects (they internally use async def
in 3.5 and later, and, compatibly, @asyncio.coroutine
and yield from
in the body in 3.4 and earlier).
For backward compatibility with Python 3.4, v3 (by which, as always, we mean Python 3.5) lets you implement coroutines in either of two different ways: as appropriately decorated generators using the yield from
statement; or as native coroutines, using the new keywords async
and await
.
As mentioned in “Coroutine-Based Async Architectures”, the core concept used in asyncio
is that of a coroutine—a function able to suspend its execution, preserving its internal state; explicitly hand control back to the framework that’s orchestrating it (in asyncio
’s case, that’s the event loop, covered in “asyncio’s Event Loop”); and resume execution from the point at which it had suspended, when the framework tells it to resume.
Just as a generator function immediately executes, runs no user code, and returns a generator object, a coroutine function immediately executes, runs no user code, and returns a coroutine object. In fact, up to Python 3.4, a coroutine function was just a specific kind of generator function: a generator function is any function containing one or more occurrences of the yield
keyword, and a coroutine function was one specifically containing one or more occurrences of the expression yield from iterable
…although such a statement could also occur in a generator function not intended for use as a coroutine function, as covered in “yield from (v3-only)” (the resulting ambiguity, although marginal, was suboptimal, whence the introduction in Python 3.5 of async
and await
, which remove any ambiguity). To mark such a yield from
–based coroutine function explicitly as being a coroutine, and ensure it was usable in asyncio
, you decorated it with @asyncio.coroutine
. For example, here’s a coroutine that sleeps for delay
seconds, then returns a result
:
@asyncio
.
coroutine
def
delayed_result
(
delay
,
result
):
yield from
asyncio
.
sleep
(
delay
)
return
result
Calling, for example, delayed_result(1.5, 23)
, gives you a generator object (specifically a coroutine object) poised to wait a second and a half, then return 23
. Note that the generator object in question won’t execute until it’s properly connected to an event loop, covered in “asyncio’s Event Loop”, and the event loop runs; for example,
loop
=
asyncio
.
get_event_loop
()
x
=
loop
.
run_until_complete
(
delayed_result
(
1.5
,
23
))
sets x
to 23
after a delay of a second and a half.
Mostly for debugging purposes, you can check whether some object f
is a coroutine function with asyncio.iscoroutinefunction(f)
; you can check whether some object x
is a coroutine object with asyncio.iscoroutine(x)
.
To allow more explicit and readable code, Python 3.5 introduces the keywords async
and await
.1 Use async
to create coroutines (with async def
), to create asynchronous context managers (with async with
), and for asynchronous iteration (with async for
).
When you use async def
instead of def
, and await
instead of yield from
,2 the function becomes an explicit coroutine function, with no need for decoration (and without being a generator function as a side effect). For example:
async
def
new_way
(
delay
,
result
):
await
asyncio
.
sleep
(
delay
)
return
result
gives you a coroutine function new_way
that’s fully equivalent to the delayed_result
in the previous example (you could decorate new_way
with @asyncio.coroutine
, but this is not required nor recommended usage). In this way, you can also write a coroutine function that never await
s anything. (It may seem peculiar to need that, but it comes in handy in prototyping, testing, and refactoring.) An async def
coroutine function cannot contain any yield
; conversely, you can use await
only within an async def
coroutine function.
Within a native coroutine function (meaning one defined with async def
) only, you can also use two useful new constructs: async with
and async for
.
The async with
statement is just like the with
statement, except that the context manager (the object that’s the result of the expression just after the keyword with
) must be an asynchronous context manager, implementing the special methods __aenter__
and __aexit__
, corresponding to __enter__
and __exit__
in a plain context manager (a class might conceivably choose to implement all four special methods, making its instances usable in both plain with
and async with
statements). __aenter__
and __aexit__
must return awaitable objects (each is typically a coroutine function defined with async def
, thus intrinsically returning a coroutine object, which is of course awaitable); async with x:
implicitly uses await x.__aenter__()
at entry and await x.__aexit__(type, value, tb)
at exit, elegantly achieving seamless async context-manager behavior.
The async for
statement is just like the for
statement, except that the iterable (the object that’s the result of the expression just after the keyword for
) must be an asynchronous iterable, implementing the special method __aiter__
,3 corresponding to __iter__
in a plain iterable (a class might conceivably choose to implement both special methods, making its instances usable in both plain for
and async for
statements). __aiter__
must return an asynchronous iterator (implementing the special method __anext__
, corresponding to __next__
in a plain iterator, but returning an awaitable object); async for x:
implicitly uses x.__aiter__()
at entry and await y.__anext__()
on the eventually returned asynchronous iterator y
, elegantly achieving seamless async looping behavior.
asyncio
supplies an explicit event loop interface, and some basic implementations of that interface, as by far the largest component of the framework.
The event loop interface is a very broad and rich interface supplying several categories of methods; the interface is embodied in the asyncio.BaseEventLoop
class. asyncio
offers a few alternative implementations of event loops, depending on your platform, and third-party add-on packages let you add still more, for example, to integrate with specific user interface frameworks such as Qt
. The core idea is that all event loop implementations implement the same interface.
In theory, your application can have multiple event loops and manage them through multiple, explicit event loop policies; in practice, such complexity is rarely needed, and you can get away with a single event loop, normally in your main thread, managed by a single, implicit global policy. The main exception to this state of things occurs when you want to run asyncio
code on multiple threads of your program; in that case, while you may still get away with a single policy, you need multiple event loops—a separate one per each thread calling event-loop methods other than call_soon_threadsafe
, covered in Table 18-1. Each thread can only call methods on a loop instantiated and running in that thread.4
In the rest of this chapter, we assume (except when we explicitly state otherwise) that you’re using a single event loop, usually the default one obtained by calling loop = asyncio.get_event_loop()
near the start of your code. Sometimes, you force a specific implementation, by first instantiating loop
explicitly as your specific platform or third-party framework may require, then immediately calling asyncio.set_event_loop(loop)
; ignoring platform-specific or framework-specific semantic peculiarities, which we don’t cover in this book, the material in the rest of this chapter applies equally well in this second, somewhat-rarer use case.
The following sections cover nine categories of methods supplied by an asyncio.BaseEventLoop
instance loop
, and a few closely related functions and classes supplied by asyncio
.
loop
can be in one of three states: stopped (that’s the state loop
is when just created: nothing yet runs on loop
), running (all functionality runs on loop
), or closed (loop
is irreversibly terminated, and cannot be started again). Independently, loop
can be in debug mode (checking sanity of operations, and giving ample information to help you develop code) or not (faster and quieter operation; that’s the normal mode to use “in production,” as opposed to development, debugging, and testing), as covered in “asyncio developing and debugging”. Regarding state and mode, loop
supplies the following methods:
close |
Sets |
get_debug |
Returns |
is_closed |
Returns |
is_closing |
Returns |
is_running |
Returns |
run_forever |
Runs |
run_until_complete |
Runs until |
set_debug |
Sets |
stop |
If called when If called when |
When you develop code using asyncio
, sanity checking and logging help a lot.
Besides calling loop.set_debug(True)
(or setting the environment variable PYTHONASYNCIODEBUG
to a nonempty string), set logging to DEBUG
level: for example, call logging.basicConfig(level=logging.DEBUG)
at startup.
In debug mode, you get useful ResourceWarning
warnings when transports and event loops are not explicitly closed (frequently a symptom of a bug in your code): enable warnings, as covered in “The warnings Module”—for example, use the command-line option -Wdefault
(with no space between -W
and default
) when you start Python, as mentioned in Table 2-1.
For more tips and advice on developing and debugging with asyncio
, see the appropriate section in the online docs.
A key functionality of the asyncio
event loop is to schedule calls to functions (see Table 18-1)—either “as soon as convenient” or with specified delays. For the latter purpose, loop
maintains its own internal clock (in seconds and fractions), not necessarily coincident with the system clock covered in Chapter 12.
loop
’s methods to schedule calls don’t directly support passing named arguments. If you do need to pass named arguments, wrap the function to be called in functools.partial
, covered in Table 7-4. This is the best way to achieve this goal (superior to alternatives such as using lambda
or closures), because debuggers (including asyncio
’s debug mode) introspect such wrapped functions to supply more and clearer information than they could with alternative approaches.
call_at |
Schedules |
call_later |
Schedules |
call_soon |
Schedules |
call_soon_threadsafe |
Like |
time |
Returns a |
Besides loop
’s own methods, the current event loop’s internal time is also used by one module-level function (i.e., a function supplied directly by the module asyncio
):
sleep |
A coroutine function to build and return a coroutine object that completes after |
loop
can have open communication channels of several kinds: connections that reach out to other systems’ listening sockets (stream or datagram or—on Unix—unix sockets), ones that listen for incoming connections (grouped in instances of the Server
class), and ones built on pipes to/from subprocesses. Here are the methods loop
supplies to create the various kinds of connections:
create_connection |
When you have an already-connected stream socket that you just want to wrap into a transport and protocol, pass it as the named argument Optional arguments must be passed as named arguments, if at all.
All other optional named-only arguments can be passed only if |
create_datagram_endpoint |
Much like |
create_server |
Like
|
create_unix_connection |
Like |
create_unix_server |
Same as |
Besides sockets, event loops can connect subprocess pipes, using these methods:
connect_read_pipe |
Returns a |
connect_write_pipe |
Returns a |
An asyncio
task (asyncio.Task
, covered in “Tasks”, is a subclass of asyncio.Future
, covered in “Futures”) wraps a coroutine object and orchestrates its execution. loop
offers a method to create a task:
create_task |
Creates and returns a |
You can also customize the factory loop
uses to create tasks, but that is rarely needed except to write custom implementations of event loops, so we don’t cover it.
Another roughly equivalent way to create a task is to call asyncio.ensure_future
with a single argument that is a coroutine object or other awaitable; the function in this case creates and returns a Task
instance wrapping the coroutine object. (If you call asyncio.ensure_future
with an argument that’s a Future
instance, it returns the argument unchanged.)
We recommend using more explicit and readable loop.create_task
instead of the roughly equivalent asyncio.ensure_future
.
You can choose to use loop
at a somewhat-low abstraction level, watching for file descriptors to become ready for reading or writing, and calling callback functions when they do. (On Windows, with the default SelectorEventLoop
implementation of loop
, you can use these methods only on file descriptors representing sockets; with the alternative ProactorEventLoop
implementation that you can choose to explicitly instantiate, you cannot use these methods at all.) loop
supplies the following methods related to watching file descriptors:
add_reader |
When |
add_writer |
When |
remove_reader |
Stop watching for |
remove_writer |
Stop watching for |
Also at a low level of abstraction, loop
supplies four coroutine-function methods corresponding to methods on socket objects covered in Chapter 17:
sock_accept |
|
sock_connect |
|
sock_recv |
|
sock_sendall |
|
It’s often necessary to perform DNS lookups. loop
supplies two coroutine-function methods that work like the same-name functions covered in Table 17-1, but in an async, nonblocking way:
getaddrinfo |
Returns a coroutine object that, when done, returns a five-items tuple |
getnameinfo |
Returns a coroutine object that, when done, returns a pair |
On Unix-like platforms, loop
(when run on the main thread) supplies two methods to add and remove handlers for signals (a Unix-specific, limited form of inter-process communication, well covered on Wikipedia) the process may receive:
add_signal_handler |
Sets the handler for signal number |
remove_signal_handler |
Removes the handler for signal number |
loop
can arrange for a function to run in an executor, a pool of threads or processes as covered in “The concurrent.futures Module”: that’s useful when you must do some blocking I/O, or CPU-intensive operations (in the latter case, use as executor an instance of concurrent.futures.ProcessPoolExecutor
). The two relevant methods are:
run_in_executor |
Returns a coroutine object that runs |
set_default_executor |
Sets |
You can customize exception handling in the event loop; loop
supplies four methods for this purpose:
call_exception_handler |
Call |
default_exception_handler |
The exception handler supplied by |
get_exception_handler |
Gets and returns |
set_exception_handler |
Sets |
context
is a dict
with the following contents (more keys may be added in future releases). All keys, except message
, are optional; use context.get(key)
to avoid a KeyError
on accessing some key in the context:
Exception
instance
asyncio.Future
instance
asyncio.Handle
instance
str
instance, the error message
asyncio.Protocol
instance
socket.socket
instance
asyncio.Transport
instance
The following sections cover three more concepts you need in order to use asyncio
, and functionality supplied by asyncio
for each of these concepts.
The asyncio.Future
class is almost compatible with the Future
class supplied by the module concurrent.futures
and covered in Table 14-1 (in some future version of Python, the intention is to fully unify the two Future
interfaces, but this goal cannot be guaranteed). The main differences between an instance af
of asyncio.Future
and an instance cf
of concurrent.futures.Future
are:
af
can’t be passed to functions wait
and as_completed
of module concurrent.futures
Methods af.result
and af.exception
don’t take a timeout
argument, and can only be called when af.done()
is True
There is no method af.running()
For thread-safety, callbacks added with af.add_done_callback
get scheduled, once af
is done, via loop.call_soon_threadsafe
.
af
also supplies three extra methods over and above those of cf
:
remove_done_callback |
Removes all instances of |
set_exception |
Marks |
set_result |
Marks |
(In fact, cf
has methods set_exception
and set_result
, too, but in cf
’s case they’re meant to be called strictly and exclusively by unit tests and Executor
implementations; af
’s identical methods do not have such constraints.)
The best way to create a Future
in asyncio
is with loop
’s create_future
method, which takes no arguments; at worst, loop.create_future()
just performs exactly the same as return futures.Future(loop)
, but, this way, alternative loop implementations get a chance to override the method and provide a better implementation of futures.
asyncio.Task
is a subclass of asyncio.Future
: an instance at
of asyncio.Task
wraps a coroutine object and schedules its execution in loop
.
The class defines two class methods: all_tasks()
, which returns the set
of all tasks defined on loop
; and current_task()
, which returns the task currently executing (None
if no task is executing).
at.cancel()
has slightly different semantics from the cancel
method of other futures: it does not guarantee the cancellation of the task, but rather raises a CancelledError
inside the wrapped coroutine—the latter may intercept the exception (intended to enable clean-up work, but also makes it possible for the coroutine to refuse cancellation). at.cancelled()
returns True
only when the wrapped coroutine has propagated (or spontaneously raised) CancelledError
.
asyncio
supplies several functions to ease working with tasks and other futures. All accept an optional (named-only) argument loop=None
to use an event loop different from the current default one (None
means to use the current default event loop); for all, arguments that are futures can also be coroutine objects (which get automatically wrapped into instances of asyncio.Task
). The functions are:
as_completed |
Returns an iterator whose values are |
gather |
Returns a single future
|
run_coroutine_threadsafe |
Submits coroutine object |
shield |
Waits for |
timeout |
Returns a context manager that raises an
This snippet prints |
wait |
This coroutine function returns a coroutine object that waits for the futures in nonempty iterable |
wait_for |
This coroutine function returns a coroutine object that waits for future |
For details about transports and protocols, see the section about them in the online docs. In this section, we’re offering just the conceptual basis, some core details about working with them, and two examples. The core idea is that a transport does all that’s needed to ensure that a stream (or datagram) of “raw,” uninterpreted bytes is pushed to an external system, or pulled from an external system; a protocol translates those bytes to and from semantically meaningful messages.
A transport class is a class supplied by asyncio
to abstract any one of various kinds of communication channels (TCP, UDP, SSL, pipes, etc.). You don’t directly instantiate a transport class: rather, you call loop
methods that create the transport instance and the underlying channel, and provide the transport instance when done.
A protocol class is one supplied by asyncio
to abstract various kinds of protocols (streaming, datagram-based, subprocess-pipes). Extend the appropriate one of those base classes, overriding the callback methods in which you want to perform some action (the base classes supply empty default implementations for such methods, so just don’t override methods for events you don’t care about). Then, pass your class as the protocol_factory
argument to loop
methods.
A protocol instance p
always has an associated transport instance t
, in 1-to-1 correspondence. As soon as the connection is established, loop
calls p.connection_made(t)
: p
must save t
as an attribute of self
, and may perform some initialization-setting method calls on t
.
When the connection is lost or closed, loop
calls p.connection_lost(exc)
, where exc
is None
to indicate a regular closing (typically via end-of-file, EOF), or else an Exception
instance recording what error caused the connection to be lost.
Each of connection_made
and connection_lost
gets called exactly once on each protocol instance p
. All other callbacks to p
’s methods happen between those two calls; during such other callbacks, p
gets informed by t
about data or EOF being received, and/or asks t
to send data out. All interactions between p
and t
occur via callbacks by each other on the other one’s methods.
Here is a protocol-based implementation of a client for the same simple echo protocol shown in “A Connection-Oriented Socket Client”. (Since asyncio
exists only in v3, we have not bothered maintaining any compatibility with v2 in this example’s code.)
import
asyncio
data
=
"""A few lines of text
including non-ASCII characters: €£
to test the operation
of both server
and client."""
class
EchoClient
(
asyncio
.
Protocol
):
def
__init__
(
self
):
self
.
data_iter
=
iter
(
data
.
splitlines
())
def
write_one
(
self
):
chunk
=
next
(
self
.
data_iter
,
None
)
if
chunk
is
None
:
self
.
transport
.
write_eof
()
else
:
line
=
chunk
.
encode
()
self
.
transport
.
write
(
line
)
(
'Sent:'
,
chunk
)
def
connection_made
(
self
,
transport
):
self
.
transport
=
transport
(
'Connected to server'
)
self
.
write_one
()
def
connection_lost
(
self
,
exc
):
loop
.
stop
()
(
'Disconnected from server'
)
def
data_received
(
self
,
data
):
(
'Recv:'
,
data
.
decode
())
self
.
write_one
()
loop
=
asyncio
.
get_event_loop
()
echo
=
loop
.
create_connection
(
EchoClient
,
'localhost'
,
8881
)
transport
,
protocol
=
loop
.
run_until_complete
(
echo
)
loop
.
run_forever
()
loop
.
close
()
You wouldn’t normally bother using asyncio
for such a simplistic client, one that is doing nothing beyond sending data to the server, receiving replies, and using print
to show what’s happening. However, the purpose of the example is to show how to use asyncio
(and, specifically, asyncio
’s protocols) in a client (which would be handy if a client had to communicate with multiple servers and/or perform other nonblocking I/O operations simultaneously).
Nevertheless, this example, for conciseness, takes shortcuts (such as calling loop.stop
when connection is lost) that would not be acceptable in high-quality production code. For a critique of simplistic echo examples, and a thoroughly productionized counterexample, see Łukasz Langa’s aioecho.
Similarly, here is a v3-only protocol-based server for the same (deliberately simplistic) echo functionality:
import
asyncio
class
EchoServer
(
asyncio
.
Protocol
):
def
connection_made
(
self
,
transport
):
self
.
transport
=
transport
self
.
peer
=
transport
.
get_extra_info
(
'peername'
)
(
'Connected from'
,
self
.
peer
)
def
connection_lost
(
self
,
exc
):
(
'Disconnected from'
,
self
.
peer
)
def
data_received
(
self
,
data
):
(
'Recv:'
,
data
.
decode
())
self
.
transport
.
write
(
data
)
(
'Echo:'
,
data
.
decode
())
loop
=
asyncio
.
get_event_loop
()
echo
=
loop
.
create_server
(
EchoServer
,
'localhost'
,
8881
)
server
=
loop
.
run_until_complete
(
echo
)
(
'Serving at'
,
server
.
sockets
[
0
]
.
getsockname
())
loop
.
run_forever
()
This server code has no intrinsic limits on how many clients at a time it can be serving, and the transport deals with any network fragmentation.
Fundamental operations of transports and protocols, as outlined in the previous section, rely on a callback paradigm. As mentioned in “Coroutine-Based Async Architectures”, this may make development harder when you need to fragment what could be a linear stream of code into multiple functions and methods; with asyncio
, you may partly finesse that by using coroutines, futures, and tasks for implementation—but there is no intrinsic connection between protocol instances and such tools, so you’d essentially be building your own. It’s reasonable to wish for a higher level of abstraction, focused directly on coroutines, for the same networking purposes you could use transports, protocols, and their callbacks for.
asyncio
comes to the rescue by supplying coroutine-based streams, as documented online. Four convenience coroutine functions based on streams are directly supplied by asyncio
: open_connection
, open_unix_connection
, start_server
, and start_unix_server
. While usable on their own, they’re mostly supplied as examples of how to best create streams, and the docs explicitly invite you to copy them into your code and edit them as necessary. You can easily find the source code, for example on GitHub. The same logic applies, and is explicitly documented in source code comments, to the stream classes themselves—StreamReader
and StreamWriter
; the classes wrap transports and protocols and supply coroutine methods where appropriate.
Here is a streams-based implementation of a client for the same simple echo protocol shown in “A Connection-Oriented Socket Client”, coded using the legacy (pre–Python 3.5) approach to coroutines:
import
asyncio
data
=
"""A few lines of data
including non-ASCII characters: €£
to test the operation
of both server
and client."""
@asyncio
.
coroutine
def
echo_client
(
data
):
reader
,
writer
=
yield from
asyncio
.
open_connection
(
'localhost'
,
8881
)
(
'Connected to server'
)
for
line
in
data
:
writer
.
write
(
line
.
encode
())
(
'Sent:'
,
line
)
response
=
yield from
reader
.
read
(
1024
)
(
'Recv:'
,
response
.
decode
())
writer
.
close
()
(
'Disconnected from server'
)
loop
=
asyncio
.
get_event_loop
()
loop
.
run_until_complete
(
echo_client
(
data
.
splitlines
()))
loop
.
close
()
asyncio.open_connection
eventually (via its coroutine-object immediate result, which yield from
waits for) returns a pair of streams, reader
and writer
, and the rest is easy (with another yield from
to get the eventual result of reader.read
). The modern native (Python 3.5) kind of coroutine is no harder:
import
asyncio
data
=
"""A few lines of data
including non-ASCII characters: €£
to test the operation
of both server
and client."""
async
def
echo_client
(
data
):
reader
,
writer
=
await
asyncio
.
open_connection
(
'localhost'
,
8881
)
(
'Connected to server'
)
for
line
in
data
:
writer
.
write
(
line
.
encode
())
(
'Sent:'
,
line
)
response
=
await
reader
.
read
(
1024
)
(
'Recv:'
,
response
.
decode
())
writer
.
close
()
(
'Disconnected from server'
)
loop
=
asyncio
.
get_event_loop
()
loop
.
run_until_complete
(
echo_client
(
data
.
splitlines
()))
loop
.
close
()
The transformation needed in this case is purely mechanical: remove the decorator, change def
to async def
, and change yield from
to await
.
Similarly for streams-based servers—here’s one that uses legacy coroutines:
import
asyncio
@asyncio
.
coroutine
def
handle
(
reader
,
writer
):
address
=
writer
.
get_extra_info
(
'peername'
)
(
'Connected from'
,
address
)
while
True
:
data
=
yield from
reader
.
read
(
1024
)
if
not
data
:
break
s
=
data
.
decode
()
(
'Recv:'
,
s
)
writer
.
write
(
data
)
yield from
writer
.
drain
()
(
'Echo:'
,
s
)
writer
.
close
()
(
'Disconnected from'
,
address
)
loop
=
asyncio
.
get_event_loop
()
echo
=
asyncio
.
start_server
(
handle
,
'localhost'
,
8881
)
server
=
loop
.
run_until_complete
(
echo
)
(
'Serving on
{}
'
.
format
(
server
.
sockets
[
0
]
.
getsockname
()))
try
:
loop
.
run_forever
()
except
KeyboardInterrupt
:
pass
server
.
close
()
loop
.
run_until_complete
(
server
.
wait_closed
())
loop
.
close
()
And here’s the equivalent using modern native coroutines, again with the same mechanical transformation:
import
asyncio
async
def
handle
(
reader
,
writer
):
address
=
writer
.
get_extra_info
(
'peername'
)
(
'Connected from'
,
address
)
while
True
:
data
=
await
reader
.
read
(
1024
)
if
not
data
:
break
s
=
data
.
decode
()
(
'Recv:'
,
s
)
writer
.
write
(
data
)
await
writer
.
drain
()
(
'Echo:'
,
s
)
writer
.
close
()
(
'Disconnected from'
,
address
)
loop
=
asyncio
.
get_event_loop
()
echo
=
asyncio
.
start_server
(
handle
,
'localhost'
,
8881
)
server
=
loop
.
run_until_complete
(
echo
)
(
'Serving on
{}
'
.
format
(
server
.
sockets
[
0
]
.
getsockname
()))
try
:
loop
.
run_forever
()
except
KeyboardInterrupt
:
pass
server
.
close
()
loop
.
run_until_complete
(
server
.
wait_closed
())
loop
.
close
()
The same considerations as in the previous section apply to these client and server examples: each client, as it stands, may be considered “overkill” for the very simple task it performs (but is useful to suggest how asyncio
-based clients for more complex tasks would proceed); each server is intrinsically unbounded in the number of clients it can serve, and immune to any data fragmentation that might have occurred during network transmission.
asyncio
offers classes (to synchronize coroutines) that are very similar to the ones that the threading
module offers (to synchronize threads), as covered in “Thread Synchronization Objects”, and queues similar to the thread-safe queues covered in “The queue Module”. This section documents the small differences between the synchronization and queue classes supplied by asyncio
, and the classes with the same names covered in Chapter 14.
asyncio
supplies the classes BoundedSemaphore
, Condition
, Event
, Loop
, and Semaphore
, very similar to the classes of the same names covered in “Thread Synchronization Objects”, except of course that the classes in asyncio
synchronize coroutines, not threads. The asyncio
synchronization classes and their methods do not accept a timeout
argument (use asyncio.wait_for
instead to apply a timeout to a coroutine or asyncio
future).
Several methods of asyncio
synchronization classes (all that deal with acquiring or waiting for a synchronization class’s instance, and therefore could possibly block) are coroutine methods returning coroutine objects. Specifically, this applies to BoundedSemaphore.acquire
, Condition.acquire
, Condition.wait
, Condition.wait_for
, Event.wait
, Loop.acquire
, and Semaphore.acquire
.
asyncio
supplies the class Queue
, and its subclasses LifoQueue
and PriorityQueue
, very similar to the classes of the same names covered in “The queue Module”, except of course that the classes in asyncio
queue data between coroutines, not between threads. The asyncio
queue classes and their methods do not accept a timeout
argument (use asyncio.wait_for
instead to apply a timeout to a coroutine or asyncio
future).
Several methods of asyncio
queue classes (all that could possibly block) are coroutine methods, returning coroutine objects. Specifically, this applies to the methods get
, join
, and put
, of the class Queue
and of both subclasses.
selectors
is in Python’s standard library only in v3. However, you can install a v2 backport via pip2 install selectors34
, then, for portability, code your import as:
try
:
import
selectors
except
ImportError
:
import
selectors34
as
selectors
selectors
supports I/O multiplexing by picking whatever mechanism is best on your platform from a lower-level module, select
; you do not need to know the low-level details of select
, and therefore we do not cover select
in this book.
The kind of files that selectors
can handle depend on your platform. Sockets (covered in Chapter 17) are supported on all platforms; on Windows, no other kinds of files are supported. On Unix-like systems, other kinds may be supported (to open a file in nonblocking mode, use the low-level call os.open
, with the flag os.O_NONBLOCK
, covered in Table 10-5).
Methods of selector classes return instances of the class SelectorKey
, a named tuple that supplies four attributes:
data
Data associated with the file object when you registered it with the selector
event
Either of the constants covered in “selectors Events”, or their “bitwise or”
fd
The file descriptor on which the event occurred
fileobj
The file object registered with the selector class: can either be an int
(in which case it’s equal to fd
) or an object with a fileno()
method (in which case fd
equals fileobj.fileno()
).
Call selectors.DefaultSelector()
to get an instance of the concrete subclass of the BaseSelector
abstract class that’s optimal on your platform. (Don’t worry about what specific concrete subclass you’re getting; they all implement the same functionality.) Each such class is a context manager, and therefore suitable for use in a with
statement to ensure that the selector is closed when you’re done with it:
with
selectors
.
DefaultSelector
()
as
sel
:
# use sel within this block, or in functions called within it
...
# here, after the block's done, sel is properly closed
A selector instance s
supplies the following methods:
close |
Frees all of |
get_key |
Returns the |
get_map |
Returns a mapping with files as keys, and |
modify |
Like |
register |
Registers a file object for selection, monitoring the requested events, and associating with it an arbitrary item of data. Returns a new |
select |
Returns a list of pairs When |
unregister |
Deregisters file object |
selectors
is quite low-level, but can sometimes come in handy when performance is key, particularly when the task to perform on the sockets is simple.
Here is a selectors
-based example server for the simplistic echo protocol shown in “A Connection-Oriented Socket Client”. This example is fully coded to run equally well in v2 (as long as you’ve installed the selectors34
backport; see “The concurrent.futures Module”) and v3; it provides more informational print
s than example “A Connection-Oriented Socket Server”, to better show the low-level mechanisms involved in communication with multiple clients “at once,” in byte chunks of whatever size optimizes all I/O operations:
from
__future__
import
print_function
import
socket
try
:
import
selectors
except
ImportError
:
import
selectors34
as
selectors
ALL
=
selectors
.
EVENT_READ
|
selectors
.
EVENT_WRITE
# mapping socket -> bytes to send on that socket when feasible
data
=
{}
# set of sockets to be closed and unregistered when feasible
socks_to_close
=
set
()
def
accept
(
sel
,
data
,
servsock
,
unused_events
):
# accept a new connection, register it with the selector
sock
,
address
=
servsock
.
accept
()
(
'Connected from'
,
address
)
sel
.
register
(
sock
,
ALL
,
handle
)
def
finish
(
sel
,
data
,
sock
):
# sock needs to be closed & unregistered if it's in the selector
if
sock
in
sel
.
get_map
():
ad
=
sock
.
getpeername
()
(
'Disconnected from
{}
'
.
format
(
ad
))
sel
.
unregister
(
sock
)
sock
.
close
()
data
.
pop
(
sock
,
None
)
def
handle
(
sel
,
data
,
sock
,
events
):
# a client socket just became ready to write and/or read
ad
=
sock
.
getpeername
()
if
events
&
selectors
.
EVENT_WRITE
:
tosend
=
data
.
get
(
sock
)
if
tosend
:
# send as much as the client's ready to receive
nsent
=
sock
.
send
(
tosend
)
s
=
tosend
.
decode
(
'utf-8'
,
errors
=
'replace'
)
(
u
'
{}
bytes sent to
{}
(
{}
)'
.
format
(
nsent
,
ad
,
s
))
# bytes to be sent later, if any
tosend
=
tosend
[
nsent
:]
if
tosend
:
# more bytes to send later, keep track of them
(
'
{}
bytes remain for
{}
'
.
format
(
len
(
tosend
),
ad
))
data
[
sock
]
=
tosend
else
:
# no more bytes to send -> ignore EVENT_WRITE for now
(
'No bytes remain for
{}
'
.
format
(
ad
))
data
.
pop
(
sock
,
None
)
sel
.
modify
(
sock
,
selectors
.
EVENT_READ
,
handle
)
if
events
&
selectors
.
EVENT_READ
:
# receive up to 1024 bytes at a time
newdata
=
sock
.
recv
(
1024
)
if
newdata
:
# add new data received as needing to be sent back
s
=
newdata
.
decode
(
'utf-8'
,
errors
=
'replace'
)
(
u
'Got
{}
bytes from
{}
(
{}
)'
.
format
(
len
(
newdata
),
ad
,
s
))
data
[
sock
]
=
data
.
get
(
sock
,
b
''
)
+
newdata
sel
.
modify
(
sock
,
ALL
,
handle
)
else
:
# recv empty string means client disconnected
socks_to_close
.
add
(
sock
)
try
:
servsock
=
socket
.
socket
(
socket
.
AF_INET
,
socket
.
SOCK_STREAM
)
servsock
.
bind
((
''
,
8881
))
servsock
.
listen
(
5
)
(
'Serving at'
,
servsock
.
getsockname
())
with
selectors
.
DefaultSelector
()
as
sel
:
sel
.
register
(
servsock
,
selectors
.
EVENT_READ
,
accept
)
while
True
:
for
sk
,
events
in
sel
.
select
():
sk
.
data
(
sel
,
data
,
sk
.
fileobj
,
events
)
while
socks_to_close
:
sock
=
socks_to_close
.
pop
()
finish
(
sel
,
data
,
sock
)
except
KeyboardInterrupt
:
pass
finally
:
servsock
.
close
()
selectors
does not supply an event loop; the event loop needed in async programming, in the preceding example, is the simple loop using while True:
.
The low-level approach, as shown in this example, makes us write a lot of code compared to the simplicity of the task, a task that is made especially simple by the fact that the server does not need to examine or process incoming bytes (for most network jobs, incoming streams or chunks of bytes do need to be parsed into semantically meaningful messages). The payback comes in the high performance one can attain, with such a no-overhead approach, consuming minimal resources.
Moreover, it’s worth noticing (as is emphasized by the detailed informational print
s) that this example does not share the limitations of “A Connection-Oriented Socket Server”: the code, as presented, can handle any number of clients, up to the limits of available memory, as well as arbitrary network fragmentation of data along the arbitrarily high number of simultaneous connections.
1 Pseudokeywords, currently: for backward compatibility, you can also still use async
and await
as normal identifiers—but don’t do that, as it makes your code less readable and in some future version it will become a syntax error.
2 However, you can’t await
an arbitrary iterable object—only an awaitable object, such as a coroutine object or an asyncio.Future
instance.
3 Note that __aiter__
must not be a coroutine, while all the other dunder methods in these sections with names starting with __a
must be coroutines.
4 For some subtleties regarding event loops on Windows and event-loop backward compatibility with old releases of macOS, see the event loops page of the online docs; there are no such subtleties on other platforms.