Python’s distutils
is part of the standard library, underlying the tools used for packaging and distributing Python programs and extensions. However, we don’t recommend that you use distutils
directly: use, instead, newer third-party tools. In most cases, you’ll want to use setuptools
and wheels
to create wheels, then use twine
to upload them to your favorite repository (usually PyPI, the Python Package Index, aka The Cheeseshop) and pip
to download and install them. If you do not have pip
installed (it comes with v2 and v3), see pip’s installation page. Be sure to use the most upgraded version of pip
, by running pip install --upgrade pip
. On Windows, run py -m pip install -U pip setuptools
and then make sure pip
is in your system PATH, similar to ensuring Python is on your PATH.
In this chapter, we cover the simplest uses of setuptools
and twine
for the most common packaging needs. For more in-depth or advanced explanation, see the Python Packaging User Guide. At the time of writing, the new PEP 518 specifies protocols for distributing extensions and programs with other build tools, but these are not yet supported, so we do not cover them in this book.
If you are looking to create and distribute complicated cross-platform or cloud-based apps, you may wish to explore Docker containers; if you want a complete package manager, especially if you’re doing data science/engineering, consider conda (the base of both Miniconda and Anaconda); if you’d like a script installer that also manages your virtual environments, consider pipsi.
setuptools
is a rich and flexible set of tools to package Python programs and extensions for distribution; we recommend that you use it in preference to the standard library’s distutils
.
A distribution is the set of files to package into a single archive file for distribution purposes. A distribution may include Python packages and/or other Python modules (as covered in Chapter 6), as well as, optionally, Python scripts, C-coded (and other) extensions, data files, and auxiliary files with metadata. A distribution is said to be pure when all code it includes is Python, and nonpure when it includes non-Python code (usually, C-coded extensions). A distribution is universal when it is pure and can be used with either v2 or v3 without 2to3
conversion.
You usually place all the files of a distribution in a directory, known as the distribution root, and in subdirectories of the distribution root. Mostly, you can arrange the subtree of files and directories rooted at the distribution root to suit your needs. However, as covered in “Packages”, a Python package must reside in its own directory (unless you are creating namespace packages, covered in “Namespace Packages (v3 Only)”), and a package’s directory must contain a file named __init__.py (and subdirectories with __init__.py files for the package’s subpackages, if any) as well as other modules that belong to that package.
A distribution must include a setup.py script, and should include a README file (preferably in reStructuredText [.rst] format); it may also contain a requirements.txt, a MANIFEST.in, and a setup.cfg, covered in the following sections.
If you wish to test your package while still developing it, you may install it locally with pip install -e
; the -e
flag stands for “editable,” and all details are well explained in the online docs.
The distribution root directory must contain a Python script that by convention is named setup.py. The setup.py script can, in theory, contain arbitrary Python code. However, in practice, setup.py always boils down to some variation of this:
from
setuptools
import
setup
,
find_packages
setup
(
many
named
arguments
go
here
)
You should also import Extension
if your setup.py deals with a nonpure distribution. All the action is in the parameters you supply in the call to setup
. It is fine, of course, to have a few statements before the call to setup
in order to arrange setup
’s arguments in clearer and more readable ways than could be managed by having everything inline as part of the setup
call. For example, a long_description string may be provided from its own separate file, as in, for example:
with
open
(
'
./README.rst
'
)
as
f
:
long_desc
=
f
.
read
(
)
The setup
function accepts only named arguments, and there are a large number of such arguments that you could potentially supply. Named arguments to setup
fall primarily into three groups: metadata about the distribution, information about which files are in the distribution, and information about dependencies. A simple example is the setup.py for Flask (covered in “Flask”), showing some of this metadata:
""" __doc__ for long_description goes here; omitted. """
import
re
import
ast
from
setuptools
import
setup
_version_re
=
re
.
compile
(
r
'__version__s+=s+(.*)'
)
with
open
(
'flask/__init__.py'
,
'rb'
)
as
f
:
version
=
str
(
ast
.
literal_eval
(
_version_re
.
search
(
f
.
read
()
.
decode
(
'utf-8'
))
.
group
(
1
)))
setup
(
name
=
'Flask'
,
version
=
version
,
url
=
'http://github.com/pallets/flask/'
,
license
=
'BSD'
,
author
=
'Armin Ronacher'
,
author_email
=
'[email protected]'
,
description
=
'A microframework based on Werkzeug, Jinja2 '
'and good intentions'
,
long_description
=
__doc__
,
packages
=
[
'flask'
,
'flask.ext'
],
include_package_data
=
True
,
zip_safe
=
False
,
platforms
=
'any'
,
install_requires
=
[
'Werkzeug>=0.7'
,
'Jinja2>=2.4'
,
'itsdangerous>=0.21'
,
'click>=2.0'
,
],
classifiers
=
[
'Development Status :: 4 - Beta'
,
'Environment :: Web Environment'
,
'Intended Audience :: Developers'
,
'License :: OSI Approved :: BSD License'
,
'Operating System :: OS Independent'
,
'Programming Language :: Python'
,
'Programming Language :: Python :: 2'
,
'Programming Language :: Python :: 2.6'
,
'Programming Language :: Python :: 2.7'
,
'Programming Language :: Python :: 3'
,
'Programming Language :: Python :: 3.3'
,
'Programming Language :: Python :: 3.4'
,
'Programming Language :: Python :: 3.5'
,
'Topic :: Internet :: WWW/HTTP :: Dynamic Content'
,
'Topic :: Software Development :: Libraries :: Python Modules'
],
entry_points
=
'''
[console_scripts]
flask=flask.cli:main
'''
)
Provide metadata about the distribution, by supplying some of the following named arguments when you call the setup
function. The value you associate with each argument name you supply is a string, intended mostly to be human-readable; the specifications about the string’s format are mostly advisory. The explanations and recommendations about the metadata fields in the following list are also non-normative and correspond only to common, but not universal, conventions. Whenever the following explanations refer to “this distribution,” the phrase can be taken to refer to the material included in the distribution rather than to the packaging of the distribution. The following metadata arguments are required:
author
The name(s) of the author(s) of material included in this distribution. You should always provide this information: authors deserve credit for their work.
author_email
Email address(es) of the author(s) named in the argument author
. You should usually provide this information; however, you may optionally direct people toward a maintainer
and provide maintainer_email
instead.
classifiers
A list of Trove strings to classify your package; each string must be one of those listed at List Classifiers on PyPI.
description
A concise description of this distribution, preferably fitting within one line of 80 characters or less.
long_description
A long description of this distribution, typically as provided in the README file, preferably in .rst format.
license
The licensing terms of this distribution, in a concise form that typically references the full license text included as another distributed file or available at a URL.
name
The name of this distribution as a valid Python identifier (see PEP 426 #name for criteria). If you plan to upload your project to PyPI, this name must not conflict with any project already in the PyPI database.
url
A URL at which more information can be found about this distribution, or None
if no such URL exists.
version
The version of this distribution, normally structured as major.minor
or even more finely. See PEP 440 for recommended versioning schemes.
The following optional arguments may also be provided, if appropriate:
keywords
A list of strings that would likely be searched for by somebody looking for the functionality provided by this distribution. You should provide this information so users can find your package on PyPI or other search engines.
maintainer
The name(s) of the current maintainer(s) of this distribution. You should provide this information when the maintainer is different from the author.
maintainer_email
Email address(es) of the maintainer(s) named in argument maintainer
. You should provide this information only when you supply the maintainer
argument and the maintainer is willing to receive email about this work.
platforms
A list of platforms on which this distribution is known to work. You should provide this information when you have reason to believe this distribution may not work everywhere. This information should be reasonably concise, so the field often references information at a URL or in another distributed file.
A distribution can contain a mix of Python source files, C-coded extensions, and data files. setup
accepts optional named arguments that detail which files to put in the distribution. Whenever you specify file paths, the paths must be relative to the distribution root directory and use /
as the path separator. setuptools
adapts location and separator appropriately when it installs the distribution. Wheels, in particular, do not support absolute paths: all paths are relative to the top-level directory of your package.
The named arguments packages
and py_modules
do not list file paths, but rather Python packages and modules, respectively. Therefore, in the values of these named arguments, don’t use path separators or file extensions. If you list subpackage names in argument packages
, use Python dot syntax instead (e.g., top_package.sub_package
).
By default, setup
looks for Python modules (listed in the value of the named argument py_modules
) in the distribution root directory, and for Python packages (listed in the value of the named argument packages
) as subdirectories of the distribution root directory.
Here are the setup
named arguments you will most frequently use to detail which Python source files are part of the distribution:
entry_points |
The |
||
packages |
You can import and use
|
||
py_modules |
For each module name string |
entry_points
are a way to tell the installer (usually pip
) to register plug-ins, services, or scripts with the OS and, if appropriate, to create a platform-specific executable. The primary entry_points
group
arguments used are console_scripts
(replaces the named argument scripts
, which is deprecated) and gui_scripts
. Other plug-ins and services (e.g., parsers), are also supported, but we do not cover them further in this book; see the Python Packaging User Guide for more detailed information.
When pip
installs a package, it registers each entry point name
with the OS and creates an appropriate executable (including an .exe launcher on Windows), which you can then run by simply entering name
at the terminal prompt, rather than, for example, having to type python -m mymodule
.
Scripts are Python source files that are meant to be run as main programs (see “The Main Program”), generally from the command line. Each script file should have as its first line a shebang line—that is, a line starting with #!
and containing the substring python
. In addition, each script should end with the following code block:
if
__name__
==
'
__main__
'
:
mainfunc
(
)
To have pip
install your script as an executable, list the script in entry_points
under console_scripts
(or gui_scripts
, as appropriate). In addition to, or instead of, the main function of your script, you can use entry_points
to register other functions as script interfaces. Here’s what entry_points
with both console_scripts
and gui_scripts
defined might look like:
entry_points
=
{
'
console_scripts
'
:
[
'
example
=
example:mainfunc
'
,
'
otherfunc
=
example:anotherfunc
'
,
]
,
'
gui_scripts
'
:
[
'
mygui
=
mygui.gui_main:run
'
,
]
,
}
,
After installation, type example
at the terminal prompt to execute mainfunc
in the module example
. If you type otherfunc
, the system executes anotherfunc
, also in the module example
.
To put files of any kind in the distribution, supply the following named arguments. In most cases, you’ll want to use package_data
to list your data files. The named argument data_files
is used for listing files that you want to install to directories outside your package; however, we do not recommend you use it, due to complicated and inconsistent behavior, as described here:
data_files |
The value of named argument At installation time, installing from a wheel places each target directory as a subdirectory of Python’s |
package_data |
The value of named argument |
To put C-coded extensions in the distribution, supply the following named argument:
ext_modules |
|
All the details about each extension are supplied as arguments when instantiating the setuptools.Extension
class. Extension
’s constructor accepts two mandatory arguments and many optional named arguments. The simplest possible example looks something like this:
ext_modules
=
[
Extension
(
'
x
'
,
sources
=
[
'
x
.c
'
]
)
]
The Extension
class constructor is:
Extension |
|
The Extension
class also supports other file extensions besides .c, indicating other languages you may use to code Python extensions. On platforms having a C++ compiler, the file extension .cpp indicates C++ source files. Other file extensions that may be supported, depending on the platform and on various add-ons to setuptools
, include .f for Fortran, .i for SWIG, and .pyx for Cython files. See “Extending Python Without Python’s C API” for information about using different languages to extend Python.
In most cases, your extension needs no further information besides mandatory arguments name
and sources
. Note that you need to list any .h headers in your MANIFEST.in file. setuptools
performs all that is necessary to make the Python headers directory and the Python library available for your extension’s compilation and linking, and provides whatever compiler or linker flags or options are needed to build extensions on a given platform.
When additional information is required to compile and link your extension correctly, you can supply such information via the named arguments of the class Extension
. Such arguments may potentially interfere with the cross-platform portability of your distribution. In particular, whenever you specify file or directory paths as the values of such arguments, the paths should be relative to the distribution root directory. However, when you plan to distribute your extensions to other platforms, you should examine whether you really need to provide build information via named arguments to Extension
. It is sometimes possible to bypass such needs by careful coding at the C level.
Here are the named arguments that you may pass when calling Extension
:
define_macros = [ (
macro_name
,
macro_value
) ... ]
Each of the items macro_name
and macro_value
is a string, respectively the name and value of a C preprocessor macro definition, equivalent in effect to the C preprocessor directive: #define macro_name macro_value
.
macro_value
can also be None
, to get the same effect as the C preprocessor directive: #define macro_name
.
extra_compile_args = [
list of compile_arg
strings
]
Each of the strings listed as the value of extra_compile_args
is placed among the command-line arguments for each invocation of the C compiler.
extra_link_args = [
list of link_arg
strings
]
Each of the strings listed as the value of extra_link_args
is placed among the command-line arguments for the linker.
extra_objects =
[
list of object_name
strings
]
Each of the strings listed as the value of extra_objects
names an object file to link in. Do not specify the file extension as part of the object name: distutils
adds the platform-appropriate file extension (such as .o on Unix-like platforms and .obj on Windows) to help you keep cross-platform portability.
include_dirs = [
list of directory_path
strings
]
Each of the strings listed as the value of include_dirs
identifies a directory to supply to the compiler as one where header files are found.
libraries = [
list of library_name
strings
]
Each of the strings listed as the value of libraries
names a library to link in. Do not specify the file extension or any prefix as part of the library name: distutils
, in cooperation with the linker, adds the platform-appropriate file extension and prefix (such as .a, and a prefix lib, on Unix-like platforms, and .lib on Windows) to help you keep cross-platform portability.
library_dirs = [
list of directory_path
strings
]
Each of the strings listed as the value of library_dirs
identifies a directory to supply to the linker as one where library files are found.
runtime_library_dirs = [
list of directory_path
strings
]
Each of the strings listed as the value of runtime_library_dirs
identifies a directory where dynamically loaded libraries are found at runtime.
undef_macros = [
list of macro_name
strings
]
Each of the strings macro_name
listed as the value of undef_macros
is the name for a C preprocessor macro definition, equivalent in effect to the C preprocessor directive: #undef macro_name
.
You may optionally list dependencies with named arguments in setup.py or in a requirements file (see “The requirements.txt File”):
install_requires |
|
extras_require |
|
You may optionally provide a requirements.txt file. If provided, it contains pip install
calls, one per line. This file is particularly useful for re-creating a particular environment for installation, or forcing the use of certain versions of dependencies.
When the user enters the following at a command prompt, pip
installs all items listed in the file; however, installation is not guaranteed to be in any particular order:
pip install -r requirements.txt
pip
only automatically discovers and installs dependencies listed in install_requires
. extras_require
entries must be manually opted in to at installation time, or be used in the install_requires
of another package’s setup.py (e.g., your package is being used as a library by others). requirements.txt must be manually run by the user. For detailed usage, see the pip
docs.
When you package your source distribution, setuptools
by default inserts the following files in the distribution:
All Python (.py) and C source files explicitly listed in packages
or found by find_packages
in setup.py
Files listed in package_data
and data_files
in setup.py
Scripts or plug-ins defined in entry_points
in setup.py
Test files, located at test/test*.py under the distribution root directory, unless excluded in find_packages
Files README.rst or README.txt (if any), setup.cfg (if any), and setup.py
To add yet more files in the source distribution, place in the distribution root directory a manifest template file named MANIFEST.in, whose lines are rules, applied sequentially, about files to add (include
) or subtract (prune
) from the list of files to place in the distribution. See the Python docs for more info. If you have any C extensions in your project (listed in setup.py named argument ext_modules
), the path to any .h header files must be listed in MANIFEST.in to ensure the headers are included, with a line like graft /dir/*.h
, where dir
is a relative path to the headers.
setup.cfg supplies appropriate defaults for options to build-time commands. For example, you can add the following lines to setup.cfg to always build universal wheels (see “Creating wheels”):
[bdist_wheel]
universal=1
Once you have your setup.py (and other files) in order, distributing your package requires the following steps:
1. Create (“package up”) the distribution into a wheel or other archive format.
2. Register your package, if necessary, to a repository.
3. Upload your package to a repository.
In the past, a packaged “source” distribution (sdist) made with python setup.py sdist
was the most useful file you could produce with distutils
. When you are distributing packages with C extensions for flavors of Linux, you still want to create an sdist. And when you absolutely require absolute paths (rather than relative paths) for installation of certain files (listed in the data_files
argument to setup.py), you need to use an sdist. (See the discussion on data_files
in the Python Packaging User Guide.) But when you are packaging pure Python, or platform-dependent C extensions for macOS or Windows, you can make life much easier for most users by also creating “built” wheels of your distribution.
Wheels are the new, improved way to package up your Python modules and packages for distribution, replacing the previously favorite packaging form known as eggs. Wheels are considered “built” distributions, meaning your users don’t have to go through a “build” step in order to install them. Wheels provide for faster updates than eggs and may be multiplatform compatible because they do not include .pyc files. In addition, your users don’t need a compiler for C extensions on Mac and Windows machines. Finally, wheels are well-supported by PyPI and preferred by pip
.
To build wheels, install the wheel
package by running the following command on a terminal: pip install wheel
.
Wheels are considered pure if they only contain Python code, and nonpure if they contain extensions coded in C (or other programming languages). Pure wheels can be universal if the code can be run with either v2 or v3 without needing 2to3
conversion, as covered in “v2 source with conversion to v3”. Universal wheels are cross-platform (but beware: not all Python built-ins work in exactly the same way on all platforms). Version-specific, pure Python wheels are also cross-platform but require a particular version of Python. When your pure distribution requires 2to3
conversion (or has different v2 and v3 requirements, e.g., setup.py arguments specified in install_requires
), you need to create two wheels, one for v2 and one for v3.
For a pure distribution, supplying wheels is just a matter of convenience for the users. For a nonpure distribution, making built forms available may be more than just an issue of convenience. A nonpure distribution, by definition, includes code that is not pure Python—generally, C code. Unless you supply a built form, users need to have the appropriate C compiler installed in order to build and install your distribution. In addition, installing a source distribution may be rather intricate, particularly for end users who may not be experienced programmers. It is therefore recommended to provide both an sdist, in .tar.gz format, and a wheel (or several) for nonpure packages. For that, you need to have the necessary C compiler installed. Nonpure wheels work only on other computers with the same platform (e.g., macOS, Windows) and architecture (e.g., 32-bit, 64-bit) as they were built on.
In order to create a wheel, in many cases, all you need to run is a single line. For a pure (Python only), universal (works on both v2 and v3 without conversion) distribution, just type the following at your distribution’s top-level directory:
python setup.py bdist_wheel --universal
This creates a wheel that can be installed on any platform. When you are creating a pure package that works on both v2 and v3 but requires 2to3
conversion (or has different arguments in setup.py to, for example, install_requires
for v2 and v3), create two wheels, one for v2 and one for v3, as follows:
python2 setup.py bdist_wheel
python3 setup.py bdist_wheel
For a pure package that can only work on v2 or only on v3, use only the appropriate Python version to create the single wheel. Version-specific pure wheels work on any platform with the appropriate version of Python installed.
bdist_wheel --universal
doesn’t detect whether your package is nonpure or version-specific. In particular, if your package contains C-extensions, you shouldn’t create a universal wheel: pip
would override your platform wheel or sdist to install the universal wheel instead.
Nonpure wheels can be created for macOS or for Windows, but only for the platform being used to create them. Running python setup.py bdist_wheel
(without the --universal
flag) automatically detects the extension(s) and creates the appropriate wheel. One benefit of creating a wheel is that your users are able to pip install
the package, regardless of whether they have a C compiler themselves, as long as they’re running on the same platform as you used to create the wheel.
Unfortunately, distributing nonpure Linux wheels isn’t quite as simple as distributing pure ones, due to variations among Linux distributions, as described in PEP 513. PyPI does not accept nonpure Linux wheels unless they are tagged manylinux. The complex process to create these cross-distribution wheels currently involves Docker images and auditwheel
; check out the manylinux docs for more information.
Once you run python setup.py bdist_wheel
, you will have a wheel named something like mypkg-0.1-py2.py3-none-any.whl in a (new, if you’ve run setup.py for the first time) directory called dist/. For more information on wheel naming and tagging conventions, see PEP 425. To check which files have been inserted after building the wheel, without installing or uncompressing it, you can use:
unzip -l mypkg
because a wheel file is just a zip archive with appropriate metadata.
To create an sdist (source distribution) for your project, type the following in the top level of your package directory:
python setup.py sdist
This creates a .tar.gz file (if it creates a .zip on Windows, you may need to use --formats
to specify the archive format).
Do not attempt to upload both a .zip and a .tar.gz file to PyPI; you’ll get an error. Instead, stick with .tar.gz for most use cases.
Your .tar.gz file may then be uploaded to PyPI or otherwise distributed. Your users will have to unpack and install with, typically, python setup.py install
. More information on source distributions is available in the online docs.
Once you’ve created a wheel or an sdist, you may choose to upload it to a repository for easy distribution to your users. You can upload to a local repository, such as a company’s private repository, or you may upload to a public repository such as PyPI. In the past, setup.py was used to build and immediately upload; however, due to issues with security, this is no longer recommended. Instead, you should use a third-party module such as twine
(or Flit
for extremely simple packages, as covered in the Flit docs). There are plans to eventually merge twine
into pip
: check the Python Packaging User Guide for updated information.
Using twine
is fairly straightforward. Run pip install twine
to download it. You’ll need to create a ~/.pypirc file (which provides repository information), and then it’s a simple command to register or upload your package.
Twine
recommends that you have a ~/.pypirc file (that is, a file named .pypirc, residing in your home directory) that lists information about the repositories you want to upload to, including your username and password. Since .pypirc is typically stored “in clear,” you should set permissions to 600
(on Unix-like systems), or leave the password off and be prompted for it each and every time you run twine
. The file should look something like this:
[distutils]
index-servers=
testpypi
pypi
warehouse
myrepo
[testpypi]
repository=https://testpypi.python.org/pypi
username=yourusername
password=yourpassword
[pypi]
repository=https://pypi.python.org/pypi
username=yourusername
password=yourpassword
[warehouse]
repository=https://upload.pypi.org/legacy/
username=yourusername
password=yourpassword
[myrepo]
repository=https://otherurl/myrepo
username=yourusername
password=yourpassword
If you’ve never used PyPI, create a user account with a username and password. (You may do this during the register step, but it’s best to do it online at PyPI.) You should also create a user account on testpypi, to practice uploading packages to a temporary repository before uploading them to the public repository.
If you are uploading your package to PyPI, you may need to register the package before you upload it the first time. Use the appropriate command to specify the wheel or sdist you’re registering (simply using dist/*
does not work):
twine register -r repo dist/mypkg.whl # for a wheel
or:
twine register -r repo dist/mydist.tar.gz # for an sdist
repo
is the particular repository where you’re registering your package, as listed in your ~/.pypirc file. Alternatively, you may provide the repository URL on the command line with the flag --repository-url
.
Warehouse is the new backend to PyPI. Registration will no longer be required. You can add it to your ~/.pypirc now, as shown in the preceding example. Running twine
without -r
should upload to the correct version of PyPI by default.
Once your package is registered, you may upload it with the following command at the top level of your package; twine
finds and uploads the latest version of your distribution to the repo specified in your .pypirc (or alternatively, you may provide the repository URL on the command line with flag --repository-url
):
twine upload -r repo dist/*
At this point, your users will be able to use pip install yourpkg --user
(or just pip install
into a virtual environment; it’s wiser for them to avoid a nonvenv pip install
, which affects the whole Python installation for all users) to install your package.
The Python Packaging Authority (PyPA) continues to improve Python packaging and distribution. Please join in the effort by contributing to the codebase or documentation, posting bug reports, or joining in the discussions on packaging.