The Python 3 Standard Library by Exa. Chapter 19
Modules and PackagesPython’s primary extension mechanism uses source code saved to modules and incorporated
into a program through the import statement. The features that most developers think of
as “Python” are actually implemented as the collection of modules called the Standard
Library, the subject of this book. Although the import feature is built into the interpreter
itself, the library also includes several modules related to the import process.
The importlib (page 1329) module exposes the underlying implementation of the import mechanism used by the interpreter. It can be used to import modules dynamically at
runtime, instead of using the import statement to load them during start-up. Dynamically
loading modules is useful when the name of a module that needs to be imported is not
known in advance, such as for plug-ins or extensions to an application.
Python packages can include supporting resource files such as templates, default configuration files, images, and other data, along with source code. The interface for accessing
resource files in a portable way is implemented in the pkgutil (page 1334) module. It also
includes support for modifying the import path for a package, so that the contents can be
installed into multiple directories but appear as part of the same package.
zipimport (page 1344) provides a custom importer for modules and packages saved to
ZIP archives. It is used to load Python EGG files, for example, and can also be used as a
convenient way to package and distribute an application.
19.1 importlib: Python’s Import Mechanism
The importlib module includes functions that implement Python’s import mechanism for
loading code in packages and modules. It is one access point to importing modules dynamically, and is useful in some cases where the name of the module that needs to be
imported is unknown when the code is written (for example, for plug-ins or extensions to
an application).
19.1.1 Example Package
The examples in this section use a package called example with __init__.py.
Listing 19.1: example/__init__.py
print('Importing example package')
This package also contains submodule.py.
13291330 Chapter 19 Modules and Packages
Listing 19.2: example/submodule.py
print('Importing submodule')
Watch for the text from the print() calls in the sample output when the package or
module are imported.
19.1.2 Module Types
Python supports several styles of modules. Each requires its own handling when opening the
module and adding it to the namespace, and support for the formats varies by platform.
For example, under Microsoft Windows, shared libraries are loaded from files with the
extensions .dll and .pyd, instead of .so. The extensions for C modules may also change
when using a debug build of the interpreter instead of a normal release build, since they
can be compiled with debug information included as well. If a C extension library or other
module is not loading as expected, use the constants defined in importlib.machinery to find
the supported types for the current platform, as well as the parameters for loading them.
Listing 19.3: importlib_suffixes.py
import importlib.machinery
SUFFIXES = [
('Source:', importlib.machinery.SOURCE_SUFFIXES),
('Debug:',
importlib.machinery.DEBUG_BYTECODE_SUFFIXES),
('Optimized:',
importlib.machinery.OPTIMIZED_BYTECODE_SUFFIXES),
('Bytecode:', importlib.machinery.BYTECODE_SUFFIXES),
('Extension:', importlib.machinery.EXTENSION_SUFFIXES),
]
def main():
tmpl = '{:<10} {}'
for name, value in SUFFIXES:
print(tmpl.format(name, value))
if __name__ == '__main__':
main()
The return value is a sequence of tuples containing the file extension, the mode to use
for opening the file containing the module, and a type code from a constant defined in the
module. The following output is incomplete, because some of the importable module or
package types do not correspond to single files.
$ python3 importlib_suffixes.py
Source: ['.py']19.1 importlib: Python’s Import Mechanism 1331
Debug: ['.pyc']
Optimized: ['.pyc']
Bytecode: ['.pyc']
Extension: ['.cpython-35m-darwin.so', '.abi3.so', '.so']
19.1.3 Importing Modules
The high-level API in importlib simplifies the process of importing a module given an absolute or relative name. When using a relative module name, specify the package containing
the module as a separate argument.
Listing 19.4: importlib_import_module.py
import importlib
m1 = importlib.import_module('example.submodule')
print(m1)
m2 = importlib.import_module('.submodule', package='example')
print(m2)
print(m1 is m2)
The return value from import_module() is the module object that was created by the import.
$ python3 importlib_import_module.py
Importing example package
Importing submodule
<module 'example.submodule' from '.../example/submodule.py'>
<module 'example.submodule' from '.../example/submodule.py'>
True
If the module cannot be imported, import_module() raises ImportError.
Listing 19.5: importlib_import_module_error.py
import importlib
try:
importlib.import_module('example.nosuchmodule')
except ImportError as err:
print('Error:', err)
The error message includes the name of the missing module.1332 Chapter 19 Modules and Packages
$ python3 importlib_import_module_error.py
Importing example package
Error: No module named 'example.nosuchmodule'
To reload an existing module, use reload().
Listing 19.6: importlib_reload.py
import importlib
m1 = importlib.import_module('example.submodule')
print(m1)
m2 = importlib.reload(m1)
print(m1 is m2)
The return value from reload() is the new module. Depending on which type of loader was
used, it may be the same module instance.
$ python3 importlib_reload.py
Importing example package
Importing submodule
<module 'example.submodule' from '.../example/submodule.py'>
Importing submodule
True
19.1.4 Loaders
The lower-level API in importlib provides access to the loader objects, as described in
Section 17.2.6, “Modules and Imports” (page 1200) in the section on the sys module. To
get a loader for a module, use find_loader(). Then, to retrieve the module, use the loader’s
load_module() method.
Listing 19.7: importlib_find_loader.py
import importlib
loader = importlib.find_loader('example')
print('Loader:', loader)
m = loader.load_module()
print('Module:', m)
This example loads the top level of the example package.19.1 importlib: Python’s Import Mechanism 1333
$ python3 importlib_find_loader.py
Loader: <_frozen_importlib_external.SourceFileLoader object at
0x101be0da0>
Importing example package
Module: <module 'example' from '.../example/__init__.py'>
Submodules within packages need to be loaded separately using the path from the
package. In the following example, the package is loaded first, and then its path is passed
to find_loader() to create a loader capable of loading the submodule.
Listing 19.8: importlib_submodule.py
import importlib
pkg_loader = importlib.find_loader('example')
pkg = pkg_loader.load_module()
loader = importlib.find_loader('submodule', pkg.__path__)
print('Loader:', loader)
m = loader.load_module()
print('Module:', m)
Unlike with import_module(), the name of the submodule should be given without any
relative path prefix, since the loader will already be constrained by the package’s path.
$ python3 importlib_submodule.py
Importing example package
Loader: <_frozen_importlib_external.SourceFileLoader object at
0x1012e5390>
Importing submodule
Module: <module 'submodule' from '.../example/submodule.py'>
TIP
Related Reading
• Standard library documentation for importlib.1
• Section 17.2.6, “Modules and Imports” (page 1200): Import hooks, the module search path, and
other related machinery in the sys module.
• inspect (page 1311): Load information from a module programmatically.
1 https://docs.python.org/3.5/library/importlib.html1334 Chapter 19 Modules and Packages
• PEP 3022: New-import hooks.
• PEP 3693: Post-import hooks.
• PEP 4884: Elimination of PYO files.
19.2 pkgutil: Package Utilities
The pkgutil module includes functions for changing the import rules for Python packages
and for loading non-code resources from files distributed within a package.
19.2.1 Package Import Paths
The extend_path() function is used to modify the search path and change the way submodules are imported from within a package so that several different directories can be
combined as though they were one. This function can be used to override installed versions
of packages with development versions, or to combine platform-specific and shared modules
into a single package namespace.
The most common way to call extend_path() is by adding two lines to the __init__.py
inside the package.
import pkgutil
__path__ = pkgutil.extend_path(__path__, __name__)
extend_path() scans sys.path for directories that include a subdirectory whose name is
based on the package given as the second argument. The list of directories is combined with
the path value passed as the first argument and returned as a single list, suitable for use as
the package import path.
The example package called demopkg includes two files, __init__.py and shared.py. The
__init__.py file in demopkg1 contains print statements to show the search path before and
after it is modified, to highlight the differences between these paths.
Listing 19.9: demopkg1/__init__.py
import pkgutil
import pprint
print('demopkg1.__path__ before:')
pprint.pprint(__path__)
print()
__path__ = pkgutil.extend_path(__path__, __name__)
print('demopkg1.__path__ after:')
2 www.python.org/dev/peps/pep-0302
3 www.python.org/dev/peps/pep-0369
4 www.python.org/dev/peps/pep-048819.2 pkgutil: Package Utilities 1335
pprint.pprint(__path__)
print()
The extension directory, with add-on features for demopkg, contains three more source
files. An __init__.py is present at each directory level, as well as a not_shared.py.
$ find extension -name '*.py'
extension/__init__.py
extension/demopkg1/__init__.py
extension/demopkg1/not_shared.py
The next simple test program imports the demopkg1 package.
Listing 19.10: pkgutil_extend_path.py
import demopkg1
print('demopkg1 :', demopkg1.__file__)
try:
import demopkg1.shared
except Exception as err:
print('demopkg1.shared : Not found ({})'.format(err))
else:
print('demopkg1.shared :', demopkg1.shared.__file__)
try:
import demopkg1.not_shared
except Exception as err:
print('demopkg1.not_shared: Not found ({})'.format(err))
else:
print('demopkg1.not_shared:', demopkg1.not_shared.__file__)
When this test program is run directly from the command line, the not_shared module is
not found.
NOTE
The full file system paths in these examples have been shortened to emphasize the parts that change.
$ python3 pkgutil_extend_path.py
demopkg1.__path__ before:
['.../demopkg1']
demopkg1.__path__ after:
['.../demopkg1']
demopkg1 : .../demopkg1/__init__.py1336 Chapter 19 Modules and Packages
demopkg1.shared : .../demopkg1/shared.py
demopkg1.not_shared: Not found (No module named 'demopkg1.not_sh
ared')
However, if the extension directory is added to the PYTHONPATH and the program is run
again, different results are produced.
$ PYTHONPATH=extension python3 pkgutil_extend_path.py
demopkg1.__path__ before:
['.../demopkg1']
demopkg1.__path__ after:
['.../demopkg1',
'.../extension/demopkg1']
demopkg1 : .../demopkg1/__init__.py
demopkg1.shared : .../demopkg1/shared.py
demopkg1.not_shared: .../extension/demopkg1/not_shared.py
The version of demopkg1 inside the extension directory has been added to the search path,
so the not_shared module is found there.
Extending the path in this manner is useful for combining platform-specific versions
of packages with common packages, especially if the platform-specific versions include C
extension modules.
19.2.2 Development Versions of Packages
While creating enhancements to a project, a developer often needs to test changes to an
installed package. Replacing the installed copy with a development version may be a bad
idea, since that version is not necessarily correct and other tools on the system are likely
to depend on the installed package.
A completely separate copy of the package could be configured in a development environment using virtualenv or venv (page 1163). For small modifications, however, the
overhead of setting up a virtual environment with all of the dependencies may be excessive.
Another option is to use pkgutil to modify the module search path for modules that
belong to the package under development. In this case, the path must be reversed so the
development version will override the installed version.
Given a package demopkg2 containing an __init__.py and overloaded.py, with the function under development located in demopkg2/overloaded.py, the installed version contains
Listing 19.11: demopkg2/overloaded.py
def func():
print('This is the installed version of func().')19.2 pkgutil: Package Utilities 1337
and demopkg2/__init__.py contains
Listing 19.12: demopkg2/__init__.py
import pkgutil
__path__ = pkgutil.extend_path(__path__, __name__)
__path__.reverse()
reverse() is used to ensure that any directories added to the search path by pkgutil are
scanned for imports before the default location.
The next program imports demopkg2.overloaded and calls func().
Listing 19.13: pkgutil_devel.py
import demopkg2
print('demopkg2 :', demopkg2.__file__)
import demopkg2.overloaded
print('demopkg2.overloaded:', demopkg2.overloaded.__file__)
print()
demopkg2.overloaded.func()
Running it without any special path treatment produces output from the installed version
of func().
$ python3 pkgutil_devel.py
demopkg2 : .../demopkg2/__init__.py
demopkg2.overloaded: .../demopkg2/overloaded.py
This is the installed version of func().
A development directory containing
$ find develop/demopkg2 -name '*.py'
develop/demopkg2/__init__.py
develop/demopkg2/overloaded.py
and a modified version of overloaded,
Listing 19.14: develop/demopkg2/overloaded.py
def func():
print('This is the development version of func().')1338 Chapter 19 Modules and Packages
will be loaded when the test program is run with the develop directory in the search path.
$ PYTHONPATH=develop python3 pkgutil_devel.py
demopkg2 : .../demopkg2/__init__.py
demopkg2.overloaded: .../develop/demopkg2/overloaded.py
This is the development version of func().
19.2.3 Managing Paths with PKG Files
The first example illustrated how to extend the search path using extra directories included
in the PYTHONPATH. It is also possible to extend the search path by using *.pkg files containing
directory names. PKG files are similar to the PTH files used by the site (page 1169) module.
They can contain directory names, one per line, to be added to the search path for the
package.
Another way to structure the platform-specific portions of the application from the first
example is to use a separate directory for each operating system, and include a .pkg file to
extend the search path.
The next example uses the same demopkg1 files, and also includes the following files.
$ find os_* -type f
os_one/demopkg1/__init__.py
os_one/demopkg1/not_shared.py
os_one/demopkg1.pkg
os_two/demopkg1/__init__.py
os_two/demopkg1/not_shared.py
os_two/demopkg1.pkg
The PKG files are named demopkg1.pkg to match the package being extended. They both
contain one line.
demopkg
This demonstration program shows the version of the module being imported.
Listing 19.15: pkgutil_os_specific.py
import demopkg1
print('demopkg1:', demopkg1.__file__)
import demopkg1.shared
print('demopkg1.shared:', demopkg1.shared.__file__)
import demopkg1.not_shared
print('demopkg1.not_shared:', demopkg1.not_shared.__file__)19.2 pkgutil: Package Utilities 1339
A simple wrapper script can be used to switch between the two packages.
Listing 19.16: with_os.sh
#!/bin/sh
export PYTHONPATH=os_${1}
echo "PYTHONPATH=$PYTHONPATH"
echo
python3 pkgutil_os_specific.py
When this script is run with "one" or "two" as the argument, the path is adjusted.
$ ./with_os.sh one
PYTHONPATH=os_one
demopkg1.__path__ before:
['.../demopkg1']
demopkg1.__path__ after:
['.../demopkg1',
'.../os_one/demopkg1',
'demopkg']
demopkg1: .../demopkg1/__init__.py
demopkg1.shared: .../demopkg1/shared.py
demopkg1.not_shared: .../os_one/demopkg1/not_shared.py
$ ./with_os.sh two
PYTHONPATH=os_two
demopkg1.__path__ before:
['.../demopkg1']
demopkg1.__path__ after:
['.../demopkg1',
'.../os_two/demopkg1',
'demopkg']
demopkg1: .../demopkg1/__init__.py
demopkg1.shared: .../demopkg1/shared.py
demopkg1.not_shared: .../os_two/demopkg1/not_shared.py
PKG files can appear anywhere in the normal search path, so a single PKG file in the
current working directory could also be used to include a development tree.1340 Chapter 19 Modules and Packages
19.2.4 Nested Packages
For nested packages, only the path of the top-level package needs to be modified. As an
example, consider the directory structure
$ find nested -name '*.py'
nested/__init__.py
nested/second/__init__.py
nested/second/deep.py
nested/shallow.py
where nested/__init__.py contains
Listing 19.17: nested/__init__.py
import pkgutil
__path__ = pkgutil.extend_path(__path__, __name__)
__path__.reverse()
and a development tree like
$ find develop/nested -name '*.py'
develop/nested/__init__.py
develop/nested/second/__init__.py
develop/nested/second/deep.py
develop/nested/shallow.py
Both the shallow and deep modules contain a simple function to print out a message
indicating whether they come from the installed or development version. The following test
program exercises the new packages.
Listing 19.18: pkgutil_nested.py
import nested
import nested.shallow
print('nested.shallow:', nested.shallow.__file__)
nested.shallow.func()
print()
import nested.second.deep
print('nested.second.deep:', nested.second.deep.__file__)
nested.second.deep.func()
When pkgutil_nested.py is run without any path manipulation, the installed versions of
both modules are used.19.2 pkgutil: Package Utilities 1341
$ python3 pkgutil_nested.py
nested.shallow: .../nested/shallow.py
This func() comes from the installed version of nested.shallow
nested.second.deep: .../nested/second/deep.py
This func() comes from the installed version of nested.second.de
ep
When the develop directory is added to the path, the development versions of both functions
override the installed versions.
$ PYTHONPATH=develop python3 pkgutil_nested.py
nested.shallow: .../develop/nested/shallow.py
This func() comes from the development version of nested.shallow
nested.second.deep: .../develop/nested/second/deep.py
This func() comes from the development version of nested.second.
deep
19.2.5 Package Data
In addition to code, Python packages can contain data files such as templates, default
configuration files, images, and other supporting files used by the code in the package. The
get_data() function gives access to the data in the files in a format-agnostic way, so it does
not matter if the package is distributed as an EGG, as part of a frozen binary, or as regular
files on the file system.
Suppose the a package pkgwithdata contains a templates directory.
$ find pkgwithdata -type f
pkgwithdata/__init__.py
pkgwithdata/templates/base.html
The file pkgwithdata/templates/base.html contains a simple HTML template.
Listing 19.19: pkgwithdata/templates/base.html
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
<html> <head>
<title>PyMOTW Template</title>
</head>
<body>
<h1>Example Template</h1>1342 Chapter 19 Modules and Packages
<p>This is a sample data file.</p>
</body>
</html>
The following program uses get_data() to retrieve the template contents and print them
out.
Listing 19.20: pkgutil_get_data.py
import pkgutil
template = pkgutil.get_data('pkgwithdata', 'templates/base.html')
print(template.decode('utf-8'))
The arguments to get_data() are the dotted name of the package and a filename relative
to the top of the package. The return value is a byte sequence, so it is decoded from UTF-8
before being printed.
$ python3 pkgutil_get_data.py
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
<html> <head>
<title>PyMOTW Template</title>
</head>
<body>
<h1>Example Template</h1>
<p>This is a sample data file.</p>
</body>
</html>
get_data() is distribution format-agnostic because it uses the import hooks defined in
PEP 302 to access the package contents. Any loader that provides the hooks can be used,
including the ZIP archive importer in zipfile (page 511).
Listing 19.21: pkgutil_get_data_zip.py
import pkgutil
import zipfile
import sys
# Create a ZIP file with code from the current directory
# and the template using a name that does not appear on the
# local file system.
with zipfile.PyZipFile('pkgwithdatainzip.zip', mode='w') as zf:
zf.writepy('.')
zf.write('pkgwithdata/templates/base.html',19.2 pkgutil: Package Utilities 1343
'pkgwithdata/templates/fromzip.html',
)
# Add the ZIP file to the import path.
sys.path.insert(0, 'pkgwithdatainzip.zip')
# Import pkgwithdata to show that it comes from the ZIP archive.
import pkgwithdata
print('Loading pkgwithdata from', pkgwithdata.__file__)
# Print the template body.
print('\nTemplate:')
data = pkgutil.get_data('pkgwithdata', 'templates/fromzip.html')
print(data.decode('utf-8'))
This example uses PyZipFile.writepy() to create a ZIP archive containing a copy of the
pkgwithdata package, including a renamed version of the template file. It then adds the ZIP
archive to the import path, before using pkgutil to load the template and print it. Refer
to the discussion of zipfile (page 511) for more details about using writepy().
$ python3 pkgutil_get_data_zip.py
Loading pkgwithdata from
pkgwithdatainzip.zip/pkgwithdata/__init__.pyc
Template:
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
<html> <head>
<title>PyMOTW Template</title>
</head>
<body>
<h1>Example Template</h1>
<p>This is a sample data file.</p>...
</body>
</html>
TIP
Related Reading
• Standard library documentation for pkgutil.5
• virtualenv6: Ian Bicking’s virtual environment script.
• distutils: Packaging tools from the Python standard library.
5 https://docs.python.org/3.5/library/pkgutil.html
6 http://pypi.python.org/pypi/virtualenv1344 Chapter 19 Modules and Packages
• setuptools7: Next-generation packaging tools.
• PEP 3028: Import Hooks.
• zipfile (page 511): Create importable ZIP archives.
• zipimport (page 1344): Importer for packages in ZIP archives.
... > https://telegra.ph/193-zipimport-Load-Python-Code-from-ZIP-Archives-11-05