19.3 zipimport: Load Python Code from ZIP Archives

19.3 zipimport: Load Python Code from ZIP Archives

The zipimport module implements the zipimporter class, which can be used to find and

load Python modules inside ZIP archives. The zipimporter supports the import hooks API

specified in PEP 302; it is how Python Eggs work.

Using the zipimport module directly is rarely necessary, since importing directly from a

ZIP archive is feasible as long as that archive appears in sys.path. Nevertheless, studying

how the importer API can be used can help a programmer learn the features available and

understand how module importing works. Knowing how the ZIP importer works will also

help when debugging the issues that may come up when distributing applications packaged

as ZIP archives created with zipfile.PyZipFile.

19.3.1 Example

The followiong examples reuse some of the code from the discussion of zipfile (page 511)

to create an example ZIP archive containing a few Python modules.

Listing 19.22: zipimport_make_example.py

import sys

import zipfile

if __name__ == '__main__':

zf = zipfile.PyZipFile('zipimport_example.zip', mode='w')

try:

zf.writepy('.')

zf.write('zipimport_get_source.py')

zf.write('example_package/README.txt')

finally:

zf.close()

for name in zf.namelist():

print(name)

Run zipimport_make_example.py before trying any of the other examples, so as to create

a ZIP archive containing all of the modules in the example directory, along with some test

data needed for the examples in this section.

7 https://setuptools.readthedocs.io/en/latest/

8 www.python.org/dev/peps/pep-030219.3 zipimport: Load Python Code from ZIP Archives 1345

$ python3 zipimport_make_example.py

__init__.pyc

example_package/__init__.pyc

zipimport_find_module.pyc

zipimport_get_code.pyc

zipimport_get_data.pyc

zipimport_get_data_nozip.pyc

zipimport_get_data_zip.pyc

zipimport_get_source.pyc

zipimport_is_package.pyc

zipimport_load_module.pyc

zipimport_make_example.pyc

zipimport_get_source.py

example_package/README.txt

19.3.2 Finding a Module

Given the full name of a module, find_module() will try to locate that module inside the

ZIP archive.

Listing 19.23: zipimport_find_module.py

import zipimport

importer = zipimport.zipimporter('zipimport_example.zip')

for module_name in ['zipimport_find_module', 'not_there']:

print(module_name, ':', importer.find_module(module_name))

If the module is found, the zipimporter instance is returned. Otherwise, None is returned.

$ python3 zipimport_find_module.py

zipimport_find_module : <zipimporter object

"zipimport_example.zip">

not_there : None

19.3.3 Accessing Code

The get_code() method loads the code object for a module from the archive.

Listing 19.24: zipimport_get_code.py

import zipimport

importer = zipimport.zipimporter('zipimport_example.zip')1346 Chapter 19 Modules and Packages

code = importer.get_code('zipimport_get_code')

print(code)

The code object is not the same as a module object, but is used to create one.

$ python3 zipimport_get_code.py

<code object <module> at 0x1012b4ae0, file

"./zipimport_get_code.py", line 6>

To load the code as a usable module, use load_module() instead.

Listing 19.25: zipimport_load_module.py

import zipimport

importer = zipimport.zipimporter('zipimport_example.zip')

module = importer.load_module('zipimport_get_code')

print('Name :', module.__name__)

print('Loader :', module.__loader__)

print('Code :', module.code)

The result is a module object configured as though the code had been loaded from a regular

import.

$ python3 zipimport_load_module.py

<code object <module> at 0x1007b4c00, file

"./zipimport_get_code.py", line 6>

Name : zipimport_get_code

Loader : <zipimporter object "zipimport_example.zip">

Code : <code object <module> at 0x1007b4c00, file

"./zipimport_get_code.py", line 6>

19.3.4 Source

As with the inspect (page 1311) module, it is possible to retrieve the source code for a module from the ZIP archive with the zipimport module, if the archive includes the source. In

the following example, only zipimport_get_source.py is added to zipimport_example.zip;

the rest of the modules are just added as the .pyc files.

Listing 19.26: zipimport_get_source.py

import zipimport

modules = [

'zipimport_get_code',19.3 zipimport: Load Python Code from ZIP Archives 1347

'zipimport_get_source',

]

importer = zipimport.zipimporter('zipimport_example.zip')

for module_name in modules:

source = importer.get_source(module_name)

print('=' * 80)

print(module_name)

print('=' * 80)

print(source)

print()

If the source for a module is not available, get_source() returns None.

$ python3 zipimport_get_source.py

================================================================

zipimport_get_code

================================================================

None

================================================================

zipimport_get_source

================================================================

#!/usr/bin/env python3

#

# Copyright 2007 Doug Hellmann.

#

"""Retrieving the source code for a module within a zip archive.

"""

#end_pymotw_header

import zipimport

modules = [

'zipimport_get_code',

'zipimport_get_source',

]

importer = zipimport.zipimporter('zipimport_example.zip')

for module_name in modules:

source = importer.get_source(module_name)

print('=' * 80)

print(module_name)

print('=' * 80)

print(source)

print()1348 Chapter 19 Modules and Packages

19.3.5 Packages

To determine whether a name refers to a package instead of a regular module, use

is_package().

Listing 19.27: zipimport_is_package.py

import zipimport

importer = zipimport.zipimporter('zipimport_example.zip')

for name in ['zipimport_is_package', 'example_package']:

print(name, importer.is_package(name))

In this case, zipimport_is_package came from a module and the example_package is a

package.

$ python3 zipimport_is_package.py

zipimport_is_package False

example_package True

19.3.6 Data

Sometimes source modules or packages need to be distributed with non-code data. Images,

configuration files, default data, and test fixtures are just a few examples of these types of

data. Frequently, the module __path__ or __file__ attributes are used to find these data

files relative to where the code is installed.

For example, with a “normal” module, the file system path can be constructed from the

__file__ attribute of the imported package as in the following code.

Listing 19.28: zipimport_get_data_nozip.py

import os

import example_package

# Find the directory containing the imported

# package and build the data filename from it.

pkg_dir = os.path.dirname(example_package.__file__)

data_filename = os.path.join(pkg_dir, 'README.txt')

# Read the file and show its contents.

print(data_filename, ':')

print(open(data_filename, 'r').read())

The output will depend on where the sample code is located on the file system.

$ python3 zipimport_get_data_nozip.py

.../example_package/README.txt :19.3 zipimport: Load Python Code from ZIP Archives 1349

This file represents sample data which could be embedded in the

ZIP archive. You could include a configuration file, images, or

any other sort of noncode data.

If the example_package is imported from the ZIP archive instead of the file system, using

__file__ does not work.

Listing 19.29: zipimport_get_data_zip.py

import sys

sys.path.insert(0, 'zipimport_example.zip')

import os

import example_package

print(example_package.__file__)

data_filename = os.path.join(

os.path.dirname(example_package.__file__),

'README.txt',

)

print(data_filename, ':')

print(open(data_filename, 'rt').read())

The __file__ of the package refers to the ZIP archive, rather than a directory, so building

up the path to the README.txt file gives the wrong value.

$ python3 zipimport_get_data_zip.py

zipimport_example.zip/example_package/__init__.pyc

zipimport_example.zip/example_package/README.txt :

Traceback (most recent call last):

File "zipimport_get_data_zip.py", line 20, in <module>

print(open(data_filename, 'rt').read())

NotADirectoryError: [Errno 20] Not a directory:

'zipimport_example.zip/example_package/README.txt'

A more reliable way to retrieve the file is to use the get_data() method. The zipimporter

instance that loaded the module can be accessed through the __loader__ attribute of the

imported module.

Listing 19.30: zipimport_get_data.py

import sys

sys.path.insert(0, 'zipimport_example.zip')

import os

import example_package

print(example_package.__file__)

data = example_package.__loader__.get_data(1350 Chapter 19 Modules and Packages

'example_package/README.txt')

print(data.decode('utf-8'))

pkgutil.get_data() uses this interface to access data from within a package. The value

returned is a byte string, which needs to be decoded to a Unicode string before it is printed.

$ python3 zipimport_get_data.py

zipimport_example.zip/example_package/__init__.pyc

This file represents sample data which could be embedded in the

ZIP archive. You could include a configuration file, images, or

any other sort of noncode data.

The __loader__ is not set for modules not imported via zipimport.

TIP

Related Reading

• Standard library documentation for zipimport.9

• Python 2 to 3 porting notes for zipimport (page 1365).

• imp: Other import-related functions.

• pkgutil (page 1334): Provides a more generic interface to get_data().

• zipfile (page 511): Read and write ZIP archive files.

• PEP 30210: New Import Hooks.

9 https://docs.python.org/3.5/library/zipimport.html

10 www.python.org/dev/peps/pep-0301330 Chapter 19 Modules and Packages

Listing 19.2: example/submodule.py

print('Importing submodule')

Watch for the text from the print() calls in the sample output when the package or

module are imported.

19.1.2 Module Types

Python supports several styles of modules. Each requires its own handling when opening the

module and adding it to the namespace, and support for the formats varies by platform.

For example, under Microsoft Windows, shared libraries are loaded from files with the

extensions .dll and .pyd, instead of .so. The extensions for C modules may also change

when using a debug build of the interpreter instead of a normal release build, since they

can be compiled with debug information included as well. If a C extension library or other

module is not loading as expected, use the constants defined in importlib.machinery to find

the supported types for the current platform, as well as the parameters for loading them.

Listing 19.3: importlib_suffixes.py

import importlib.machinery

SUFFIXES = [

('Source:', importlib.machinery.SOURCE_SUFFIXES),

('Debug:',

importlib.machinery.DEBUG_BYTECODE_SUFFIXES),

('Optimized:',

importlib.machinery.OPTIMIZED_BYTECODE_SUFFIXES),

('Bytecode:', importlib.machinery.BYTECODE_SUFFIXES),

('Extension:', importlib.machinery.EXTENSION_SUFFIXES),

]

def main():

tmpl = '{:<10} {}'

for name, value in SUFFIXES:

print(tmpl.format(name, value))

if __name__ == '__main__':

main()

The return value is a sequence of tuples containing the file extension, the mode to use

for opening the file containing the module, and a type code from a constant defined in the

module. The following output is incomplete, because some of the importable module or

package types do not correspond to single files.

$ python3 importlib_suffixes.py

Source: ['.py']19.1 importlib: Python’s Import Mechanism 1331

Debug: ['.pyc']

Optimized: ['.pyc']

Bytecode: ['.pyc']

Extension: ['.cpython-35m-darwin.so', '.abi3.so', '.so']

19.1.3 Importing Modules

The high-level API in importlib simplifies the process of importing a module given an absolute or relative name. When using a relative module name, specify the package containing

the module as a separate argument.

Listing 19.4: importlib_import_module.py

import importlib

m1 = importlib.import_module('example.submodule')

print(m1)

m2 = importlib.import_module('.submodule', package='example')

print(m2)

print(m1 is m2)

The return value from import_module() is the module object that was created by the import.

$ python3 importlib_import_module.py

Importing example package

Importing submodule

<module 'example.submodule' from '.../example/submodule.py'>

<module 'example.submodule' from '.../example/submodule.py'>

True

If the module cannot be imported, import_module() raises ImportError.

Listing 19.5: importlib_import_module_error.py

import importlib

try:

importlib.import_module('example.nosuchmodule')

except ImportError as err:

print('Error:', err)

The error message includes the name of the missing module.1332 Chapter 19 Modules and Packages

$ python3 importlib_import_module_error.py

Importing example package

Error: No module named 'example.nosuchmodule'

To reload an existing module, use reload().

Listing 19.6: importlib_reload.py

import importlib

m1 = importlib.import_module('example.submodule')

print(m1)

m2 = importlib.reload(m1)

print(m1 is m2)

The return value from reload() is the new module. Depending on which type of loader was

used, it may be the same module instance.

$ python3 importlib_reload.py

Importing example package

Importing submodule

<module 'example.submodule' from '.../example/submodule.py'>

Importing submodule

True

19.1.4 Loaders

The lower-level API in importlib provides access to the loader objects, as described in

Section 17.2.6, “Modules and Imports” (page 1200) in the section on the sys module. To

get a loader for a module, use find_loader(). Then, to retrieve the module, use the loader’s

load_module() method.

Listing 19.7: importlib_find_loader.py

import importlib

loader = importlib.find_loader('example')

print('Loader:', loader)

m = loader.load_module()

print('Module:', m)

This example loads the top level of the example package.19.1 importlib: Python’s Import Mechanism 1333

$ python3 importlib_find_loader.py

Loader: <_frozen_importlib_external.SourceFileLoader object at

0x101be0da0>

Importing example package

Module: <module 'example' from '.../example/__init__.py'>

Submodules within packages need to be loaded separately using the path from the

package. In the following example, the package is loaded first, and then its path is passed

to find_loader() to create a loader capable of loading the submodule.

Listing 19.8: importlib_submodule.py

import importlib

pkg_loader = importlib.find_loader('example')

pkg = pkg_loader.load_module()

loader = importlib.find_loader('submodule', pkg.__path__)

print('Loader:', loader)

m = loader.load_module()

print('Module:', m)

Unlike with import_module(), the name of the submodule should be given without any

relative path prefix, since the loader will already be constrained by the package’s path.

$ python3 importlib_submodule.py

Importing example package

Loader: <_frozen_importlib_external.SourceFileLoader object at

0x1012e5390>

Importing submodule

Module: <module 'submodule' from '.../example/submodule.py'>

TIP

Related Reading

• Standard library documentation for importlib.1

• Section 17.2.6, “Modules and Imports” (page 1200): Import hooks, the module search path, and

other related machinery in the sys module.

• inspect (page 1311): Load information from a module programmatically.

1 https://docs.python.org/3.5/library/importlib.html1334 Chapter 19 Modules and Packages

• PEP 3022: New-import hooks.

• PEP 3693: Post-import hooks.

• PEP 4884: Elimination of PYO files.

19.2 pkgutil: Package Utilities

The pkgutil module includes functions for changing the import rules for Python packages

and for loading non-code resources from files distributed within a package.

19.2.1 Package Import Paths

The extend_path() function is used to modify the search path and change the way submodules are imported from within a package so that several different directories can be

combined as though they were one. This function can be used to override installed versions

of packages with development versions, or to combine platform-specific and shared modules

into a single package namespace.

The most common way to call extend_path() is by adding two lines to the __init__.py

inside the package.

import pkgutil

__path__ = pkgutil.extend_path(__path__, __name__)

extend_path() scans sys.path for directories that include a subdirectory whose name is

based on the package given as the second argument. The list of directories is combined with

the path value passed as the first argument and returned as a single list, suitable for use as

the package import path.

The example package called demopkg includes two files, __init__.py and shared.py. The

__init__.py file in demopkg1 contains print statements to show the search path before and

after it is modified, to highlight the differences between these paths.

Listing 19.9: demopkg1/__init__.py

import pkgutil

import pprint

print('demopkg1.__path__ before:')

pprint.pprint(__path__)

print()

__path__ = pkgutil.extend_path(__path__, __name__)

print('demopkg1.__path__ after:')

2 www.python.org/dev/peps/pep-0302

3 www.python.org/dev/peps/pep-0369

4 www.python.org/dev/peps/pep-048819.2 pkgutil: Package Utilities 1335

pprint.pprint(__path__)

print()

The extension directory, with add-on features for demopkg, contains three more source

files. An __init__.py is present at each directory level, as well as a not_shared.py.

$ find extension -name '*.py'

extension/__init__.py

extension/demopkg1/__init__.py

extension/demopkg1/not_shared.py

The next simple test program imports the demopkg1 package.

Listing 19.10: pkgutil_extend_path.py

import demopkg1

print('demopkg1 :', demopkg1.__file__)

try:

import demopkg1.shared

except Exception as err:

print('demopkg1.shared : Not found ({})'.format(err))

else:

print('demopkg1.shared :', demopkg1.shared.__file__)

try:

import demopkg1.not_shared

except Exception as err:

print('demopkg1.not_shared: Not found ({})'.format(err))

else:

print('demopkg1.not_shared:', demopkg1.not_shared.__file__)

When this test program is run directly from the command line, the not_shared module is

not found.

NOTE

The full file system paths in these examples have been shortened to emphasize the parts that change.

$ python3 pkgutil_extend_path.py

demopkg1.__path__ before:

['.../demopkg1']

demopkg1.__path__ after:

['.../demopkg1']

demopkg1 : .../demopkg1/__init__.py1336 Chapter 19 Modules and Packages

demopkg1.shared : .../demopkg1/shared.py

demopkg1.not_shared: Not found (No module named 'demopkg1.not_sh

ared')

However, if the extension directory is added to the PYTHONPATH and the program is run

again, different results are produced.

$ PYTHONPATH=extension python3 pkgutil_extend_path.py

demopkg1.__path__ before:

['.../demopkg1']

demopkg1.__path__ after:

['.../demopkg1',

'.../extension/demopkg1']

demopkg1 : .../demopkg1/__init__.py

demopkg1.shared : .../demopkg1/shared.py

demopkg1.not_shared: .../extension/demopkg1/not_shared.py

The version of demopkg1 inside the extension directory has been added to the search path,

so the not_shared module is found there.

Extending the path in this manner is useful for combining platform-specific versions

of packages with common packages, especially if the platform-specific versions include C

extension modules.

19.2.2 Development Versions of Packages

While creating enhancements to a project, a developer often needs to test changes to an

installed package. Replacing the installed copy with a development version may be a bad

idea, since that version is not necessarily correct and other tools on the system are likely

to depend on the installed package.

A completely separate copy of the package could be configured in a development environment using virtualenv or venv (page 1163). For small modifications, however, the

overhead of setting up a virtual environment with all of the dependencies may be excessive.

Another option is to use pkgutil to modify the module search path for modules that

belong to the package under development. In this case, the path must be reversed so the

development version will override the installed version.

Given a package demopkg2 containing an __init__.py and overloaded.py, with the function under development located in demopkg2/overloaded.py, the installed version contains

Listing 19.11: demopkg2/overloaded.py

def func():

print('This is the installed version of func().')19.2 pkgutil: Package Utilities 1337

and demopkg2/__init__.py contains

Listing 19.12: demopkg2/__init__.py

import pkgutil

__path__ = pkgutil.extend_path(__path__, __name__)

__path__.reverse()

reverse() is used to ensure that any directories added to the search path by pkgutil are

scanned for imports before the default location.

The next program imports demopkg2.overloaded and calls func().

Listing 19.13: pkgutil_devel.py

import demopkg2

print('demopkg2 :', demopkg2.__file__)

import demopkg2.overloaded

print('demopkg2.overloaded:', demopkg2.overloaded.__file__)

print()

demopkg2.overloaded.func()

Running it without any special path treatment produces output from the installed version

of func().

$ python3 pkgutil_devel.py

demopkg2 : .../demopkg2/__init__.py

demopkg2.overloaded: .../demopkg2/overloaded.py

This is the installed version of func().

A development directory containing

$ find develop/demopkg2 -name '*.py'

develop/demopkg2/__init__.py

develop/demopkg2/overloaded.py

and a modified version of overloaded,

Listing 19.14: develop/demopkg2/overloaded.py

def func():

print('This is the development version of func().')1338 Chapter 19 Modules and Packages

will be loaded when the test program is run with the develop directory in the search path.

$ PYTHONPATH=develop python3 pkgutil_devel.py

demopkg2 : .../demopkg2/__init__.py

demopkg2.overloaded: .../develop/demopkg2/overloaded.py

This is the development version of func().

19.2.3 Managing Paths with PKG Files

The first example illustrated how to extend the search path using extra directories included

in the PYTHONPATH. It is also possible to extend the search path by using *.pkg files containing

directory names. PKG files are similar to the PTH files used by the site (page 1169) module.

They can contain directory names, one per line, to be added to the search path for the

package.

Another way to structure the platform-specific portions of the application from the first

example is to use a separate directory for each operating system, and include a .pkg file to

extend the search path.

The next example uses the same demopkg1 files, and also includes the following files.

$ find os_* -type f

os_one/demopkg1/__init__.py

os_one/demopkg1/not_shared.py

os_one/demopkg1.pkg

os_two/demopkg1/__init__.py

os_two/demopkg1/not_shared.py

os_two/demopkg1.pkg

The PKG files are named demopkg1.pkg to match the package being extended. They both

contain one line.

demopkg

This demonstration program shows the version of the module being imported.

Listing 19.15: pkgutil_os_specific.py

import demopkg1

print('demopkg1:', demopkg1.__file__)

import demopkg1.shared

print('demopkg1.shared:', demopkg1.shared.__file__)

import demopkg1.not_shared

print('demopkg1.not_shared:', demopkg1.not_shared.__file__)19.2 pkgutil: Package Utilities 1339

A simple wrapper script can be used to switch between the two packages.

Listing 19.16: with_os.sh

#!/bin/sh

export PYTHONPATH=os_${1}

echo "PYTHONPATH=$PYTHONPATH"

echo

python3 pkgutil_os_specific.py

When this script is run with "one" or "two" as the argument, the path is adjusted.

$ ./with_os.sh one

PYTHONPATH=os_one

demopkg1.__path__ before:

['.../demopkg1']

demopkg1.__path__ after:

['.../demopkg1',

'.../os_one/demopkg1',

'demopkg']

demopkg1: .../demopkg1/__init__.py

demopkg1.shared: .../demopkg1/shared.py

demopkg1.not_shared: .../os_one/demopkg1/not_shared.py

$ ./with_os.sh two

PYTHONPATH=os_two

demopkg1.__path__ before:

['.../demopkg1']

demopkg1.__path__ after:

['.../demopkg1',

'.../os_two/demopkg1',

'demopkg']

demopkg1: .../demopkg1/__init__.py

demopkg1.shared: .../demopkg1/shared.py

demopkg1.not_shared: .../os_two/demopkg1/not_shared.py

PKG files can appear anywhere in the normal search path, so a single PKG file in the

current working directory could also be used to include a development tree.1340 Chapter 19 Modules and Packages

19.2.4 Nested Packages

For nested packages, only the path of the top-level package needs to be modified. As an

example, consider the directory structure

$ find nested -name '*.py'

nested/__init__.py

nested/second/__init__.py

nested/second/deep.py

nested/shallow.py

where nested/__init__.py contains

Listing 19.17: nested/__init__.py

import pkgutil

__path__ = pkgutil.extend_path(__path__, __name__)

__path__.reverse()

and a development tree like

$ find develop/nested -name '*.py'

develop/nested/__init__.py

develop/nested/second/__init__.py

develop/nested/second/deep.py

develop/nested/shallow.py

Both the shallow and deep modules contain a simple function to print out a message

indicating whether they come from the installed or development version. The following test

program exercises the new packages.

Listing 19.18: pkgutil_nested.py

import nested

import nested.shallow

print('nested.shallow:', nested.shallow.__file__)

nested.shallow.func()

print()

import nested.second.deep

print('nested.second.deep:', nested.second.deep.__file__)

nested.second.deep.func()

When pkgutil_nested.py is run without any path manipulation, the installed versions of

both modules are used.19.2 pkgutil: Package Utilities 1341

$ python3 pkgutil_nested.py

nested.shallow: .../nested/shallow.py

This func() comes from the installed version of nested.shallow

nested.second.deep: .../nested/second/deep.py

This func() comes from the installed version of nested.second.de

ep

When the develop directory is added to the path, the development versions of both functions

override the installed versions.

$ PYTHONPATH=develop python3 pkgutil_nested.py

nested.shallow: .../develop/nested/shallow.py

This func() comes from the development version of nested.shallow

nested.second.deep: .../develop/nested/second/deep.py

This func() comes from the development version of nested.second.

deep

19.2.5 Package Data

In addition to code, Python packages can contain data files such as templates, default

configuration files, images, and other supporting files used by the code in the package. The

get_data() function gives access to the data in the files in a format-agnostic way, so it does

not matter if the package is distributed as an EGG, as part of a frozen binary, or as regular

files on the file system.

Suppose the a package pkgwithdata contains a templates directory.

$ find pkgwithdata -type f

pkgwithdata/__init__.py

pkgwithdata/templates/base.html

The file pkgwithdata/templates/base.html contains a simple HTML template.

Listing 19.19: pkgwithdata/templates/base.html

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">

<html> <head>

<title>PyMOTW Template</title>

</head>

<body>

<h1>Example Template</h1>1342 Chapter 19 Modules and Packages

<p>This is a sample data file.</p>

</body>

</html>

The following program uses get_data() to retrieve the template contents and print them

out.

Listing 19.20: pkgutil_get_data.py

import pkgutil

template = pkgutil.get_data('pkgwithdata', 'templates/base.html')

print(template.decode('utf-8'))

The arguments to get_data() are the dotted name of the package and a filename relative

to the top of the package. The return value is a byte sequence, so it is decoded from UTF-8

before being printed.

$ python3 pkgutil_get_data.py

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">

<html> <head>

<title>PyMOTW Template</title>

</head>

<body>

<h1>Example Template</h1>

<p>This is a sample data file.</p>

</body>

</html>

get_data() is distribution format-agnostic because it uses the import hooks defined in

PEP 302 to access the package contents. Any loader that provides the hooks can be used,

including the ZIP archive importer in zipfile (page 511).

Listing 19.21: pkgutil_get_data_zip.py

import pkgutil

import zipfile

import sys

# Create a ZIP file with code from the current directory

# and the template using a name that does not appear on the

# local file system.

with zipfile.PyZipFile('pkgwithdatainzip.zip', mode='w') as zf:

zf.writepy('.')

zf.write('pkgwithdata/templates/base.html',19.2 pkgutil: Package Utilities 1343

'pkgwithdata/templates/fromzip.html',

)

# Add the ZIP file to the import path.

sys.path.insert(0, 'pkgwithdatainzip.zip')

# Import pkgwithdata to show that it comes from the ZIP archive.

import pkgwithdata

print('Loading pkgwithdata from', pkgwithdata.__file__)

# Print the template body.

print('\nTemplate:')

data = pkgutil.get_data('pkgwithdata', 'templates/fromzip.html')

print(data.decode('utf-8'))

This example uses PyZipFile.writepy() to create a ZIP archive containing a copy of the

pkgwithdata package, including a renamed version of the template file. It then adds the ZIP

archive to the import path, before using pkgutil to load the template and print it. Refer

to the discussion of zipfile (page 511) for more details about using writepy().

$ python3 pkgutil_get_data_zip.py

Loading pkgwithdata from

pkgwithdatainzip.zip/pkgwithdata/__init__.pyc

Template:

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">

<html> <head>

<title>PyMOTW Template</title>

</head>

<body>

<h1>Example Template</h1>

<p>This is a sample data file.</p>

</body>

</html>

TIP

Related Reading

• Standard library documentation for pkgutil.5

• virtualenv6: Ian Bicking’s virtual environment script.

• distutils: Packaging tools from the Python standard library.

5 https://docs.python.org/3.5/library/pkgutil.html

6 http://pypi.python.org/pypi/virtualenv1344 Chapter 19 Modules and Packages

• setuptools7: Next-generation packaging tools.

• PEP 3028: Import Hooks.

• zipfile (page 511): Create importable ZIP archives.

• zipimport (page 1344): Importer for packages in ZIP archives.

19.3 zipimport: Load Python Code from ZIP Archives

The zipimport module implements the zipimporter class, which can be used to find and

load Python modules inside ZIP archives. The zipimporter supports the import hooks API

specified in PEP 302; it is how Python Eggs work.

Using the zipimport module directly is rarely necessary, since importing directly from a

ZIP archive is feasible as long as that archive appears in sys.path. Nevertheless, studying

how the importer API can be used can help a programmer learn the features available and

understand how module importing works. Knowing how the ZIP importer works will also

help when debugging the issues that may come up when distributing applications packaged

as ZIP archives created with zipfile.PyZipFile.

19.3.1 Example

The followiong examples reuse some of the code from the discussion of zipfile (page 511)

to create an example ZIP archive containing a few Python modules.

Listing 19.22: zipimport_make_example.py

import sys

import zipfile

if __name__ == '__main__':

zf = zipfile.PyZipFile('zipimport_example.zip', mode='w')

try:

zf.writepy('.')

zf.write('zipimport_get_source.py')

zf.write('example_package/README.txt')

finally:

zf.close()

for name in zf.namelist():

print(name)

Run zipimport_make_example.py before trying any of the other examples, so as to create

a ZIP archive containing all of the modules in the example directory, along with some test

data needed for the examples in this section.

7 https://setuptools.readthedocs.io/en/latest/

8 www.python.org/dev/peps/pep-030219.3 zipimport: Load Python Code from ZIP Archives 1345

$ python3 zipimport_make_example.py

__init__.pyc

example_package/__init__.pyc

zipimport_find_module.pyc

zipimport_get_code.pyc

zipimport_get_data.pyc

zipimport_get_data_nozip.pyc

zipimport_get_data_zip.pyc

zipimport_get_source.pyc

zipimport_is_package.pyc

zipimport_load_module.pyc

zipimport_make_example.pyc

zipimport_get_source.py

example_package/README.txt

19.3.2 Finding a Module

Given the full name of a module, find_module() will try to locate that module inside the

ZIP archive.

Listing 19.23: zipimport_find_module.py

import zipimport

importer = zipimport.zipimporter('zipimport_example.zip')

for module_name in ['zipimport_find_module', 'not_there']:

print(module_name, ':', importer.find_module(module_name))

If the module is found, the zipimporter instance is returned. Otherwise, None is returned.

$ python3 zipimport_find_module.py

zipimport_find_module : <zipimporter object

"zipimport_example.zip">

not_there : None

19.3.3 Accessing Code

The get_code() method loads the code object for a module from the archive.

Listing 19.24: zipimport_get_code.py

import zipimport

importer = zipimport.zipimporter('zipimport_example.zip')1346 Chapter 19 Modules and Packages

code = importer.get_code('zipimport_get_code')

print(code)

The code object is not the same as a module object, but is used to create one.

$ python3 zipimport_get_code.py

<code object <module> at 0x1012b4ae0, file

"./zipimport_get_code.py", line 6>

To load the code as a usable module, use load_module() instead.

Listing 19.25: zipimport_load_module.py

import zipimport

importer = zipimport.zipimporter('zipimport_example.zip')

module = importer.load_module('zipimport_get_code')

print('Name :', module.__name__)

print('Loader :', module.__loader__)

print('Code :', module.code)

The result is a module object configured as though the code had been loaded from a regular

import.

$ python3 zipimport_load_module.py

<code object <module> at 0x1007b4c00, file

"./zipimport_get_code.py", line 6>

Name : zipimport_get_code

Loader : <zipimporter object "zipimport_example.zip">

Code : <code object <module> at 0x1007b4c00, file

"./zipimport_get_code.py", line 6>

19.3.4 Source

As with the inspect (page 1311) module, it is possible to retrieve the source code for a module from the ZIP archive with the zipimport module, if the archive includes the source. In

the following example, only zipimport_get_source.py is added to zipimport_example.zip;

the rest of the modules are just added as the .pyc files.

Listing 19.26: zipimport_get_source.py

import zipimport

modules = [

'zipimport_get_code',19.3 zipimport: Load Python Code from ZIP Archives 1347

'zipimport_get_source',

]

importer = zipimport.zipimporter('zipimport_example.zip')

for module_name in modules:

source = importer.get_source(module_name)

print('=' * 80)

print(module_name)

print('=' * 80)

print(source)

print()

If the source for a module is not available, get_source() returns None.

$ python3 zipimport_get_source.py

================================================================

zipimport_get_code

================================================================

None

================================================================

zipimport_get_source

================================================================

#!/usr/bin/env python3

#

# Copyright 2007 Doug Hellmann.

#

"""Retrieving the source code for a module within a zip archive.

"""

#end_pymotw_header

import zipimport

modules = [

'zipimport_get_code',

'zipimport_get_source',

]

importer = zipimport.zipimporter('zipimport_example.zip')

for module_name in modules:

source = importer.get_source(module_name)

print('=' * 80)

print(module_name)

print('=' * 80)

print(source)

print()1348 Chapter 19 Modules and Packages

19.3.5 Packages

To determine whether a name refers to a package instead of a regular module, use

is_package().

Listing 19.27: zipimport_is_package.py

import zipimport

importer = zipimport.zipimporter('zipimport_example.zip')

for name in ['zipimport_is_package', 'example_package']:

print(name, importer.is_package(name))

In this case, zipimport_is_package came from a module and the example_package is a

package.

$ python3 zipimport_is_package.py

zipimport_is_package False

example_package True

19.3.6 Data

Sometimes source modules or packages need to be distributed with non-code data. Images,

configuration files, default data, and test fixtures are just a few examples of these types of

data. Frequently, the module __path__ or __file__ attributes are used to find these data

files relative to where the code is installed.

For example, with a “normal” module, the file system path can be constructed from the

__file__ attribute of the imported package as in the following code.

Listing 19.28: zipimport_get_data_nozip.py

import os

import example_package

# Find the directory containing the imported

# package and build the data filename from it.

pkg_dir = os.path.dirname(example_package.__file__)

data_filename = os.path.join(pkg_dir, 'README.txt')

# Read the file and show its contents.

print(data_filename, ':')

print(open(data_filename, 'r').read())

The output will depend on where the sample code is located on the file system.

$ python3 zipimport_get_data_nozip.py

.../example_package/README.txt :19.3 zipimport: Load Python Code from ZIP Archives 1349

This file represents sample data which could be embedded in the

ZIP archive. You could include a configuration file, images, or

any other sort of noncode data.

If the example_package is imported from the ZIP archive instead of the file system, using

__file__ does not work.

Listing 19.29: zipimport_get_data_zip.py

import sys

sys.path.insert(0, 'zipimport_example.zip')

import os

import example_package

print(example_package.__file__)

data_filename = os.path.join(

os.path.dirname(example_package.__file__),

'README.txt',

)

print(data_filename, ':')

print(open(data_filename, 'rt').read())

The __file__ of the package refers to the ZIP archive, rather than a directory, so building

up the path to the README.txt file gives the wrong value.

$ python3 zipimport_get_data_zip.py

zipimport_example.zip/example_package/__init__.pyc

zipimport_example.zip/example_package/README.txt :

Traceback (most recent call last):

File "zipimport_get_data_zip.py", line 20, in <module>

print(open(data_filename, 'rt').read())

NotADirectoryError: [Errno 20] Not a directory:

'zipimport_example.zip/example_package/README.txt'

A more reliable way to retrieve the file is to use the get_data() method. The zipimporter

instance that loaded the module can be accessed through the __loader__ attribute of the

imported module.

Listing 19.30: zipimport_get_data.py

import sys

sys.path.insert(0, 'zipimport_example.zip')

import os

import example_package

print(example_package.__file__)

data = example_package.__loader__.get_data(1350 Chapter 19 Modules and Packages

'example_package/README.txt')

print(data.decode('utf-8'))

pkgutil.get_data() uses this interface to access data from within a package. The value

returned is a byte string, which needs to be decoded to a Unicode string before it is printed.

$ python3 zipimport_get_data.py

zipimport_example.zip/example_package/__init__.pyc

This file represents sample data which could be embedded in the

ZIP archive. You could include a configuration file, images, or

any other sort of noncode data.

The __loader__ is not set for modules not imported via zipimport.

TIP

Related Reading

• Standard library documentation for zipimport.9

• Python 2 to 3 porting notes for zipimport (page 1365).

• imp: Other import-related functions.

• pkgutil (page 1334): Provides a more generic interface to get_data().

• zipfile (page 511): Read and write ZIP archive files.

• PEP 30210: New Import Hooks.

9 https://docs.python.org/3.5/library/zipimport.html

10 www.python.org/dev/peps/pep-030



Report Page