Schroedinger's importer
27th August, 2018
A Python package I'm working on combines several submodules with mixed licensing — some will be open source and redistributed, others will be proprietary and in-house only. I wanted to ease importing of the package by automatically detecting which submodules are present, and dynamically importing only those.
We wanted the option to include or exclude modules from redistribution by simply removing them. For example, we might have a directory structure like this:
package |-- __init__.py |-- open_source | |-- included1.py | `-- included2.py `-- proprietary |-- inhouse1.py `-- inhouse2.py
The files inhouse1.py
and inhouse2.py
would be present for internal development, but would be removed when distributing an open-source version of the package.
I needed to find a solution that satisfied several constraints:
- All modules should be available when using
from package import *
- Adding a new module should only require one line of code in
__init__.py
- Modules could be excluded by simply deleting a subdirectory or
.py
file - Modules that didn't exist should be silently skipped
- If
import
ing a module failed for another reason, a warning would be raised
The direct solution would be write a custom __init__.py
, something like
import warning __all__ = [] # - Import `included1` try: import open_source.included1 __all__.append('included1') except ModuleNotFoundError: pass except ImportError: warning.warn("Couldn't import included1") # - Import `inhouse1` try: import proprietary.inhouse1 __all__.append('inhouse1') except ModuleNotFoundError: pass except ImportError: warning.warn("Couldn't import inhouse1") …
but this fails on a couple of counts — it's overly verbose, and you have to do extensive edits to add or remove a module.
Enter… importlib
!
Python handles importing by internally using the importlib package. This provides several facilities for looking up packages and modules; creating module loaders; loading and reloading, etc. You can therefore dynamically load modules, and even write your own importer package if you want. Unfortunately I found the documentation of importlib
somewhat brief, and couldn't find an example that felt like native import
in Python, as far as the end user was concerned. In particular, I couldn't find an example that would load the modules into the namespace the same way that import
does.
The key function for our purposes is import_module()
:
mod = importlib.import_module(name, package=None)
This function imports a module, handling the process of lookup, importing parent modules, using other __init__.py
files when necessary. It returns a module object mod
. name
is the name of the module to import, specified either as an absolute or relative terms. If you use relative referencing, then the package
argument specifies the base package for resolving the module name.
Below, as a functioning outline, is my solution: a smart __init__.py
that scans for a list of defined modules, and configures __all__
so that it contains the modules that can be successfully imported.
## __init__.py Smart importer for submodules # - Dictionary {module file} -> {class name to import} dModules = { '.open_source.included1': 'Included1Class', '.open_source.included2': 'Included2Class', '.inhouse.inhouse1': 'InHouse1Class', '.inhouse.inhouse2': 'InHouse2Class', '.inhouse.inhouse3': 'InHouse3Class', } # - Define current package (i.e. location of this `__init__.py`) strBasePackage = 'mypackage.subpackage' # - Required imports import importlib from warnings import warn # - Initialise list of available modules __all__ = [] # - Loop over submodules to attempt import for strModule, strClass in dModules.items(): try: # - Attempt to import the package print('Trying to import ' + strBasePackage + strModule + '.' + strClass) locals()[strClass] = getattr(importlib.import_module(strModule, strBasePackage), 'strClass') # - Add the resulting class to __all__ __all__.append(strClass) except ModuleNotFoundError: # - Ignore ModuleNotFoundError pass except ImportError: # - Raise a warning if the package could not be imported for any other reason warn('Could not load package ' + strModule)
Usage
The only configuration needed is at the beginning of the script, by configuring a dictionary of modules and classes to import:
# - Dictionary {module file} -> {class name to import} dModules = { '.open_source.included1': 'Included1Class', '.open_source.included2': 'Included2Class', '.inhouse.inhouse1': 'InHouse1Class', '.inhouse.inhouse2': 'InHouse2Class', '.inhouse.inhouse3': 'InHouse3Class', }
The package location needs also to be specified:
# - Define current package (i.e. location of this `__init__.py`) strBasePackage = 'mypackage.subpackage'
This line needs to be set once per __init__.py
to specify its location in the package/module namespace. Possibly this could be inferred programatically, but I didn't find a simple way to do that (please let me know if you have a smart way to solve this!).
The code then loops over the list of modules, and attempts to import the required class from each using importlib.import_module()
. If the module doesn't exist (ModuleNotFoundError
) then it is silently skipped. If importing the module fails (ImportError
) then a warning is raised.
Enhancements
The code above is a simple example of how to perform dynamic importing in __init__.py
. There are various ways that the importer could be improved, for example:
- Permitting modules alone to be imported, not just classes within modules;
- Allowing you to specify a list of classes from each module;
- Automatically detecting the base namespace, as mentioned above;
- Allowing renaming of modules / classes similar to the
import … as …
syntax; and - Handling import errors more gracefully by printing a stack trace.
Photo credit
I need ideas for this prop by Robert Couse-Baker. (cc 2.0) by