Schroedinger's importer

27th August, 2018

A Python package I'm working on combines several submodules with mixed licensing — some will be open source and redistributed, others will be proprietary and in-house only. I wanted to ease importing of the package by automatically detecting which submodules are present, and dynamically importing only those.

We wanted the option to include or exclude modules from redistribution by simply removing them. For example, we might have a directory structure like this:

package
|-- __init__.py
|-- open_source
|   |-- included1.py
|   `-- included2.py
`-- proprietary
    |-- inhouse1.py
    `-- inhouse2.py

The files inhouse1.py and inhouse2.py would be present for internal development, but would be removed when distributing an open-source version of the package.

I needed to find a solution that satisfied several constraints:

The direct solution would be write a custom __init__.py, something like

import warning
__all__ = []

# - Import `included1`
try:
    import open_source.included1
    __all__.append('included1')
except ModuleNotFoundError:
    pass
except ImportError:
    warning.warn("Couldn't import included1")
	
# - Import `inhouse1`
try:
    import proprietary.inhouse1
    __all__.append('inhouse1')
except ModuleNotFoundError:
    pass
except ImportError:
    warning.warn("Couldn't import inhouse1")

…

but this fails on a couple of counts — it's overly verbose, and you have to do extensive edits to add or remove a module.

Enter… importlib!

Python handles importing by internally using the importlib package. This provides several facilities for looking up packages and modules; creating module loaders; loading and reloading, etc. You can therefore dynamically load modules, and even write your own importer package if you want. Unfortunately I found the documentation of importlib somewhat brief, and couldn't find an example that felt like native import in Python, as far as the end user was concerned. In particular, I couldn't find an example that would load the modules into the namespace the same way that import does.

The key function for our purposes is import_module():

mod = importlib.import_module(name, package=None)

This function imports a module, handling the process of lookup, importing parent modules, using other __init__.py files when necessary. It returns a module object mod. name is the name of the module to import, specified either as an absolute or relative terms. If you use relative referencing, then the package argument specifies the base package for resolving the module name.

Below, as a functioning outline, is my solution: a smart __init__.py that scans for a list of defined modules, and configures __all__ so that it contains the modules that can be successfully imported.

## __init__.py Smart importer for submodules

# - Dictionary {module file} -> {class name to import}
dModules = {
    '.open_source.included1': 'Included1Class',
    '.open_source.included2': 'Included2Class',
    '.inhouse.inhouse1': 'InHouse1Class',
    '.inhouse.inhouse2': 'InHouse2Class',
    '.inhouse.inhouse3': 'InHouse3Class',
}

# - Define current package (i.e. location of this `__init__.py`)
strBasePackage = 'mypackage.subpackage'

# - Required imports
import importlib
from warnings import warn

# - Initialise list of available modules
__all__ = []

# - Loop over submodules to attempt import
for strModule, strClass in dModules.items():
    try:
        # - Attempt to import the package
        print('Trying to import ' + strBasePackage + strModule + '.' + strClass)
        locals()[strClass] = getattr(importlib.import_module(strModule, strBasePackage), 'strClass')

        # - Add the resulting class to __all__
        __all__.append(strClass)

    except ModuleNotFoundError:
        # - Ignore ModuleNotFoundError
        pass

    except ImportError:
        # - Raise a warning if the package could not be imported for any other reason
        warn('Could not load package ' + strModule)

Usage

The only configuration needed is at the beginning of the script, by configuring a dictionary of modules and classes to import:

# - Dictionary {module file} -> {class name to import}
dModules = {
    '.open_source.included1': 'Included1Class',
    '.open_source.included2': 'Included2Class',
    '.inhouse.inhouse1': 'InHouse1Class',
    '.inhouse.inhouse2': 'InHouse2Class',
    '.inhouse.inhouse3': 'InHouse3Class',
}

The package location needs also to be specified:

# - Define current package (i.e. location of this `__init__.py`)
strBasePackage = 'mypackage.subpackage'

This line needs to be set once per __init__.py to specify its location in the package/module namespace. Possibly this could be inferred programatically, but I didn't find a simple way to do that (please let me know if you have a smart way to solve this!).

The code then loops over the list of modules, and attempts to import the required class from each using importlib.import_module(). If the module doesn't exist (ModuleNotFoundError) then it is silently skipped. If importing the module fails (ImportError) then a warning is raised.

Enhancements

The code above is a simple example of how to perform dynamic importing in __init__.py. There are various ways that the importer could be improved, for example: