Skip to content

Preparation for packaging

Carl Sandrock edited this page Feb 6, 2020 · 20 revisions

At some point we want to package the functions we have developed here as a package on PyPI. This document details the steps that we should follow to do that.

The following points need to be addressed:

  • Minimal package structure requirements
  • Correct arguments for setup()
  • Choose a versioning scheme
  • Testing
  • Documentation
  • Project dependencies
  • Clean repository
  • Licensing
  • Project packaging

1. Package Structure

The package must conform to the required PyPI structure before it can be published. This section discusses the files and structure needed to comply with the minimum requirements.

1.1 setup.py

Perhaps the most important file in the distribution, the setup.py file is used to install the distribution onto the user’s computer. This file is also used for the configuration of various aspects of the distribution.

1.2 setup.cfg

This is a file that contains the default options that are used in the setup.py commands.

1.3 README.rst

This is the readme file for the project, although not strictly required it gives information on the goals of the project. The most common file format is reStructuredText using the .rst extension. This file is also automatically rendered by pages like GitHub.

1.4 MANIFEST.in

This file is only needed in cases where python setup.py sdist (or bdist_wheel) doesn't automatically include certain non-install files. For details on writing a MANIFEST.in file see this MANIFEST.in template.

1.5 Package Folder

Common practice dictates that all the python modules and packages should be included in a single top-level package with the same name as the project.

1.6 .gitignore

The Python build system creates a number of intermediate files that should not be included in commits to the repository. The .gitignore file removes these files from the change list. The .gitignore will typically looks as follows:

# Compiled python modules
*.pyc

# Setuptools distribution folder
/dist/

# Python egg metadata
/*.egg-info

1.7 Minimal Structure

The minimal structure of the initial files must look like the following:

skogestad/
    skogestad/
        __init__.py
        ...
    setup.py
    setup.cfg
    README.rst
    MANIFEST.in
    .gitignore

2. setup() args

The setup.py file contains a global setup() function.setup is imported from setuptools. The most important arguments to setup are the following:

  • name: This is the name that will be used to import the project. The name in this case is skogestad.
  • version: Gives the current version of the project.
  • description: Short description of the project.
  • url: Homepage for the project. In this case the GitHub repository.
  • author: Gives the author's name.
  • author_email: Contact details of the author.
  • license: Provides details about the project license.
  • packages: All the directories contained within the project. 'setuptools.find_packages' can be used to automatically find all packages. Manual specification must conform to PEP 508.
  • install_requires: This key word specifies the package dependencies, e.g numpy.

For a full list of arguments please see the setuptools documentation. The setup.py file will have the following format:

from setuptools import setup, find_packages

setup(
    name='skogestad',
    version='',
    description='some discription',
    url='https://github.com/alchemyst/Skogestad-Python',
    author='',
    author_email='',
    license='',
    packages=find_packages(),
    install_requires = []
)

The format above is not final and is only for illustration purposes.

3. Versioning Scheme

Several different versioning schemes exist and the use of these versioning schemes depend greatly on the needs of the project. All versioning schemes must comply with PEP 440. The following schemes are most common:

  • Semantic: Recommended for new projects. Three part MAJOR.MINOR.MAINTENANCE scheme. This scheme makes it easy to use "compatible release" specifiers. Python projects using this scheme must conform to clauses 1-8 of the semantic versioning specifications.
  • Date based: Recommended for projects with regular time based release cadence. Generally takes the format YEAR.MONTH.
  • Serial: Simplest form that consists a single number that is incremented with each release.

Other, more complex, schemes do exist but the above mentioned schemes are most likely sufficient for the needs of this project. The choice of the versioning scheme must be discussed with the author before implementation.

4. Testing

Tests are required to ensure the package performs the way it is intended. The testing of the package must be included in the setup() arguments in setup.py. The tests will also have their own subdirectory to prevent unnecessary clutter in the main directory.

The directory structure will now include a tests folder as displayed below:

skogestad/
    skogestad/
        __init__.py
        ...
    tests/
        __init__.py
        ...
    setup.py
    setup.cfg
    README.rst
    MANIFEST.in
    .gitignore

A test package is needed to run the tests on the package. The best way to run the tests, if unsure about what to use, is Nose. The testing must be incorporated into setup() as follows:

setup(
    ...
    test_suite='nose.collector',
    tests_require=['nose'],
    ...
)

To run the tests use the command python setup.py test.

5. Documentation

The project's documentation must be expanded and completed before the publication of the project on PyPI. It is recommended that the documentation be setup according to the Numpy/Scipy style guide. It is also important for the function docstrings to be complete and checked for spelling mistakes.

6. Project Dependencies

The end goal of this task is to have the least amount of project dependencies. These dependencies are specified in setup() by using the install_requires argument. The reason for reducing the number of dependencies is mainly ease of maintenance. If a project depends on some feature of e.g sympy and in a new release sympy developers decide to remove that specific feature the project will no longer function properly. The possibility of the code failing increases with the number of dependencies the project has. Trying to keep track of all the different changes of each dependency makes it difficult for the project developer to maintain the project's source code. A module that can be considered as part of the core structure of the project such as numpy does not have to be removed. Adding a dependency can only be justified if a large part of that module is used for various tasks in the project, otherwise it must be removed. The current project dependencies include:

control
future
matplotlib
numpy
scipy
sympy
mock
numpydoc

numpy, scipy and matplotlib are core calculation and plotting libraries; future is used for cross-version compatibility and numpydoc along with mock is used for the documentation. Therefore the dependency on sympy and control must be reduced and if possible totally be removed.

7. Repository Cleanup

The current state of the repository in terms of organisation is not good. Several files in the repository have been accidentally added by previous students and must be removed or reclassified. The end goal is to have a structured scheme where only the install and configuration files are not inside a subdirectory. A possible format might look like the following:

skogestad/
    skogestad/
        __init__.py
        ...
    tests/
        __init__.py
        ...
    examples/
        __init__.py
        ...
    figures/
        __init__.py
        ...
    chapters/
        __init__.py
        ...
    setup.py
    setup.cfg
    README.rst
    MANIFEST.in

Here the files are organised in the following way:

  • skogestad: Contains utils.py and utilsplot.py.
  • tests: Contains all the test scripts.
  • examples: Code for the worked examples.
  • figures: Code generating certain figures in the book.
  • chapters: Any code pertaining to a specific chapter.

Any other scripts in the repository must be either removed or reclassified to fit into one of the above mentioned folders. A problem that might arise from this structure is running of example code if skogestad is not installed on the user's computer. This can be bypassed by using the PyCharm IDE, adding the file to environmental variables in Windows or adding the following lines at the beginning of a script when using other IDEs.

import sys

sys.path.append('path/to/folder')

It is important to add an empty __init__.py file to the root level of the folder to enable python to detect the contents when using the above method.

8. Licensing

The project will most likely be licensed using an open-source license. Open-source licenses make it easy for others to contribute to the project. It also allows for the author to get some recognition for his work. Licensing grants the users of the project specific permissions. Two of the most common licenses are:

  • General Public License (GPL): It grants the user permission to copy the software, freely distribute, sell and modify the software in anyway. For more information see this page.
  • MIT License: This is the shortest and broadest license. The only restriction of this license is that the software must be accompanied by the license agreement. For the actual license see this MIT License page.

Many more licenses are available for a full list see Open Source Initiative's licenses page. For more information on choosing a license see the following links:

The license file must be included as a text file in the root directory of the project as follows:

skogestad/
    ...
    License.txt
    setup.py
    ...

NOTE: This is only required for the MIT license.

To indicate the license in setup() (required) use the following argument:

setup(
    ...
    license='License name'
    ...

9. Final Packaging

9.1 Source Distribution

To install the package from PyPI a distribution must be created. The minimal requirement is a Source Distribution of the project. This is known as an unbuilt package and requires an additional build step when installed using pip.

9.2 Built Distribution

This package can be installed without going through the initial build step. This can be achieved by creating a wheel for the project. A wheel is a built package and has a substantially faster install time for the end user. There are three types of wheels for python projects:

  • Universal wheel: For pure python projects that natively support both Python 2 and 3.
  • Pure Python Wheel: For pure python projects that do not natively support both Python 2 and 3.
  • Platform Wheel: For python projects containing compiled extensions.

Since this project natively supports Python 2 and 3 a universal wheel can be used. To build a Universal wheel use the following command:

python setup.py bdist_wheel --universal

The --universal flag can also be permanently enabled in setup.cfg:

[bdist_wheel]
universal=1

Final Comments

The descriptions given above can be used to comply with the minimum requirements in order to publish the project on PyPI. For more information please see the following pages: