Why should you care about creating packages?
pip install demo
).pip install -e .
installs your package and keeps it up-to-date during development).from demo.main import say_hello
, then test the function).pip install demo==1.0.3
).Library vs Package vs Module:
.py
file containing functions that belong togetherPackaging python code is quite easy, you require a single setup.py
script to package in several distribution formats.
Let’s use the folder structure from an earlier post, and create a virtual environment in it:
➜ tree -a -L 2
.
├── .venv
│ └── ...
├── Pipfile
├── Pipfile.lock
├── src
│ └── demo
│ └── main.py
└── tests
└── demo
└── ...
9 directories, 3 files
Create a setup.py
file in the root directory that we will use to define the way we’d like to package our code, containing the following code:
"""Setup.py script for packaging project."""
from setuptools import setup, find_packages
import json
import os
def read_pipenv_dependencies(fname):
"""Get default dependencies from Pipfile.lock."""
filepath = os.path.join(os.path.dirname(__file__), fname)
with open(filepath) as lockfile:
lockjson = json.load(lockfile)
return [dependency for dependency in lockjson.get('default')]
if __name__ == '__main__':
setup(
name='demo',
version=os.getenv('PACKAGE_VERSION', '0.0.dev0'),
package_dir={'': 'src'},
packages=find_packages('src', include=[
'demo*'
]),
description='A demo package.',
install_requires=[
*read_pipenv_dependencies('Pipfile.lock'),
]
)
You can now call this script to package your code in several ways:
python setup.py develop # don't generate anything, just install locally
python setup.py bdist_egg # generate egg distribution, doesn't include dependencies
python setup.py bdist_wheel # generate versioned wheel, includes dependencies
python setup.py sdist --formats=zip,gztar,bztar,ztar,tar # source code
Run the first one in the list above. When it succeeds, you will be able to import your code as follows:
from demo.main import say_hello
Note: if you are receiving “No module named demo…”, you’ll need to add an empty
__init__.py
file in all folders you want to import from. In our example, that only includes thedemo
folder. You can read more about these__init__.py
files here.
Now that we were able to install the project, we should take a closer look at the arguments that we pass to the setuptools.setup
function:
name
: the name of your packageversion
: every change to your code should yield a different package version, or else developers may install the same package version that’s suddenly behaving differently, breaking their codepackages
: a list of paths of all your python filesinstall_requires
: a list of package names and versions (just as in a requirements.txt
file)You can see that I wrote a simple function read_pipenv_dependencies
to read the non-dev dependencies from the Pipfile.lock
. Now I won’t have to specify dependencies manually. I also use os.getenv
to read in an environment variable to determine the package version, which are nice sagues to the next topics.
Just as I read in the Pipfile.lock
to specify my dependencies, I can also read in a README.md
file to display useful documentation as the long_description
. More information about this can be read on packaging.python.org.
In addition you can create a proper documentation web page using readthedocs
and shinx
. Create a folder for your documentation:
mkdir docs
Install sphinx:
pipenv install -d sphinx
Run the quickstart to generate the source directory for your documentation:
sphinx-quickstart
Now you can start populating the docs/index.rst
file with your documentation. Learn more about automating this process on sphinx-doc.org.
As part of your packaging process, you’d want to apply some static code analyses, linting, and testing.
pipenv install -d mypy autopep8 \
flake8 pytest bandit pydocstyle
Preferably, you’d run a command to verify the code style, run some tests and checks before you push your commits to the remote repository, and cause a build pipeline to fail if the tests don’t pass.
As we’re rapidly introducing new commands that are part of packaging our specific project, it is useful to record common commands. Most build automation tools (such as Gradle or npm) provide this feature by default.
Make is a tool to organize code compilation, traditionally used in c-oriented projects. But it can be used to run any other command.
By default, when you run make
, it executes the first command in the list, in the example below it will execute make help
and print out the contents of the Makefile.
If we run make test
, it will first run make dev
, as it’s stated as a dependency in the Makefile:
help:
@echo "Tasks in \033[1;32mdemo\033[0m:"
@cat Makefile
lint:
mypy src --ignore-missing-imports
flake8 src --ignore=$(shell cat .flakeignore)
dev:
pip install -e .
test: dev
pytest --doctest-modules --junitxml=junit/test-results.xml
bandit -r src -f xml -o junit/security.xml || true
build: clean
pip install wheel
python setup.py bdist_wheel
clean:
@rm -rf .pytest_cache/ .mypy_cache/ junit/ build/ dist/
@find . -not -path './.venv*' -path '*/__pycache__*' -delete
@find . -not -path './.venv*' -path '*/*.egg-info*' -delete
Now, as you can see, it’s quite easy for new developers to contribute to the project. They now have a nice overview of common commands, for example to build a wheel: make build
.
When you run make build
, it will use the setup.py
file to create a wheel distribution. You’ll find a .whl
file in the dist/
folder, having 0.0.dev0
in the name. You can now specify an environment variable to change the version of the wheel:
export PACKAGE_VERSION='1.0.0'
make build
ls dist
Having the wheel, you can create a new folder, somewhere on your desktop, copy the wheel into it, and install it using:
mkdir test-whl && cd test-whl
pipenv shell
pip install *.whl
Print out the installed packages:
pip list
It’s also possible to add data to your package by adding the following lines to your setup.py
script:
Note: This may not work on distributed systems (such as Databricks).
if __name__ == '__main__':
setup(
data_files=[
('data', ['data/my-config.json'])
]
)
You’ll then be able to read the file by using this function:
def get_cfg_file(filename: str, foldername: str) -> dict:
"""Get config file.
Using 'data_files' property from setup.py script.
"""
if not isinstance(foldername, str):
raise ValueError('Foldername must be string.')
if foldername[0] == '/':
raise ValueError('Foldername must not start with \'/\'')
if not isinstance(filename, str):
raise ValueError('Filename must be string.')
# Will first try to read file from installed location
# this only applies for .whl installations
# Otherwise it will read file directly
try:
filepath = os.path.join(sys.prefix, foldername, filename)
with open(filepath) as f:
return json.load(f)
except FileNotFoundError:
filepath = os.path.join(foldername, filename)
with open(filepath) as f:
return json.load(f)
If you create a wheel again, and install it in a virtual environment in a new folder, without copying the data file, you should be able to access the data by executing the function above.
As part of our packaging process, we want to integrate changes from many contributors, and automate as many repetitive processes that are required to successfully release a new version.
For this example we’ll use Azure DevOps, where the following pipeline will be triggered on git tags
as well as the master
branch.
Have a look, and we’ll discuss the various stages and tasks afterwards:
resources:
- repo: self
trigger:
- master
- refs/tags/v*
variables:
python.version: "3.7"
project: demo
feed: demo
major_minor: $[format('{0:yy}.{0:MM}', pipeline.startTime)]
counter_unique_key: $[format('{0}.demo', variables.major_minor)]
patch: $[counter(variables.counter_unique_key, 0)]
fallback_tag: $(major_minor).dev$(patch)
stages:
- stage: Test
jobs:
- job: Test
displayName: Test
steps:
- task: UsePythonVersion@0
displayName: "Use Python $(python.version)"
inputs:
versionSpec: "$(python.version)"
- script: pip install pipenv && pipenv install -d --system --deploy --ignore-pipfile
displayName: "Install dependencies"
- script: pip install typed_ast && make lint
displayName: Lint
- script: pip install pathlib2 && make test
displayName: Test
- task: PublishTestResults@2
displayName: "Publish Test Results junit/*"
condition: always()
inputs:
testResultsFiles: "junit/*"
testRunTitle: "Python $(python.version)"
- stage: Build
dependsOn: Test
jobs:
- job: Build
displayName: Build
steps:
- task: UsePythonVersion@0
displayName: "Use Python $(python.version)"
inputs:
versionSpec: "$(python.version)"
- script: "pip install wheel twine"
displayName: "Wheel and Twine"
- script: |
# Get version from git tag (v1.0.0) -> (1.0.0)
git_tag=`git describe --abbrev=0 --tags | cut -d'v' -f 2`
echo "##vso[task.setvariable variable=git_tag]$git_tag"
displayName: Set GIT_TAG variable if tag is pushed
condition: contains(variables['Build.SourceBranch'], 'refs/tags/v')
- script: |
# Get variables that are shared across jobs
GIT_TAG=$(git_tag)
FALLBACK_TAG=$(fallback_tag)
echo GIT TAG: $GIT_TAG, FALLBACK_TAG: $FALLBACK_TAG
# Export variable so python can access it
export PACKAGE_VERSION=${GIT_TAG:-${FALLBACK_TAG:-default}}
echo Version used in setup.py: $PACKAGE_VERSION
# Use PACKAGE_VERSION in setup()
python setup.py bdist_wheel
displayName: Build
- task: CopyFiles@2
displayName: Copy dist files
inputs:
sourceFolder: dist/
contents: demo*.whl
targetFolder: $(Build.ArtifactStagingDirectory)
flattenFolders: true
- task: PublishBuildArtifacts@1
displayName: PublishArtifact
inputs:
pathtoPublish: $(Build.ArtifactStagingDirectory)
ArtifactName: demo.whl
- task: TwineAuthenticate@1
inputs:
artifactFeed: $(project)/$(feed)
- script: |
twine upload -r $(feed) --config-file $(PYPIRC_PATH) dist/*
displayName: PublishFeed
In the Test
stage we install the project in the pipeline container, without creating a virtual environment. We then run the make lint
and make test
commands, just like you would on your machine.
In the Build
stage we will try to extract the package version from a git tag, and we construct a fallback package version. We run the python setup.py bdist_wheel
command to build a wheel, knowing that our package version environment variable is set. Finally, we publish the artifact to Azure DevOps artifacts, and (optionally) to our feed.
You’ll need a .pypirc
file to publish your package to a feed, you can copy the contents after creating a feed in Azure DevOps, which looks something like this:
[distutils]
Index-servers =
stefanschenk
[stefanschenk]
Repository = https://pkgs.dev.azure.com/stefanschenk/_packaging/stefanschenk/pypi/upload
On how to install packages from a private feed, have a look at this post.