Python package versions seem simple on the surface because we're used to seeing the common and well-defined version schemes
like SemVer (1.2.3
) or CalVer (2023.6.1
) and occasionally a pre-release suffix (1.2.3pre1
). But Python package versioning
can get much more complicated than that! ๐คฏ
Below is a list of quirks in Python package versioning you may not know about:
This article should serve as a list of reasons to not handle Python versions as simple strings due to their complexity. Instead use
the official packaging.version.Version
class to parse, compare, and reason about Python versions.
This nerdsnipe blog post idea was given to me by William Woodruff at PyCon US 2023. Thanks Will!
PEP 440 is the current standard for Python package versions which superseded PEP 386. At the time PEP 440 came into effect, it was capable of representing 97% of existing package versions.
Python package versions are made of up to 6 types of "version parts". There is a "Release" part which is required and 3 common optional parts named "Pre-releases", "Post-releases", and "Dev". The remaining 2 optional and less frequently encountered parts are the "Epoch" and "Local" version parts. Below is a graphic showing all the types of version part, how they are ordered, and what the delimiters are:
Any version number can start with an optional v
prefix. This character is stripped off during normalization and doesn't carry any further data about the version.
__version__ = "v1.0" # Normalizes to '1.0'
Epoch versions give a method for packages to change their versioning scheme without the burden of the past versioning scheme impacting which release will be considered "latest". For example, if your project used a large number for the first release version segment it can be clunky to then use an even larger number in order to "break free" from that previous versioning scheme.
From PEP 440:
In particular, supporting version epochs allows a project that was previously using date based versioning to switch to semantic versioning by specifying a new version epoch.
Here's an example of that in action, say a project used CalVer but wanted to switch to SemVer, this is what their version scheme might look like:
# CalVer...
"2023.7.23"
"2023.8.4"
# Switch to SemVer!
"1!2.0.0"
"1!2.1.0"
"1!2.1.1"
# and so on...
Local versions are an arbitrary identifier consisting of alphanumerics and a period and come after the "release". Local versions don't have any pre-defined semantics but are typically used to differentiate upstream releases from potentially altered rebuilds by downstream integrators.
Local version "segments" can be delimited by .
, -
, or _
but all of those characters will normalize to .
:
__version__ = "1.0.0+ubuntu-1" # Normalizes to '1.0.0+ubuntu.1'
Local versions also can't be uploaded to PyPI, the service will reject them:
$ python -m twine upload dist/*
Uploading distributions to https://upload.pypi.org/legacy/
...
WARNING Error during upload. Retry with the --verbose option for more details.
ERROR HTTPError: 400 Bad Request from https://upload.pypi.org/legacy/
'1.0.0+local' is an invalid value for Version. Error: Can't use PEP 440 local
versions. See https://packaging.python.org/specifications/core-metadata for more
information.
This behavior is called out explicitly in the PEP:
As the Python Package Index is intended solely for indexing and hosting upstream projects, it MUST NOT allow the use of local version identifiers.
PEP 440 doesn't define a maximum length for a version number, meaning that they can be infinitely long according to the standard. There's a package that uses the first 217 digits of pi as its version number.
Version numbers always normalize to lowercase, meaning that you can "yell" all you want in your version numbers:
__version__ = "V1.0.0-RC0" # Normalizes to '1.0.0rc0'
The pre, post, and dev version parts aren't exclusive, meaning you can combine all three into one version:
# All the suffixes!
__version__ = "1.0.0-pre0-post0-dev0"
The pre, post, and dev version parts all have an optional delimiter (-
, _
, or .
) meaning that each of
the parts can be combined without any characters separating them. This means you can have a version with all three
that looks quite strange:
# Who need delimiters?
__version__ = "0previewpostdev" # Normalizes to 0rc0.post0.dev0
This also means that "rc
" can be typo-ed as "cr
" into a valid version identifier that means something completely different due to c
being valid for pre-releases and r
for post-releases:
# An actual release candidate:
__version__ = "1.0.0rc1"
# A typo-ed release candidate:
__version__ = "1.0.0cr1" # Normalizes to '1.0.0.rc0.post1'
PEP 386 preferred the c
prefix for release candidates. PEP 440 which obsoletes PEP 386 changed that preference to rc
.
The alpha, beta, release candidate suffixes are normalized without a delimiter but post and dev release suffixes are normalized with a delimiter:
__version__ = "1.0.0-alpha0" # --> 1.0.0a0
__version__ = "1.0.0-beta" # --> 1.0.0b0
__version__ = "1.0.0-rc0" # --> 1.0.0rc0
__version__ = "1.0.0post0" # --> 1.0.0.post0
__version__ = "1.0.0dev0" # --> 1.0.0.dev0
Having a hyphen delimiter (-
) and a number after any release version is equivalent to a post-release:
__version__ = "0-0" # Normalizes to '0.post0'
The version 1.0
and 1.0.0
are not considered distinct release numbers. Release segment comparison
will always add zeroes for missing release segments so the two compare to identical values.
This also means you can add a ton of zeroes after a version and pip handles it totally fine:
# Add a bunch of zeroes after the requested version...
$ python -m pip install urllib3==2.0.4.0.0.0.0.0.0
# ...resolves to '2.0.4':
Downloading urllib3-2.0.4-py3-none-any.whl (123 kB)
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ 123.9/123.9 kB 1.4 MB/s eta 0:00:00
Installing collected packages: urllib3
Successfully installed urllib3-2.0.4
This also applies to epoch versions! Every version has a default epoch version of 0
:
# Add a leading 0 epoch version.
$ python -m pip install 'urllib3==0!2.0.4'
# ...resolves to '2.0.4':
Downloading urllib3-2.0.4-py3-none-any.whl (123 kB)
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ 123.9/123.9 kB 1.4 MB/s eta 0:00:00
Installing collected packages: urllib3
Successfully installed urllib3-2.0.4
All integers are interpreted via the int()
builtin meaning they will normalize away prefixed zeroes on an
integer without error:
__version__ = "01.001.0000" # Normalizes to '1.1.0'
PEP 440 only came into effect in 2014, there was a lot of Python happening before that! Retroactive application of PEP 440 rules for versions can cause some ambiguous parsing situations for source distributions.
William Woodruff wrote an awesome article describing this issue.
โจ Bonus quirk! โจ Wheel build numbers are a feature that's very similar to a version number and are only applicable to wheel distributions. They act as a tie-breaker when two wheels have identical name, version, platform, Python, and ABI tags.
From PEP 427, build numbers come after the name and version in a wheel filename:
{name}-{version}(-{build tag})?-{python tag}-{abi tag}-{platform tag}.whl
This means that you can have two distributions that have the same name, version, and tags but still not be identical due to a wheel build number. One more reason it's important to pin hashes in addition to versions in lock-files!
Combining this quirk with the one above means that with a dist name of 0
you can have a wheel file named 0-0-0-0-py3-any-none.whl
0-0-0-0-py3-any-none.whl
^ ^ ^ ^
| | | |
| | | +- build number
| | +- post-release
| +- version
+- name
Don't let social media algorithms decide what you want to see.
Never miss an article and support the decentralized web. Get guaranteed notifications for new publications by following the RSS feed or the email newsletter. Send any thoughts and questions you have via Mastodon or email.
Thanks for reading!
โ Seth
This work is licensed under