About โ€ข Blog โ€ข Newsletter โ€ข Links

Quirks of Python package versioning

Published 2023-08-03 by Seth Larson
Reading time: 6 minutes

Python package versions seem simple on the surface because we're used to seeing the common and well-defined version schemes like SemVer (1.2.3) or CalVer (2023.6.1) and occasionally a pre-release suffix (1.2.3pre1). But Python package versioning can get much more complicated than that! ๐Ÿคฏ

Below is a list of quirks in Python package versioning you may not know about:

This article should serve as a list of reasons to not handle Python versions as simple strings due to their complexity. Instead use the official packaging.version.Version class to parse, compare, and reason about Python versions.

This nerdsnipe blog post idea was given to me by William Woodruff at PyCon US 2023. Thanks Will!

What is a Python package version?

PEP 440 is the current standard for Python package versions which superseded PEP 386. At the time PEP 440 came into effect, it was capable of representing 97% of existing package versions.

Python package versions are made of up to 6 types of "version parts". There is a "Release" part which is required and 3 common optional parts named "Pre-releases", "Post-releases", and "Dev". The remaining 2 optional and less frequently encountered parts are the "Epoch" and "Local" version parts. Below is a graphic showing all the types of version part, how they are ordered, and what the delimiters are:

Epoch
Epoch
Release
Release
Pre-Release
Pre-Release
Post-Release
Post-Relea...
Dev
Dev
Local
Local
v
v
!
!
-
.
_
-...
-
.
_
-...
-
.
_
-...
+
+
O
O
Optional
Optional
Optional
Optional
Optional
Optional
Optional
Optional
Optional
Optional
Required
Required
Text is not SVG - cannot display

v Prefixes

Any version number can start with an optional v prefix. This character is stripped off during normalization and doesn't carry any further data about the version.

__version__ = "v1.0"  # Normalizes to '1.0'

Epoch versions

Epoch versions give a method for packages to change their versioning scheme without the burden of the past versioning scheme impacting which release will be considered "latest". For example, if your project used a large number for the first release version segment it can be clunky to then use an even larger number in order to "break free" from that previous versioning scheme.

From PEP 440:

In particular, supporting version epochs allows a project that was previously using date based versioning to switch to semantic versioning by specifying a new version epoch.

Here's an example of that in action, say a project used CalVer but wanted to switch to SemVer, this is what their version scheme might look like:

# CalVer...
"2023.7.23"
"2023.8.4"

# Switch to SemVer!
"1!2.0.0"
"1!2.1.0"
"1!2.1.1"

# and so on...

Local versions

Local versions are an arbitrary identifier consisting of alphanumerics and a period and come after the "release". Local versions don't have any pre-defined semantics but are typically used to differentiate upstream releases from potentially altered rebuilds by downstream integrators.

Local version "segments" can be delimited by ., -, or _ but all of those characters will normalize to .:

__version__ = "1.0.0+ubuntu-1"  # Normalizes to '1.0.0+ubuntu.1'

Local versions also can't be uploaded to PyPI, the service will reject them:

$ python -m twine upload dist/* 

Uploading distributions to https://upload.pypi.org/legacy/
...
WARNING  Error during upload. Retry with the --verbose option for more details.                      
ERROR    HTTPError: 400 Bad Request from https://upload.pypi.org/legacy/                             
         '1.0.0+local' is an invalid value for Version. Error: Can't use PEP 440 local    
         versions. See https://packaging.python.org/specifications/core-metadata for more            
         information.

This behavior is called out explicitly in the PEP:

As the Python Package Index is intended solely for indexing and hosting upstream projects, it MUST NOT allow the use of local version identifiers.

Looooong versions

PEP 440 doesn't define a maximum length for a version number, meaning that they can be infinitely long according to the standard. There's a package that uses the first 217 digits of pi as its version number.

Case-insensitivity

Version numbers always normalize to lowercase, meaning that you can "yell" all you want in your version numbers:

__version__ = "V1.0.0-RC0"  # Normalizes to '1.0.0rc0'

Pre-post-dev releases

The pre, post, and dev version parts aren't exclusive, meaning you can combine all three into one version:

# All the suffixes!
__version__ = "1.0.0-pre0-post0-dev0"

No delimiters needed

The pre, post, and dev version parts all have an optional delimiter (-, _, or .) meaning that each of the parts can be combined without any characters separating them. This means you can have a version with all three that looks quite strange:

# Who need delimiters?
__version__ = "0previewpostdev"  # Normalizes to 0rc0.post0.dev0

This also means that "rc" can be typo-ed as "cr" into a valid version identifier that means something completely different due to c being valid for pre-releases and r for post-releases:

# An actual release candidate:
__version__ = "1.0.0rc1"

# A typo-ed release candidate:
__version__ = "1.0.0cr1"  # Normalizes to '1.0.0.rc0.post1'

PEP 386 preferred the c prefix for release candidates. PEP 440 which obsoletes PEP 386 changed that preference to rc.

Delimiter normalization

The alpha, beta, release candidate suffixes are normalized without a delimiter but post and dev release suffixes are normalized with a delimiter:

__version__ = "1.0.0-alpha0"  # --> 1.0.0a0
__version__ = "1.0.0-beta"    # --> 1.0.0b0
__version__ = "1.0.0-rc0"     # --> 1.0.0rc0

__version__ = "1.0.0post0"    # --> 1.0.0.post0
__version__ = "1.0.0dev0"     # --> 1.0.0.dev0

Implicit post releases

Having a hyphen delimiter (-) and a number after any release version is equivalent to a post-release:

__version__ = "0-0"  # Normalizes to '0.post0'

Implicit zeroes

The version 1.0 and 1.0.0 are not considered distinct release numbers. Release segment comparison will always add zeroes for missing release segments so the two compare to identical values.

This also means you can add a ton of zeroes after a version and pip handles it totally fine:

# Add a bunch of zeroes after the requested version...
$ python -m pip install urllib3==2.0.4.0.0.0.0.0.0

# ...resolves to '2.0.4':
  Downloading urllib3-2.0.4-py3-none-any.whl (123 kB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 123.9/123.9 kB 1.4 MB/s eta 0:00:00
Installing collected packages: urllib3
Successfully installed urllib3-2.0.4

This also applies to epoch versions! Every version has a default epoch version of 0:

# Add a leading 0 epoch version.
$ python -m pip install 'urllib3==0!2.0.4'

# ...resolves to '2.0.4':
  Downloading urllib3-2.0.4-py3-none-any.whl (123 kB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 123.9/123.9 kB 1.4 MB/s eta 0:00:00
Installing collected packages: urllib3
Successfully installed urllib3-2.0.4

Zero-prefixed numerical normalization

All integers are interpreted via the int() builtin meaning they will normalize away prefixed zeroes on an integer without error:

__version__ = "01.001.0000"  # Normalizes to '1.1.0'

Ambiguous parsing of pre-PEP 440 source distribution versions

PEP 440 only came into effect in 2014, there was a lot of Python happening before that! Retroactive application of PEP 440 rules for versions can cause some ambiguous parsing situations for source distributions.

William Woodruff wrote an awesome article describing this issue.

Wheel build numbers

โœจ Bonus quirk! โœจ Wheel build numbers are a feature that's very similar to a version number and are only applicable to wheel distributions. They act as a tie-breaker when two wheels have identical name, version, platform, Python, and ABI tags.

From PEP 427, build numbers come after the name and version in a wheel filename:

{name}-{version}(-{build tag})?-{python tag}-{abi tag}-{platform tag}.whl

This means that you can have two distributions that have the same name, version, and tags but still not be identical due to a wheel build number. One more reason it's important to pin hashes in addition to versions in lock-files!

Combining this quirk with the one above means that with a dist name of 0 you can have a wheel file named 0-0-0-0-py3-any-none.whl

0-0-0-0-py3-any-none.whl
^ ^ ^ ^
| | | |
| | | +- build number
| | +- post-release
| +- version 
+- name

Thanks for reading! โ™ก Did you find this article helpful and want more content like it? Get notified of new posts by subscribing to the RSS feed or the email newsletter.


This work is licensed under CC BY-SA 4.0