Security Developer-in-Residence Weekly Report #24

Published 2024-01-09 by Seth Larson
Reading time: 3 minutes

This critical role would not be possible without funding from the OpenSSF Alpha-Omega project. Massive thank-you to Alpha-Omega for investing in the security of the Python ecosystem!

Welcome to the first weekly report of 2024!

Software Bill-of-Materials for CPython

Continuing from 2023 there will be a focus on Software Bill-of-Materials (SBOMs) for CPython and incremental improvements to the CPython release process as more is automated.

I made a suggestion to release managers to backport SBOM tooling in the CPython repository to all supported release streams in an effort to treat SBOMs more like an additional artifact instead of a new feature of CPython. This would mean SBOMs would be available for previous CPython releases and we won't have to wait until the 3.13.0 stable release in October to make them available for consumption.

Trusted Publisher provenance on PyPI

Last week William Woodruff published a pre-PEP discussion for using Trusted Publisher configurations to bootstrap publish provenance on PyPI. I was involved in reviewing the initial draft, so I'm excited to see the discussion! Some things to highlight that came from the discussion:

Looking forward to helping however I can with this project once it is proposed as a PEP!

Build reproducibility of macOS artifacts

Previously I worked on build reproducibility for CPython source artifacts which are both tarballs. I want to provide build reproducibility to all artifacts that CPython provides including the Windows and macOS binary installers.

Turns out that macOS's Package files (.pkg) use the eXtensible ARchive (XAR) format internally. This format isn't supported by diffoscope, the tool I've been using for verifying. I put together a quick bit of functionality in order to diff .pkg files which appears to work nicely and have submitted it upstream to the diffoscope project.

Next steps for reproducibility would be to apply diffoscope inside an automated macOS build process to shake out any sources of non-determinism and address them.

Software identifiers

Listened to the Open Source Security Podcast (which I recommend) where Josh Bressers and Kurt Seifried discussed software identifiers as they relate to vulnerabilities and Software Bill-of-Materials and more specifically CISA's RFI on software identifiers and OpenSSF's response.

CPE system could work if it was open for others to collaborate. Currently, mostly a closed system. I've also found casually that CPEs tend to work much better for returning CVE matches today compared to Package URLs even though OSV works with Package URLs natively. I suspect tooling will improve in this area as time goes on.

Package URLs (PURLs) are distributed, namespaced, and intrinsic (easily discoverable). Downside is that two completely different Package URLs may reference the same software but different methods of retrieval (which may be relevant!) Sometimes ties software identity to its source code platform which can change (see CPython moving to GitHub).

Package URLs being namespaced also means that they can carve out namespaces that are governed by different standards, for example the pkg:pypi/... namespace is governed by PEPs for names and versions where pkg:npm/ is governed by different standards. I think this ability will be critical for software identifiers to model different ecosystems, ecosystems won't converge to one set of standards so identifying software needs to be able to model them properly.

Other items

That's all for this week! 👋 If you're interested in more you can read next week's report or last week's report.

Thanks for reading! ♡ Did you find this article helpful and want more content like it? Get notified of new posts by subscribing to the RSS feed or the email newsletter.

This work is licensed under CC BY-SA 4.0