> Why not include the public key in the package? Because PyPI (or an attacker) c...

zimmerfrei · on May 23, 2023

>> but I don't think you actually want this: lots of large packages have multiple release managers (and contributors who come and go); you don't want to manually resolve each new human identity that appears for a package distribution.

Nope, you assume wrong. That's exactly what I (also) want, that is, knowing that the *authors* remained the same, whoever they are.

>> What most people actually want is a strong cryptographic attestation that the package distribution came from the same source as the thing hosting the source code

Nope, nobody really needs more of that, since that's what's your HTTPS certificate is for.

People *really* want to mitigate the risk of pypi infrastructure getting fully compromised, which is very likely, given how many eggs you keep in the same basket there.

PGP signatures were the last ditch, not very convenient but also not as bad as they are painted. But from now on there will be not even that very little.

woodruffw · on May 23, 2023

> Nope, you assume wrong. That's exactly what I (also) want, that is, knowing that the authors remained the same, whoever they are.

The point is that they don't remain the same. Assuming that they do is an operational error.

> Nope, nobody really needs more of that, since that's what's your HTTPS certificate is for.

HTTPS provides transport security, i.e. an authenticity relationship between you and GitHub's servers. It doesn't provide artifact authenticity for the source on that server, and cannot. That's what the comment above is referring to.

zimmerfrei · on May 23, 2023

> The point is that they don't remain the same. Assuming that they do is an operational error.

How many projects are signing each release with a different PGP key each time? And what are the odds that such projects will actually correct their practices as soon as pip implements key verification and make the problems more visible? A lot I guess?

It seems a lot of assumptions are being made...

But it is a self-fulfilling prophecy: the more you hide, hamper, and cripple the signature metadata, the more people will misuse it (without knowing it), which leads to these articles that argue for more crippling because people are misusing it.

The elephant in the room remains that pypi is a big target, and even though I highly appreciate the work done by maintainers (mostly volounteers?) I have a hard time believing they will always be able to keep skilled attackers away from its infra.

donaldstufft · on May 23, 2023

Nobody at PyPI is opposed to package signing, and removing or minimizing the damage that compromised infrastructure can do.

However, GPG is not a good tool to build those features on top of, and the vestigial support for GPG signing that PyPI had in no way aided the long term efforts to get proper, secure package signing into PyPI.

zzzeek · on May 23, 2023

maybe your blog post can use a little extra line at the end that says, one of these three things:

1. "Nobody at Pypi is opposed to package signing, so long term here is the technology we want to use for this: XYZ..."

2. "Nobody at Pypi is opposed to package signing, however after years of discussion there seem to be no feasible ways of doing this, so going forward there are no plans to actually add package signing" (refer to @tptacek's post at https://news.ycombinator.com/item?id=36048373 which seems to claim there are many, IIRC)

3. "Nobody at Pypi is opposed to package signing, however we simply don't have the resources to implement any new approaches. We would require a grant of $X million dollars to hire people do do this (which would be using technology XYZ)"

is there a choice 4?

donaldstufft · on May 23, 2023

The current expectation is it will be a combination of sigstore and TUF, but if someone proposes something better then we're open to that.

Implementing those things takes time though.

zimmerfrei · on May 24, 2023

And why exactly should pypi implement it?

Pypi should just be the organized repository of packages, with only some limited assurance over their authenticity. That is, pypi should just let authors upload signatures and metadata.

Something else, something to plug into pip (taking into account also the bootstrap problem), should be responsible for validating the signatures and providing assurance over identities.

People that just want TOFU will use that one plugin. People that trust Microsoft will use the github plugin. People that trust pypi also for identity, will use maybe what you will write when you have time for it. Maybe a popular plugin will allow people to choose many IdPs. Over time, the community will converge and that may be adopted as de facto standard.

But your choice of removing PGP signatures (as on type of signature) is now making that impossible, and you intend to decide in the future for others what the only blessed verification mechanism is (also, with no indication of when that will happen).

donaldstufft · on May 24, 2023

Well PGP signatures has been part of PyPI for 18 years now, if someone was going to build a secure system on top of that, they would have by now.

PyPI should implement it though, because fundamentally the question of who is authorized to release for "requests" on PyPI is a question of who PyPI authorizes to release for that.

oxfordmale · on May 23, 2023

The problem is that package signing is removed without providing an alternative. I am not volunteering to this project, so will quietly sit in the corner.

Taywee · on May 23, 2023

> That's exactly what I (also) want, that is, knowing that the authors remained the same, whoever they are

The authors are often many people. You can have one person signing on behalf of all the others. PGP isn't going to tell you that the authors remained the same, only that the signer did (or that many people have access to the same private key and hopefully every one of them is completely trustworthy).

PGP doesn't let you verify that the authors remained the same. Only the key. If you wanted to actually verify authors, you'd have to have all of them sign their own commits, and you'd have to validate every commit, not just the release, otherwise you're just back to trusting whoever holds the key. Many projects very regularly get new committers, too, so you'd have to validate many new signatures with every single update.

> Nope, nobody really needs more of that, since that's what's your HTTPS certificate is for.

No it's not. Your HTTPS certificate will not tell you "this PyPi package release is actually built and uploaded by the same person who controls the GitHub repository linked on the package page". PyPi hosts distributions. It frequently has source distributions, but it doesn't necessarily host "source code", which would usually mean the source repository. Even with that, it's Transport Layer Security, or a Secure Socket Layer. It does not authenticate anything other than the Socket/Transport itself.

I'm fine with PGP, but most people don't really know how to use it. They add a key and think they're safe when it validates, but that only protects you if you already trust the key. PGP signing doesn't tell you "this is safe", just "this was signed by the person who has the private key for this public key", which isn't as useful without a lot of personal footwork or a trusted authority.

PGP key signing parties were a thing for a reason. Using PGP properly requires either an initial leap of trust (importing your distro's keys and trusting what they trust), a lot of dilligence (personally verifying identities), or a small amount of dilligence with a good web of trust (you sign keys that you know are good, and so does everybody you know, so a lot of what you find online you can validate through your links).

bombolo · on May 23, 2023

The people in charge of doing a release, with the permissions to do so, are a much smaller subset than the authors.

And PGP does support web of trust, so if the previous release guy trusts the new release guy… perhaps we could accept it as well.

woodruffw · on May 23, 2023

> And PGP does support web of trust, so if the previous release guy trusts the new release guy… perhaps we could accept it as well.

PGP's web of trust has been broken since at least 2019[1]. GPG removed support for it years ago.

(This is a recurring problem with PGP: if you search these things, you're given the false impression that it's all still humming along.)

[1]: https://inversegravity.net/2019/web-of-trust-dead/

upofadown · on May 23, 2023

Web of trust based on signatures on keyservers is dead. That is not what is being suggested here.

Beldin · on May 23, 2023

> that's what's your HTTPS certificate is for.

Not really... That certificate doesn't go back in time. If a domain expires, an attacker could reregister it under their name and get a valid certificate.

You'd be downloading from the right domain name with a valid HTTPS certificate, but you're not downloading from the same place as before.

recursive · on May 23, 2023

> That certificate doesn't go back in time.

It does, kind of, if it's pinned.

dgrove · on May 23, 2023

HKPK doesn't have a ton of adoption and only works in browsers. So this does nothing for curl, wget, pip

crote · on May 23, 2023

>> knowing that the authors remained the same

The problem is that "authors" is not a well-defined concept, and especially larger projects will have very regular author changes. Is the author the person who made the last commit? The person who uploaded it to PyPI? The person who is currently managing the project? What if it isn't a person but a company?

>> that's what's your HTTPS certificate is for

A lot of open source projects rely on untrusted third-party mirrors. The main server will just randomly redirect you to a mirror near you, so HTTPS certificates are pretty much useless because you are connecting to a third-party domain. They use signatures to prevent the mirror from doing weird stuff, and they guarantee that the mirror is serving the upstream content as-is.

drexlspivey · on May 23, 2023

The author is the person holding the release signing key

eesmith · on May 23, 2023

"The"? Multiple people may hold the key.

"Person"? The release could be part of an automated process.

bombolo · on May 23, 2023

> and especially larger projects will have very regular author changes

We're not checking the signature of every commit, just of the release. It is usually 1 or 2 people who do releases.

heyoni · on May 23, 2023

But in this case we lose one way of defining authors.

slaymaker1907 · on May 23, 2023

There is some security even if they provide the public key. Bootstrapping is a problem, but clients can keep track of a mapping from package names to public key and issue a warning if that ever changes. That’s how SSH and RDP works and while I’ve never had an actual security hole plugged with this, I’ve had cases where my remote machine went down so the DNS didn’t update yet the IP was reassigned so the warning about mismatching keys was actually helpful.

woodruffw · on May 23, 2023

> There is some security even if they provide the public key.

That security is integrity, which PyPI already provides through strong cryptographic digests of each package distribution. Codesigning schemes need to provide authenticity, not just integrity; a codesigning scheme that's downgradeable to arbitrary key trust is a more complicated than necessary hashing scheme.

donaldstufft · on May 23, 2023

The problem with TOFU is that it assumes long lived keys (itself a bad practice) OR it assumes that the end user will be fine with regular notices that the keys that have signed their packages have changed, and will be able to correctly differentiate false positives from real positives.

specialist · on May 23, 2023

> Because PyPI ... could always substitute a new key.

Isn't that what public key servers are for?

For publishing my FOSS to sonatype, I had to first publish my public key, eg keyserver.ubuntu.com.

I don't know PyPI, but from this OC, it sounds like PyPI does not have the same prerequisite.

woodruffw · on May 23, 2023

Yep. Unfortunately, PGP's keyserver network has been dead for years[1]. There are two big (non-synchronizing) ones left, and they're the two I used to do the analysis that's linked in this announcement (meaning they're the ones that are largely missing well-formed keys for the signatures on PyPI).

This was discussed a bit on Sunday's thread[2], and my understanding is that Maven's ability to use PGP in this way is effectively due to Sonatype assuming a large amount of operational and maintenance burden. PyPI doesn't have those kind of resources available to it. Even assuming that the service was gifted that kind of support, it would still cause a lot of heartburn with existing signatures and carry forwards all of the legacy baggage of PGP that we're trying to eliminate entirely.

[1]: https://gist.github.com/rjhansen/67ab921ffb4084c865b3618d695...

[2]: https://news.ycombinator.com/item?id=36021172

bombolo · on May 23, 2023

It seems pypi should launch their own new keyserver, rather than removing PGP.

In any event, they will ask for a photo of the ID in the future. Google has already written on their security blog that this is where they're going, and from the whole google titan keys event, we know who decides on behalf of pypi.

masklinn · on May 23, 2023

Pypi doesn’t have the resources it needs to do its own job, they’re not going to waste more resources they don’t have on a dead-end technology they don’t have a use for.

alerighi · on May 23, 2023

> Notably, PGP is incapable of providing either of these: you only get key IDs, which are neither strong human identities nor a strong binding to a service. Key IDs might correspond to keys with email (or other identities) in them, but that's (1) not guaranteed, and (2) not a strong proof of identity (since anybody can claim any identity in a PGP key).

Depends. If the distributor maintains a repository of trusted public keys (for example as repositories of Linux distributions do) it gives you a guarantee. As it's said, most of the time you just want to know that the key used to sign a package is not changed. That is the same level of security that SSH offers (first time you connect to a server saves the public key, then give an error if that public key is changed). That is really enough for a package in PyPy, or sign git commits and similar.

We should ask ourself if the complexity of PGP is needed. Probably not, as it's not needed the complexity of x509 certificates, since a simple RSA signature of the package with a public key hosted on a server would be sufficient. But PGP is practical, you have a good tooling built around it, is pretty universal, so why not?