Recent versions of rails takes this approach. But I’ve wondered: is it possible to avoid copying the key onto disk, and leave it in 1Password, eg via their cli?
Yes, it is possible to inject secrets directly into a subprocess without writing them to disk via op run. Also see https://news.ycombinator.com/item?id=41482194 … we responded to the parent around the same time :D
As far as I can tell there's zero real world impact here, I think they just want to maintain a stellar track record for any reported bug that would affect the certificate issuance in any way.
Basically, had it been a second, a day, a month, doesn't matter - they still treated it seriously. That sort of thing goes a long way towards building trust.
I want to trust that when a CA has a certain written policy, they think it's important to stick by that policy, and they have plans to stick by the policy.
For instance, Symantec had a policy that they validate their subscribers before issuing the certificate. What they actually did was that they validate their subscribers before issuing the certificate, unless they were testing things out. In 2015, it was found that they tested out a google.com certificate, and they fired the employees involved in the incident: https://archive.is/Ro70U
At no point in either incident were those certs outside of the control of Symantec employees. Still, they lost a lot of trust (and ultimately their CA was marked untrusted) because they did not fix their problems: https://wiki.mozilla.org/CA:Symantec_Issues
So apparently letting people use human judgment, firing people who misuse it, and hiring different people with hopefully-better human judgment is not the way to be trustworthy.
(To be clear, I think it's extremely reasonable for them not to revoke the certificates, but I think it's good and important that they're following the procedure which requires them to make an explicit decision not to revoke them in consultation with the community.)
acting like an automaton would mean that they just recall all certificates. which they won't do. but thinking and perhaps adapting for the future, they do. and that's the right thing
computers are about exactness. if the do not value that and inspect that thoroughly, i would not trust them for security.
Trust extended by the root program maintainers (who serve as a proxy for you, the user, and should make decisions in your interests) to the CA operators (who all too often make decisions that are good, in a prisoner’s dilemma sort of way, for the certificate holders but are terrible for you). This is meant to correct the broken incentive structure of the CA model where the people who pay the CAs are not the people who consume the results of the CA’s work.
(How much the root program maintainers can be relied on to represent your interests varies, but all other alternatives so far seem even more terrible.)
Nobody is seriously suggesting that revoking every LE certificate is the proportionate response here. But in the background for all of this is the fact that CAs historically have resisted revocation as a remedy for practically every misissuance or security incident, no matter their severity. They also frequently talked as though revocation was not only not considered, but did not even come to mind when the incident was recognized.
Arguments from lack of security impact or the number of “affected customers” (who are not, in fact, the people who the CAs exist to provide a service to, to reiterate the incentive problem) were used to argue against almost everything, including eliminating shady sub-CAs, reducing astronomical maximum validity periods, and prohibitions against broken security schemes, even after the corresponding decisions were confirmed by a vote of the CA/Browser Forum and publicly announced by the programs. (In fact, there’s an open Mozilla bug right now where Google’s Ryan Sleevi is lambasting Google Trust Services for cross-signing a historical SHA-1 CA using a currently trusted root.)
This is why Mozilla’s policy is absolutely merciless regarding revocation and incident reports. In an issue like this, where there is, in fact, no actual security impact, it is expected that the CA will suck it up and file an additional report saying that yes, on the balance we aren’t going to revoke, and here’s our reasoning (because again, bare statements or restatement of conclusions masquerading as justification is the norm for CA communications, as can be seen from the current compliance bugs on the Mozilla tracker). For every decision not to revoke, there needs to exist at least percieved harm to the CA(’s reputation with the root program), because history shows that otherwise nobody revokes anything.
(Browse the relevant category of the Mozilla tracker, mozilla.dev.security.policy, or the CA/B Forum mailing lists if you want a chilling read. This is what finally turned me off DNSSEC+DANE, because if people for whom PKI and key management is literally the only job are that bad at it, I don’t even want to imagine how bad domain resellers are going to be, and unlike CAs you can’t just toss your DNS delegation out and get a new one in a couple of minutes.)
Thus the revocation part of this issue is posturing, but it’s posturing that both sides recognize and have decided to accept as the only way of ensuring the WebPKI remains somewhat functional.
The operations part that’s going to be discussed in the linked bug itself (and not the so far nonexistent no-revocation report), on the other hand, has immediate importance to LE’s operations, because it means that there were no additional controls beyond the issuing software itself that were enforcing LE’s declared policies, and that’s just too brittle a design to work with. The policies as declared in the Certification Practice Statement are (intentionally meant to be) rigid and hard to change, whereas people will routinely reconfigure or even modify the issuing software, and somebody, sometime, will make a fumble in the certificate template or check the wrong verification box or even mistakenly enter a command to issue a sub-CA from prod. That’s not a problem, but not preventing it from causing a misissuance is.
The accepted remedy is to run independent linters (plural, because bespoke tooling integrating them into the pipeline has mistakes as well) configured to enforce both the common Baseline Requirements and the particular CA’s CPS and left untouched until there’s a (carefully reviewed) change to those. It seems that LE’s setup failed in this regard, because while of course Boulder can and will have the occasional off-by-one bug, it’s highly unlikely that ZLint or Cablint or whatever has the same one, so somebody configured them wrong.
The security concerns of this particular bug are essentially zero. The meta question is if there are other related bugs that may not have been caught. We should stamp out bug classes, not individual bugs.
It's a brown M&Ms sort of situation. It's a low-impact situation, but the appropriate response is to audit how the mistake was made and figure out what failed for it to slip through — which might lead to insight into other latent problems.
My guess is that the Facebook employee thought it was simply not worth using an untrusted (22 stars) library like this for the project. Maybe there was some additional feature where they just didn't want to deal with asking the original author of the idea and wanted to build themselves, but I'm guessing because the project is going to have significant usage of this (and probably growing as more complicated use cases come up), in house control is easier to manage (eg. imagine the case where the repo goes offline or something). Also, it doesn't look like this was 'Facebook inspired', but done on the developer's own time. Likely some weekend project because they were bored. I really don't think this is done in malice like some other comments hint at.