More

sevenoftwelve · 2024-09-15T20:47:43 1726433263

The article is interesting, but it misses the most practical and unambiguously safe way to generate streams of random data: Use cryptography.

To generate a stream of random data, use a hash function with arbitrary-length output (XOF) such as blake2x[^0] or shake256[^1]. Make sure your key contains at least 256 bits of entropy. Absolutely never use a key with less than 128 bits of entropy.

Since it's impossible to know how much entropy there is in a key, you probably want to use something like the Fortuna RNG[^2]. Substitute the sha2/AES based construction for your XOF. Bruce Schneier designed Fortuna back when XOFs were harder to come by.

If you want more performance, you can use blake2 to compress your input seed into 256 bits and generate the random stream using chacha20[^3].

All of this is usually handled by the Linux kernel[^4], so it's best to just use the getrandom(2)[^5] system call or just read from /dev/urandom[^6]. If you are writing a Rust program, you can use the rand[^7] crate, which uses a mixed approach reading a seed from the operating system and expanding it in-process using chacha[^8]. This is a valid strategy.

I am omitting some subtleties[^10] about mathematical definitions of randomness extractors as used by the author of the article. When you are using a cryptographic approach, you are dealing with a complexity-theory based security notion[^9], which does not precisely equate to creating a stream with a specific amount of entropy. Everywhere – except for a physics or a math paper dealing with information theory – I would call this a technicality. For most intents and purposes, cryptographic security notions are the most real-world robust conceptions of randomness available.

[^0]: https://www.blake2.net/

[^1]: https://csrc.nist.gov/pubs/fips/202/final (shake256 is part of the SHA-3 standard)

[^2]: https://www.schneier.com/academic/fortuna/

[^3]: https://protonvpn.com/blog/chacha20

[^4]: https://lwn.net/Articles/884875/

[^5]: https://man.archlinux.org/man/getrandom.2

[^6]: https://man.archlinux.org/man/urandom.4

[^7]: https://crates.io/crates/rand

[^8]: https://docs.rs/rand/0.8.5/rand/rngs/struct.StdRng.html

[^9]: https://en.wikipedia.org/wiki/Security_of_cryptographic_hash...

[^10]: In particular, PRFs are not guaranteed to output tokens with a certain amount of entropy – if I recall correctly – because they can map two inputs to the same output.

---

I am the main author of the Rosenpass[^11] post-quantum secure key exchange for WireGuard. My expertise comes from developing this protocol, as well as a couple of years of engagement with the real-world cryptography community and from my own scientific research on cryptography and secure implementations of cryptography.

[^11]: https://rosenpass.eu/

seanhunter · 2024-09-16T05:21:47 1726464107

This is a great post with really valuable resources for any practical attempt to use strong randomness, but aren't you missing the whole point of the article?

Surely if you can just use a strong RNG to generate the key for the cryptographic algorithm you could just use that for all your randomness and ignore the stream of input entirely? The whole point of the article is how to extract the entropy from an unknown/untrusted input stream.

It's like the author has presented a recipe for a chocolate cake and you've said "it's better if you already have a cake, then you can just take a slice of that". Well yes.

Or in the domain of the article, faced with von Neumann's algorithm for getting fair flips from a biased coin, your solution amounts to "Instead I just flip my own coin which I know is fair."

sevenoftwelve · 2024-09-16T09:29:23 1726478963

I don't think so.

Using hash functions requires a minimum amount of entropy in the seed. So do the schemes put forward in the article. In particular, these schemes require a relatively high degree of certainty about the amount of entropy in the stream at low variation. For the entropy extractors, the amount of total entropy required scales linearly with the length of the output stream. If you are using a hash function, the entropy requirement is constant.

The fact remains, both the method put forward in the post and using hash functions have a minimum entropy requirement. Even if you have only a small amount of entropy, hash functions will still get you more bang for the buck.

If push really comes to shove, you can still use a key stretching function[^0] to make it as hard as possible for an attacker to brute force what little entropy you have, as is routinely done with passwords.

To illustrate the difference in entropy requirement, imagine running the Von Neumann generator for a while, achieving the needed entropy level. In this scenario, the output stream of randomness will be fine. If you then get a section from the stream with very little entropy – much lower than the required amount – you get a section with very little entropy in your output stream. The Von Neumann generator can degrade, hash functions won't (for all practical intents and purposes).

Fortuna is designed to approach the entropy estimation problem. It is eventually secure even in the face of a determined attacker; due to the construction using entropy pools used with exponentially decreasing frequency of use, it covers an extreme range of different entropy levels – around ten orders of magnitude with 32 pools.

Crucially, once a secure level of entropy is reached, the RNG stays secure.

Of course, if the amount of entropy is low, Fortuna will produce quite a bit of insecure entropy before, eventually, producing secure entropy. In practice, this issue can be solved by running Fortuna for a while without producing any output. In a factory producing secure hardware devices, you might just allow the device to be active for a day, and you might some extra, high-quality entropy from the device that flashes your hardware devices in the first place.

Fortuna also allows you to use different sources of entropy securely. Using a Von Neumann generator for instance, achieving this is much harder.

Entropy estimation is borderline impossible in practice; Fortuna deals with this head on, with von Neumann you are lost if your estimate is off.

[^0]: https://en.wikipedia.org/wiki/Key_stretching

sevenoftwelve · 2024-05-02T05:52:04 1714629124

I am researching formal methods systems to conduct cryptographic proofs; I think a big issue in that space is having syntax sugar in the core compiler, so I am pursuing ways to move syntactic sugar into a separate macro language.

Your post sounds like an interesting research project, but the examples given in the blog post seem somewhat discouraging to me. When I gaze at one of your examples, they seem obtuse and while I am sure I could understand the examples by not just skimming the article, this does tell me that the grammars you used are not very self-explanatory: Your syntax expresses meaning relative to the specific grammar as opposed to expressing meaning relative to the English language or pre-existing, well-established programming languages.

Is there some example of self-explanatory grammars and self-explanatory bootstrapping code in your language?

ret2pop · 2024-05-02T06:07:45 1714630065

Self explanatory grammars: there's the stem-like syntax that I showcase and you could write a lisp like syntax in it pretty easily. I'd imagine you don't want to use this in production yet. As for self explanatory bootstrap code: the bootstrap code is the way it is because cognition doesn't want to assume some complex syntax from the start.

sevenoftwelve · on March 4, 2023

I am the Rosenpass author; Rosenpass is an add-on to make WireGuard even more secure. There are certain types of attacks (i.e. those from quantum computers) that could become possible to perform for governments and very large organizations.

To protect the infrastructure of smaller companies and to protect the data of private WireGuard users, Rosenpass makes WireGuard immune against these potential attacks from states and large organization.

KomoD · on March 4, 2023

Without looking exactly how Rosenpass works (outside of your brief explanation), could this just be merged into WireGuard? If yes, why hasn't it?

sevenoftwelve · on March 1, 2023

Rosenpass author here;

Mate, you could just read the code…or give it a try ;)

> the WG author seems like he doesn't care about PQC

This is plainly not true; WG supports post-quantum security with the use of the PSK mechanism as we do. PQ-crypto is high quality but it is also new and fairly inefficient; not a good thing to integrate into the kernel directly. Using the PSK mechanism is the best way to do this I know of at this point in time.

> It's also not clear how the WG PSK change is coordinated, and whether that entails a brief loss of connectivity - packet loss, latency spike.

WireGuard establishes a session with the existing PSK; we replace the PSK every two minutes but WireGuard keeps its established session around until it renegotiates a session.

Both WG and RP rekey their session every two minutes; there is no interruption.

rogers18445 · on March 1, 2023

So is the rosenpass tunnel separate from the non PQC tunnel (the non PQC tunnel being used just for rosenpass)?

Because afaik the moment the PSK is changed all packets immediately start being encrypted by it.

If the change doesn't coincide on both the sender and receiver (within an instant), there will be dropped packets until both PSK's are the same again. Being separate from WG, I don't see how you can insert yourself into their state machine for better coordination.

sevenoftwelve · on Feb 28, 2023

We are based on the work of Hülsing, Ning, Zimmermann, Weber and Schwabe and in contact with the group. I havn't heard about kuedlki; the slides seem to refer to a reimplementation of the 2020 paper and the go implementation has been stale for two years. They also seem to use a tweaked krystals implementation which I would not trust. The link to the blog post they refer to dis dead.

The Rosenpass protocol builds on the 2020 paper but also adds security against state disruption attacks (CVE-2021-46873). The implementation is actively maintained, written un Rust not in go. We use Classic McEliece and Kyber from the OQS library.

sevenoftwelve · on Feb 28, 2023

Rosenpass author here;

nope, that is not correct. NIST has elected Kyber as one of the algorithms to standardize and we are using that.

As other commenters mentioned (very good info there, thank you all!) the other algorithm we use – Classic McEliece – is one of the oldest algorithms and has been well studied. There is no known efficient attack against it.

wolf550e · on Feb 28, 2023

Have you seen https://isd.mceliece.org/1347.html ? DJB agrees with you.

Vecr · on March 1, 2023

DJB says the parameters designed for long term use mceliece6960119 and mceliece6688128 are fine against an attack billions or trillions of times stronger.

sevenoftwelve · on Feb 28, 2023

Rosenpass author here;

There is a confusion about terminology here I think. Mathematical proofs including cryptography proofs use models simplifying reality; i.e. the real practical system might still be susceptible to attacks despite a proof of security.

For crypto primitives (classic mc eliece, curve25519, ed25519, RSA, etc etc) the standard for proofs is currently showing that they are as hard as some well studied mathematical problem. This is done by showing that an attack on the primitive leads to an attack on the underlying mathematical primitive. The proof for Diffie-Hellman shows that attacking DH leads to an efficient solution for the discrete log problem. I.e. the proof is a reduction to the underlying primitive.

No primitive is perfectly secure (at least a brute force – i.e. guessing each possibility is possible); there is some probability that the adversary can guess the right key. We call this probability the adversary's advantage. One task in cryptoanalysis is to find better attacks against primitives with a higher advantage; if an attack with a polynomial time average runtime is found, the primitive is broken. Finding a higher non-polynomial attack is still an interesting result.

The standard for protocols is proving that the protocol is secure assuming the primitives are secure; since multiple primitives are used you basically get a formula deriving an advantage for breaking the entire protocol. The proof is a reduction to a set of primitives.

We did not build a proof in that gold standard, although we are working on it. We built a proof in the symbolic model – known as a symbolic analysis. This uses the perfect cryptography assumption; i.e. we assumed that the advantages for each primitive are zero. Google "Dolev-Yao-Model".

This makes the proof much easier; a proof assistant such as ProVerif can basically find a proof automatically using logic programming methods (horn clauses).

The definitions of security are fairly well understood; unfortunately there is a lot to go into so I can't expand on that here. Looking up "IND-CPA" and "IND-CCA" might be a good start; these are the security games/models of security for asymmetric encryption; you could move on to the models for key exchange algorithms there. Reading the [noise protocol spec](https://noiseprotocol.org/) is also a good start.

tetha · on Feb 28, 2023

A professor in university had an interesting illustration of the attackers advantage.

First off, an attack is straight up impossible. If you need to invest ~ 10k operations for each atom in the observable universe to break a system with more than 50% probability, well. That won't get broken, until breakthroughs in related mathematics happen. Even if you were lucky to guess a key once, you will never be twice.

Then, you enter the area of throwing money at it. You can conquer quite a few exponents of two of search space if you throw a distributed system worth billions of dollars at it. And a couple more millions in change in post-docs shaving off fractions off of that exponent. Here you are usually safe, since it'll be hard even with all that hardware, manpower and math research.

But once it's exponential growth with lower exponents or even polynomial, it's just an implementation and optimization issue on the way to real-time decodeability.

However, even if the math is hard, the implementation might not be. And that's why a formally proven implementation of a very hard algorithm is exciting. If the implementation is provably as hard as discrete logarithms, and you get broken, a silly amount of modern crypto gets broken all at once.

Or we might learn something about formal verification and your method and tooling. Which is also good progress.

yellow_lead · on March 1, 2023

Thank you for the informative comment!

sevenoftwelve · on Feb 28, 2023

Author here;

We are :) Rosenpass is a fancy way of generating a PSK for WireGuard.

sevenoftwelve · on Feb 28, 2023

Rosenpass author here.

It does yes. But it is a mitigation, not a real fix.

An attacker could still just speed up time. Although not being able to produce a KillPacket for the year three thousand is a good thing :)

sevenoftwelve · on Feb 28, 2023

Rosenpass author here; I myself am independent, thus funding by NLNet. We have some project participants who are Freelancers; two of my co-authors are employed at research institutes. One of my co authors is employed at MPI-SP.

The cookie thing is a defense against WireGuard CVE-2021-46873; the attack is in my view not bad enough to get rid of the WireGuard protocol. WG is still the standard for pre-quantum VPN implementations. Rosenpass also needs to use post-quantum crypto-primitives that need a lot of cpu and memory resources.

Rosenpass and WireGuard work together; Rosenpass runs in userspace and gives keys to WireGuard so we do not plan to replace it any time.

It would be possible to apply the biscuit mechanism to classical WireGuard; unfortunately that would cause a protocol incompatibility. I am not sure if they are going to take that path.