Hacker News new | past | comments | ask | show | jobs | submit login
The XAES-256-GCM extended-nonce AEAD (filippo.io)
192 points by FiloSottile 5 months ago | hide | past | favorite | 56 comments



The design is very clever: since it's based on CMAC, we can use AES-CBC to derive keys where lower level primitives are not available.

In AES-CBC terms, the algorithm can be described as:

    1. L = AES-CBC-256ₖ(iv = 0¹²⁸, plaintext = 0¹²⁸)[:16]
    2. If MSB₁(L) = 0, then K1 = L << 1;
       Else K1 = (L << 1) ⊕ 0¹²⁰10000111
    3. M1 = 0x00 || 0x01 || X || 0x00 || N[:12]
    4. M2 = 0x00 || 0x02 || X || 0x00 || N[:12]
    5. Kₓ = AES-CBC-256ₖ(iv = K1, plaintext = M1)[:16] || AES-CBC-256ₖ(iv = K1, plaintext = M2)[:16]
    6. Nₓ = N[12:]
Where AES-CBC-256 returns the first 128-bit block of the ciphertext, discarding the padded block. (Thus, if you can't turn off padding, it costs three additional AES calls with the same key compared to a lower level implementation — not bad). After deriving a key, use it with the standard AES-GCM.

Here's my JS implementation based on WebCrypto API, which uses this fact: https://github.com/dchest/xaes

It accepts a proper CryptoKey intended for AES-CBC, supporting all CryptoKey features, e.g. storing it in IndexedDB with "extractable" bit set to false.

Great job, Filippo!


It seems to be standard in that field, but to be honest, I hate cryptographic notation. AFAICT, about half of the numbers in that pseudo-code count a number of bytes, and the other half a number of bits, and without already knowing the algorithms, it's almost impossible to tell which is which. E.g., N[:12] seems to be twelve bytes, while 0¹²⁸ is 16 bytes. X is actually the character 'X', i.e. the bit string 01011000. While L is actually a variable, not the bit string 01001100. And so on. It's clear that mathematicians don't like unambiguous notation nearly as much as CS people do.


Well, X is a variable that's equal to 'X' :) It's a mix of cryptographic notation, Go/Python notations for slices, and my ad-hoc notation for AES-CBC. To be fair, I find such pseudocode easier to read than math notation, but it's because I'm more used to it.


Agree. Bytes and bits mix is quite awful. For implementation purposes it could be more programming oriented. 0¹²⁸ -> `[0] * 16` (it's python oriented, maybe C has short idiom for that) or nice example with padded data: 0¹²⁰10000111 -> `[0] * 15 || 0b10000111`. `X` should be clearly 'X'.

I guess crypto pros are ok with notation. But for hobbyist it requires some reverse engineering every time.


I don't disagree, actually. I was copying the NIST source document notation with 0¹²⁸ and 0¹²⁰10000111, but it probably does more harm than good. `X` was just me being too clever. (In my defense, `X` is formatted differently from variables in the original, and all variables are defined.)

Done. https://github.com/C2SP/C2SP/pull/86/files


> 0¹²⁰10000111

for those of you(like me) wondering where this apparently spooky constant is coming from, it is a bitstring of the coefficients of the lexically first irreducible polynomial of degree b with the minimum possible number of non-zero terms, where b is the block size(in bits) of the underlying block cipher with which CMAC is instantiated. So, nothing up the sleeve here.


My natural follow-up question was "why can't you just have K1 = L?" Obviously it's inherited from CMAC, but why does CMAC do it?

Investigating further, general-case CMAC involves generating a K1 and a K2, which afaict just need to be arbitrarily different from each other. So why not something even simpler, like "xor with 1"?


The multiplication in CMAC is there to distinguish between full and partial final input blocks. It can't be simply a xor with a constant because that would be easily cancelable in the input, and wouldn't satisfy the required xor-universal-like properties required by the security proof.

The input here is highly restricted so there's no point to it.


My reaction was "Huh? What multiplication?"

The answer is that we're treating this as a Galois field/finite field of order 2^128 with the reducing polynomial (2^128 + 0b10000111).

Under that framework, the left shift and possible XOR implement multiplication by 2. (An example of general multiplication here: https://en.wikipedia.org/wiki/Finite_field_arithmetic#Progra...)


It's an 8 bit constant padded to 128 bits, so while I appreciate the explanation it's not nearly enough entropy to spook me.


Not a crypto guy. So would the high-level summary be something like this?

Standard AES-GCM AEAD will catastrophically fail if you use the same nonce twice[1] for different messages, and the size of the size of the nonce is small enough that one cannot safely use a random nonce in many cases.

This work provides an easy to use way to avoid that. It does so by changing not just the nonce but also the key used per message for the AES-GCM call.

And it uses only "plain" AES which is readily available if you have AES-GCM, and no fancy new constructions which may have unknown weaknesses.

The overhead per message is just two small buffers that needs to be encrypted/decrypted with "plain" AES and a longer 192 bit nonce.

[1]: https://frereit.de/aes_gcm/


It seems like this removes the footgun in vanilla AES-GCM where you really need to rotate keys every ~2^32 messages if you are using a random nonce. Nonce collision in AES-GCM is catastrophic (it allows attackers to at least sign arbitrary messages). You don't need to use a random nonce, but it's usually recommended. Fairly clever to use two primitives (counter-based KDF and vanilla GCM) to make this FIPS compliant.


No, this makes random nonces safe in the first place. With standard AES-GCM, you should use deterministic nonce generation since 96 bits is not enough to avoid random collisions. Also, you must change the nonce (or key) after 2^32 blocks regardless of how it was generated because the counter rolls over and the next block would use the same nonce+counter as the first block.


You are wrong. NIST recommendations outline both deterministic or RBG-based construction. The 2^32 invocation limit is only for random nonce or if your deterministic construction is less than 96bits. Lots of people are doing random nonces.

https://csrc.nist.gov/pubs/sp/800/38/d/final


That link says the document needs revising, specifically "to clarify the guidance in connection with the IV constructions" (NIST calls the nonce an IV). It defines GCM normatively but its non-normative recommendations are outdated.

I also don't agree with the specific claim (made or at least implied by NIST) that a single 96-bit deterministic nonce isn't limited to 2³² blocks. The counter block will wrap around regardless of how the nonce was generated, because the GCTR function that is used to compute the ciphertext and authentication tag sets CBᵢ = inc32(CBᵢ₋₁) and CB₁ = ICB = inc₃₂(J₀) with J₀ a function of the nonce and inc₃₂ only incrementing the bottom 32 bits. Modern recommendations do not make this distinction around how the nonce was generated and I see no justification made for it by NIST. Perhaps it was meant to apply to the case where a portion of the nonce was implicit and thus not sent or stored in the clear, but deterministic generation doesn't always mean partially implicit nonces and the implicit part is too small (usually 32 bits) and too easy to obtain (often derived from a hardware identifier) to provide any additional security anyway.

Using any nonce lengths other than 96 bits is not recommended today, regardless of the recommendations in 2007. Shorter lengths are obviously a poor choice, but longer lengths are not always supported by implementations. Moreover, while the published standard supports various lengths (with support for <96 bits marked for removal), the invocation of the GHASH function and its effect on nonce entropy is not well studied AFAIK and all nonces other than 96 bits are fed into GHASH. Thus, one shouldn't use a nonce longer than 96 bits, which means the birthday paradox can become a real problem if the same key is used to encrypt a large number of messages, each with different nonces. A single or relatively small number of CSPRNG-generated nonces for the same key is usually okay; a lot is a problem. This problem is a major reason why AES-GCM-SIV, XChaCha20-Poly1305, and XAES-256-GCM even exist.


For some elaboration on my issues with NIST recommendations, let us consider NIST's response to public comments from 2021:

NSA raised the issue of "Counter wrapping, or integer overflow, because counter is 32 bits" to which NIST replied that "WITH CURRENT COMPUTING ABILITIES [...] Counter should not overflow". I find that to be a thoroughly inadequate response. Obviously current computing capabilities can overflow a 32-bit counter. That also translates (as NSA also pointed out) to 68GB of data encrypted with the same nonce, which is still "a lot" for some use cases, but easy to exceed for other use cases in the age of terabytes and petabytes.

On the issue of nonce reuse specifically, NIST respond to NSA's concerns with 'Generate a new 96-bit nonce for each message using a cryptographically strong PRNG. Re-key at reasonably regular intervals, where "reasonably regular" is defined by how much data and how many messages are being encrypted'. I think that broadly validates what I said. However, "reasonably regular" is not actionable guidance, and it is not always possible to re-key easily.

https://csrc.nist.gov/csrc/media/projects/crypto-publication...


I framed that as NIST responses to NSA concerns, but on re-reading, it seems that the table I'm quoting is entirely produced by the NSA. This doesn't really affect the substance of what I wrote with regard to technical details, but I may have misattributed statements to NIST that came from NSA.


They're not wrong, and the limited nonce space in standard GCM is one of the most common problems cryptography engineers have with GCM.


This is freaking fantastic- I only wish it existed when I wrote my last encrypted filesystem a few years back.

Nonce collision is a huge concern on large file system deployments. 2^32 seems huge but when you’re writing 100k iops a second on a PB array the chance of collision is almost guaranteed if you’re betting on PRNG randomness.


The CAESAR competition [1] ended in 2019 and resulted in multiple different AEADs, most with plenty of nonce space.

[1] https://en.m.wikipedia.org/wiki/CAESAR_Competition


Why is nonce collision a problem though? It just means that two blocks share the same encryption key, right? Without knowing the plaintext in either block, how does that weaken the security of the system?


Encryption with the same key and repeated nonce/counter produce the same cipher stream. Ciphertext in GCM (or CTR) mode is cipherstream XOR plaintext, thus given two ciphertexts with the same key/nonce:

ciphertext1 XOR ciphertext2 = (cipherstream XOR plaintext1) XOR (cipherstream XOR plaintext2) = plaintext1 XOR plaintext2

In GCM it can also break authentication.


I would love to see this used in a FIPS-compliant variant of age[1] for archival file encryption use cases. We had banking industry auditors veto age for this use case due to the use of ChaCha instead of AES (they were fine with the X25519 public key part of age which I think was somewhat recently approved by NIST).

I’ve no experience with golang but it seems like it should drop right in based on the age spec. I might give it a shot if time ever permits. I guess I should call it “cage” as in “compliant actually good encryption”

1: https://github.com/FiloSottile/age


> “compliant actually good encryption”

an oxymoron, perhaps


also an unfortunate acronym


> they were fine with the X25519 public key part of age which I think was somewhat recently approved by NIST

As far as I can tell, Ed25519 is approved (FIPS 186-5), but X25519 is still not (yet).


FWIW Go doesn't have an implementation of XAES in the stdlib yet, there's only the reference implementation in C2SP.


Thank goodness. I am kind of sick of the constant churn in the crypto package.

I get that you want to keep up to date with security, but the entire crypto tree is basically a playground for Filippo Valsorda at this point. Meanwhile stuff that I actually need like CMAC is "won't fix"


What churn does the crypto package get? It's part of the standard library and so bound by the compatibility promise, which basically freezes existing things in place.



Churn implies upheaval, breaking things that used to (and ought to) work. I don't see examples of that from the first few commits I examined.


so pretty stable, and you're mad that other people won't do free work for you to implement a mostly unused spec.


There's a lot of this kind of entitlement around.

Trash other peoples' work as "playground" activity, and demand they work on something else for free.


What's another language with a stdlib that includes CMAC?



Erlang / Elixir


Link? Does Elixir even have cryptography in the stdlib?


https://www.erlang.org/doc/apps/crypto/crypto

Elixir can call Erlang/OTP modules directly, so they also have access to that same module. Erlang/OTP is a hard dependency for Elixir afaik.


Gotcha, thanks!


bge, bureaucratic good encryption.


Question from a non-cryptographer: why use 192bit nonces instead of 256? I can’t imagine those extra bits would be considered costly in any practical application.


There is no space for 256 bits: 192 bits is 96 bits from the underlying nonce space, and 96 bits that go into the 128-bit CMAC block (along with the necessary prefix). We could make the CMAC input longer, but then we'd have to run the AES-256 block function more times (and we'd hit some annoying key control issues in the CMAC KDF).

This is actually similar to why XChaCha20Poly1305 has 192-bit nonces, and consistency with the other major extended-nonce AEAD is another mild advantage.


Reducing security below 128 bits in order to save a block of AES will anger the gods and surely we will be made to pay. Turn back now, while there is still time.


>(2⁸⁰ messages with collision risk 2⁻³²)

Would there be an issue before that due to the fact that the AES block size is only 128 bits?


Assuming you’re referring to the birthday bound on blocks (https://sweet32.info) that’s a limit on blocks encrypted with a single key. XAES derives large keys per message, so it achieves what are commonly referred to as “better-than-birthday” bounds.


No

(it's difficult to give a more detailed answer than that, without more detail on why you think it'd be an issue)


> safe, boring, compliant, and interoperable

My favorite kind of technology.


I like it, because it is indeed nice to have a NIST-backed construction.

But at the same time, it is disappointing that you get locked out of several niceties of NIST KDFs, such as label and context. I get that they are sacrificed to minimize the number of AES calls, but still I would prioritize strong cryptographic separation over just a few saved AES calls, especially for messages longer than a few hundred bytes.

Finally, *random* GCM nonces longer than 96 bits are definitely misunderstood and bring better guarantees than 96 bits nonces [1]. But of course, if you can derive a fresh key for every message, that's definitely to prefer.

[1] https://neilmadden.blog/2024/05/23/galois-counter-mode-and-r...


[flagged]


Somebody make one of those tricky quizzes, "is it a crypto algorithm or Elon Musk baby name?"


Only natural to see him doing crypto grifting


[flagged]


The British meaning is much newer, and is a bastardization of an extremely commonly-used word in the sciences (derived from a Latin root and all). Cringing at its use evokes the feeling I get when someone snickers at the pronunciation of "Uranus" in an astronomy course.


I guess this thread is talking about the word nonce, https://www.collinsdictionary.com/dictionary/english/nonce


This is why people have the concept of context.


Don't feed egregious comments by replying; flag them instead.

https://news.ycombinator.com/newsguidelines.html


I don't think the point was raised in bad faith nor do I find myself shocked by it's content. Flagging here seems extreme.


The flags are for the effects these comments have on threads (offtopic tangents, grumpery, etc) not the faith of the commenter or one reader's resilience to shock.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: