How does it compare to XChachaPoly? I guess it's only useful when you are sure to have CPU support for AES and not having to deal with the annoyance of non constant time software implementations. I feel like XChachaPoly has the benefit of being annoyance free (constant time by design even in software implementation), fairly fast even without hardware support and to exists, so it's probably a good choice for most use cases.
EDIT: my question is probably more clearly asked as: what's wrong with XChachaPoly that is solved by this new construct?
Because AESNI and CLMUL, AES-GCM is faster than a SIMD implementation of ChaCha20Poly1305. In theory, ChaCha20 is overkill, 8 rounds would have been enough and people do use 12 rounds for disk encryption.
Yes that was my thoughts as well, if you are concerned about speed and dropping AES rounds, you might as well do the same with Chacha and use Chacha20/8 (which I don't know if it "exists" but Salsa20/8 is a thing, used in scrypt and available in libsodium for example). At least it's also fast when you don't have AESNI.
Really pedantic nit here, but the cipher Salsa20 is called “Salsa20” and so reduced round versions are called e.g. “Salsa20/8”. However the ChaCha cipher is just called “ChaCha” so the specific-round versions are just “ChaCha20” or “ChaCha8”.
I feel like I just “actually it’s GNU/Linux”ed you there though... I feel bad, I’m sorry.
I've unfortunately seen many situations where only NIST approved constructions are allowed. ChaCha is not an approved NIST algorithm but AES-256-GCM is
EDIT: my question is probably more clearly asked as: what's wrong with XChachaPoly that is solved by this new construct?