Constant-Time Big Numbers: An Introduction

moasda · on May 22, 2021

From article:

The key here is to understand that by constant-time, we don’t mean that operations can’t take a varying amount of time. Instead, we mean that the timing signal generated by our operations doesn’t leak any secret information.

codeflo · on May 22, 2021

It’s very unfortunate that the crypto-community uses the term “constant-time” this way. Without further context, I thought of algorithmic complexity, where “constant time” means “bounded by a constant” — which doesn’t guarantee resistance to timing attacks at all!

So if you’re looking for a crypto algorithm, make sure it’s constant-time in the relevant sense.

dataflow · on May 22, 2021

Yeah I wish people would say uniform-time instead.

moasda · on May 22, 2021

Exactly, that was my assumption, too. I was wondering how they could handle big numbers in algorithmic time complexity O(1).

rurban · on May 23, 2021

libc's use the better timingsafe_ prefix for this.

Crypto folks should better follow the established standards.

nickcw · on May 22, 2021

> This means representing numbers in base 2^{m} where m is your register width). With components this big, we usually call these “limbs”: wordplay on “digits”, of course.

I never realised limbs was wordplay on digits. Duh!

stephenbuilds · on May 22, 2021

Just wanted to congratulate the author on a beautifully structured and penned article. It's not a topic that is directly related to my work, but I still enjoyed it and learned from it.

aaronAgain · on May 22, 2021

And I wanted to concur with this. A great description of complicated topic. Just enough information in each section to get you ready for the next, nothing extra. Well done.

nullc · on May 24, 2021

> because constant-time exponentiation is theoretically 2x slower

For uniform inputs at least they should be quite close to the same speed, in theory, for sufficiently advanced implementations.

An intuition that helps see why this is so is that all but a vanishingly small fraction of uniform inputs have a size close to the maximum size. E.g. 1/2 are 1 bit smaller, 1/4 are 2 bit smaller... So inputs small enough to even cut off a single word will be rare enough that your benchmark will never even see them.

bob1029 · on May 22, 2021

> Because of this, adding in timing noise can only act as a mitigation, and not a particularly effective one at that.

I think the author is thinking in terms of X << t for purposes of adding random delay. If you flip this around, you get the opposite conclusion. For instance,

  Crypto operation actual: 10uS
  Random delay: 100-1000ms

In this arrangement, the amount of time the actual cryptographic operation takes to complete falls under the noise floor.

Isn't this essentially a big part of why PBKDF2 and friends are good at hashing passwords?

cronokirby · on May 22, 2021

Author here.

To illustrate my point further, let's say that your operation takes either 0.1ms or 0.0ms.

If you add in a uniform random delay of 100-1000ms, your expected random delay becomes 450ms.

The expected total runtime is thus either 450.1ms or 450.0ms. If you have a way to calculate this expected value with enough precision, you can completely recover the timing signal.

The amount of noise you inject can only make recovering the signal harder, not impossible. I also believe that the difficulty is only polynomial in the amount of noise, which limits the effectiveness of this strategy.

The reason why password hashes like PBKDF2 try and make the operation take longer is for a different reason, as far as I know. If you try and crack a password's hash by trying many different common passwords, until you can find a match, then it helps your attack if calculating a hash is very fast. Password hashes intentionally try and make calculating hashes more expensive, both in time, but also in memory consumption (in Argon2, for example), to limit the effectiveness of this attack.

ghusbands · on May 22, 2021

This is a good place to ask without a top-level comment: Do people really tend to do it that way? Whenever I've considered using sleep to mitigate timing attacks, it has been to sleep an amount of time that makes the resultant time constant. That is, to make the operation in this case always take 450ms.

cronokirby · on May 22, 2021

Sleeping just long enough to fill the remaining time-slot for an operation is actually a valid countermeasure; in theory.

The problem is figuring out how much sleep is necessary based on what variable time work has been done, and actually sleeping the right amount. This can be difficult with the granularity involved.

In practice, attempting to fill the remaining time is either not effective, or requires a level of instrumentation that severely degrades performances.

I'd recommend section 5 of https://eprint.iacr.org/2016/613.pdf for a survey of these ideas.

Dylan16807 · on May 23, 2021

> In this arrangement, the amount of time the actual cryptographic operation takes to complete falls under the noise floor.

And if you remember shannon's law, you can send and receive data just fine below the noise floor.

The lower you go, the less each unit of additional noise harms you, since your bandwidth is roughly proportional to your signal to noise ratio (when it's below 1). When someone is trying to receive a short "signal" like a key, no amount of noise is going to help for long.