This is very similar to Montgomery reduction [1]. Both do a multiply taking low bits followed by a multiply taking the high bits to reduce a number.
Montgomery reduction requires an augmented number system with conversions to and from. The article's approach works on numbers directly, but if I understand it correctly the multiplications are twice the width.
Are these two methods algebraically related? Would it be possible to get fastmod working without the double width?
Montgomery reduction requires an augmented number system with conversions to and from. The article's approach works on numbers directly, but if I understand it correctly the multiplications are twice the width.
Are these two methods algebraically related? Would it be possible to get fastmod working without the double width?
[1]: https://en.wikipedia.org/wiki/Montgomery_modular_multiplicat...