The matrices that make up the state space (A, B and C) are constant in S4. This allowed them to represent some of the math operations as a convolution (which can be parallelized).
The difference between S4 and Mamba is that these matrices are input-dependent in Mamba. Plus they add in some CUDA stuff ("parallel scan") to make it faster to compute on a GPU even if these matrices are not constant.
https://srush.github.io/annotated-s4/
The matrices that make up the state space (A, B and C) are constant in S4. This allowed them to represent some of the math operations as a convolution (which can be parallelized).
The difference between S4 and Mamba is that these matrices are input-dependent in Mamba. Plus they add in some CUDA stuff ("parallel scan") to make it faster to compute on a GPU even if these matrices are not constant.
Yannic Kilcher's video on Mamba might also be a good resource: https://youtu.be/9dSkvxS2EB0