Hacker News new | past | comments | ask | show | jobs | submit login

Actually its a little more nuanced:

Operation Type|Mamba Complexity|Transformer Complexity

Training(per iteration)|O(L)|O(L^2)

Autoregressive Inference(per step)|O(T)|O(L)

Memory Requirements|O(C)|O(L)

Where: L stands for the sequence length. T denotes a fixed constant that accounts for compression and selection time in Mamba's autoregressive inference. C reflects the fixed size of the SSM (State Space Model) latent state in Mamba

Per: https://github.com/state-spaces/mamba/issues/196






Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: