Mamba processes sequences through a
learned hidden state that compresses
history into a fixed-size representation.
Key Equations
- x' = Ax + Bu (continuous)
- y = Cx + Du (output)
- B, C, delta = f(input)
vs Transformers
- O(N) instead of O(N^2) complexity
- Fixed memory, not growing context
- Selection enables context-aware gating
Interactions
- Click hidden states for details
- Click matrices A, B, C for formulas
- Watch selection mechanism