You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/source/en/model_doc/mamba.md
+1-1Lines changed: 1 addition & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -30,7 +30,7 @@ Tips:
30
30
31
31
- Mamba is a new `state space model` architecture that rivals the classic Transformers. It is based on the line of progress on structured state space models, with an efficient hardware-aware design and imp
5D3A
lementation in the spirit of [FlashAttention](https://github.com/Dao-AILab/flash-attention).
32
32
- Mamba stacks `mixer` layers, which are the equivalent of `Attention` layers. The core logic of `mamba` is held in the `MambaMixer` class.
33
-
- Two implementation cohabit: one is optimized and uses fast cuda kernels, while the other one is naive but can run on any device!
33
+
- Two implementations cohabit: one is optimized and uses fast cuda kernels, while the other one is naive but can run on any device!
34
34
- The current implementation leverages the original cuda kernels: the equivalent of flash attention for Mamba are hosted in the [`mamba-ssm`](https://github.com/state-spaces/mamba) and the [`causal_conv1d`](https://github.com/Dao-AILab/causal-conv1d) repositories. Make sure to install them if your hardware supports them!
35
35
- Contributions to make the naive path faster are welcome 🤗
0 commit comments