10000 Update docs/source/en/model_doc/mamba.md · huggingface/transformers@f963e38 · GitHub
[go: up one dir, main page]

Skip to content

Commit f963e38

Browse files
Update docs/source/en/model_doc/mamba.md
Co-authored-by: Lysandre Debut <hi@lysand.re>
1 parent 28e5ef0 commit f963e38

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

docs/source/en/model_doc/mamba.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@ Tips:
3030

3131
- Mamba is a new `state space model` architecture that rivals the classic Transformers. It is based on the line of progress on structured state space models, with an efficient hardware-aware design and imp 5D3A lementation in the spirit of [FlashAttention](https://github.com/Dao-AILab/flash-attention).
3232
- Mamba stacks `mixer` layers, which are the equivalent of `Attention` layers. The core logic of `mamba` is held in the `MambaMixer` class.
33-
- Two implementation cohabit: one is optimized and uses fast cuda kernels, while the other one is naive but can run on any device!
33+
- Two implementations cohabit: one is optimized and uses fast cuda kernels, while the other one is naive but can run on any device!
3434
- The current implementation leverages the original cuda kernels: the equivalent of flash attention for Mamba are hosted in the [`mamba-ssm`](https://github.com/state-spaces/mamba) and the [`causal_conv1d`](https://github.com/Dao-AILab/causal-conv1d) repositories. Make sure to install them if your hardware supports them!
3535
- Contributions to make the naive path faster are welcome 🤗
3636

0 commit comments

Comments
 (0)
0