8000 llama : custom attention mask + parallel decoding + no context swaps by ggerganov · Pull Request #3228 · ggml-org/llama.cpp · GitHub
[go: up one dir, main page]

Skip to content

llama : custom attention mask + parallel decoding + no context swaps#3228

Merged
ggerganov merged 57 commits intomasterfrom
custom-attention-mask
Sep 28, 2023
Merged

llama : custom attention mask + parallel decoding + no context swaps#3228
ggerganov merged 57 commits intomasterfrom
custom-attention-mask

Commits

Commits on Sep 21, 2023

Commits on Sep 27, 2023

0