[bug] mistral-7b-openorca crashes main.exe after BPE update.

Prerequisites

Please answer the following questions for yourself before submitting an issue.

I am running the latest code. Development is very rapid so there are no tagged versions as of now.
I carefully followed the README.md.
I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
I reviewed the Discussions, and have a new bug or useful enhancement to share.

Expected Behavior

mistral-7b-openorca.Q4_K_S.gguf works correctly, as it was before the BPE update.

Current Behavior

mistral-7b-openorca.Q4_K_S.gguf crashes main.exe after entering (and processing?) the prompt.

Additionally, I've merged that commit into my own chat project (slightly rewritten main example), and it generates, but crashes at the end of generation (eos issue?).

Physical (or virtual) hardware you are using:

i5 3470 (AVX only).

Operating System:

Windows 8.1

Environment:

Compiled with w64devkit-fortran-1.20.0
Additionally, I've tested it and got the same crash with main.exe from b1311 AVX release.

Failure Information (for bugs)

The crash message points at llama.cpp, line 7716, GGML_ASSERT(false);

Failure Logs

[1696334675] Log start
[1696334675] Cmd: main -t 3 -m F:/GGML/test/models/mistral_7b_openorca_Q4_K_S.gguf -p "system: complete the given task with precision, adding methodical explanations. user:" --temp 0.9 --repeat_penalty 1.133 --top-p 0.7 -r user: --interactive-first
[1696334675] main: build = 0 (unknown)
[1696334675] main: built with cc (GCC) 13.1.0 for x86_64-w64-mingw32
[1696334675] main: seed  = 1696334675
[1696334675] main: llama backend init
[1696334675] main: load the model and apply lora adapter, if any
[1696334676] warming up the model with an empty run
[1696334677] n_ctx: 512
[1696334677] 
[1696334677] system_info: n_threads = 3 / 4 | AVX = 1 | AVX2 = 0 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 0 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | 
[1696334677] add_bos: 1
[1696334677] tokenize the prompt
[1696334677] prompt: "system: complete the given task with precision, adding methodical explanations. user:"
[1696334677] tokens: [ '':1, ' system':1587, ':':28747, ' complete':4160, ' the':272, ' given':2078, ' task':3638, ' with':395, ' precision':16021, ',':28725, ' adding':8833, ' method':2038, 'ical':745, ' explan':10928, 'ations':697, '.':28723, ' user':2188, ':':28747 ]
[1696334677] recalculate the cached logits (check): embd_inp.empty() false, n_matching_session_tokens 0, embd_inp.size() 18, session_tokens.size() 0, embd_inp.size() 18
[1696334677] inp_pfx: [ '':1, ' ':28705, '':13, '':13, '###':27332, ' Inst':3133, 'ruction':3112, ':':28747, '':13, '':13 ]
[1696334677] inp_sfx: [ ' ':28705, '':13, '':13, '###':27332, ' Response':12107, ':':28747, '':13, '':13 ]
[1696334677] main: interactive mode on.
[1696334677] Reverse prompt: 'user:'
[1696334677] sampling: repeat_last_n = 64, repeat_penalty = 1.133000, presence_penalty = 0.000000, frequency_penalty = 0.000000, top_k = 40, tfs_z = 1.000000, top_p = 0.700000, typical_p = 1.000000, temp = 0.900000, mirostat = 0, mirostat_lr = 0.100000, mirostat_ent = 5.000000
[1696334677] generate: n_ctx = 512, n_batch = 512, n_predict = -1, n_keep = 0
[1696334677] 

[1696334677] == Running in interactive mode. ==
[1696334677]  - Press Ctrl+C to interject at any time.
[1696334677]  - Press Return to return control to LLaMa.
 - To return control without starting a new line, end your input with '/'.
 - If you want to submit another line, end your input with '\'.

[1696334677] embd_inp.size(): 18, n_consumed: 0
[1696334677] found antiprompt: ▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅ system: complete the given task with precision, adding methodical explanations. user:
[1696334677] eval: [ '':1, ' system':1587, ':':28747, ' complete':4160, ' the':272, ' given':2078, ' task':3638, ' with':395, ' precision':16021, ',':28725, ' adding':8833, ' method':2038, 'ical':745, ' explan':10928, 'ations':697, '.':28723, ' user':2188, ':':28747 ]
[1696334681] n_past = 18
[1696334681] embd_inp.size(): 18, n_consumed: 18
[1696334681] found antiprompt: ▅▅▅▅▅▅▅▅
53F3
▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅ system: complete the given task with precision, adding methodical explanations. user:
[1696334681] waiting for user input
[1696334689] buffer: 'Write a joke about llamas.
'
[1696334689] input tokens: [ ' Write':12018, ' a':264, ' joke':13015, ' about':684, ' llam':17620, 'as':293, '.':28723, '':13 ]
[1696334689] n_remain: -9
[1696334689] embd_inp.size(): 26, n_consumed: 18
[1696334689] eval: [ ' Write':12018, ' a':264, ' joke':13015, ' about':684, ' llam':17620, 'as':293, '.':28723, '':13 ]
[1696334691] n_past = 26
[1696334691] top 10 candidates:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Prerequisites

Expected Behavior

Current Behavior

Failure Information (for bugs)

Failure Logs

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Description

Prerequisites

Expected Behavior

Current Behavior

Failure Information (for bugs)

Failure Logs

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions