Cannot load Phi3 with latest (0.2.84) release

Prerequisites

I am running the latest code. Development is very rapid so there are no tagged versions as of now.
I carefully followed the README.md.
I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
I reviewed the Discussions, and have a new bug or useful enhancement to share.

Expected Behavior

Attempted to load "Phi-3-mini-4k-instruct-q4.gguf" with latest (0.2.84) release. Expected the load to be successful

Current Behavior

Load fails:

<snip>
llama_model_loader: - type q5_K:   32 tensors
llama_model_loader: - type q6_K:   17 tensors
llama_model_load: error loading model: error loading model hyperparameters: key not found in model: phi3.attention.sliding_window
llama_load_model_from_file: failed to load model
Traceback (most recent call last):
  File "C:\Users\riedgar\source\repos\guidance\llama_load_error.py", line 10, in <module>
    llama_model = llama_cpp.Llama(downloaded_file, logits_all=True, verbose=True, n_ctx=4096)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\riedgar\AppData\Local\miniconda3\envs\guidance-312\Lib\site-packages\llama_cpp\llama.py", line 372, in __init__
    _LlamaModel(
  File "C:\Users\riedgar\AppData\Local\miniconda3\envs\guidance-312\Lib\site-packages\llama_cpp\_internals.py", line 55, in __init__
    raise ValueError(f"Failed to load model from file: {path_model}")
ValueError: Failed to load model from file: C:\Users\riedgar\.cache\huggingface\hub\models--microsoft--Phi-3-mini-4k-instruct-gguf\snapshots\999f761fe19e26cf1a339a5ec5f9f201301cbb83\Phi-3-mini-4k-instruct-q4.gguf
Exception ignored in: <function Llama.__del__ at 0x0000029C3603D4E0>
Traceback (most recent call last):
  File "C:\Users\riedgar\AppData\Local\miniconda3\envs\guidance-312\Lib\site-packages\llama_cpp\llama.py", line 2089, in __del__
    if self._lora_adapter is not None:
       ^^^^^^^^^^^^^^^^^^
AttributeError: 'Llama' object has no attribute '_lora_adapter'

This was working with previous llama-cpp-python versions.

Environment and Context

Physical (or virtual) hardware you are using, e.g. for Linux:

i7 13800-H (for Windows)
For Linux, using GitHub-provided Runner machines

Operating System, e.g. for Linux:

Windows and Linux

SDK version, e.g. for Linux:

Python 3.12.3

Failure Information (for bugs)

Please help provide information about the failure if this is a bug. If it is not a bug, please remove the rest of this template.

Steps to Reproduce

import llama_cpp

from huggingface_hub import hf_hub_download

repo_id = "microsoft/Phi-3-mini-4k-instruct-gguf"
filename = "Phi-3-mini-4k-instruct-q4.gguf"

downloaded_file = hf_hub_download(repo_id=repo_id, filename=filename)

llama_model = llama_cpp.Llama(downloaded_file, logits_all=True, verbose=True, n_ctx=4096)
assert llama_model is not None

This ends up with the stacktrace above.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Prerequisites

Expected Behavior

Current Behavior

Environment and Context

Failure Information (for bugs)

Steps to Reproduce

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Description

Prerequisites

Expected Behavior

Current Behavior

Environment and Context

Failure Information (for bugs)

Steps to Reproduce

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions