-
Notifications
You must be signed in to change notification settings - Fork 12.1k
memory : migrate from llama_kv_cache to more generic llama_memory #14006
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
fe4b1b3
to
bca2671
Compare
bca2671
to
f149a8e
Compare
|
||
// general concept of LLM memory | ||
// the KV cache is a type of LLM memory, but there can be other types | ||
struct llama_memory_i { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed this from class
to struct
to be compatible with the C-header declaration.
src/llama-context.cpp
Outdated
llama_kv_cache * kv_self = static_cast<llama_kv_cache *>(memory.get()); | ||
return kv_self; | ||
llama_memory_t llama_context::get_memory() const { | ||
return static_cast<llama_memory_t>(memory.get()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This cast shouldn't be necessary.
src/llama-context.cpp
Outdated
llama_kv_cache * llama_get_kv_self(llama_context * ctx) { | ||
return ctx->get_kv_self(); | ||
return static_cast<llama_kv_cache *>(ctx->get_memory()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is not a safe cast, so it should be a checked with dynamic_cast
ggml-ci
…ml-org#14006) * memory : merge llama_kv_cache into llama_memory + new `llama_memory` API ggml-ci * context : fix casts ggml-ci
…mory (ggml-org#14006)" This reverts commit 7f37b6c.
cont #13988
llama_kv_cache
intollama_memory_i
llama_kv_cache_unified
now implementsllama_memory_i
llama_kv_cache_recurrent
now implementsllama_memory_i
llama_memory_
public API tolibllama
llama_kv_self_*
public API is now simply routing to the newllama_memory_
API and it will be deprecated in the next PRTODO
llama_memory_
public APINext PRs
llama_kv_self_*
public API in favor of the newllama_memory_
API