Tags: Pints-AI/llama.cpp
Tags
finetune : add training data file to log message (ggml-org#4979) This commit adds the name of the training data file to the log message printed when the training data is tokenized. The motivation for this change is that it can be useful to show which file is being tokenized when running the finetune example. Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com>
ggml : importance matrix support for legacy quants (ggml-org#4969) * imatrix: adding support for legacy quants * imatrix: guard Q4_0/Q5_0 against ffn_down craziness --------- Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
perplexity : fix kv cache handling for hellaswag (ggml-org#4981) ggml-ci
metal : localized logic in `ggml_metal_graph_compute` (ggml-org#4924) * Metal: Localized logic in `ggml_metal_graph_compute`, minor performance improvement * Whitespace * Collecting command buffer completions on single thread * Whitespace * Reduce diff noise
android : introduce starter project example (ggml-org#4926) * Introduce starter project for Android Based on examples/llama.swiftui. * Add github workflow * Set NDK version * Only build arm64-v8a in CI * Sync bench code * Rename CI prop to skip-armeabi-v7a * Remove unused tests
metal : log `recommendedMaxWorkingSetSize` on iOS 16+ (ggml-org#4936) * metal: Log `recommendedMaxWorkingSetSize` on iOS 16+ * Only log on iOS and macOS, ignoring tvOS and other platforms * Check for Xcode version before using recommendedMaxWorkingSetSize --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
ggml : introduce GGML_CALL function annotation (ggml-org#4850) This change makes it possible to build ggml-cuda.cu and ggml-metal.m as independent dynamic shared objects, that may be conditionally linked at runtime in a multiplatform binary. It introduces a GGML_CALL annotation that documents which functions have a cyclic call relationship, between the application code and GPU modules. This change does nothing, unless the build defines -DGGML_MULTIPLATFORM which causes back-references and function pointers to conform to MS ABI which is supported by NVCC, ROCm, XCode, GCC and Clang across platforms
finetune : use LLAMA_FILE_MAGIC_GGLA (ggml-org#4961) This commit replaces the magic number LLAMA_FILE_MAGIC_LORA used in finetune.cpp with LLAMA_FILE_MAGIC_GGLA defined in llama.h. Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com>
speculative : threading options (ggml-org#4959) * speculative: expose draft threading * fix usage format * accept -td and -tbd args * speculative: revert default behavior when -td is unspecified * fix trailing whitespace
PreviousNext