8000 Tags · Pints-AI/llama.cpp · GitHub
[go: up one dir, main page]

Skip to content

Tags: Pints-AI/llama.cpp

Tags

b1892

Toggle b1892's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. 8000 The key has expired.
finetune : add training data file to log message (ggml-org#4979)

This commit adds the name of the training data file to the log message
printed when the training data is tokenized.

The motivation for this change is that it can be useful to show which
file is being tokenized when running the finetune example.

Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com>

b1891

Toggle b1891's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
ggml : importance matrix support for legacy quants (ggml-org#4969)

* imatrix: adding support for legacy quants

* imatrix: guard Q4_0/Q5_0 against ffn_down craziness

---------

Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>

b1889

Toggle b1889's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
perplexity : fix kv cache handling for hellaswag (ggml-org#4981)

ggml-ci

b1887

Toggle b1887's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
metal : localized logic in `ggml_metal_graph_compute` (ggml-org#4924)

* Metal: Localized logic in `ggml_metal_graph_compute`, minor performance improvement

* Whitespace

* Collecting command buffer completions on single thread

* Whitespace

* Reduce diff noise

b1886

Toggle b1886's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
android : introduce starter project example (ggml-org#4926)

* Introduce starter project for Android

Based on examples/llama.swiftui.

* Add github workflow

* Set NDK version

* Only build arm64-v8a in CI

* Sync bench code

* Rename CI prop to skip-armeabi-v7a

* Remove unused tests

b1885

Toggle b1885's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
metal : replace loop of dispatch_async with dispatch_apply (ggml-org#…

…4934)

* Replace loop of dispatch_async with dispatch_apply

* Update ggml-metal.m

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

b1884

Toggle b1884's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
metal : log `recommendedMaxWorkingSetSize` on iOS 16+ (ggml-org#4936)

* metal: Log `recommendedMaxWorkingSetSize` on iOS 16+

* Only log on iOS and macOS, ignoring tvOS and other platforms

* Check for Xcode version before using recommendedMaxWorkingSetSize

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

b1882

Toggle b1882's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
ggml : introduce GGML_CALL function annotation (ggml-org#4850)

This change makes it possible to build ggml-cuda.cu and ggml-metal.m as
independent dynamic shared objects, that may be conditionally linked at
runtime in a multiplatform binary. It introduces a GGML_CALL annotation
that documents which functions have a cyclic call relationship, between
the application code and GPU modules.

This change does nothing, unless the build defines -DGGML_MULTIPLATFORM
which causes back-references and function pointers to conform to MS ABI
which is supported by NVCC, ROCm, XCode, GCC and Clang across platforms

b1881

Toggle b1881's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
finetune : use LLAMA_FILE_MAGIC_GGLA (ggml-org#4961)

This commit replaces the magic number LLAMA_FILE_MAGIC_LORA used in
finetune.cpp with LLAMA_FILE_MAGIC_GGLA defined in llama.h.

Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com>

b1880

Toggle b1880's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
speculative : threading options (ggml-org#4959)

* speculative: expose draft threading

* fix usage format

* accept -td and -tbd args

* speculative: revert default behavior when -td is unspecified

* fix trailing whitespace
0