forked from abetlen/llama-cpp-python
-
Notifications
You must be signed in to change notification settings - Fork 0
+19,916
−5,313
Open
Changes from 1 commit
Commits
Show all changes
879 commits
Select commit
Hold shift + click to select a range
df45a4b
fix: fix string value kv_overrides. Closes #1487
abetlen 91d05ab
fix: adjust kv_override member names to match llama.cpp
abetlen 165b4dc
fix: Fix typo in Llama3VisionAlphaChatHandler. Closes #1488
abetlen af3ed50
fix: Use numpy recarray for candidates data, fixes bug with temp < 0
abetlen a6457ba
Merge branch 'main' of https://github.com/abetlen/llama-cpp-python in…
abetlen 6b018e0
misc: Improve llava error messages
abetlen cd3f1bb
feat: Update llama.cpp
abetlen ae5682f
fix: Disable Windows+CUDA workaround when compiling for HIPBLAS (#1493)
Engininja2 c3ef41b
chore: Bump version
abetlen 951e39c
Merge branch 'main' of https://github.com/abetlen/llama-cpp-python in…
abetlen 027f7bc
fix: Avoid duplicate special tokens in chat formats (#1439)
CISC 6e0642c
fix: fix logprobs when BOS is not present (#1471)
a-ghorbani d634efc
feat: adding `rpc_servers` parameter to `Llama` class (#1477)
chraac 255e1b4
feat: Update llama.cpp
abetlen 83d6b26
feat: Update llama.cpp
abetlen 1615eb9
feat: Update llama.cpp
abetlen 86a38ad
chore: Bump version
abetlen e342161
feat: Update llama.cpp
abetlen dbcf64c
feat: Support SPM infill (#1492)
CISC 320a5d7
feat: Add `.close()` method to `Llama` class to explicitly free model…
jkawamoto 5af8163
chore(deps): bump pypa/cibuildwheel from 2.18.1 to 2.19.0 (#1522)
dependabot[bot] 9e396b3
feat: Update workflows and pre-built wheels (#1416)
Smartappli 8401c6f
feat: Update llama.cpp
abetlen f4491c4
feat: Update llama.cpp
abetlen 4c1d74c
fix: Make destructor to automatically call .close() method on Llama c…
abetlen 554fd08
feat: Update llama.cpp
abetlen 6c33190
chore: Bump version
abetlen d98a24a
docs: Remove references to deprecated opencl backend. Closes #1512
abetlen 5beec1a
feat: Update llama.cpp
abetlen 27d5358
docs: Update readme examples to use newer Qwen2 model (#1544)
jncraton 398fe81
chore(deps): bump docker/build-push-action from 5 to 6 (#1539)
dependabot[bot] 35c980e
chore(deps): bump pypa/cibuildwheel from 2.18.1 to 2.19.1 (#1527)
dependabot[bot] 04959f1
feat: Update llama_cpp.py bindings
abetlen 117cbb2
feat: Update llama.cpp
abetlen bf5e0bb
fix(server): Update `embeddings=False` by default. Embeddings should …
abetlen 73ddf29
fix(ci): Fix the CUDA workflow (#1551)
oobabooga c546c94
misc: Install shared libraries to lib subdirectory
abetlen 92bad6e
Merge branch 'main' of https://github.com/abetlen/llama-cpp-python in…
abetlen 139774b
fix: Update shared library rpath
abetlen d5f6a15
fix: force $ORIGIN rpath for shared library files
abetlen e51f200
fix: Fix installation location for shared libraries
abetlen 73fe013
fix: Fix RPATH so it works on macos
abetlen dc20e8c
fix: Copy dependencies for windows
abetlen 296304b
fix(server): Fix bug in FastAPI streaming response where dependency w…
abetlen bd5d17b
feat: Update llama.cpp
abetlen b4cc923
chore: Bump version
abetlen 4fb6fc1
fix(ci): Use LLAMA_CUDA for cuda wheels
abetlen 387d01d
fix(misc): Fix type errors
abetlen 8992a1a
feat: Update llama.cpp
abetlen 3a551eb
fix(ci): Update macos image (macos-11 is removed)
abetlen 01bddd6
chore: Bump version
abetlen 7e20e34
feat: Update llama.cpp
abetlen 62804ee
feat: Update llama.cpp
abetlen 157d913
fix: update token_to_piece
abetlen 218d361
feat: Update llama.cpp
abetlen 1a55417
fix: Update LLAMA_ flags to GGML_ flags
abetlen 09a4f78
fix(ci): Update LLAMA_ flags to GGML_
abetlen 0481a3a
fix(docs): Update LLAMA_ flags to GGML_ flags
abetlen fccff80
fix(docs): Remove kompute backend references
abetlen 276ea28
fix(misc): Update LLAMA_ flags to GGML_
abetlen aaf4cbe
chore: Bump version
abetlen 14760c6
chore(deps): bump pypa/cibuildwheel from 2.19.1 to 2.19.2 (#1568)
dependabot[bot] e31f096
chore(deps): bump microsoft/setup-msbuild from 1.1 to 1.3 (#1569)
dependabot[bot] b77e507
feat(ci): Dockerfile update base images and post-install cleanup (#1530)
Smartappli c1ae815
fix(misc): Format
abetlen 08f2bb3
fix(minor): Minor ruff fixes
abetlen f7f4fa8
feat(ci): Update simple Dockerfile (#1459)
yentur 7613d23
feat: Update llama.cpp
abetlen 66d5cdd
fix(server): Use split_mode from model settings (#1594)
grider-withourai 797f54c
fix(docs): Update README.md typo (#1589)
ericcurtin 0700476
fix: Change repeat_penalty to 1.0 to match llama.cpp defaults (#1590)
ddh0 3638f73
feat: Add 'required' literal to ChatCompletionToolChoiceOption (#1597)
mjschock f95057a
chore(deps): bump microsoft/setup-msbuild from 1.3 to 2 (#1585)
dependabot[bot] 5105f40
feat: Update llama.cpp
abetlen 816d491
chore: Bump version
abetlen a14b49d
feat: Update llama.cpp
abetlen dccb148
feat: Update llama.cpp
abetlen 9ed6b27
fix: Correcting run.sh filepath in Simple Docker implementation (#1626)
mashuk999 4bf3b43
chore: Bump version
abetlen cffb4ec
feat: Update llama.cpp
abetlen 53c6f32
feat: Update llama.cpp
abetlen 0b1a8d8
feat: FreeBSD compatibility (#1635)
yurivict 8297a0d
fix(docker): Update Dockerfile build options from `LLAMA_` to `GGML_`…
Smartappli ac02174
fix(docker): Fix GGML_CUDA param (#1633)
Smartappli 8a12c9f
fix(docker): Update Dockerfile BLAS options (#1632)
Smartappli 1f0b9a2
fix : Missing LoRA adapter after API change (#1630)
shamitv f7b9e6d
chore: Bump version
abetlen 5575fed
fix: llama_grammar_accept_token arg order (#1649)
tc-wolf dff186c
feat: Ported back new grammar changes from C++ to Python implementati…
ExtReMLapin 18f58fe
feat: Update llama.cpp
abetlen ce6466f
chore: Bump version
abetlen 198f47d
feat(ci): Re-build wheel index automatically when releases are created
abetlen a07b337
feat: Update llama.cpp
abetlen 9cad571
fix: Include all llama.cpp source files and subdirectories
abetlen 8432116
chore: Bump version
abetlen e966f3b
feat: Add more detailed log for prefix-match (#1659)
xu-song 131db40
chore(deps): bump pypa/cibuildwheel from 2.19.2 to 2.20.0 (#1657)
dependabot[bot] 5e39a85
feat: Enable recursive search of HFFS.ls when using `from_pretrained`…
benHeid c5de5d3
feat: Update llama.cpp
abetlen bfb42b7
Merge branch 'main' of github.com:abetlen/llama-cpp-python into main
abetlen 0998ea0
fix: grammar prints on each call. Closes #1666
abetlen 7aaf701
fix: typo
abetlen 45de9d5
feat: Update llama.cpp
abetlen 4244151
feat: Update llama.cpp
abetlen 95a1533
fix: Added back from_file method to LlamaGrammar (#1673)
ExtReMLapin 9bab46f
fix: only print 'cache saved' in verbose mode (#1668)
lsorber 8ed663b
feat: Update llama.cpp
abetlen fc19cc7
chore: Bump version
abetlen 63d65ac
feat: Update llama.cpp
abetlen 78e35c4
fix: missing dependencies for test (#1680)
jkawamoto 3c7501b
fix: Llama.close didn't free lora adapter (#1679)
jkawamoto 7bf07ec
feat: Update llama.cpp
abetlen 658b244
Merge branch 'main' of github.com:abetlen/llama-cpp-python into main
abetlen a2ba731
feat: Update llama.cpp
abetlen d7328ef
chore: Bump version
abetlen a20f13f
feat: Update llama.cpp
abetlen 259ee15
feat: Update llama.cpp
abetlen 82ae7f9
feat: Update llama.cpp
abetlen f70df82
feat: Add MiniCPMv26 chat handler.
abetlen e251a0b
fix: Update name to MiniCPMv26ChatHandler
abetlen c68e7fb
fix: pull all gh releases for self-hosted python index
abetlen 97d527e
feat: Add server chat_format minicpm-v-2.6 for MiniCPMv26ChatHandler
abetlen b570fd3
docs: Add project icon courtesy of 🤗
abetlen cbbfad4
docs: center icon and resize
abetlen ad2deaf
docs: Add MiniCPM-V-2.6 to multi-modal model list
abetlen 332720d
feat: Update llama.cpp
abetlen 077ecb6
chore: Bump version
abetlen 45001ac
misc(fix): Update CHANGELOG
abetlen 4b1e364
docs: Update README
abetlen 8b853c0
docs: Update README
abetlen 9cba3b8
docs: Update README
abetlen d981d32
feat: Enable detokenizing special tokens with `special=True` (#1596)
benniekiss 98eb092
fix: Use system message in og qwen format. Closes #1697
abetlen dcb0d0c
feat: Update llama.cpp
abetlen 9769e57
feat: Update llama.cpp
abetlen c3fc80a
feat: Update llama.cpp
abetlen 9497bcd
feat: Update llama.cpp
abetlen c032fc6
feat: Update llama.cpp
abetlen e529940
feat(ci): Speed up CI workflows using `uv`, add support for CUDA 12.5…
Smartappli a4e1451
chore(deps): bump pypa/cibuildwheel from 2.20.0 to 2.21.1 (#1743)
dependabot[bot] f8fcb3e
feat: Update sampling API for llama.cpp (#1742)
abetlen 1e64664
feat: Update llama.cpp
abetlen 9b64bb5
misc: Format
abetlen 22cedad
fix: Fix memory allocation of ndarray (#1704)
xu-song 29afcfd
fix: Don't store scores internally unless logits_all=True. Reduces me…
abetlen 84c0920
feat: Add loading sharded GGUF files from HuggingFace with Llama.from…
Gnurro 47d7a62
feat: Update llama.cpp
abetlen 6c44a3f
feat: Add option to configure n_ubatch
abetlen 49b1e73
docs: Add cuda 12.5 to README.md (#1750)
Smartappli 1324c0c
chore(deps): bump actions/cache from 3 to 4 (#1751)
dependabot[bot] 4744551
feat: Update llama.cpp
abetlen 926b414
feat: Update llama.cpp
abetlen b3dfb42
chore: Bump version
abetlen 8e07db0
fix: install build dependency
abetlen 65222bc
fix: install build dependency
abetlen 9992c50
fix: Fix speculative decoding
abetlen 11d9562
misc: Rename all_text to remaining_text (#1658)
xu-song e975dab
fix: Additional fixes for speculative decoding
abetlen dca0c9a
feat: Update llama.cpp
abetlen 01c7607
feat: Expose libggml in internal APIs (#1761)
abetlen 57e70bb
feat: Update llama.cpp
abetlen 7c4aead
chore: Bump version
abetlen 7403e00
feat: Update llama.cpp
abetlen e712cff
feat: Update llama.cpp
abetlen cafa33e
feat: Update llama.cpp
abetlen d1cb50b
Add missing ggml dependency
abetlen 2796f4e
Add all missing ggml dependencies
abetlen 7ecdd94
chore: Bump version
abetlen f3fb90b
feat: Update llama.cpp
abetlen 7ba257e
feat: Update llama.cpp
abetlen 9d06e36
fix(ci): Explicitly install arm64 python version
abetlen fb0b8fe
fix(ci): Explicitly set cmake osx architecture
abetlen 72ed7b8
fix(ci): Explicitly test on arm64 macos runner
abetlen 8988aaf
fix(ci): Use macos-14 runner
abetlen f11a781
fix(ci): Use macos-13 runner
abetlen 9a09fc7
fix(ci): Debug print python system architecture
abetlen a412ba5
fix(ci): Update config
abetlen df05096
fix(ci): Install with regular pip
abetlen 1cd3f2c
fix(ci): gg
abetlen b34f200
fix(ci): Use python3
abetlen d8cc231
fix(ci): Use default architecture chosen by action
abetlen d5d5099
fix(ci): Update CMakeLists.txt for macos
abetlen 4f17ae5
fix(ci): Remove cuda version 12.5.0 incompatibility with VS (#1838)
pabl-o-ce 991d9cd
fix(ci): Remove CUDA 12.5 from index
abetlen 2795303
chore(deps): bump pypa/cibuildwheel from 2.21.1 to 2.22.0 (#1844)
dependabot[bot] 2523472
fix: Fix pickling of Llama class by setting seed from _seed member. C…
abetlen d553a54
Merge branch 'main' of github.com:abetlen/llama-cpp-python into main
abetlen ddac04c
chore(deps): bump conda-incubator/setup-miniconda from 3.0.4 to 3.1.0…
dependabot[bot] fa04cdc
fix logit-bias type hint (#1802)
ddh0 38fbd29
docs: Remove ref to llama_eval in llama_cpp.py docs (#1819)
richdougherty 4192210
fix: make content not required in ChatCompletionRequestAssistantMessa…
feloy 77a12a3
fix: Re-add suport for CUDA 12.5, add CUDA 12.6 (#1775)
Smartappli 073b7e4
fix: added missing exit_stack.close() to /v1/chat/completions (#1796)
Ian321 9bd0c95
fix: Avoid thread starvation on many concurrent requests by making us…
gjpower 1ea6154
fix(docs): Update development instructions (#1833)
Florents-Tselai d610477
fix(examples): Refactor Batching notebook to use new sampler chain AP…
lukestanley 4f0ec65
fix: chat API logprobs format (#1788)
domdomegg df136cb
misc: Update development Makefile
abetlen 6889429
Merge branch 'main' of github.com:abetlen/llama-cpp-python into main
abetlen b9b50e5
misc: Update run server command
abetlen 5585f8a
feat: Update llama.cpp
abetlen 61508c2
Add CUDA 12.5 and 12.6 to generated output wheels
abetlen a9fe0f8
chore: Bump version
abetlen ca80802
fix(ci): hotfix for wheels
abetlen 002f583
chore: Bump version
abetlen ea4d86a
fix(ci): update macos runner image to non-deprecated version
abetlen afedfc8
fix: add missing await statements for async exit_stack handling (#1858)
gjpower 801a73a
feat: Update llama.cpp
abetlen 803924b
chore: Bump version
abetlen 2bc1d97
feat: Update llama.cpp
abetlen c9dfad4
feat: Update llama.cpp
abetlen 1d5f534
feat: Update llama.cpp
abetlen e8f14ce
fix: streaming resource lock (#1879)
gjpower 0580cf2
chore: Bump version
abetlen 80be68a
feat: Update llama.cpp
abetlen 0b89fe4
feat: Update llama.cpp
abetlen 14879c7
fix(ci): Fix the CUDA workflow (#1894)
oobabooga 4442ff8
fix: error showing time spent in llama perf context print (#1898)
shakalaca 710e19a
chore: Bump version
abetlen 344c106
feat: Update llama.cpp
abetlen e232fae
feat: Update llama.cpp
abetlen 37eb5f0
chore: Bump version
abetlen 99f2ebf
feat: Update llama.cpp
abetlen 4c6514d
feat: Update llama.cpp
abetlen cb2edb9
chore: Bump version
abetlen b1d23df
hotfix: Disable curl support
abetlen 0d475d7
feat: Update llama.cpp
abetlen 51dce74
misc: Fix support for new parameters, deprecate rpc_servers parameter
abetlen 5a635f4
fix(minor): Fix type hint for older versions of python
abetlen 0dec788
fix: Fix missing deprecated symbols on windows with missing LLAMA_API…
abetlen cd548bd
feat: Add support for new mtmd api, add Qwen2.5-VL chat handler
abetlen 07a979f
fix: Use num_threads from llama model for mtmd
abetlen 6f3f0bf
docs: Add Qwen2.5-VL to README
abetlen 9770b84
chore: Bump version
abetlen 9e5a4ea
fix: Update reference to in Llama.embed. Closes #2037
abetlen ae54cde
fix(ci): Update cuda build action to use ubuntu 22.04
abetlen 083fcf6
fix(ci): Add git to package list
abetlen 11d28df
fix(ci): Remove macos-13 builds to fix cross compilation error
abetlen 1580839
chore: Bump version
abetlen 82ad829
fix(ci): update runners for cpu builds
abetlen 7011bc1
fix(ci): Update docker runner
abetlen b39e9d4
feat: Update llama.cpp
abetlen 98fda8c
fix(ci): Temporarily disable windows cuda wheels
abetlen 8866fbd
chore: Bump version
abetlen cce4887
fix(ci): Fix macos cpu builds
abetlen a99fd21
feat: Update llama.cpp
abetlen c8579d7
fix: Better chat format for Qwen2.5-VL (#2040)
alcoftTAO d9749cb
chore: Bump version
abetlen 95292e3
feat: Update llama.cpp
abetlen e1af05f
chore: Bump version
abetlen File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
chore: Bump version
- Loading branch information
commit d7328efabf841a750259dc42089efece13efc0f5
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,4 @@ | ||
from .llama_cpp import * | ||
from .llama import * | ||
|
||
__version__ = "0.2.88" | ||
__version__ = "0.2.89" |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
[pull] main from abetlen:main #3
< 8000 div class="gh-header-actions mt-0 mb-3 mb-md-2 ml-1 flex-md-order-1 flex-shrink-0 d-flex flex-items-center gap-1">New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?