-
Notifications
You must be signed in to change notification settings - Fork 435
chore: bump llama_cpp_python to 0.3.6 #2368
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Latest llama-cpp-python has different CMAKE_ARGS for CUDA, ROCm, and MPS support. You have to update the README, other docs, and tests, too. Look for |
The tests with reduced |
@Ian321 could you make changes to the tests that fail due to reduced |
@alimaredia what kind of changes do you have in mind? This worked as expected with previous versions of llama-cpp-python and now there seems to have been a regression. I tested llama.cpp directly with a reduced ctx and it did not crash, so now I'm planning on fixing it in llama-cpp-python and then bump this pr (to hopefully 0.3.2). |
We just need CI to be passing here to ensure there are no regressions, since this is a non-trivial bump |
If you look at the failure here: https://github.com/instructlab/instructlab/actions/runs/11153487993/job/31003244761?pr=2368#step:15:338, some of the tests here (https://github.com/instructlab/instructlab/blob/main/scripts/functional-tests.sh) are failing. Those tests get run in order for every PR to merge, so we'd expect changes to the tests in this PR in order to do the version bump. Removal of certain functional tests is on the table if properly justified. |
Our downstream build pipeline is now configured to handle llama_cpp_python 0.2.75 and 0.3.1. |
@alimaredia the tests fail because there is a bug in abetlen/llama-cpp-python#1759 for which I have provided a fix abetlen/llama-cpp-python#1796 . We just have to wait for the next release of it (where it's hopefully merged of fixed some other way). @tiran if you mean the changes to the pipeline directly, I only see some general cleanup and nothing that would affect this PR directly. The test that caught this should not be modified or removed as it's what caught the above mentioned bug and helped me submit a PR for it. The only thing I could recommend is to add a timeout to I will still rebase it, just in case I missed something. |
My update regarding the downstream pipeline was for @nathan-weinberg and @alimaredia . We have an internal build pipeline that rebuilds all Python wheel from sources. Some packages like llama-cpp-python need extra configuration to build correctly. llama-cpp-python 0.3 has deprecated some options and introduced new build flags. Our internal builds are now able to handle >=0.2.75 and 0.3.x. |
This pull request has merge conflicts that must be resolved before it can be |
@Ian321 I've set this for our 0.24.0 milestone - since we were able to bump to 0.3.2 in ilab 0.23.0 we're hoping to follow up shortly after the release with this! cc @fabiendupont |
This pull request has merge conflicts that must be resolved before it can be |
Signed-off-by: Ignaz Kraft <ignaz.k@live.de>
Nice solution @Ian321. Much shorter than I thought. |
Instructlab had been using a outdated version of llama_cpp_python that did not support models such as Mixtral NeMo. This PR
simplybumps the version of that dependency to the latest one and updates the pipelines and documentation to use the new building flags.Checklist:
conventional commits.