10000 llama : bump max layers from 256 to 512 by ggerganov · Pull Request #8530 · ggml-org/llama.cpp · GitHub
[go: up one dir, main page]

Skip to content

llama : bump max layers from 256 to 512 #8530

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jul 19, 2024
Merged

Conversation

ggerganov
Copy link
Member

fix #8528

@ggerganov ggerganov force-pushed the gg/bump-max-layers branch from fac0355 to b268edf Compare July 17, 2024 07:04
@slaren
Copy link
Member
slaren commented Jul 17, 2024

I would suggest moving the assert to the case where the key is actually an array only. That way it wouldn't affect other models. The assert should also be replaced with an exception instead to avoid crashing on invalid models. Ideally it should be a dynamic container instead of a fixed array, but that requires more changes.

@ggerganov
Copy link
Member Author

I would suggest moving the assert to the case where the key is actually an array only. That way it wouldn't affect other models.

Wouldn't this loop potentially overflow if we check just array keys?

https://github.com/ggerganov/llama.cpp/blob/b268edf87c3bb996feb88e2d8b2c85b38bff6f36/src/llama.cpp#L4061-L4063

@slaren
Copy link
Member
slaren commented Jul 17, 2024

Yes, I assumed that it would only set the first element in that case. From what I can tell, only OpenELM ever uses more than the first element.

@ggerganov ggerganov requested a review from slaren July 17, 2024 07:50
@ggerganov ggerganov merged commit d197545 into master Jul 19, 2024
54 checks passed
@ggerganov ggerganov deleted the gg/bump-max-layers branch July 19, 2024 13:50
arthw pushed a commit to arthw/llama.cpp that referenced this pull request Jul 27, 2024
* llama : bump max layers from 256 to 512

* llama : replace asserts with exceptions
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Bug: Can't quantize 405B Mega merge
2 participants
0