-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Draft: Merge LoRA Adapters with AWQ BaseModels #2418
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
8000 Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding merging capabilities to AWQ. I only skimmed the PR so far, but could you please:
- Also implement the
unmerge
method? It should be very similar to themerge
method, but remove the delta weight - There should be a unit test to ensure that merging works, e.g. similar to this test (without DoRA).
- Let's run
make style
on your changes.
@BenjaminBossan Thanks for looking into it already! Your three points are on my agenda, I will give you a ping when I commit the changes. 8000 p> |
Great, thanks a lot. |
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. |
@Whadup Do you still plan on working on this? |
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. |
It's not quite clear to me, but it appears like AutoAWQ will be integrated into llm-compressor:
|
This PR extends the AwqLoraLinear class to allow merging in of LoRA Adapters.
Instead of re-quantizing the whole model, we use the original quantization scales and zeros.