Add lora+ implementation #1915

kallewoof · 2024-07-08T14:42:04Z

Builds on #1509.

src/peft/helpers.py

HuggingFaceDocBuilderDev · 2024-07-15T10:36:36Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

BenjaminBossan · 2024-07-15T10:58:16Z

@kallewoof Thanks for pushing this PR forward. By now, we have waited long enough that I feel it's fair to continue with this PR.

I haven't done a full review yet, but it seems that tests are failing. Could you please take a look?

kallewoof · 2024-07-15T12:57:41Z

@BenjaminBossan Thanks. I think I addressed the errors.

BenjaminBossan · 2024-07-15T14:00:31Z

We get an error in Python 3.8 as we're missing a from __future__ import annotations.

kallewoof · 2024-07-16T00:39:57Z

Thanks, fixed. Also added a license header.

BenjaminBossan

Thanks for the updates. No in-depth review yet, but I found a couple of issues that should be quick to address.

src/peft/optimizers/loraplus.py

src/peft/optimizers/__init__.py

src/peft/optimizers/loraplus.py

src/peft/tuners/lora/__init__.py

src/peft/tuners/__init__.py

src/peft/utils/peft_types.py

kallewoof · 2024-07-17T05:31:21Z

Failing tests appear unrelated to the PR.

BenjaminBossan

Thanks for continuing the work on LoRA+. I found a few areas for improvement, but overall it looks good already. Tested it on a small example and results were slightly improved.

src/peft/optimizers/loraplus.py

src/peft/optimizers/__init__.py

src/peft/optimizers/loraplus.py

tests/test_loraplus_helper.py

BenjaminBossan · 2024-07-17T12:38:37Z

@kallewoof LMK when this is ready for another review.

kallewoof · 2024-07-17T13:44:26Z

@BenjaminBossan Should be ready! Sorry if I missed anything.

BenjaminBossan

Fantastic, I think we're almost done. I only have two small comments left, the rest looks good.

src/peft/optimizers/loraplus.py

BenjaminBossan · 2024-07-18T10:31:24Z

@kallewoof Please ping me once this is ready for review. Otherwise, I don't know if you're still working on some changes or not :)

kallewoof · 2024-07-18T10:32:14Z

@BenjaminBossan Ping! Sorry about all the force-pushes.

kallewoof · 2024-07-19T01:08:27Z

@BenjaminBossan Sorry, I thought about this overnight and I think you're right that weight_decay should be popped. If someone provides a weight decay meant for something other than LoRA+ then it will be picked up by both, which is most likely undesired.

kallewoof · 2024-07-22T00:19:13Z

Playing around with this, I noticed that LoraPlusConfig is actually not used anywhere. ~~I think the code is missing a part where you create a LoraConfig with lora+ feature and it initializes the optimizer and such.~~

BenjaminBossan · 2024-07-22T09:39:13Z

Sorry about all the force-pushes.

Not a big deal with this PR, but especially on bigger ones they're better to avoid to make reviews easier. Note that there is no need to clean up the git history, if that was what you're going for, as we squash before merging.

Sorry, I thought about this overnight and I think you're right that weight_decay should be popped. If someone provides a weight decay meant for something other than LoRA+ then it will be picked up by both, which is most likely undesired.

What do you mean by "both"?

Playing around with this, I noticed that LoraPlusConfig is actually not used anywhere.

Hmm, yes, you're right. How about removing it completely then? I guess an argument could be made that something like this API could be useful:

from peft import LoraPlusConfig

optimizer_config = LoraPlusConfig(...)
optimizer = create_loraplus_optimizer(model, optimizer_config)

to make it easier to share the config settings, but IMO the value is very marginal.

kallewoof · 2024-07-22T13:02:20Z

Sorry about all the force-pushes.

Not a big deal with this PR, but especially on bigger ones they're better to avoid to make reviews easier. Note that there is no need to clean up the git history, if that was what you're going for, as we squash before merging.

Right. It's a very ingrained habit from other projects where they don't squash.

Sorry, I thought about this overnight and I think you're right that weight_decay should be popped. If someone provides a weight decay meant for something other than LoRA+ then it will be picked up by both, which is most likely undesired.

What do you mean by "both"?

Imagine FutureOptimizerX which has a fancy weight_decay parameter that does something, and now imagine someone attaching a LoRA+ optimizer to it:

def create_loraplus_optimizer(
    model: PeftModel, optimizer_cls: type[Optimizer], *, lr: float, loraplus_lr_ratio: float, **kwargs
) -> Optimizer:

The LoRA+ optimizer picks out and uses the passed-in weight decay value in its setup. Then, if we don't pop it,

    optimizer = optimizer_cls(optimizer_grouped_parameters, **kwargs)

the FutureOptimizerX optimizer will now also use our weight decay param. It is unclear which optimizer the user was referring to for the weight decay, but it's probably unlikely that they intended for both FutureOptimizerX and LoRA+ to get the same weight decay with the same value.

Maybe I'm overcomplicating this?

Playing around with this, I noticed that LoraPlusConfig is actually not used anywhere.

Hmm, yes, you're right. How about removing it completely then? I guess an argument could be made that something like this API could be useful:
from peft import LoraPlusConfig

optimizer_config = LoraPlusConfig(...)
optimizer = create_loraplus_optimizer(model, optimizer_config)
to make it easier to share the config settings, but IMO the value is very marginal.

I think the cleanest approach is to remove it in this PR and then make a follow-up where we make it easier to use, if necessary.

Ultimately, without some tweaks to transformers, I don't think we can do this automatically e.g. from get_peft_model because we need to actually access the Trainer. I think.

Edit: We probably should add an example to the docs on how to use it though, at least. Let me look into that.

Co-authored-by: Chris Hua <stillmatic@users.noreply.github.com>

…uity

kallewoof · 2024-07-29T09:42:32Z

@BenjaminBossan Sounds good. I rebased on main. If it is easier I will merge instead in the future.

BenjaminBossan · 2024-07-29T10:51:19Z

Thanks everyone involved in this PR, good work.

Add LoRA+: Efficient Low Rank Adaptation of Large Models https://arxiv.org/abs/2402.12354 Call create_loraplus_optimizer to initialize an optimizer with optimizer parameters that are especially effective for LoRA training. Builds upon this code base: https://github.com/nikhil-ghosh-berkeley/loraplus --------- Co-authored-by: moghadas76 <s.m.moghadas2012@gmail.com> Co-authored-by: Chris Hua <stillmatic@users.noreply.github.com>

kallewoof changed the title ~~202407 loraplus~~ Add lora+ implementation Jul 8, 2024

BenjaminBossan reviewed Jul 9, 2024

View reviewed changes

src/peft/helpers.py Outdated Show resolved Hide resolved

kallewoof force-pushed the 202407-loraplus branch 2 times, most recently from 64dc23c to 6f42ed4 Compare July 9, 2024 14:43

kallewoof force-pushed the 202407-loraplus branch from 6f42ed4 to bbefbec Compare July 15, 2024 12:42

BenjaminBossan requested changes Jul 16, 2024

View reviewed changes

kallewoof force-pushed the 202407-loraplus branch from 949793b to 691178e Compare July 16, 2024 10:30

BenjaminBossan requested changes Jul 17, 2024

View reviewed changes

BenjaminBossan reviewed Jul 17, 2024

View reviewed changes

tests/test_loraplus_helper.py Outdated Show resolved Hide resolved

kallewoof force-pushed the 202407-loraplus branch from ed6095a to e58b876 Compare July 17, 2024 10:57

BenjaminBossan mentioned this pull request Jul 17, 2024

Integrating Riemannian Preconditioner #1807

Closed

kallewoof force-pushed the 202407-loraplus branch from e58b876 to daddf0b Compare July 17, 2024 11:00

BenjaminBossan requested changes Jul 18, 2024

View reviewed changes

src/peft/optimizers/loraplus.py Outdated Show resolved Hide resolved

src/peft/optimizers/loraplus.py Outdated Show resolved Hide resolved

kallewoof force-pushed the 202407-loraplus branch 3 times, most recently from fa6d661 to f983257 Compare July 18, 2024 10:27

moghadas76 and others added 22 commits July 29, 2024 18:41

Fix docs

132dfac

style fixes

1d35a72

move loraplus_lr_ratio out of opt_kwargs, and other fixes

65069de

add support for other 8 bit optimizers

9146816

Co-authored-by: Chris Hua <stillmatic@users.noreply.github.com>

revert unneeded import

4bca4a7

clean out old code

a5df785

add 'LORAPLUS' to peft config mapping

af584cb

conditional bnb in lora+ tests

56a5227

added compat for py 3.8

b6d2d6a

license header

95b9e80

review fixes

0460dcd

review fixes

80f7392

make lr and loraplus_lr_ratio required forced kw args

52d8e0b

lora+: do not propagate weight_decay to optimizer

f550e51

doc: add LoRA+ optimizer example

b2f802b

remove LoraPlusConfig

29b3f5d

rename weight_decay to loraplus_weight_decay to avoid potential ambig…

d445dad

…uity

review fixes LoRA+ documentation

bf54d0a

specify optimizer params

5981ecc

assumption about lora set up

afe07e1

add lora hint to lora+ example code

c60ca71

missing LoraConfig import

babd89b

kallewoof force-pushed the 202407-loraplus branch from d2b3f9e to babd89b Compare July 29, 2024 09:42

BenjaminBossan merged commit 273acf0 into huggingface:main Jul 29, 2024
14 checks passed

BenjaminBossan mentioned this pull request Jul 29, 2024

Add lora+ implentation #1509

Closed

BenjaminBossan mentioned this pull request Jul 29, 2024

Feature Request: Integrate Lora+/different learning rates for adapter matrices A and B #1504

Closed

kallewoof deleted the 202407-loraplus branch July 29, 2024 13:09

Add lora+ implementation #1915

Add lora+ implementation #1915

Uh oh!

Conversation

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!