-
Notifications
You must be signed in to change notification settings - Fork 2k
Update OFT to fix merge bugs #1996
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for fixing the issues you mentioned with OFT. I don't understand all the details but in general, this is to bring OFT closer to the BOFT implementation, is that right? Is that also why you decided to no longer use LycorisTuner
as a base?
Did you check whether after these changes, the same code using OFT would still produce the exact same results? E.g. let's say a user has trained an OFT model checkpoint in PEFT 0.12. Now we merge this PR and release it in PEFT 0.13. If the user upgrades to 0.13 and loads the OFT checkpoint, would they still get the same results? This is really important because otherwise it would be a source of bugs.
If we cannot ensure the same results after this PR because, for instance, something was not implemented correctly previously, at the very least we have to add some mechanism to error or warn users when they try to load an old checkpoint with new PEFT versions.
I have added some smaller comments too, please check them out. On top of those, please run make style and ensure 120 character line limits (e.g. long strings).
Also, pinging @okotaku and @lukaskuhn-lku since they worked on the original OFT PR.
src/peft/tuners/boft/layer.py
Outdated
elif boft_block_size != 0 and boft_block_num != 0: | ||
raise ValueError(f"You can only specify either boft_block_size ({boft_block_size}) or boft_block_num ({boft_block_num}), but not both simultaneously.") | ||
|
||
else: | ||
raise ValueError( | ||
f"You can only specify either boft_block_size ({boft_block_size}) or boft_block_num ({boft_block_num}), but not both simultaneously or setting both" | ||
"to be 0, because boft_block_size x boft_block_num != in_features." | ||
) | ||
raise ValueError(f"Either `boft_block_size` or `boft_block_num` must be non-zero. Currently, boft_block_size = {boft_block_size} and boft_block_num = {boft_block_num}.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need these checks? We already have checks here:
peft/src/peft/tuners/boft/config.py
Lines 127 to 133 in 41c274e
if self.boft_block_size == 0 and self.boft_block_num == 0: | |
raise ValueError("You must specify either boft_block_size or boft_block_num.") | |
if not (self.boft_block_size != 0) ^ (self.boft_block_num != 0): | |
raise ValueError( | |
f"You can only specify either boft_block_size ({self.boft_block_size}) or boft_block_num ({self.boft_block_num}), " | |
"but not both simultaneously, because boft_block_size x boft_block_num != in_features." | |
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you are right, I forgot there is already a check in the config.py Updated.
src/peft/tuners/boft/layer.py
Outdated
elif boft_block_size != 0 and boft_block_num != 0: | ||
raise ValueError(f"You can only specify either boft_block_size ({boft_block_size}) or boft_block_num ({boft_block_num}), but not both simultaneously.") | ||
|
||
else: | ||
raise ValueError("Unknown error!") | ||
raise ValueError(f"Either `boft_block_size` or `boft_block_num` must be non-zero. Currently, boft_block_size = {boft_block_size} and boft_block_num = {boft_block_num}.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
src/peft/tuners/oft/layer.py
Outdated
class OFTLayer(nn.Module, LycorisLayer): | ||
class MultiplicativeDropoutLayer(nn.Module): | ||
""" | ||
Implements the multiplicative dropout layer for BOFT. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implements the multiplicative dropout layer for BOFT. | |
Implements the multiplicative dropout layer for OFT. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated.
src/peft/tuners/oft/layer.py
Outdated
if D == 1: | ||
return x | ||
|
||
# Create a mask with 1s for matrices to be replaced with identity and 0s otherwise |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sentence doesn't quite make sense, could you please check again?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated.
@BenjaminBossan Hi, thank you so much. Because the main change (beside changing the format) is in the merge/unmerge, it should be possible to update it without affecting previous trained model checkpoints. I am currently testing it. But I am currently facing a problem: import transformers
import accelerate
import peft
from datasets import load_dataset
from transformers import AutoImageProcessor
from peft import LoraConfig, BOFTConfig, OFTConfig, get_peft_model
import numpy as np
import evaluate
import torch
import os
from torchvision.transforms import (
CenterCrop,
Compose,
Normalize,
RandomHorizontalFlip,
RandomResizedCrop,
Resize,
ToTensor,
)
from transformers import AutoModelForImageClassification, TrainingArguments, Trainer
from peft import PeftConfig, PeftModel, AutoPeftModel
os.environ["HF_API_TOKEN"] = "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
if __name__ == '__main__':
dataset = load_dataset("food101", split="train[:5000]")
output_dir = "output"
model_checkpoint = "google/vit-base-patch16-224-in21k"
labels = dataset.features["label"].names
label2id, id2label = dict(), dict()
for i, label in enumerate(labels):
label2id[label] = i
id2label[i] = label
image_processor = AutoImageProcessor.from_pretrained(model_checkpoint)
normalize = Normalize(mean=image_processor.image_mean, std=image_processor.image_std)
train_transforms = Compose(
[
RandomResizedCrop(image_processor.size["height"]),
RandomHorizontalFlip(),
ToTensor(),
normalize,
]
)
val_transforms = Compose(
[
Resize(image_processor.size["height"]),
CenterCrop(image_processor.size["height"]),
ToTensor(),
normalize,
]
)
def preprocess_train(example_batch):
"""Apply train_transforms across a batch."""
example_batch["pixel_values"] = [train_transforms(image.convert("RGB")) for image in example_batch["image"]]
return example_batch
def preprocess_val(example_batch):
"""Apply val_transforms across a batch."""
example_batch["pixel_values"] = [val_transforms(image.convert("RGB")) for image in example_batch["image"]]
return example_batch
splits = dataset.train_test_split(test_size=0.1)
train_ds = splits["train"]
val_ds = splits["test"]
train_ds.set_transform(preprocess_train)
val_ds.set_transform(preprocess_val)
def print_trainable_parameters(model):
trainable_params = 0
all_param = 0
for _, param in model.named_parameters():
all_param += param.numel()
if param.requires_grad:
trainable_params += param.numel()
print(
f"trainable params: {trainable_params} || all params: {all_param} || trainable%: {100 * trainable_params / all_param:.2f}"
)
model = AutoModelForImageClassification.from_pretrained(
model_checkpoint,
label2id=label2id,
id2label=id2label,
ignore_mismatched_sizes=True, # provide this in case you're planning to fine-tune an already fine-tuned checkpoint
)
print_trainable_parameters(model)
config = LoraConfig(
r=32,
lora_alpha=16,
target_modules=["query", "value"],
lora_dropout=0.1,
bias="none",
# modules_to_save=["classifier"],
)
lora_model = get_peft_model(model, config)
print_trainable_parameters(lora_model)
model_name = model_checkpoint.split("/")[-1]
batch_size = 128
args = TrainingArguments(
f"{model_name}-finetuned-lora-food101",
remove_unused_columns=False,
evaluation_strategy="epoch",
save_strategy="no",
learning_rate=5e-3,
per_device_train_batch_size=batch_size,
gradient_accumulation_steps=1,
per_device_eval_batch_size=batch_size,
fp16=True,
num_train_epochs=5,
logging_steps=10,
metric_for_best_model="accuracy",
label_names=["labels"],
eval_strategy="steps",
eval_steps=100000000000,
)
metric = evaluate.load("accuracy")
def compute_metrics(eval_pred):
"""Computes accuracy on a batch of predictions"""
predictions = np.argmax(eval_pred.predictions, axis=1)
return metric.compute(predictions=predictions, references=eval_pred.label_ids)
def collate_fn(examples):
pixel_values = torch.stack([example["pixel_values"] for example in examples])
labels = torch.tensor([example["label"] for example in examples])
return {"pixel_values": pixel_values, "labels": labels}
trainer = Trainer(
lora_model,
args,
train_dataset=train_ds,
eval_dataset=val_ds,
tokenizer=image_processor,
compute_metrics=compute_metrics,
data_collator=collate_fn,
)
eval_results = trainer.evaluate(val_ds)
print('Before training', eval_results)
repo_name = f"sgp-bench/peft-finetuned-lora-food101"
train_results = trainer.train()
token = os.getenv("HF_API_TOKEN")
lora_model.push_to_hub(repo_name, token=token)
eval_results = trainer.evaluate(val_ds)
print('After training', eval_results)
config = PeftConfig.from_pretrained(repo_name)
model = AutoModelForImageClassification.from_pretrained(
config.base_model_name_or_path,
label2id=label2id,
id2label=id2label,
ignore_mismatched_sizes=True, I am currently debugging with LoRA and the official guide from doc. |
…sion checkpoints / loading checkpoint successful
I could reproduce your issue of low accuracy (I did not train, just loaded your checkpoint). I think the issue is this line, right?
For this reason, the classifier layer is randomly initialized, so the accuracy is low.
Could you show me which steps you take to achieve this? |
Ah, ok, you are right, in the old checkpoint with merge_and_unload I did have the classifier saved. It should be working now, I have tested it can load an old checkpoint using previous version of PEFT OFT. |
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for the updates. Could you please run make style
so that CI can pass successfully?
I did some further tests regarding the backwards compatibility of the changes in this PR. If my testing is correct, this PR works without a problem for BOFT but it will cause an issue with OFT, namely that old OFT checkpoints will not work anymore after this PR. They will load fine but the result will be different, which is bad, as users will not expected this. We need to either ensure that the old checkpoint works as expected, or we have to add a check and if we detect an old checkpoint, we have to raise an error and tell the user that they either need to re-train their OFT checkpoint or use an older PEFT version.
Here is how I tested the compatibility:
# try loading old oft and boft checkpoints
# First checkout and pull all branches to be tested. Then call like this:
# git checkout <branch 1> # e.g. main
# python script.py oft # either oft or boft
# git checkout <branch 2> # e.g. PR branch
# python script.py oft # same method as above
# call `rm -r /tmp/peft/` to try again with different branches
import os
import sys
import torch
from transformers import AutoModelForCausalLM
from peft import BOFTConfig, OFTConfig, PeftModel, get_peft_model
model_id = "facebook/opt-125m"
inputs = torch.arange(5).view(-1, 1)
# oft
def main(method):
if method.lower() == "oft":
config_cls = OFTConfig
path = "/tmp/peft/oft"
else:
config_cls = BOFTConfig
path = "/tmp/peft/boft"
model = AutoModelForCausalLM.from_pretrained(model_id)
if not os.path.exists(path + "/adapter_model.safetensors"):
print(f"{method} adapter does not exist yet, creating it")
peft_config = config_cls(target_modules=["q_proj", "v_proj"], init_weights=False)
model = get_peft_model(model,
B421
peft_config)
model.save_pretrained(path)
model.eval()
with torch.inference_mode():
outputs = model(inputs).logits
torch.save(outputs, path + "/output.pt")
print(f"{method} adapter and output created saved in {path}")
else:
print(f"{method} adapter and outpput exist, loading it from {path}")
model = PeftModel.from_pretrained(model, path)
print(f"{method} adapter loaded successfully")
model.eval()
with torch.inference_mode():
outputs_new = model(inputs).logits
outputs_old = torch.load(path + "/output.pt", weights_only=True)
assert torch.allclose(outputs_old, outputs_new, atol=1e-4, rtol=1e-4)
print(f"{method} outputs match")
if __name__ == "__main__":
method = sys.argv[1] if len(sys.argv) > 1 else "boft"
assert method.lower() in ["boft", "oft"]
main(method)
Does that make sense to you? I summarized the results from my tests in this table:
branch 1 | branch 2 | method | status |
---|---|---|---|
main | PR | OFT | fail: outputs different |
PR | main | OFT | TypeError: OFTConfig.__init__() got an unexpected keyword argument 'bias' |
main | PR | BOFT | success |
PR | main | BOFT | success |
In general, we really want to retain backwards compatibility but in my book it's acceptable if the previous implementation was not correct and is now fixed. Also note that a new OFT checkpoint created after this PR cannot be loaded with older PEFT versions -- there is no forward compatibility -- but this is okay.
src/peft/tuners/boft/layer.py
Outdated
f"You can only specify either boft_block_size ({boft_block_size}) or boft_block_num ({boft_block_num}), but not both simultaneously or setting both" | ||
"to be 0, because boft_block_size x boft_block_num != in_features." | ||
) | ||
raise ValueError("Unknown error!") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this code should not be reachable. Still, let's use a more helpful error message. My proposal:
"Something went wrong, please report this error: https://github.com/huggingface/peft/issues".
# OFT info | ||
self.oft_r = nn.ParameterDict({}) | ||
self.oft_s = nn.ParameterDict({}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here we introduce a new parameter oft_s
. All existing checkpoints don't have this parameter, so that means when a user loads an old checkpoint, this parameter is randomly initialized, which means the checkpoint behaves different. See my longer comment on compatibility.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, it depends on the init_weights, if init_weights=True (which should be the default behavior), then oft_s is basically like an identity and should not make changes. If init_weights=False, then it is randomly initialized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, but when users load checkpoints, they have trained the model, so oft_s
won't be an identity transform.
src/peft/tuners/oft/layer.py
Outdated
elif r != 0 and oft_block_size != 0: | ||
raise ValueError(f"You can only specify either r ({r}) or oft_block_size ({oft_block_size}), but not both simultaneously.") | ||
else: | ||
raise ValueError(f"Either `r` or `oft_block_size` must be non-zero. Currently, r = {r} and oft_block_size = {oft_block_size}.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These lines should not be reachable because of the checks we have in the config class, but IMO it's fine to leave them here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
src/peft/tuners/oft/layer.py
Outdated
elif r != 0 and oft_block_size != 0: | ||
raise ValueError(f"You can only specify either r ({r}) or oft_block_size ({oft_block_size}), but not both simultaneously.") | ||
else: | ||
raise ValueError(f"Either `r` or `oft_block_size` must be non-zero. Currently, r = {r} and oft_block_size = {oft_block_size}.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as earlier: Should be unreachable but it's fine if we leave the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
tests/test_custom_models.py
Outdated
("Vanilla MLP 2 OFT", "MLP", OFTConfig, {"target_modules": ["lin0"]}), | ||
("Vanilla MLP 5 OFT", "MLP", OFTConfig, {"target_modules": ["lin0"], "modules_to_save": ["lin1"]}), | ||
("Vanilla MLP 1 OFT", "MLP", OFTConfig, {"r": 2, "target_modules": "lin0"}), | ||
("Vanilla MLP 2 OFT", "MLP", OFTConfig, {"r": 2, "target_modules": ["lin0"], "r": 2}), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
("Vanilla MLP 2 OFT", "MLP", OFTConfig, {"r": 2, "target_modules": ["lin0"], "r": 2}), | |
("Vanilla MLP 2 OFT", "MLP", OFTConfig, {"r": 2, "target_modules": ["lin0"]}), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
Yes, this should be expected, the reason is that the load is not correct in the previous version. I think the best way is to do "we have to add a check and if we detect an old checkpoint, we have to raise an error and tell the user that they either need to re-train their OFT checkpoint or use an older PEFT version." this. Where do you suggest to add the check? |
There was a problem hiding this comment.
Choose F987 a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this should be expected, the reason is that the load is not correct in the previous version. I think the best way is to do "we have to add a check and if we detect an old checkpoint, we have to raise an error and tell the user that they either need to re-train their OFT checkpoint or use an older PEFT version." this. Where do you suggest to add the check?
Good questions. I checked what would be the best place to change this, keeping in mind that in the future, we could have more of these cases.
My suggestion is the following. In this line, we load all the arguments for the config:
Line 151 in 0421234
kwargs = {**class_kwargs, **loaded_attributes} |
Here, we could add a check on these arguments that could be used to raise an error if an old OFT config is detected. So the new code would be:
loaded_attributes = cls.from_json_file(config_file)
kwargs = {**class_kwargs, **loaded_attributes}
kwargs = self.check_kwargs(**kwargs) # <= new line
return cls.from_peft_type(**kwargs)
By default, this method should be defined on PeftConfigMixin
to just return the kwargs
without any validation or change. Then on OFTConfig
, we can overrride this method to check if it's from an old OFT checkpoint and raise an error.
To check if the config is from an old checkpoint, we could simply check for the presence of one of the new arguments added in this PR. WDYT?
Maybe long term it would be better to also store the PEFT version in the adapter_config.json
so that this can be checked instead. But it's not quite easy, as existing checkpoints don't have this info yet. So let's not add that in the current PR.
# OFT info | ||
self.oft_r = nn.ParameterDict({}) | ||
self.oft_s = nn.ParameterDict({}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, but when users load checkpoints, they have trained the model, so oft_s
won't be an identity transform.
@BenjaminBossan Sorry for the late reply, I added the check_kwargs, do you think it is ok like that? As for the oft_s, if the user used an older version, it would be not existent, therefore, even if it is trained, it will still be an identity transform, therefore should make no difference. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for making these latest changes.
Concerning the compatibility check, I have a few comments, please check.
Also, I think it would be good to have a unit test for this check. I created a checkpoint with the old OFT implementation here: https://huggingface.co/peft-internal-testing/OFT-tiny-random-OPTForCausalLM. You can use it to load and then check that the warning has been raised.
src/peft/config.py
Outdated
""" | ||
if "oft_block_size" in kwargs: | ||
warnings.warn( | ||
'OFT has been updated since 0.12.1.dev0. Your trained adapter weights may not be compatible with the latest version of OFT. Please retrain your adapter weights.') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Note that
dev0
is not a real version, so let's not refer to that. The next release version will be 0.13.0, so let' use that version. - "may not be compatible": We are pretty sure it is incompatible when trained, right? Let's phrase it as "is incompatible".
- Let's also mention that users can downgrade PEFT to version 0.12.0 and then the adapter will still work.
src/peft/config.py
Outdated
kwargs (additional keyword arguments, *optional*): | ||
Additional keyword arguments passed along to the child class initialization. | ||
""" | ||
if "oft_block_size" in kwargs: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, so oft_block_size
is the new parameter, right? So if we're loading an old OFT model, it should be missing. Therefore, should the check not be if "oft_block_size" not in kwargs
?
@BenjaminBossan I noticed some other problems in the previous implementation of OFT, the size of the OFT adapters are also defined incorrectly (OFT should rotate the neurons, not the feature dimensions, the current way could have some fine-tuning effects, but is not optimal). Therefore, I added the raise ValueError in the OFTConfig, because it is now not possible to load the old adapter weights. I would suggest the user either use the old version of PEFT or retrain the adapter weights entirely. Best. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for making further corrections to OFT. I have a few smaller comments and also a question about the parameter finding algorithm, please take a look.
Also, please ensure to run make style
on your code changes.
src/peft/tuners/oft/config.py
Outdated
""" | ||
if "oft_block_size" not in kwargs: | ||
raise ValueError( | ||
'OFT has been updated since PEFT 0.13.0. Your trained adapter weights is incompatible with the latest version of OFT. Please retrain your adapter weights with newer PEFT versions.' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'OFT has been updated since PEFT 0.13.0. Your trained adapter weights is incompatible with the latest version of OFT. Please retrain your adapter weights with newer PEFT versions.' | |
'OFT has been updated since PEFT 0.13.0. Your trained adapter weights are incompatible with the latest version of OFT. Please retrain your adapter weights with newer PEFT versions.' |
Also, let's ensure the 120 char line limit.
src/peft/tuners/oft/config.py
Outdated
if "oft_block_size" not in kwargs: | ||
raise ValueError( | ||
'OFT has been updated since PEFT 0.13.0. Your trained adapter weights is incompatible with the latest version of OFT. Please retrain your adapter weights with newer PEFT versions.' | ||
'Downgrade PEFT to version 0.12.0 to merge the old adapter weights.' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'Downgrade PEFT to version 0.12.0 to merge the old adapter weights.' | |
'Alternatively, downgrade PEFT to version 0.12.0 to use the old adapter weights.' |
Not specific to merging, right?
src/peft/tuners/oft/layer.py
Outdated
warnings.warn(f"Invalid `oft_block_size` ({oft_block_size})!") | ||
oft_block_size = self.adjust_oft_parameters(self.in_features, oft_block_size) | ||
warnings.warn(f"Adjusted `oft_block_size` to ({oft_block_size}).") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of two warnings, let's make this one warning by prepending the first message to the second.
src/peft/tuners/oft/layer.py
Outdated
if self.in_features % r != 0 or r > self.in_features: | ||
warnings.warn(f"Invalid `r` ({r})!") | ||
r = self.adjust_oft_parameters(self.in_features, r) | ||
warnings.warn(f"Adjusted `r` to ({r}).") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same about having a single warning.
src/peft/tuners/oft/layer.py
Outdated
if conv_filter_dim % oft_block_size != 0 or oft_block_size > conv_filter_dim: | ||
warnings.warn(f"Invalid `oft_block_size` ({oft_block_size})!") | ||
oft_block_size = self.adjust_oft_parameters(conv_filter_dim, oft_block_size) | ||
warnings.warn(f"Adjusted `oft_block_size` to ({oft_block_size}).") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment about having a single warning.
src/peft/tuners/oft/layer.py
Outdated
if conv_filter_dim % r != 0 or r > conv_filter_dim: | ||
warnings.warn(f"Invalid `r` ({r})!") | ||
r = self.adjust_oft_parameters(conv_filter_dim, r) | ||
warnings.warn(f"Adjusted `r` to ({r}).") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment about having a single warning.
@BenjaminBossan So locally, I ran the I do get some error with |
Yeah, I can reproduce that, strange. Maybe it's related to the special layer types that Deberta uses.. I think we can skip this specific Deberta test, similar to how we skip AdaLoRA + Roberta: Lines 1105 to 1107 in c29810b
Please also merge with/rebase on the latest main from PEFT. |
@BenjaminBossan I fixed the error in test_feature_extraction_models and fetched the latest main. Best. |
Hmm, it seems like |
@BenjaminBossan Thank you so much! I tested the test_deeply_nested, so locally there was no error. |
@Zeju1997 I can confirm that |
@BenjaminBossan Thanks! I have updated. |
Thanks, but unfortunately, this has not solved the issue. I tested on another Linux PC with Intel CPU and still can't reproduce the issue. At this point, I don't know what else can be done, it's most likely some problem with the hardware of the GitHub CI runners. I would propose to skip this test on Linux for now: ...
import platform
...
# in test
if platform.system() == "Linux":
self.skipTest("This test fails but only on GitHub CI with Linux systems.") If you add this, you can also do revert the tolerance changes, since they didn't help. |
@BenjaminBossan Updated. |
@BenjaminBossan I think all the tests have been passed. Thank you again for the great help during the process! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for this PR and for your patience. Finally CI is green, I'll merge right away.
src/peft/tuners/oft/config.py
Outdated
"OFT has been updated since PEFT 0.13.0. Your trained adapter weights are incompatible " | ||
"with the latest version of OFT. Please retrain your adapter weights with newer PEFT versions." | ||
"Alternatively, downgrade PEFT to version 0.12.0 to use the old adapter weights." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, damn, a minor issue I noticed just now: Since PEFT 0.13.0 was released already, this message is no longer correct.
"OFT has been updated since PEFT 0.13.0. Your trained adapter weights are incompatible " | |
"with the latest version of OFT. Please retrain your adapter weights with newer PEFT versions." | |
"Alternatively, downgrade PEFT to version 0.12.0 to use the old adapter weights." | |
"OFT has been updated since PEFT 0.14.0. Your trained adapter weights are incompatible " | |
"with the latest version of OFT. Please retrain your adapter weights with newer PEFT versions." | |
"Alternatively, downgrade PEFT to version 0.13.0 to use the old adapter weights." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ping @Zeju1997 just this small change and we're good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@BenjaminBossan Sure, updated. Best.
src/peft/tuners/oft/config.py
Outdated
"OFT has been updated since PEFT 0.14.0. Your trained adapter weights are incompatible" | ||
"with the latest version of OFT. Please retrain your adapter weights with newer PEFT versions." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add back the spaces:
"OFT has b F438 een updated since PEFT 0.14.0. Your trained adapter weights are incompatible" | |
"with the latest version of OFT. Please retrain your adapter weights with newer PEFT versions." | |
"OFT has been updated since PEFT 0.14.0. Your trained adapter weights are incompatible " | |
"with the latest version of OFT. Please retrain your adapter weights with newer PEFT versions. " |
@BenjaminBossan Updated. Best, |
We've got it, finally. Thanks for your patience. |
@Zeju1997 Unfortunately, the regression tests are failing now for BOFT with the https://github.com/huggingface/peft/actions/runs/11136746324/job/30949028303#step:8:182 The issue is most likely that the This means that users wouldn't be able to load BOFT checkpoints including To test it locally, run this pytest command: pytest tests/regression/test_regression.py -s --regression -k boft |
@BenjaminBossan OK, I will look into this. |
@BenjaminBossan Is this related to old checkpoints from previous versions or also the current version? |
It's an old BOFT |
The previous OFT implementation contained a few errors, which are fixed now. Unfortunately, this makes previous OFT checkpoints invalid, which is why an error will be raised. Users are instructed to either retrain the OFT adapter or switch to an old PEFT version.
The previous OFT implementation contained a few errors, which are fixed now. Unfortunately, this makes previous OFT checkpoints invalid, which is why an error will be raised. Users are instructed to either retrain the OFT adapter or switch to an old PEFT version.
changes for BOFT
changes for OFT