Computer Science > Computer Vision and Pattern Recognition

arXiv:2311.18763 (cs)

[Submitted on 30 Nov 2023 (v1), last revised 2 May 2024 (this version, v2)]

Title:Continual Diffusion with STAMINA: STack-And-Mask INcremental Adapters

Authors:James Seale Smith, Yen-Chang Hsu, Zsolt Kira, Yilin Shen, Hongxia Jin

Abstract:Recent work has demonstrated a remarkable ability to customize text-to-image diffusion models to multiple, fine-grained concepts in a sequential (i.e., continual) manner while only providing a few example images for each concept. This setting is known as continual diffusion. Here, we ask the question: Can we scale these methods to longer concept sequences without forgetting? Although prior work mitigates the forgetting of previously learned concepts, we show that its capacity to learn new tasks reaches saturation over longer sequences. We address this challenge by introducing a novel method, STack-And-Mask INcremental Adapters (STAMINA), which is composed of low-ranked attention-masked adapters and customized MLP tokens. STAMINA is designed to enhance the robust fine-tuning properties of LoRA for sequential concept learning via learnable hard-attention masks parameterized with low rank MLPs, enabling precise, scalable learning via sparse adaptation. Notably, all introduced trainable parameters can be folded back into the model after training, inducing no additional inference parameter costs. We show that STAMINA outperforms the prior SOTA for the setting of text-to-image continual customization on a 50-concept benchmark composed of landmarks and human faces, with no stored replay data. Additionally, we extended our method to the setting of continual learning for image classification, demonstrating that our gains also translate to state-of-the-art performance in this standard benchmark.

Comments:	CVPR-W 2024
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Cite as:	arXiv:2311.18763 [cs.CV]
	(or arXiv:2311.18763v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2311.18763

Submission history

From: James Smith [view email]
[v1] Thu, 30 Nov 2023 18:04:21 UTC (17,078 KB)
[v2] Thu, 2 May 2024 19:24:23 UTC (9,513 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Continual Diffusion with STAMINA: STack-And-Mask INcremental Adapters

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Continual Diffusion with STAMINA: STack-And-Mask INcremental Adapters

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators