As a guest user you are not logged in or recognized by your IP address. You have
access to the Front Matter, Abstracts, Author Index, Subject Index and the full
text of Open Access publications.
Large-scale pre-trained vision-language models (VLMs), like CLIP, have presented striking generalizability for adapting to image classification in a few shot setting. Most existing methods explore a set of learnable tokens, such as prompt learning, on data-efficient utilization for task adaptation. However, they focus on either the coupled-modality property by prompt projection or decoupled-modality characteristic by prompt consistency, which ignores effective interaction between prompts. To model the deep yet sufficient cross-modal interaction and enhance the generalization between both seen and unseen tasks, in this paper, we propose a novel coupled and decoupled prompt learning framework, dubbed PromptCD, for vision-language models. Specifically, we introduce a bi-directional coupled-modality mechanism to intensify the interaction between both vision and language branches. Additionally, we propose mixture consistency to further improve the generalization and discrimination of the models on unseen tasks. The integration of such a mechanism and consistency facilitates the proposed framework adaptation for various downstream tasks. We conduct extensive experiments on 11 image classification datasets under a range of evaluation protocols, including base-to-novel and domain generalization, and cross-dataset recognition. Experimental results demonstrate that our proposed PromptCD overall outperforms state-of-the-art methods.
This website uses cookies
We use cookies to provide you with the best possible experience. They also allow us to analyze user behavior in order to constantly improve the website for you. Info about the privacy policy of IOS Press.
This website uses cookies
We use cookies to provide you with the best possible experience. They also allow us to analyze user behavior in order to constantly improve the website for you. Info about the privacy policy of IOS Press.