Computer Science > Computer Vision and Pattern Recognition

arXiv:2308.03364 (cs)

[Submitted on 7 Aug 2023 (v1), last revised 11 Aug 2023 (this version, v2)]

Title:Dual Aggregation Transformer for Image Super-Resolution

Authors:Zheng Chen, Yulun Zhang, Jinjin Gu, Linghe Kong, Xiaokang Yang, Fisher Yu

View PDF

Abstract:Transformer has recently gained considerable popularity in low-level vision tasks, including image super-resolution (SR). These networks utilize self-attention along different dimensions, spatial or channel, and achieve impressive performance. This inspires us to combine the two dimensions in Transformer for a more powerful representation capability. Based on the above idea, we propose a novel Transformer model, Dual Aggregation Transformer (DAT), for image SR. Our DAT aggregates features across spatial and channel dimensions, in the inter-block and intra-block dual manner. Specifically, we alternately apply spatial and channel self-attention in consecutive Transformer blocks. The alternate strategy enables DAT to capture the global context and realize inter-block feature aggregation. Furthermore, we propose the adaptive interaction module (AIM) and the spatial-gate feed-forward network (SGFN) to achieve intra-block feature aggregation. AIM complements two self-attention mechanisms from corresponding dimensions. Meanwhile, SGFN introduces additional non-linear spatial information in the feed-forward network. Extensive experiments show that our DAT surpasses current methods. Code and models are obtainable at this https URL.

Comments:	Accepted to ICCV 2023. Code is available at this https URL
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2308.03364 [cs.CV]
	(or arXiv:2308.03364v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2308.03364

Submission history

From: Zheng Chen [view email]
[v1] Mon, 7 Aug 2023 07:39:39 UTC (5,557 KB)
[v2] Fri, 11 Aug 2023 05:21:15 UTC (5,557 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Dual Aggregation Transformer for Image Super-Resolution

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Dual Aggregation Transformer for Image Super-Resolution

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators