Computer Science > Computer Vision and Pattern Recognition

arXiv:2206.10075 (cs)

[Submitted on 21 Jun 2022 (v1), last revised 14 Oct 2022 (this version, v2)]

Title:Counting Varying Density Crowds Through Density Guided Adaptive Selection CNN and Transformer Estimation

Authors:Yuehai Chen, Jing Yang, Badong Chen, Shaoyi Du

View PDF

Abstract:In real-world crowd counting applications, the crowd densities in an image vary greatly. When facing density variation, humans tend to locate and count the targets in low-density regions, and reason the number in high-density regions. We observe that CNN focus on the local information correlation using a fixed-size convolution kernel and the Transformer could effectively extract the semantic crowd information by using the global self-attention mechanism. Thus, CNN could locate and estimate crowds accurately in low-density regions, while it is hard to properly perceive the densities in high-density regions. On the contrary, Transformer has a high reliability in high-density regions, but fails to locate the targets in sparse regions. Neither CNN nor Transformer can well deal with this kind of density variation. To address this problem, we propose a CNN and Transformer Adaptive Selection Network (CTASNet) which can adaptively select the appropriate counting branch for different density regions. Firstly, CTASNet generates the prediction results of CNN and Transformer. Then, considering that CNN/Transformer is appropriate for low/high-density regions, a density guided adaptive selection module is designed to automatically combine the predictions of CNN and Transformer. Moreover, to reduce the influences of annotation noise, we introduce a Correntropy based optimal transport loss. Extensive experiments on four challenging crowd counting datasets have validated the proposed method.

Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2206.10075 [cs.CV]
	(or arXiv:2206.10075v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2206.10075
Related DOI:	https://doi.org/10.1109/TCSVT.2022.3208714

Submission history

From: Yuehai Chen [view email]
[v1] Tue, 21 Jun 2022 02:05:41 UTC (5,700 KB)
[v2] Fri, 14 Oct 2022 01:15:04 UTC (7,565 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Counting Varying Density Crowds Through Density Guided Adaptive Selection CNN and Transformer Estimation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Counting Varying Density Crowds Through Density Guided Adaptive Selection CNN and Transformer Estimation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators