[go: up one dir, main page]

loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Authors: Chenyu Wang 1 ; 2 ; Toshio Endo 1 ; Takahiro Hirofuchi 2 and Tsutomu Ikegami 2

Affiliations: 1 Tokyo Institute of Technology, Tokyo, Japan ; 2 National Institute of Advanced Industrial Science and Technology (AIST), Tokyo, Japan

Keyword(s): Swin Transformer, Object Detection, Image Classification, Feature Pyramid Network, Multiscale.

Abstract: We present the Pyramid Swin Transformer for object detection and image classification, by taking advantage of more shift window operations, smaller and more different size windows. We also add a Feature Pyramid Network for object detection, which produces excellent results. This architecture is implemented in four stages, containing different size window layers. We test our architecture on ImageNet classification and COCO detection. Pyramid Swin Transformer achieves 85.4% accuracy on ImageNet classification and 54.3 box AP on COCO.

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 142.171.178.55

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Wang, C.; Endo, T.; Hirofuchi, T. and Ikegami, T. (2023). Pyramid Swin Transformer: Different-Size Windows Swin Transformer for Image Classification and Object Detection. In Proceedings of the 18th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2023) - Volume 5: VISAPP; ISBN 978-989-758-634-7; ISSN 2184-4321, SciTePress, pages 583-590. DOI: 10.5220/0011675800003417

@conference{visapp23,
author={Chenyu Wang. and Toshio Endo. and Takahiro Hirofuchi. and Tsutomu Ikegami.},
title={Pyramid Swin Transformer: Different-Size Windows Swin Transformer for Image Classification and Object Detection},
booktitle={Proceedings of the 18th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2023) - Volume 5: VISAPP},
year={2023},
pages={583-590},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0011675800003417},
isbn={978-989-758-634-7},
issn={2184-4321},
}

TY - CONF

JO - Proceedings of the 18th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2023) - Volume 5: VISAPP
TI - Pyramid Swin Transformer: Different-Size Windows Swin Transformer for Image Classification and Object Detection
SN - 978-989-758-634-7
IS - 2184-4321
AU - Wang, C.
AU - Endo, T.
AU - Hirofuchi, T.
AU - Ikegami, T.
PY - 2023
SP - 583
EP - 590
DO - 10.5220/0011675800003417
PB - SciTePress