Abstract
The role of mobile cameras increased dramatically over the past few years, leading to more and more research in automatic image quality enhancement and RAW photo processing. In this Mobile AI challenge, the target was to develop an efficient end-to-end AI-based image signal processing (ISP) pipeline replacing the standard mobile ISPs that can run on modern smartphone GPUs using TensorFlow Lite. The participants were provided with a large-scale Fujifilm UltraISP dataset consisting of thousands of paired photos captured with a normal mobile camera sensor and a professional 102MP medium-format FujiFilm GFX100 camera. The runtime of the resulting models was evaluated on the Snapdragon’s 8 Gen 1 GPU that provides excellent acceleration results for the majority of common deep learning ops. The proposed solutions are compatible with all recent mobile GPUs, being able to process Full HD photos in less than 20–50 ms while achieving high fidelity results. A detailed description of all models developed in this challenge is provided in this paper.
Andrey Ignatov and Radu Timofte are the main Mobile AI & AIM 2022 challenge organizers. The other authors participated in the challenge.
Mobile AI 2022 Workshop website:
https://ai-benchmark.com/workshops/mai/2022/
Appendix A contains the authors’ team names and affiliations.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Afifi, M., Brubaker, M.A., Brown, M.S.: HistoGAN: controlling colors of GAN-generated and real images via color histograms. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7941–7950 (2021)
Cai, J., Gu, S., Timofte, R., Zhang, L.: Ntire 2019 challenge on real image super-resolution: Methods and results. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (2019)
Cai, Y., Yao, Z., Dong, Z., Gholami, A., Mahoney, M.W., Keutzer, K.: Zeroq: A novel zero shot quantization framework. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13169–13178 (2020)
Chiang, C.M., et al.: Deploying image deblurring across mobile devices: a perspective of quality and latency. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 502–503 (2020)
Conde, M.V., McDonagh, S., Maggioni, M., Leonardis, A., Pérez-Pellitero, E.: Model-based image signal processors via learnable dictionaries. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 481–489 (2022)
Conde, M.V., Timofte, R., et al.: Reversed image signal processing and RAW reconstruction. AIM 2022 challenge report. In: Proceedings of the European Conference on Computer Vision (ECCV) Workshops (2022)
Dai, L., Liu, X., Li, C., Chen, J.: AWNET: Attentive wavelet network for image ISP. arXiv preprint arXiv:2008.09228 (2020)
Ding, X., et al.: ResRep: lossless CNN pruning via decoupling remembering and forgetting. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 4510–4520 (2021)
Ding, X., Xia, C., Zhang, X., Chu, X., Han, J., Ding, G.: RepMLP: Re-parameterizing convolutions into fully-connected layers for image recognition. arXiv preprint arXiv:2105.01883 (2021)
Ding, X., Zhang, X., Ma, N., Han, J., Ding, G., Sun, J.: RepVGG: making VGG-style convnets great again. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13733–13742 (2021)
Ganin, Y., Lempitsky, V.: Unsupervised domain adaptation by backpropagation. In: International Conference on Machine Learning, pp. 1180–1189. PMLR (2015)
Ganin, Y., Ustinova, E., Ajakan, H., Germain, P., Larochelle, H., Laviolette, F., Marchand, M., Lempitsky, V.: Domain-adversarial training of neural networks. J. Mach. Learn. Res. 17(1), 2030–2096 (2016)
He, K., Chen, X., Xie, S., Li, Y., Dollár, P., Girshick, R.: Masked autoencoders are scalable vision learners. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 16000–16009 (2022)
Howard, A., et al.: Searching for mobilenetv3. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1314–1324 (2019)
Huang, J., et al.: Range scaling global U-Net for perceptual image enhancement on mobile devices. In: Leal-Taixé, L., Roth, S. (eds.) ECCV 2018. LNCS, vol. 11133, pp. 230–242. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11021-5_15
Hui, Z., Wang, X., Deng, L., Gao, X.: Perception-preserving convolutional networks for image enhancement on smartphones. In: Proceedings of the European Conference on Computer Vision (ECCV) Workshops (2018)
Ignatov, A., Byeoung-su, K., Timofte, R.: Fast camera image denoising on mobile GPUs with deep learning, mobile AI 2021 challenge: Report. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (2021)
Ignatov, A., Chiang, J., Kuo, H.K., Sycheva, A., Timofte, R.: Learned smartphone isp on mobile NPUs with deep learning, mobile AI 2021 challenge: Report. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (2021)
Ignatov, A., Kobyshev, N., Timofte, R., Vanhoey, K., Van Gool, L.: DSLR-quality photos on mobile devices with deep convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3277–3285 (2017)
Ignatov, A., Kobyshev, N., Timofte, R., Vanhoey, K., Van Gool, L.: WESPE: weakly supervised photo enhancer for digital cameras. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 691–700 (2018)
Ignatov, A., Malivenko, G., Plowman, D., Shukla, S., Timofte, R.: Fast and accurate single-image depth estimation on mobile devices, mobile AI 2021 challenge: Report. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (2021)
Ignatov, A., Malivenko, G., Timofte, R.: Fast and accurate quantized camera scene detection on smartphones, mobile AI 2021 challenge: report. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (2021)
Ignatov, A., et al.: PyNet-V2 mobile: efficient on-device photo processing with neural networks. In: 2021 26th International Conference on Pattern Recognition (ICPR). IEEE (2022)
Ignatov, A., Malivenko, G., Timofte, R., et al.: Efficient single-image depth estimation on mobile devices, mobile AI & AIM 2022 challenge: report. In: European Conference on Computer Vision (2022)
Ignatov, A., Patel, J., Timofte, R.: Rendering natural camera bokeh effect with deep learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 418–419 (2020)
Ignatov, A., et al.: Aim 2019 challenge on bokeh effect synthesis: methods and results. In: 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), pp. 3591–3598. IEEE (2019)
Ignatov, A., et al.: MicroISP: processing 32mp photos on mobile devices with deep learning. In: European Conference on Computer Vision (2022)
Ignatov, A., Timofte, R.: Ntire 2019 challenge on image enhancement: methods and results. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (2019)
Ignatov, A., et al.: Power efficient video super-resolution on mobile NPUs with deep learning, mobile AI & AIM 2022 challenge: report. In: European Conference on Computer Vision (2022)
Ignatov, A., et al.: AI benchmark: running deep neural networks on android smartphones. In: Leal-Taixé, L., Roth, S. (eds.) ECCV 2018. LNCS, vol. 11133, pp. 288–314. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11021-5_19
Ignatov, A., Timofte, R., Denna, M., Younes, A.: Real-time quantized image super-resolution on mobile NPUs, mobile AI 2021 challenge: report. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (2021)
Ignatov, A., Timofte, R., Denna, M., Younes, A., et al.: Efficient and accurate quantized image super-resolution on mobile NPUs, mobile AI & AIM 2022 challenge: report. In: European Conference on Computer Vision (2022)
Ignatov, A., et al.: Aim 2019 challenge on raw to RGB mapping: methods and results. In: 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), pp. 3584–3590. IEEE (2019)
Ignatov, A., et al.: AI benchmark: all about deep learning on smartphones in 2019. In: 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), pp. 3617–3635. IEEE (2019)
Ignatov, A., et al.: AIM 2020 challenge on rendering realistic Bokeh. In: Bartoli, A., Fusiello, A. (eds.) ECCV 2020. LNCS, vol. 12537, pp. 213–228. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-67070-2_13
Ignatov, A., et al.: PIRM challenge on perceptual image enhancement on smartphones: report. In: Proceedings of the European Conference on Computer Vision (ECCV) Workshops (2018)
Ignatov, A., et al.: AIM 2020 challenge on learned image signal processing pipeline. arXiv preprint arXiv:2011.04994 (2020)
Ignatov, A., Timofte, R., et al.: Realistic bokeh effect rendering on mobile GPUs, mobile AI & AIM 2022 challenge: report (2022)
Ignatov, A., Van Gool, L., Timofte, R.: Replacing mobile camera ISP with a single deep learning model. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 536–537 (2020)
Ignatov, D., Ignatov, A.: Controlling information capacity of binary neural network. Pattern Recogn. Lett. 138, 276–281 (2020)
Jacob, B., et al.: Quantization and training of neural networks for efficient integer-arithmetic-only inference. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2704–2713 (2018)
Jain, S.R., Gural, A., Wu, M., Dick, C.H.: Trained quantization thresholds for accurate and efficient fixed-point inference of deep neural networks. arXiv preprint arXiv:1903.08066 (2019)
Kim, B.-H., Song, J., Ye, J.C., Baek, J.H.: PyNET-CA: enhanced PyNET with channel attention for end-to-end mobile image signal processing. In: Bartoli, A., Fusiello, A. (eds.) ECCV 2020. LNCS, vol. 12537, pp. 202–212. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-67070-2_12
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Kınlı, F.O., Menteş, S., Özcan, B., Kirac, F., Timofte, R., et al.: AIM 2022 challenge on Instagram filter removal: Methods and results. In: Proceedings of the European Conference on Computer Vision (ECCV) Workshops (2022)
Lee, J.,et al.: On-device neural net inference with mobile GPUs. arXiv preprint arXiv:1907.01989 (2019)
Li, Y., Gu, S., Gool, L.V., Timofte, R.: Learning filter basis for convolutional neural network compression. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 5623–5632 (2019)
Lim, B., Son, S., Kim, H., Nah, S., Mu Lee, K.: Enhanced deep residual networks for single image super-resolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 136–144 (2017)
Liu, H., Navarrete Michelini, P., Zhu, D.: Deep networks for image-to-image translation with Mux and Demux layers. In: Leal-Taixé, L., Roth, S. (eds.) ECCV 2018. LNCS, vol. 11133, pp. 150–165. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11021-5_10
Liu, J., Tang, J., Wu, G.: Residual feature distillation network for lightweight image super-resolution. In: Bartoli, A., Fusiello, A. (eds.) ECCV 2020. LNCS, vol. 12537, pp. 41–55. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-67070-2_2
Liu, Z., et al.: Metapruning: Meta learning for automatic neural network channel pruning. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 3296–3305 (2019)
Liu, Z., Wu, B., Luo, W., Yang, X., Liu, W., Cheng, K.T.: Bi-Real net: enhancing the performance of 1-bit CNNs with improved representational capability and advanced training algorithm. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 722–737 (2018)
Lugmayr, A., Danelljan, M., Timofte, R.: Unsupervised learning for real-world super-resolution. In: 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), pp. 3408–3416. IEEE (2019)
Lugmayr, A., Danelljan, M., Timofte, R.: Ntire 2020 challenge on real-world image super-resolution: methods and results. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 494–495 (2020)
Obukhov, A., Rakhuba, M., Georgoulis, S., Kanakis, M., Dai, D., Van Gool, L.: T-basis: a compact representation for neural networks. In: International Conference on Machine Learning, pp. 7392–7404. PMLR (2020)
Real, E., Aggarwal, A., Huang, Y., Le, Q.V.: Regularized evolution for image classifier architecture search. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 4780–4789 (2019)
Romero, A., Ignatov, A., Kim, H., Timofte, R.: Real-time video super-resolution on smartphones with deep learning, mobile AI 2021 challenge: report. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (2021)
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Seif, G., Androutsos, D.: Edge-based loss function for single image super-resolution. In: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1468–1472. IEEE (2018)
Silva, J.I.S., et al.: A deep learning approach to mobile camera image signal processing. In: Anais Estendidos do XXXIII Conference on Graphics, Patterns and Images, pp. 225–231. SBC (2020)
de Stoutz, E., Ignatov, A., Kobyshev, N., Timofte, R., Van Gool, L.: Fast perceptual image enhancement. In: Proceedings of the European Conference on Computer Vision (ECCV) Workshops (2018)
Szegedy, C., Ioffe, S., Vanhoucke, V., Alemi, A.A.: Inception-v4, inception-resnet and the impact of residual connections on learning. In: Thirty-first AAAI Conference on Artificial Intelligence (2017)
Tan, M., et al.: MNASNet: platform-aware neural architecture search for mobile. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2820–2828 (2019)
TensorFlow-Lite. https://www.tensorflow.org/lite
Timofte, R., Gu, S., Wu, J., Van Gool, L.: Ntire 2018 challenge on single image super-resolution: methods and results. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 852–863 (2018)
Truong, P., Danelljan, M., Van Gool, L., Timofte, R.: Learning accurate dense correspondences and when to trust them. arXiv preprint arXiv:2101.01710 (2021)
Uhlich, S., et al.: Mixed precision DNNs: All you need is a good parametrization. arXiv preprint arXiv:1905.11452 (2019)
Vu, T., Nguyen, C.V., Pham, T.X., Luu, T.M., Yoo, C.D.: Fast and efficient image quality enhancement via DesubPixel convolutional neural networks. In: Leal-Taixé, L., Roth, S. (eds.) ECCV 2018. LNCS, vol. 11133, pp. 243–259. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11021-5_16
Wan, A., et al.: FBNetV2: differentiable neural architecture search for spatial and channel dimensions. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12965–12974 (2020)
Woo, S., Park, J., Lee, J.-Y., Kweon, I.S.: CBAM: convolutional block attention module. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11211, pp. 3–19. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01234-2_1
Wu, Y., Zheng, J., Fan, Z., Wu, X., Zhang, F.: Residual feature distillation channel spatial attention network for ISP on smartphone. In: Proceedings of the European Conference on Computer Vision (ECCV) Workshops (2022)
Wu, B., et al.: FBNet: hardware-aware efficient convnet design via differentiable neural architecture search. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10734–10742 (2019)
Yang, J., et al.: Quantization networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7308–7316 (2019)
Yang, R., Timofte, R., et al.: AIM 2022 challenge on super-resolution of compressed image and video: dataset, methods and results. In: Proceedings of the European Conference on Computer Vision (ECCV) Workshops (2022)
Zhang, X., Zeng, H., Zhang, L.: Edge-oriented convolution block for real-time super resolution on mobile devices. In: Proceedings of the 29th ACM International Conference on Multimedia, pp. 4034–4043 (2021)
Zhang, Y., Li, K., Li, K., Wang, L., Zhong, B., Fu, Y.: Image super-resolution using very deep residual channel attention networks. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 286–301 (2018)
Acknowledgements
We thank the sponsors of the Mobile AI and AIM 2022 workshops and challenges: AI Witchlabs, MediaTek, Huawei, Reality Labs, OPPO, Synaptics, Raspberry Pi, ETH Zürich (Computer Vision Lab) and University of Würzburg (Computer Vision Lab).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
A Teams and Affiliations
A Teams and Affiliations
1.1 Mobile AI 2022 Team
Title:
Mobile AI 2022 Learned Smartphone ISP Challenge
Members:
Andrey Ignatov\(^{1,2}\) (andrey@vision.ee.ethz.ch), Radu Timofte\(^{1,2,3}\)
Affiliations:
\(^1\) Computer Vision Lab, ETH Zurich, Switzerland
\(^2\) AI Witchlabs, Switzerland
\(^3\) University of Wuerzburg, Germany
1.2 MiAlgo
Title:
3Convs and BigUNet for Smartphone ISP
Members:
Shuai Liu (liushuai21@xiaomi.com), Chaoyu Feng, Furui Bai, Xiaotao Wang, Lei Lei
Affiliations:
Xiaomi Inc., China
1.3 Multimedia
Title:
FGARepNet: A real-time end-to-end ISP network based on Fine-Granularity attention and Re-parameter convolution
Members:
Ziyao Yi (yi.ziyao@sanechips.com.cn), Yan Xiang, Zibin Liu, Shaoqing Li, Keming Shi, Dehui Kong, Ke Xv
Affiliations:
Sanechips Co. Ltd, China
1.4 ENERZAi Research
Title:
Latency-Aware NAS and Histogram Feature Loss
Members:
Minsu Kwon (minsu.kwon@enerzai.com)
Affiliations:
ENERZAi, Seoul, Korea
enerzai.com
1.5 HITZST01
Title:
Residual Feature Distillation Channel Spatial Attention Network for ISP on Smartphones [71]
Members:
Yaqi Wu\(^1\) (titimasta@163.com), Jiesi Zheng\(^2\), Zhihao Fan\(^3\), Xun Wu\(^4\), Feng Zhang
Affiliations:
\(^1\) Harbin Institute of Technology, China
\(^2\) Zhejiang University, China
\(^3\) University of Shanghai for Science and Technology, China
\(^4\) Tsinghua University, China
1.6 MINCHO
Title:
Mobile-Smallnet: Smallnet with MobileNet blocks for an end-to-end ISP Pipeline
Members:
Albert No (albertno@hongik.ac.kr), Minhyeok Cho
Affiliations:
Hongik University, Korea
1.7 CASIA 1st
Title:
Learned Smartphone ISP Based On Distillation Acceleration
Members:
Zewen Chen\(^1\) (chenzewen2022@ia.ac.cn), Xiaze Zhang\(^2\), Ran Li\(^3\), Juan Wang\(^1\), Zhiming Wang\(^4\)
Affiliations:
\(^1\) Institute of Automation, Chinese Academy of Sciences, China
\(^2\) School of Computer Science, Fudan University, China
\(^3\) Washington University in St. Louis
\(^4\) Tsinghua University, China
1.8 JMU-CVLab
Title:
Shallow Non-linear CNNs as ISP
Members:
Marcos V. Conde (marcos.conde-osorio@uni-wuerzburg.de), Ui-Jin Choi
Affiliations:
University of Wuerzburg, Germany
1.9 DANN-ISP
Title:
Learning End-to-End Deep Learning Based Image Signal Processing Pipeline Using Adversarial Domain Adaptation
Members:
Georgy Perevozchikov (perevozchikov.gp@phystech.edu), Egor Ershov
Affiliations:
Moscow Institute of Physics and Technology, Russia
1.10 Rainbow
Title:
Auto White Balance UNet for Learned Smartphone ISP
Members:
Zheng Hui (huizheng.hz@alibaba-inc.com)
Affiliations:
Alibaba DAMO Academy, China
1.11 SKD-VSP
Title:
IFS Net-Image Frequency Separation Residual Network
Members:
Mengchuan Dong (mengchuan61@gmail.com), Wei Zhou, Cong Pang
Affiliations:
ShanghaiTech University, China
1.12 CHannel Team
Title:
GaUss-DWT net
Members:
Haina Qin (qinhaina2020@ia.ac.cn), Mingxuan Cai
Affiliations:
Institute of Automation, Chinese Academy of Sciences, China
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Ignatov, A. et al. (2023). Learned Smartphone ISP on Mobile GPUs with Deep Learning, Mobile AI & AIM 2022 Challenge: Report. In: Karlinsky, L., Michaeli, T., Nishino, K. (eds) Computer Vision – ECCV 2022 Workshops. ECCV 2022. Lecture Notes in Computer Science, vol 13803. Springer, Cham. https://doi.org/10.1007/978-3-031-25066-8_3
Download citation
DOI: https://doi.org/10.1007/978-3-031-25066-8_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-25065-1
Online ISBN: 978-3-031-25066-8
eBook Packages: Computer ScienceComputer Science (R0)