MicroISP: Processing 32MP Photos on Mobile Devices with Deep Learning

Andrey Ignatov^10,11,
Anastasia Sycheva¹⁰,
Radu Timofte^10,11,
Yu Tseng¹²,
Yu-Syuan Xu¹²,
Po-Hsiang Yu¹²,
Cheng-Ming Chiang¹²,
Hsien-Kai Kuo¹²,
Min-Hung Chen¹²,
Chia-Ming Cheng¹² &
…
Luc Van Gool^10,11

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13802))

Included in the following conference series:

European Conference on Computer Vision

1891 Accesses
6 Citations

Abstract

While neural networks-based photo processing solutions can provide a better image quality compared to the traditional ISP systems, their application to mobile devices is still very limited due to their very high computational complexity. In this paper, we present a novel MicroISP model designed specifically for edge devices, taking into account their computational and memory limitations. The proposed solution is capable of processing up to 32MP photos on recent smartphones using the standard mobile ML libraries and requiring less than 1 s to perform the inference, while for FullHD images it achieves real-time performance. The architecture of the model is flexible, allowing to adjust its complexity to devices of different computational power. To evaluate the performance of the model, we collected a novel Fujifilm UltraISP dataset consisting of thousands of paired photos captured with a normal mobile camera sensor and a professional 102MP medium-format FujiFilm GFX100 camera. The experiments demonstrated that, despite its compact size, the MicroISP model is able to provide comparable or better visual results than the traditional mobile ISP systems, while outperforming the previously proposed efficient deep learning based solutions. Finally, this model is also compatible with the latest mobile AI accelerators, achieving good runtime and low power consumption o n smartphone NPUs and APUs. The code, dataset and pre-trained models are available on the project website: https://people.ee.ethz.ch/~ihnatova/microisp.html.

A. Ignatov, R. Timofte and L. Van Gool—The main contact authors.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 149.00; Price excludes VAT (USA)

Softcover Book: USD 199.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Learned Smartphone ISP on Mobile GPUs with Deep Learning, Mobile AI & AIM 2022 Challenge: Report

PIRM Challenge on Perceptual Image Enhancement on Smartphones: Report

Deep Learning in Computer Vision Through Mobile Edge Computing for IoT

References

Abdelhamed, A., Afifi, M., Timofte, R., Brown, M.S.: NTIRE 2020 challenge on real image denoising: dataset, methods and results. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 496–497 (2020)
Google Scholar
Abdelhamed, A., Timofte, R., Brown, M.S.: NTIRE 2019 challenge on real image denoising: Methods and results. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (2019)
Google Scholar
API, A.N.N.: https://source.android.com/devices/neural-networks
Cai, J., Gu, S., Timofte, R., Zhang, L.: NTIRE 2019 challenge on real image super-resolution: Methods and results. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (2019)
Google Scholar
Cai, J., Gu, S., Zhang, L.: Learning a deep single image contrast enhancer from multi-exposure images. IEEE Trans. Image Process. 27(4), 2049–2062 (2018)
Article MathSciNet MATH Google Scholar
Dai, L., Liu, X., Li, C., Chen, J.: AWNet: attentive wavelet network for image ISP. arXiv preprint arXiv:2008.09228 (2020)
Dong, C., Loy, C.C., He, K., Tang, X.: Image super-resolution using deep convolutional networks. IEEE Trans. Pattern Anal. Mach. Intell. 38(2), 295–307 (2015)
Article Google Scholar
Dong, C., Loy, C.C., Tang, X.: Accelerating the super-resolution convolutional neural network. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 391–407. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46475-6_25
Chapter Google Scholar
Fu, X., Zeng, D., Huang, Y., Liao, Y., Ding, X., Paisley, J.: A fusion-based enhancing method for weakly illuminated images. Signal Process. 129, 82–96 (2016)
Article Google Scholar
Gu, S., Timofte, R.: A brief review of image denoising algorithms and beyond. In: Inpainting and Denoising Challenges, pp. 1–21 (2019)
Google Scholar
Hsyu, M.C., Liu, C.W., Chen, C.H., Chen, C.W., Tsai, W.C.: CSANet: high speed channel spatial attention network for mobile ISP. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (2021)
Google Scholar
Huang, J., et al.: Range scaling global U-Net for perceptual image enhancement on mobile devices. In: Leal-Taixé, L., Roth, S. (eds.) ECCV 2018. LNCS, vol. 11133, pp. 230–242. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11021-5_15
Chapter Google Scholar
Hui, Z., Wang, X., Deng, L., Gao, X.: Perception-preserving convolutional networks for image enhancement on smartphones. In: Leal-Taixé, L., Roth, S. (eds.) ECCV 2018. LNCS, vol. 11133, pp. 197–213. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11021-5_13
Chapter Google Scholar
Ignatov, A., Byeoung-su, K., Timofte, R., Pouget, A.: Fast camera image denoising on mobile gpus with deep learning, mobile AI 2021 challenge: report. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2515–2524 (2021)
Google Scholar
Ignatov, A., Chiang, J., Kuo, H.K., Sycheva, A., Timofte, R.: Learned smartphone isp on mobile npus with deep learning, mobile AI 2021 challenge: report. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (2021)
Google Scholar
Ignatov, A., Kobyshev, N., Timofte, R., Vanhoey, K., Van Gool, L.: DSLR-quality photos on mobile devices with deep convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3277–3285 (2017)
Google Scholar
Ignatov, A., et al.: PyNet-V2 Mobile: efficient on-device photo processing with neural networks. In: 2021 26th International Conference on Pattern Recognition (ICPR). IEEE (2022)
Google Scholar
Ignatov, A., Timofte, R.: NTIRE 2019 challenge on image enhancement: methods and results. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (2019)
Google Scholar
Ignatov, A., et al.: AI benchmark: running deep neural networks on android smartphones. In: Leal-Taixé, L., Roth, S. (eds.) ECCV 2018. LNCS, vol. 11133, pp. 288–314. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11021-5_19
Chapter Google Scholar
Ignatov, A., Timofte, R., Denna, M., Younes, A.: Real-time quantized image super-resolution on mobile NPUs, mobile AI 2021 challenge: report. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (2021)
Google Scholar
Ignatov, A., et al.: Aim 2019 challenge on raw to RGB mapping: methods and results. In: 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), pp. 3584–3590. IEEE (2019)
Google Scholar
Ignatov, A., et al.: AI benchmark: all about deep learning on smartphones in 2019. In: 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), pp. 3617–3635. IEEE (2019)
Google Scholar
Ignatov, A., et al.: PIRM challenge on perceptual image enhancement on smartphones: report. In: Leal-Taixé, L., Roth, S. (eds.) ECCV 2018. LNCS, vol. 11133, pp. 315–333. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11021-5_20
Chapter Google Scholar
Ignatov, A., et al.: AIM 2020 challenge on learned image signal processing pipeline. arXiv preprint arXiv:2011.04994 (2020)
Ignatov, A., Van Gool, L., Timofte, R.: Replacing mobile camera ISP with a single deep learning model. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 536–537 (2020)
Google Scholar
Johnson, J., Alahi, A., Fei-Fei, L.: Perceptual losses for real-time style transfer and super-resolution. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 694–711. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46475-6_43
Chapter Google Scholar
Kim, B.-H., Song, J., Ye, J.C., Baek, J.H.: PyNET-CA: enhanced PyNET with channel attention for end-to-end mobile image signal processing. In: Bartoli, A., Fusiello, A. (eds.) ECCV 2020. LNCS, vol. 12537, pp. 202–212. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-67070-2_12
Chapter Google Scholar
Kim, J., Lee, J.K., Lee, K.M.: Accurate image super-resolution using very deep convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1646–1654 (2016)
Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Lee, J., et al.: On-device neural net inference with mobile GPUs. arXiv preprint arXiv:1907.01989 (2019)
Lim, B., Son, S., Kim, H., Nah, S., Lee, K.M.: Enhanced deep residual networks for single image super-resolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition workshops, pp. 136–144 (2017)
Google Scholar
Liu, H., Navarrete Michelini, P., Zhu, D.: Deep networks for image-to-image translation with Mux and Demux layers. In: Leal-Taixé, L., Roth, S. (eds.) ECCV 2018. LNCS, vol. 11133, pp. 150–165. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11021-5_10
Chapter Google Scholar
Lugmayr, A., Danelljan, M., Timofte, R.: Unsupervised learning for real-world super-resolution. In: 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), pp. 3408–3416. IEEE (2019)
Google Scholar
Lugmayr, A., Danelljan, M., Timofte, R.: NTIRE 2020 challenge on real-world image super-resolution: Methods and results. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 494–495 (2020)
Google Scholar
Ma, K., Yeganeh, H., Zeng, K., Wang, Z.: High dynamic range image tone mapping by optimizing tone mapped image quality index. In: 2014 IEEE International Conference on Multimedia and Expo (ICME), pp. 1–6. IEEE (2014)
Google Scholar
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Chapter Google Scholar
Salih, Y., Malik, A.S., Saad, N., et al.: Tone mapping of HDR images: a review. In: 2012 4th International Conference on Intelligent and Advanced Systems (ICIAS2012), vol. 1, pp. 368–373. IEEE (2012)
Google Scholar
Silva, J.I.S., et al.: A deep learning approach to mobile camera image signal processing. In: Anais Estendidos do XXXIII Conference on Graphics, Patterns and Images, pp. 225–231. SBC (2020)
Google Scholar
Specifications, A.N.N.A.: https://android.googlesource.com/platform/hardware/interfaces/+/refs/heads/master/neuralnetworks/1.0/types.hal
Specifications, A.N.N.A.: https://android.googlesource.com/platform/hardware/interfaces/+/refs/heads/master/neuralnetworks/1.2/types.hal
Specifications, A.N.N.A.: https://android.googlesource.com/platform/hardware/interfaces/+/refs/heads/master/neuralnetworks/1.3/types.hal
de Stoutz, E., Ignatov, A., Kobyshev, N., Timofte, R., Van Gool, L.: Fast perceptual image enhancement. In: Leal-Taixé, L., Roth, S. (eds.) ECCV 2018. LNCS, vol. 11133, pp. 260–275. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11021-5_17
Chapter Google Scholar
Tai, Y., Yang, J., Liu, X., Xu, C.: MemNet: a persistent memory network for image restoration. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4539–4547 (2017)
Google Scholar
Timofte, R., Agustsson, E., Van Gool, L., Yang, M.H., Zhang, L.: NTIRE 2017 challenge on single image super-resolution: Methods and results. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 114–125 (2017)
Google Scholar
Timofte, R., Gu, S., Wu, J., Van Gool, L.: NTIRE 2018 challenge on single image super-resolution: Methods and results. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 852–863 (2018)
Google Scholar
Truong, P., Danelljan, M., Van Gool, L., Timofte, R.: Learning accurate dense correspondences and when to trust them. arXiv preprint arXiv:2101.01710 (2021)
Vu, T., Nguyen, C.V., Pham, T.X., Luu, T.M., Yoo, C.D.: Fast and efficient image quality enhancement via Desubpixel convolutional neural networks. In: Leal-Taixé, L., Roth, S. (eds.) ECCV 2018. LNCS, vol. 11133, pp. 243–259. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11021-5_16
Chapter Google Scholar
Wang, X., et al.: ESRGAN: enhanced super-resolution generative adversarial networks. In: Leal-Taixé, L., Roth, S. (eds.) ECCV 2018. LNCS, vol. 11133, pp. 63–79. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11021-5_5
Chapter Google Scholar
Yan, Z., Zhang, H., Wang, B., Paris, S., Yu, Y.: Automatic photo adjustment using deep neural networks. ACM Trans. Graph. (TOG) 35(2), 11 (2016)
Article Google Scholar
Yan, Z., Zhang, H., Wang, B., Paris, S., Yu, Y.: Automatic photo adjustment using deep neural networks, vol. 35, p. 11. In: ACM (2016)
Google Scholar
Yuan, L., Sun, J.: Automatic exposure correction of consumer photographs. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7575, pp. 771–785. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33765-9_55
Chapter Google Scholar
Zhang, K., Gu, S., Timofte, R.: NTIRE 2020 challenge on perceptual extreme super-resolution: methods and results. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 492–493 (2020)
Google Scholar
Zhang, K., Zuo, W., Chen, Y., Meng, D., Zhang, L.: Beyond a gaussian denoiser: residual learning of deep CNN for image denoising. IEEE Trans. Image Process. 26(7), 3142–3155 (2017)
Article MathSciNet MATH Google Scholar
Zhang, K., Zuo, W., Zhang, L.: FFDNet: toward a fast and flexible solution for CNN-based image denoising. IEEE Trans. Image Process. 27(9), 4608–4622 (2018)
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

ETH Zurich, Zurich, Switzerland
Andrey Ignatov, Anastasia Sycheva, Radu Timofte & Luc Van Gool
AI Witchlabs Ltd., Zollikerberg, Switzerland
Andrey Ignatov, Radu Timofte & Luc Van Gool
MediaTek Inc., Hsinchu, Taiwan
Yu Tseng, Yu-Syuan Xu, Po-Hsiang Yu, Cheng-Ming Chiang, Hsien-Kai Kuo, Min-Hung Chen & Chia-Ming Cheng

Authors

Andrey Ignatov
View author publications
You can also search for this author in PubMed Google Scholar
Anastasia Sycheva
View author publications
You can also search for this author in PubMed Google Scholar
Radu Timofte
View author publications
You can also search for this author in PubMed Google Scholar
Yu Tseng
View author publications
You can also search for this author in PubMed Google Scholar
Yu-Syuan Xu
View author publications
You can also search for this author in PubMed Google Scholar
Po-Hsiang Yu
View author publications
You can also search for this author in PubMed Google Scholar
Cheng-Ming Chiang
View author publications
You can also search for this author in PubMed Google Scholar
Hsien-Kai Kuo
View author publications
You can also search for this author in PubMed Google Scholar
Min-Hung Chen
View author publications
You can also search for this author in PubMed Google Scholar
Chia-Ming Cheng
View author publications
You can also search for this author in PubMed Google Scholar
Luc Van Gool
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Andrey Ignatov .

Editor information

Editors and Affiliations

IBM Research - MIT-IBM Watson AI Lab, Massachusetts, USA
Leonid Karlinsky
Technion – Israel Institute of Technology, Haifa, Israel
Tomer Michaeli
Kyoto University, Kyoto, Japan
Ko Nishino

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ignatov, A. et al. (2023). MicroISP: Processing 32MP Photos on Mobile Devices with Deep Learning. In: Karlinsky, L., Michaeli, T., Nishino, K. (eds) Computer Vision – ECCV 2022 Workshops. ECCV 2022. Lecture Notes in Computer Science, vol 13802. Springer, Cham. https://doi.org/10.1007/978-3-031-25063-7_46

Download citation

DOI: https://doi.org/10.1007/978-3-031-25063-7_46
Published: 16 February 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-25062-0
Online ISBN: 978-3-031-25063-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

MicroISP: Processing 32MP Photos on Mobile Devices with Deep Learning

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Learned Smartphone ISP on Mobile GPUs with Deep Learning, Mobile AI & AIM 2022 Challenge: Report

PIRM Challenge on Perceptual Image Enhancement on Smartphones: Report

Deep Learning in Computer Vision Through Mobile Edge Computing for IoT

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

MicroISP: Processing 32MP Photos on Mobile Devices with Deep Learning

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Learned Smartphone ISP on Mobile GPUs with Deep Learning, Mobile AI & AIM 2022 Challenge: Report

PIRM Challenge on Perceptual Image Enhancement on Smartphones: Report

Deep Learning in Computer Vision Through Mobile Edge Computing for IoT

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation