Abstract
While neural networks-based photo processing solutions can provide a better image quality compared to the traditional ISP systems, their application to mobile devices is still very limited due to their very high computational complexity. In this paper, we present a novel MicroISP model designed specifically for edge devices, taking into account their computational and memory limitations. The proposed solution is capable of processing up to 32MP photos on recent smartphones using the standard mobile ML libraries and requiring less than 1 s to perform the inference, while for FullHD images it achieves real-time performance. The architecture of the model is flexible, allowing to adjust its complexity to devices of different computational power. To evaluate the performance of the model, we collected a novel Fujifilm UltraISP dataset consisting of thousands of paired photos captured with a normal mobile camera sensor and a professional 102MP medium-format FujiFilm GFX100 camera. The experiments demonstrated that, despite its compact size, the MicroISP model is able to provide comparable or better visual results than the traditional mobile ISP systems, while outperforming the previously proposed efficient deep learning based solutions. Finally, this model is also compatible with the latest mobile AI accelerators, achieving good runtime and low power consumption o n smartphone NPUs and APUs. The code, dataset and pre-trained models are available on the project website: https://people.ee.ethz.ch/~ihnatova/microisp.html.
A. Ignatov, R. Timofte and L. Van Gool—The main contact authors.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Abdelhamed, A., Afifi, M., Timofte, R., Brown, M.S.: NTIRE 2020 challenge on real image denoising: dataset, methods and results. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 496–497 (2020)
Abdelhamed, A., Timofte, R., Brown, M.S.: NTIRE 2019 challenge on real image denoising: Methods and results. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (2019)
API, A.N.N.: https://source.android.com/devices/neural-networks
Cai, J., Gu, S., Timofte, R., Zhang, L.: NTIRE 2019 challenge on real image super-resolution: Methods and results. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (2019)
Cai, J., Gu, S., Zhang, L.: Learning a deep single image contrast enhancer from multi-exposure images. IEEE Trans. Image Process. 27(4), 2049–2062 (2018)
Dai, L., Liu, X., Li, C., Chen, J.: AWNet: attentive wavelet network for image ISP. arXiv preprint arXiv:2008.09228 (2020)
Dong, C., Loy, C.C., He, K., Tang, X.: Image super-resolution using deep convolutional networks. IEEE Trans. Pattern Anal. Mach. Intell. 38(2), 295–307 (2015)
Dong, C., Loy, C.C., Tang, X.: Accelerating the super-resolution convolutional neural network. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 391–407. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46475-6_25
Fu, X., Zeng, D., Huang, Y., Liao, Y., Ding, X., Paisley, J.: A fusion-based enhancing method for weakly illuminated images. Signal Process. 129, 82–96 (2016)
Gu, S., Timofte, R.: A brief review of image denoising algorithms and beyond. In: Inpainting and Denoising Challenges, pp. 1–21 (2019)
Hsyu, M.C., Liu, C.W., Chen, C.H., Chen, C.W., Tsai, W.C.: CSANet: high speed channel spatial attention network for mobile ISP. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (2021)
Huang, J., et al.: Range scaling global U-Net for perceptual image enhancement on mobile devices. In: Leal-Taixé, L., Roth, S. (eds.) ECCV 2018. LNCS, vol. 11133, pp. 230–242. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11021-5_15
Hui, Z., Wang, X., Deng, L., Gao, X.: Perception-preserving convolutional networks for image enhancement on smartphones. In: Leal-Taixé, L., Roth, S. (eds.) ECCV 2018. LNCS, vol. 11133, pp. 197–213. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11021-5_13
Ignatov, A., Byeoung-su, K., Timofte, R., Pouget, A.: Fast camera image denoising on mobile gpus with deep learning, mobile AI 2021 challenge: report. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2515–2524 (2021)
Ignatov, A., Chiang, J., Kuo, H.K., Sycheva, A., Timofte, R.: Learned smartphone isp on mobile npus with deep learning, mobile AI 2021 challenge: report. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (2021)
Ignatov, A., Kobyshev, N., Timofte, R., Vanhoey, K., Van Gool, L.: DSLR-quality photos on mobile devices with deep convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3277–3285 (2017)
Ignatov, A., et al.: PyNet-V2 Mobile: efficient on-device photo processing with neural networks. In: 2021 26th International Conference on Pattern Recognition (ICPR). IEEE (2022)
Ignatov, A., Timofte, R.: NTIRE 2019 challenge on image enhancement: methods and results. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (2019)
Ignatov, A., et al.: AI benchmark: running deep neural networks on android smartphones. In: Leal-Taixé, L., Roth, S. (eds.) ECCV 2018. LNCS, vol. 11133, pp. 288–314. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11021-5_19
Ignatov, A., Timofte, R., Denna, M., Younes, A.: Real-time quantized image super-resolution on mobile NPUs, mobile AI 2021 challenge: report. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (2021)
Ignatov, A., et al.: Aim 2019 challenge on raw to RGB mapping: methods and results. In: 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), pp. 3584–3590. IEEE (2019)
Ignatov, A., et al.: AI benchmark: all about deep learning on smartphones in 2019. In: 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), pp. 3617–3635. IEEE (2019)
Ignatov, A., et al.: PIRM challenge on perceptual image enhancement on smartphones: report. In: Leal-Taixé, L., Roth, S. (eds.) ECCV 2018. LNCS, vol. 11133, pp. 315–333. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11021-5_20
Ignatov, A., et al.: AIM 2020 challenge on learned image signal processing pipeline. arXiv preprint arXiv:2011.04994 (2020)
Ignatov, A., Van Gool, L., Timofte, R.: Replacing mobile camera ISP with a single deep learning model. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 536–537 (2020)
Johnson, J., Alahi, A., Fei-Fei, L.: Perceptual losses for real-time style transfer and super-resolution. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 694–711. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46475-6_43
Kim, B.-H., Song, J., Ye, J.C., Baek, J.H.: PyNET-CA: enhanced PyNET with channel attention for end-to-end mobile image signal processing. In: Bartoli, A., Fusiello, A. (eds.) ECCV 2020. LNCS, vol. 12537, pp. 202–212. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-67070-2_12
Kim, J., Lee, J.K., Lee, K.M.: Accurate image super-resolution using very deep convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1646–1654 (2016)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Lee, J., et al.: On-device neural net inference with mobile GPUs. arXiv preprint arXiv:1907.01989 (2019)
Lim, B., Son, S., Kim, H., Nah, S., Lee, K.M.: Enhanced deep residual networks for single image super-resolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition workshops, pp. 136–144 (2017)
Liu, H., Navarrete Michelini, P., Zhu, D.: Deep networks for image-to-image translation with Mux and Demux layers. In: Leal-Taixé, L., Roth, S. (eds.) ECCV 2018. LNCS, vol. 11133, pp. 150–165. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11021-5_10
Lugmayr, A., Danelljan, M., Timofte, R.: Unsupervised learning for real-world super-resolution. In: 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), pp. 3408–3416. IEEE (2019)
Lugmayr, A., Danelljan, M., Timofte, R.: NTIRE 2020 challenge on real-world image super-resolution: Methods and results. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 494–495 (2020)
Ma, K., Yeganeh, H., Zeng, K., Wang, Z.: High dynamic range image tone mapping by optimizing tone mapped image quality index. In: 2014 IEEE International Conference on Multimedia and Expo (ICME), pp. 1–6. IEEE (2014)
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Salih, Y., Malik, A.S., Saad, N., et al.: Tone mapping of HDR images: a review. In: 2012 4th International Conference on Intelligent and Advanced Systems (ICIAS2012), vol. 1, pp. 368–373. IEEE (2012)
Silva, J.I.S., et al.: A deep learning approach to mobile camera image signal processing. In: Anais Estendidos do XXXIII Conference on Graphics, Patterns and Images, pp. 225–231. SBC (2020)
Specifications, A.N.N.A.: https://android.googlesource.com/platform/hardware/interfaces/+/refs/heads/master/neuralnetworks/1.0/types.hal
Specifications, A.N.N.A.: https://android.googlesource.com/platform/hardware/interfaces/+/refs/heads/master/neuralnetworks/1.2/types.hal
Specifications, A.N.N.A.: https://android.googlesource.com/platform/hardware/interfaces/+/refs/heads/master/neuralnetworks/1.3/types.hal
de Stoutz, E., Ignatov, A., Kobyshev, N., Timofte, R., Van Gool, L.: Fast perceptual image enhancement. In: Leal-Taixé, L., Roth, S. (eds.) ECCV 2018. LNCS, vol. 11133, pp. 260–275. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11021-5_17
Tai, Y., Yang, J., Liu, X., Xu, C.: MemNet: a persistent memory network for image restoration. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4539–4547 (2017)
Timofte, R., Agustsson, E., Van Gool, L., Yang, M.H., Zhang, L.: NTIRE 2017 challenge on single image super-resolution: Methods and results. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 114–125 (2017)
Timofte, R., Gu, S., Wu, J., Van Gool, L.: NTIRE 2018 challenge on single image super-resolution: Methods and results. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 852–863 (2018)
Truong, P., Danelljan, M., Van Gool, L., Timofte, R.: Learning accurate dense correspondences and when to trust them. arXiv preprint arXiv:2101.01710 (2021)
Vu, T., Nguyen, C.V., Pham, T.X., Luu, T.M., Yoo, C.D.: Fast and efficient image quality enhancement via Desubpixel convolutional neural networks. In: Leal-Taixé, L., Roth, S. (eds.) ECCV 2018. LNCS, vol. 11133, pp. 243–259. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11021-5_16
Wang, X., et al.: ESRGAN: enhanced super-resolution generative adversarial networks. In: Leal-Taixé, L., Roth, S. (eds.) ECCV 2018. LNCS, vol. 11133, pp. 63–79. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11021-5_5
Yan, Z., Zhang, H., Wang, B., Paris, S., Yu, Y.: Automatic photo adjustment using deep neural networks. ACM Trans. Graph. (TOG) 35(2), 11 (2016)
Yan, Z., Zhang, H., Wang, B., Paris, S., Yu, Y.: Automatic photo adjustment using deep neural networks, vol. 35, p. 11. In: ACM (2016)
Yuan, L., Sun, J.: Automatic exposure correction of consumer photographs. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7575, pp. 771–785. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33765-9_55
Zhang, K., Gu, S., Timofte, R.: NTIRE 2020 challenge on perceptual extreme super-resolution: methods and results. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 492–493 (2020)
Zhang, K., Zuo, W., Chen, Y., Meng, D., Zhang, L.: Beyond a gaussian denoiser: residual learning of deep CNN for image denoising. IEEE Trans. Image Process. 26(7), 3142–3155 (2017)
Zhang, K., Zuo, W., Zhang, L.: FFDNet: toward a fast and flexible solution for CNN-based image denoising. IEEE Trans. Image Process. 27(9), 4608–4622 (2018)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Ignatov, A. et al. (2023). MicroISP: Processing 32MP Photos on Mobile Devices with Deep Learning. In: Karlinsky, L., Michaeli, T., Nishino, K. (eds) Computer Vision – ECCV 2022 Workshops. ECCV 2022. Lecture Notes in Computer Science, vol 13802. Springer, Cham. https://doi.org/10.1007/978-3-031-25063-7_46
Download citation
DOI: https://doi.org/10.1007/978-3-031-25063-7_46
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-25062-0
Online ISBN: 978-3-031-25063-7
eBook Packages: Computer ScienceComputer Science (R0)