Parallel Implementation of a Convolutional Neural Network on an MPSoC

Luiza de Macedo Mourelle²⁸,
Nadia Nedjah²⁹ &
Alexandre Nietupski Cardoso³⁰

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14748))

Included in the following conference series:

International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems

586 Accesses

Abstract

A Convolutional Neural Network represents a machine learning model commonly employed for pattern recognition and classification tasks in image and video-based applications. The architecture of a Convolutional Neural Network typically comprises a sequence of convolutional layers paired with pooling layers, with the final output being classified by a fully connected layer. The role of the convolutional layer is to enable the mapping of distinctive image features, while the pooling layer serves to reduce the dimensionality of matrices and simplify the data. In this research endeavor, we delve into assessing the performance of a parallelized implementation of a Convolutional Neural Network executed on a Multiprocessor System-on-Chip.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 149.00; Price excludes VAT (USA)

Softcover Book: USD 199.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Parallelizing Convolutional Neural Networks on Intel $$^{\textregistered }$$ Many Integrated Core Architecture

Accelerating Deep Learning with a Parallel Mechanism Using CPU + MIC

Article 24 October 2017

Performance Issues of Parallel, Scalable Convolutional Neural Networks in Deep Learning

References

Alpaydin, E.: An Introduction to Machine Learning, 3rd edn. MIT Press, Cambridge (2010)
Google Scholar
De Franca, A.B., Oliveira, F.D., Gomes, J.G.R., Nedjah, N.: Non-memoryless vs. memoryless hardware architectures for convolutional neural networks. In: 2021 IEEE 12th Latin American Symposium on Circuits and Systems, LASCAS 2021, pp. 21–24 (2021). https://doi.org/10.1109/LASCAS51355.2021.9459115
Duato, J., Yalamanchili, S., Li, L.: Interconnection Networks - An Engineering Approach, 1st edn. Morgan Kaufmann (2003)
Google Scholar
Guan, Y., et al.: FP-DNN: an automated framework for mapping deep neural networks onto FPGAs with RTL-HLS hybrid templates. In: International Symposium on Field-Programmable Custom Computing Machines (FCCM), pp. 152–159 (2017). https://doi.org/10.1109/FCCM.2017.25
Haykin, S.: Neural Networks: A Comprehensive Foundation, 2nd edn. Prentice Hall, Upper Saddle River (1999)
Google Scholar
Lecun, Y., Bottou, L., Bengio, Y., Ha, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 1–46 (1998)
Article Google Scholar
Li, W.J., Ruan, S.J., Yang, D.S.: Implementation of energy-efficient fast convolution algorithm for deep convolutional neural networks based on FPGA. Electron. Lett. 56(10), 485–488 (2020)
Article Google Scholar
de Micheli, G., Benini, L.: Network on Chips, 1st edn. Morgan Kaufmann (2006)
Google Scholar
Mittal, S.: A survey of FPGA-based accelerators for convolutional neural networks. Neural Comput. Appl. 32(4), 1109–1139 (2020). https://doi.org/10.1007/s00521-018-3761-1
Article Google Scholar
Moraes, F., Calazans, N., Mello, A., Möller, L., Ost, L.: HERMES: an infrastructure for low area overhead packet-switching networks on chip. Integration, VLSI J. 38(1), 69–93 (2004). https://doi.org/10.1016/j.vlsi.2004.03.003. https://linkinghub.elsevier.com/retrieve/pii/S0167926004000185
Pasricha, S., Dutt, N.: On-Chip Communication Architectures: System on Chip Interconnect. Morgan Kaufmann (2010)
Google Scholar
Patterson, D.A., Hennessy, J.L.: Computer Organization and Design - The Hardware/Software Interface. Morgan Kaufmann (2012)
Google Scholar
Rhoads, S.: Plasma - MIPS I compatible processor (2001). https://opencores.org/projects/plasma
Ruaro, M., Caimi, L.L., Fochi, V., Moraes, F.G.: Memphis: a framework for heterogeneous many-core SoCs generation and validation. Des. Autom. Embed. Syst. 23(3–4), 103–122 (2019)
Article Google Scholar
Shawahna, A., Sait, S.M., El-Maleh, A.: FPGA-based accelerators of deep learning networks for learning and classification: a review. IEEE Access 7, 7823–7859 (2019). https://doi.org/10.1109/ACCESS.2018.2890150
Article Google Scholar
Sze, V., Chen, Y.H., Emer, J., Suleiman, A., Zhang, Z.: Hardware for machine learning: challenges and opportunities. In: 2018 IEEE Custom Integrated Circuits Conference (CICC), pp. 1–8 (2018). https://doi.org/10.1109/CICC.2018.8357072

Download references

Acknowledgment

The authors are grateful to FAPERJ (Fundação de Amparo á Pesquisa do Estado do Rio de janeiro, http://www.faperj.br), CNPq (Conselho Nacional de Desenvolvimento Científico e Tecnológico, http://www.cnpq.br) and CAPES (Coordenação de Aperfeiçoamento de Pessoal de Níível Superior, http://www.capes.gov.br/) for their continuous financial support.

Author information

Authors and Affiliations

Department of Systems Engineering and Computation, Faculty of Engineering, State University of Rio de Janeiro, Rio de Janeiro, Brazil
Luiza de Macedo Mourelle
Department of Electronics Engineering and Telecommunications, Faculty of Engineering, State University of Rio de Janeiro, Rio de Janeiro, Brazil
Nadia Nedjah
Postgraduate Program in Electronics Engineering, Faculty of Engineering, State University of Rio de Janeiro, Rio de Janeiro, Brazil
Alexandre Nietupski Cardoso

Authors

Luiza de Macedo Mourelle
View author publications
You can also search for this author in PubMed Google Scholar
Nadia Nedjah
View author publications
You can also search for this author in PubMed Google Scholar
Alexandre Nietupski Cardoso
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Luiza de Macedo Mourelle .

Editor information

Editors and Affiliations

Malaysia-Japan International Institute of Technology (MJIIT), University of Technology Malaysia, Kuala Lumpur, Malaysia
Hamido Fujita
University of Hradec Kralove, Hradec Kralove, Czech Republic
Richard Cimler
Meiji University, Tokyo, Japan
Andres Hernandez-Matamoros
Department of Computer Science, Texas State University, San Marcos, TX, USA
Moonis Ali

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

de Macedo Mourelle, L., Nedjah, N., Cardoso, A.N. (2024). Parallel Implementation of a Convolutional Neural Network on an MPSoC. In: Fujita, H., Cimler, R., Hernandez-Matamoros, A., Ali, M. (eds) Advances and Trends in Artificial Intelligence. Theory and Applications. IEA/AIE 2024. Lecture Notes in Computer Science(), vol 14748. Springer, Singapore. https://doi.org/10.1007/978-981-97-4677-4_28

Download citation

DOI: https://doi.org/10.1007/978-981-97-4677-4_28
Published: 10 July 2024
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-4676-7
Online ISBN: 978-981-97-4677-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Parallel Implementation of a Convolutional Neural Network on an MPSoC

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Parallelizing Convolutional Neural Networks on Intel $$^{\textregistered }$$ Many Integrated Core Architecture

Accelerating Deep Learning with a Parallel Mechanism Using CPU + MIC

Performance Issues of Parallel, Scalable Convolutional Neural Networks in Deep Learning

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Parallel Implementation of a Convolutional Neural Network on an MPSoC

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Parallelizing Convolutional Neural Networks on Intel $$^{\textregistered }$$ Many Integrated Core Architecture

Accelerating Deep Learning with a Parallel Mechanism Using CPU + MIC

Performance Issues of Parallel, Scalable Convolutional Neural Networks in Deep Learning

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation