MDPI - Publisher of Open Access Journals

25 pages, 13951 KiB

Open AccessArticle

1D-CNN-Transformer for Radar Emitter Identification and Implemented on FPGA

by Xiangang Gao, Bin Wu, Peng Li and Zehuan Jing

Remote Sens. 2024, 16(16), 2962; https://doi.org/10.3390/rs16162962 - 12 Aug 2024

Deep learning has brought great development to radar emitter identification technology. In addition, specific emitter identification (SEI), as a branch of radar emitter identification, has also benefited from it. However, the complexity of most deep learning algorithms makes it difficult to adapt to [...] Read more.

Deep learning has brought great development to radar emitter identification technology. In addition, specific emitter identification (SEI), as a branch of radar emitter identification, has also benefited from it. However, the complexity of most deep learning algorithms makes it difficult to adapt to the requirements of the low power consumption and high-performance processing of SEI on embedded devices, so this article proposes solutions from the aspects of software and hardware. From the software side, we design a Transformer variant network, lightweight convolutional Transformer (LW-CT) that supports parameter sharing. Then, we cascade convolutional neural networks (CNNs) and the LW-CT to construct a one-dimensional-CNN-Transformer(1D-CNN-Transformer) lightweight neural network model that can capture the long-range dependencies of radar emitter signals and extract signal spatial domain features meanwhile. In terms of hardware, we design a low-power neural network accelerator based on an FPGA to complete the real-time recognition of radar emitter signals. The accelerator not only designs high-efficiency computing engines for the network, but also devises a reconfigurable buffer called “Ping-pong CBUF” and two-level pipeline architecture for the convolution layer for alleviating the bottleneck caused by the off-chip storage access bandwidth. Experimental results show that the algorithm can achieve a high recognition performance of SEI with a low calculation overhead. In addition, the hardware acceleration platform not only perfectly meets the requirements of the radar emitter recognition system for low power consumption and high-performance processing, but also outperforms the accelerators in other papers in terms of the energy efficiency ratio of Transformer layer processing. Full article

(This article belongs to the Special Issue Advances in Remote Sensing and Electromagnetic Spectrum Sensing: Data Acquisition and Signal Processing)

► Show Figures

Figure 1

24 pages, 8201 KiB

Open AccessArticle

Enhancing Sustainable Transportation Infrastructure Management: A High-Accuracy, FPGA-Based System for Emergency Vehicle Classification

by Pemila Mani, Pongiannan Rakkiya Goundar Komarasamy, Narayanamoorthi Rajamanickam, Mohammad Shorfuzzaman and Waleed Mohammed Abdelfattah

Sustainability 2024, 16(16), 6917; https://doi.org/10.3390/su16166917 (registering DOI) - 12 Aug 2024

Abstract

Traffic congestion is a prevalent problem in modern civilizations worldwide, affecting both large cities and smaller communities. Emergency vehicles tend to group tightly together in these crowded scenarios, often masking one another. For traffic surveillance systems tasked with maintaining order and executing laws, [...] Read more.

Traffic congestion is a prevalent problem in modern civilizations worldwide, affecting both large cities and smaller communities. Emergency vehicles tend to group tightly together in these crowded scenarios, often masking one another. For traffic surveillance systems tasked with maintaining order and executing laws, this poses serious difficulties. Recent developments in machine learning for image processing have significantly increased the accuracy and effectiveness of emergency vehicle classification (EVC) systems, especially when combined with specialized hardware accelerators. The widespread use of these technologies in safety and traffic management applications has led to more sustainable transportation infrastructure management. Vehicle classification has traditionally been carried out manually by specialists, which is a laborious and subjective procedure that depends largely on the expertise that is available. Furthermore, erroneous EVC might result in major problems with operation, highlighting the necessity for a more dependable, precise, and effective method of classifying vehicles. Although image processing for EVC involves a variety of machine learning techniques, the process is still labor intensive and time consuming because the techniques now in use frequently fail to appropriately capture each type of vehicle. In order to improve the sustainability of transportation infrastructure management, this article places a strong emphasis on the creation of a hardware system that is reliable and accurate for identifying emergency vehicles in intricate contexts. The ResNet50 model’s features are extracted by the suggested system utilizing a Field Programmable Gate Array (FPGA) and then optimized by a multi-objective genetic algorithm (MOGA). A CatBoost (CB) classifier is used to categorize automobiles based on these features. Overtaking the previous state-of-the-art accuracy of 98%, the ResNet50-MOP-CB network achieved a classification accuracy of 99.87% for four primary categories of emergency vehicles. In tests conducted on tablets, laptops, and smartphones, it demonstrated excellent accuracy, fast classification times, and robustness for real-world applications. On average, it took 0.9 nanoseconds for every image to be classified with a 96.65% accuracy rate. Full article

(This article belongs to the Special Issue Sustainable Transportation Infrastructure Management)

► Show Figures

Figure 1

25 pages, 329 KiB

Open AccessReview

The Role of FPGAs in Modern Option Pricing Techniques: A Survey

by Aidan O Mahony, Bernard Hanzon and Emanuel Popovici

Electronics 2024, 13(16), 3186; https://doi.org/10.3390/electronics13163186 - 12 Aug 2024

Abstract

In financial computation, Field Programmable Gate Arrays (FPGAs) have emerged as a transformative technology, particularly in the domain of option pricing. This study presents the impact of Field Programmable Gate Arrays (FPGAs) on computational methods in finance, with an emphasis on option pricing. [...] Read more.

In financial computation, Field Programmable Gate Arrays (FPGAs) have emerged as a transformative technology, particularly in the domain of option pricing. This study presents the impact of Field Programmable Gate Arrays (FPGAs) on computational methods in finance, with an emphasis on option pricing. Our review examined 99 selected studies from an initial pool of 131, revealing how FPGAs substantially enhance both the speed and energy efficiency of various financial models, particularly Black–Scholes and Monte Carlo simulations. Notably, the performance gains—ranging from 270- to 5400-times faster than conventional CPU implementations—are highly dependent on the specific option pricing model employed. These findings illustrate FPGAs’ capability to efficiently process complex financial computations while consuming less energy. Despite these benefits, this paper highlights persistent challenges in FPGA design optimization and programming complexity. This study not only emphasises the potential of FPGAs to further innovate financial computing but also outlines the critical areas for future research to overcome existing barriers and fully leverage FPGA technology in future financial applications. Full article

(This article belongs to the Section Circuit and Signal Processing)

► Show Figures

Figure 1

24 pages, 13367 KiB

Open AccessArticle

Compact Walsh–Hadamard Transform-Driven S-Box Design for ASIC Implementations

by Omer Tariq, Muhammad Bilal Akram Dastagir and Dongsoo Han

Electronics 2024, 13(16), 3148; https://doi.org/10.3390/electronics13163148 - 9 Aug 2024

Viewed by 322

Abstract

With the exponential growth of the Internet of Things (IoT), ensuring robust end-to-end encryption is paramount. Current cryptographic accelerators often struggle with balancing security, area efficiency, and power consumption, which are critical for compact IoT devices and system-on-chips (SoCs). This work presents a [...] Read more.

With the exponential growth of the Internet of Things (IoT), ensuring robust end-to-end encryption is paramount. Current cryptographic accelerators often struggle with balancing security, area efficiency, and power consumption, which are critical for compact IoT devices and system-on-chips (SoCs). This work presents a novel approach to designing substitution boxes (S-boxes) for Advanced Encryption Standard (AES) encryption, leveraging dual quad-bit structures to enhance cryptographic security and hardware efficiency. By utilizing Algebraic Normal Forms (ANFs) and Walsh–Hadamard Transforms, the proposed Register Transfer Level (RTL) circuitry ensures optimal non-linearity, low differential uniformity, and bijectiveness, making it a robust and efficient solution for ASIC implementations. Implemented on 65 nm CMOS technology, our design undergoes rigorous statistical analysis to validate its security strength, followed by hardware implementation and functional verification on a ZedBoard. Leveraging Cadence EDA tools, the ASIC implementation achieves a central circuit area of approximately 199 μm². The design incurs a hardware cost of roughly 80 gate equivalents and exhibits a maximum path delay of 0.38 ns. Power dissipation is measured at approximately 28.622 μW with a supply voltage of 0.72 V. According to the ASIC implementation on the TSMC 65 nm process, the proposed design achieves the best area efficiency, approximately 66.46% better than state-of-the-art designs. Full article

(This article belongs to the Special Issue Advanced High-Performance Integrated Circuits for Sensing Technologies and IoT Applications)

► Show Figures

Figure 1

16 pages, 3834 KiB

Open AccessArticle

A Device-on-Chip Solution for Real-Time Diffuse Correlation Spectroscopy Using FPGA

by Christopher H. Moore, Ulas Sunar and Wei Lin

Biosensors 2024, 14(8), 384; https://doi.org/10.3390/bios14080384 - 8 Aug 2024

Viewed by 292

Abstract

Diffuse correlation spectroscopy (DCS) is a non-invasive technology for the evaluation of blood perfusion in deep tissue. However, it requires high computational resources for data analysis, which poses challenges in its implementation for real-time applications. To address the unmet need, we developed a [...] Read more.

Diffuse correlation spectroscopy (DCS) is a non-invasive technology for the evaluation of blood perfusion in deep tissue. However, it requires high computational resources for data analysis, which poses challenges in its implementation for real-time applications. To address the unmet need, we developed a novel device-on-chip solution that fully integrates all the necessary computational components needed for DCS. It takes the output of a photon detector and determines the blood flow index (BFI). It is implemented on a field-programmable gate array (FPGA) chip including a multi-tau correlator for the calculation of the temporal light intensity autocorrelation function and a DCS analyzer to perform the curve fitting operation that derives the BFI at a rate of 6000 BFIs/s. The FPGA DCS system was evaluated against a lab-standard DCS system for both phantom and cuff ischemia studies. The results indicate that the autocorrelation of the light correlation and BFI from both the FPGA DCS and the reference DCS matched well. Furthermore, the FPGA DCS system was able to achieve a measurement rate of 50 Hz and resolve pulsatile blood flow. This can significantly lower the cost and footprint of the computational components of DCS and pave the way for portable, real-time DCS systems. Full article

(This article belongs to the Special Issue Advances in Biosensors Based on Reflectometry)

► Show Figures

Figure 1

19 pages, 1303 KiB

Open AccessArticle

Natural Language Processing for Hardware Security: Case of Hardware Trojan Detection in FPGAs

by Jaya Dofe, Wafi Danesh, Vaishnavi More and Aaditya Chaudhari

Cryptography 2024, 8(3), 36; https://doi.org/10.3390/cryptography8030036 - 8 Aug 2024

Viewed by 359

Abstract

Field-programmable gate arrays (FPGAs) offer the inherent ability to reconfigure at runtime, making them ideal for applications such as data centers, cloud computing, and edge computing. This reconfiguration, often achieved through remote access, enables efficient resource utilization but also introduces critical security vulnerabilities. [...] Read more.

Field-programmable gate arrays (FPGAs) offer the inherent ability to reconfigure at runtime, making them ideal for applications such as data centers, cloud computing, and edge computing. This reconfiguration, often achieved through remote access, enables efficient resource utilization but also introduces critical security vulnerabilities. An adversary could exploit this access to insert a dormant hardware trojan (HT) into the configuration bitstream, bypassing conventional security and verification measures. To address this security threat, we propose a supervised learning approach using deep recurrent neural networks (RNNs) for HT detection within FPGA configuration bitstreams. We explore two RNN architectures: basic RNN and long short-term memory (LSTM) networks. Our proposed method analyzes bitstream patterns, to identify anomalies indicative of malicious modifications. We evaluated the effectiveness on ISCAS 85 benchmark circuits of varying sizes and topologies, implemented on a Xilinx Artix-7 FPGA. The experimental results revealed that the basic RNN model showed lower accuracy in identifying HT-compromised bitstreams for most circuits. In contrast, the LSTM model achieved a significantly higher average accuracy of 93.5%. These results demonstrate that the LSTM model is more successful for HT detection in FPGA bitstreams. This research paves the way for using RNN architectures for HT detection in FPGAs, eliminating the need for time-consuming and resource-intensive reverse engineering or performance-degrading bitstream conversions. Full article

(This article belongs to the Special Issue Emerging Topics in Hardware Security)

► Show Figures

Figure 1

12 pages, 3131 KiB

Open AccessArticle

Efficient Twiddle Factor Generators for NTT

by Nari Im, Heehun Yang, Yujin Eom, Seong-Cheon Park and Hoyoung Yoo

Electronics 2024, 13(16), 3128; https://doi.org/10.3390/electronics13163128 - 7 Aug 2024

Viewed by 227

Abstract

Fully Homomorphic Encryption (FHE) allows computations on encrypted data without decryption, providing strong security for sensitive information. However, computational and memory demands for FHE are significant challenges, particularly in the Number Theoretic Transform (NTT) phase. This paper presents three efficient Twiddle Factor Generators [...] Read more.

Fully Homomorphic Encryption (FHE) allows computations on encrypted data without decryption, providing strong security for sensitive information. However, computational and memory demands for FHE are significant challenges, particularly in the Number Theoretic Transform (NTT) phase. This paper presents three efficient Twiddle Factor Generators (TFGs) to address these challenges: the Half-Memory TFG, the On-the-fly Serial TFG, and the On-the-fly Parallel TFG. The Half-Memory TFG reduces memory usage by storing only half of the twiddle factors and calculating the rest as needed. The On-the-fly Serial TFG eliminates memory requirements by computing twiddle factors, while the On-the-fly Parallel TFG enhances computational speed through parallel processing. Implemented on the FPGA KCU105 board, these TFGs demonstrated significant improvements in hardware resource utilization and computational efficiency. The Half-Memory TFG effectively reduces memory footprint, the On-the-fly Serial TFG eliminates memory usage with acceptable computational overhead, and the On-the-fly Parallel TFG offers superior performance for high-throughput applications. These innovations make FHE more practical for real-world applications, contributing to the broader goal of enabling secure, privacy-preserving computations on encrypted data. Full article

(This article belongs to the Section Circuit and Signal Processing)

► Show Figures

Figure 1

Figure 1
Primitive roots of unity for <math display="inline"><semantics> <mrow> <mi>N</mi> </mrow> </semantics></math> = 16. Full article ">Figure 2
Block diagram of Processing Element (PE). Full article ">Figure 3
Diagram of NTT for <math display="inline"><semantics> <mrow> <mi>N</mi> </mrow> </semantics></math> = 16. Full article ">Figure 4
Block diagram of conventional Memory-based TFG. Full article ">Figure 5
Symmetry properties of primitive roots of unity for <math display="inline"><semantics> <mrow> <mi>N</mi> </mrow> </semantics></math> = 16. Full article ">Figure 6
Block diagram of proposed Half-Memory-based TFG. Full article ">Figure 7
Block diagram of proposed On-the-fly Serial (O-Serial) TFG. Full article ">Figure 8
Block diagram of proposed On-the-fly Parallel (O-Parallel) TFG. Full article ">

18 pages, 22304 KiB

Open AccessArticle

A High-Performance FPGA PRNG Based on Multiple Deep-Dynamic Transformations

by Shouliang Li, Zichen Lin, Yi Yang and Ruixuan Ning

Entropy 2024, 26(8), 671; https://doi.org/10.3390/e26080671 - 7 Aug 2024

Viewed by 246

Abstract

Pseudo-random number generators (PRNGs) are important cornerstones of many fields, such as statistical analysis and cryptography, and the need for PRNGs for information security (in fields such as blockchain, big data, and artificial intelligence) is becoming increasingly prominent, resulting in a steadily growing [...] Read more.

Pseudo-random number generators (PRNGs) are important cornerstones of many fields, such as statistical analysis and cryptography, and the need for PRNGs for information security (in fields such as blockchain, big data, and artificial intelligence) is becoming increasingly prominent, resulting in a steadily growing demand for high-speed, high-quality random number generators. To meet this demand, the multiple deep-dynamic transformation (MDDT) algorithm is innovatively developed. This algorithm is incorporated into the skewed tent map, endowing it with more complex dynamical properties. The improved one-dimensional discrete chaotic mapping method is effectively realized on a field-programmable gate array (FPGA), specifically the Xilinx xc7k325tffg900-2 model. The proposed pseudo-random number generator (PRNG) successfully passes all evaluations of the National Institute of Standards and Technology (NIST) SP800-22, diehard, and TestU01 test suites. Additional experimental results show that the PRNG, possessing high novelty performance, operates efficiently at a clock frequency of 150 MHz, achieving a maximum throughput of 14.4 Gbps. This performance not only surpasses that of most related studies but also makes it exceptionally suitable for embedded applications. Full article

(This article belongs to the Section Multidisciplinary Applications)

► Show Figures

Figure 1

19 pages, 6482 KiB

Open AccessArticle

Field-Programmable Gate Array Architecture for the Discrete Orthonormal Stockwell Transform (DOST) Hardware Implementation

by Martin Valtierra-Rodriguez, Jose-Luis Contreras-Hernandez, David Granados-Lieberman, Jesus Rooney Rivera-Guillen, Juan Pablo Amezquita-Sanchez and David Camarena-Martinez

J. Low Power Electron. Appl. 2024, 14(3), 42; https://doi.org/10.3390/jlpea14030042 - 7 Aug 2024

Viewed by 376

Abstract

Time–frequency analysis is critical in studying linear and non-linear signals that exhibit variations across both time and frequency domains. Such analysis not only facilitates the identification of transient events and extraction of key features but also aids in displaying signal properties and pattern [...] Read more.

Time–frequency analysis is critical in studying linear and non-linear signals that exhibit variations across both time and frequency domains. Such analysis not only facilitates the identification of transient events and extraction of key features but also aids in displaying signal properties and pattern recognition. Recently, the Discrete Orthonormal Stockwell Transform (DOST) has become increasingly utilized in many specialized signal processing tasks. Given its growing importance, this work proposes a reconfigurable field-programmable gate array (FPGA) architecture designed to efficiently implement the DOST algorithm on cost-effective FPGA chips. An accompanying MATLAB app enables the automatic configuration of the DOST method for varying sizes (64, 128, 256, 512, and 1024 points). For the implementation, a Cyclone V series FPGA device from Intel Altera, featuring the 5CSEMA5F31C6N chip, is used. To provide a complete hardware solution, the proposed DOST core has been integrated into a hybrid ARM-HPS (Advanced RISC Machine–Hard Processor System) control unit, which allows the control of different peripherals, such as communication protocols and VGA-based displays. Results show that less than 5% of the chip’s resources are occupied, indicating a low-cost architecture that can be easily integrated into other FPGA structures or hardware systems for diverse applications. Moreover, the accuracy of the proposed FPGA-based implementation is underscored by a root mean squared error of 6.0155 × 10⁻³ when compared with results from floating-point processors, highlighting its accuracy. Full article

► Show Figures

Figure 1

15 pages, 26053 KiB

Open AccessArticle

Module Tester for Positron Emission Tomography and Particle Physics

by David Baranyai, Stefan Oniga, Balazs Gyongyosi, Balazs Ujvari and Attia Mohamed

Electronics 2024, 13(15), 3066; https://doi.org/10.3390/electronics13153066 - 2 Aug 2024

Viewed by 354

Abstract

The combination of high-density, high-time-resolution inorganic scintillation crystals such as Lutetium Yttrium Oxyorthosilicate (LYSO), Yttrium Orthosilicate (YSO) and Bismuth Germanate (BGO) with Silicon Photomultiplier (SiPM) sensors is widely employed in medical imaging, particularly in Positron Emission Tomography (PET), as well as in modern [...] Read more.

The combination of high-density, high-time-resolution inorganic scintillation crystals such as Lutetium Yttrium Oxyorthosilicate (LYSO), Yttrium Orthosilicate (YSO) and Bismuth Germanate (BGO) with Silicon Photomultiplier (SiPM) sensors is widely employed in medical imaging, particularly in Positron Emission Tomography (PET), as well as in modern particle physics detectors for precisely timing sub-detectors and calorimeters. During the assembly of each module, following individual component testing, the crystals and SiPMs are bonded together using optical glue and enclosed in a light-tight, temperature-controlled cooling box. After integration with the readout electronics, the bonding is initially tested. The final readout electronics typically comprise Application-Specific Integrated Circuits (ASICs) or low-power Analog-to-Digital Converters (ADCs) and amplifiers, designed not to heat up the temperature-sensitive SiPM sensors. However, these setups are not optimal for testing the optical bonding. Specific setups were developed to test the LYSO + SiPM modules that are already bonded but not enclosed in a box. Through large data collection, small deviations in bonding can be detected if the SiPMs and LYSOs have been thoroughly tested before our measurement. The Monte Carlo simulations we used to study how parameters—which are difficult to measure in the laboratory (LYSO absorption length, refractive index of the coating)—affect the final result. Our setups for particle physics and PET applications are already in use by research institutes and industrial partners. Full article

(This article belongs to the Special Issue Sensor Based Big Data Analysis)

► Show Figures

Figure 1

24 pages, 5669 KiB

Open AccessArticle

Design of Multichannel Spectrum Intelligence Systems Using Approximate Discrete Fourier Transform Algorithm for Antenna Array-Based Spectrum Perception Applications

by Arjuna Madanayake, Keththura Lawrance, Bopage Umesha Kumarasiri, Sivakumar Sivasankar, Thushara Gunaratne, Chamira U. S. Edussooriya and Renato J. Cintra

Algorithms 2024, 17(8), 338; https://doi.org/10.3390/a17080338 - 1 Aug 2024

Viewed by 411

Abstract

The radio spectrum is a scarce and extremely valuable resource that demands careful real-time monitoring and dynamic resource allocation. Dynamic spectrum access (DSA) is a new paradigm for managing the radio spectrum, which requires AI/ML-driven algorithms for optimum performance under rapidly changing channel [...] Read more.

The radio spectrum is a scarce and extremely valuable resource that demands careful real-time monitoring and dynamic resource allocation. Dynamic spectrum access (DSA) is a new paradigm for managing the radio spectrum, which requires AI/ML-driven algorithms for optimum performance under rapidly changing channel conditions and possible cyber-attacks in the electromagnetic domain. Fast sensing across multiple directions using array processors, with subsequent AI/ML-based algorithms for the sensing and perception of waveforms that are measured from the environment is critical for providing decision support in DSA. As part of directional and wideband spectrum perception, the ability to finely channelize wideband inputs using efficient Fourier analysis is much needed. However, a fine-grain fast Fourier transform (FFT) across a large number of directions is computationally intensive and leads to a high chip area and power consumption. We address this issue by exploiting the recently proposed approximate discrete Fourier transform (ADFT), which has its own sparse factorization for real-time implementation at a low complexity and power consumption. The ADFT is used to create a wideband multibeam RF digital beamformer and temporal spectrum-based attention unit that monitors 32 discrete directions across 32 sub-bands in real-time using a multiplierless algorithm with low computational complexity. The output of this spectral attention unit is applied as a decision variable to an intelligent receiver that adapts its center frequency and frequency resolution via FFT channelizers that are custom-built for real-time monitoring at high resolution. This two-step process allows the fine-gain FFT to be applied only to directions and bands of interest as determined by the ADFT-based low-complexity 2D spacetime attention unit. The fine-grain FFT provides a spectral signature that can find future use cases in neural network engines for achieving modulation recognition, IoT device identification, and RFI identification. Beamforming and spectral channelization algorithms, a digital computer architecture, and early prototypes using a 32-element fully digital multichannel receiver and field programmable gate array (FPGA)-based high-speed software-defined radio (SDR) are presented. Full article

(This article belongs to the Collection Feature Papers in Randomized, Online and Approximation Algorithms)

► Show Figures

Figure 1

13 pages, 2678 KiB

Open AccessArticle

An FPGA-Based Data Acquisition System with Embedded Processing for Real-Time Gas Sensing Applications

by Godwin Enemali and Ryan M. Gibson

Appl. Sci. 2024, 14(15), 6738; https://doi.org/10.3390/app14156738 - 1 Aug 2024

Viewed by 388

Abstract

Real-time gas sensing based on wavelength modulation spectroscopy (WMS) has been widely adopted for several gas sensing applications. It is attractive for its accurate, non-invasive, and fast determination of critical gas parameters such as concentration, temperature, and pressure. To implement real-time gas sensing, [...] Read more.

Real-time gas sensing based on wavelength modulation spectroscopy (WMS) has been widely adopted for several gas sensing applications. It is attractive for its accurate, non-invasive, and fast determination of critical gas parameters such as concentration, temperature, and pressure. To implement real-time gas sensing, data acquisition and processing must be implemented to accurately extract harmonics of interest from transmitted laser signals. In this work, we present an FPGA-based data acquisition architecture with embedded processing capable of achieving both real-time and accurate gas detection. By leveraging real-time processing on-chip, we minimised the data transfer bandwidth requirement, hence enabling better resolution of data transferred for high-level processing. The proposed architecture has a significantly lower bandwidth requirement compared to both the conventional offline processing architecture and the standard I-Q architecture. Specifically, it is capable of reducing data transfer overhead by 25% compared to the standard I-Q method, and it only requires a fraction of the bandwidth needed by the offline processing architecture. The feasibility of the proposed architecture is demonstrated on a commercial off-the-shelf SoC board, where measurement results show that the proposed architecture has better accuracy compared to the standard I-Q demodulation architecture for the same signal bandwidth. The proposed DAQ system has potential for more accurate and fast real-time gas sensing. Full article

(This article belongs to the Special Issue Current Updates of Programmable Logic Devices and Synthesis Methods)

► Show Figures

Figure 1

Figure 1
Overview of the WMS system for gas sensing. Full article ">Figure 2
Block diagram of real-time lock-in. Full article ">Figure 3
RTL schematic of implemented real-time demodulation. Full article ">Figure 4
Spectral absorption profile of water around 1391. Full article ">Figure 5
Transmitted signal with 56 dB additive white Gaussian noise. Full article ">Figure 6
DAQ set-up for WMS. The AWG contains signals generated from realistic absorption data from the HITRAN database. The generated signals are digitised, processed to extract harmonics of interest, and transmitted to a high-level processor. Full article ">Figure 7
Comparison of accuracy of the proposed DAQ system with conventional techniques. In (a), raw transmitted samples were demodulated offline on a high-level processor. In (b,c), the standard I-Q demodulated technique and the proposed DAQ system operated in real time for a fixed bandwidth. Full article ">Figure 8
Comparison of mean squared errors. The proposed DAQ method maintains a lower error value compared to the standard I-Q method. Full article ">

16 pages, 1503 KiB

Open AccessFeature PaperArticle

FPGA Implementation of Pillar-Based Object Classification for Autonomous Mobile Robot

by Chaewoon Park, Seongjoo Lee and Yunho Jung

Electronics 2024, 13(15), 3035; https://doi.org/10.3390/electronics13153035 - 1 Aug 2024

Viewed by 316

Abstract

With the advancement in artificial intelligence technology, autonomous mobile robots have been utilized in various applications. In autonomous driving scenarios, object classification is essential for robot navigation. To perform this task, light detection and ranging (LiDAR) sensors, which can obtain depth and height [...] Read more.

With the advancement in artificial intelligence technology, autonomous mobile robots have been utilized in various applications. In autonomous driving scenarios, object classification is essential for robot navigation. To perform this task, light detection and ranging (LiDAR) sensors, which can obtain depth and height information and have higher resolution than radio detection and ranging (radar) sensors, are preferred over camera sensors. The pillar-based method employs a pillar feature encoder (PFE) to encode 3D LiDAR point clouds into 2D images, enabling high-speed inference using 2D convolutional neural networks. Although the pillar-based method is employed to ensure real-time responsiveness of autonomous driving systems, research on accelerating the PFE is not actively being conducted, although the PFE consumes a significant amount of computation time within the system. Therefore, this paper proposes a PFE hardware accelerator and pillar-based object classification model for autonomous mobile robots. The proposed object classification model was trained and tested using 2971 datasets comprising eight classes, achieving a classification accuracy of 94.3%. The PFE hardware accelerator was implemented in a field-programmable gate array (FPGA) through a register-transfer level design, which achieved a 40 times speedup compared with the firmware for the ARM Cortex-A53 microprocessor unit; the object classification network was implemented in the FPGA using the FINN framework. By integrating the PFE and object classification network, we implemented a real-time pillar-based object classification acceleration system on an FPGA with a latency of 6.41 ms. Full article

(This article belongs to the Special Issue System-on-Chip (SoC) and Field-Programmable Gate Array (FPGA) Design)

► Show Figures

Figure 1

Figure 1
Overview of the proposed acceleration system. Full article ">Figure 2
Operation mechanism of DSCNN. Full article ">Figure 3
Examples of dataset. Full article ">Figure 4
Configuration of dataset classes: (a) building; (b) tree; (c) vehicle; (d) bicycle; (e) obstacle; (f) greenery; (g) person; (h) urban fixture. Full article ">Figure 5
Architecture of DSCNN-based object classification network: (a) Network 1; (b) Network 2; (c) Network 3; (d) Network 4; (e) Network 5. Full article ">Figure 6
Architecture of proposed acceleration system on FPGA. Full article ">Figure 7
Block diagram of PFE accelerator. Full article ">Figure 8
Operation of the PU: (a) tensor-level operation; (b) matrix-level operation; (c) hardware-level operation. Full article ">

25 pages, 10247 KiB

Open AccessArticle

Development of Power-Delay Product Optimized ASIC-Based Computational Unit for Medical Image Compression

by Tanya Mendez, Tejasvi Parupudi, Vishnumurthy Kedlaya K and Subramanya G. Nayak

Technologies 2024, 12(8), 121; https://doi.org/10.3390/technologies12080121 - 29 Jul 2024

Viewed by 491

Abstract

The proliferation of battery-operated end-user electronic devices due to technological advancements, especially in medical image processing applications, demands low power consumption, high-speed operation, and efficient coding. The design of these devices is centered on the Application-Specific Integrated Circuits (ASIC), General Purpose Processors (GPP), [...] Read more.

The proliferation of battery-operated end-user electronic devices due to technological advancements, especially in medical image processing applications, demands low power consumption, high-speed operation, and efficient coding. The design of these devices is centered on the Application-Specific Integrated Circuits (ASIC), General Purpose Processors (GPP), and Field Programmable Gate Array (FPGA) frameworks. The need for low-power functional blocks arises from the growing demand for high-performance computational units that are part of high-speed processors operating at high clock frequencies. The operational speed of the processor is determined by the computational unit, which is the workhorse of high-speed processors. A novel approach to integrating Very Large-Scale Integration (VLSI) ASIC design and the concepts of low-power VLSI compatible with medical image compression was embraced in this research. The focus of this study was the design, development, and implementation of a Power Delay Product (PDP) optimized computational unit targeted for medical image compression using ASIC design flow. This stimulates the research community’s quest to develop an ideal architecture, emphasizing on minimizing power consumption and enhancing device performance for medical image processing applications. The study uses area, delay, power, PDP, and Peak Signal-to-Noise Ratio (PSNR) as performance metrics. The research work takes inspiration from this and aims to enhance the efficiency of the computational unit through minor design modifications that significantly impact performance. This research proposes to explore the trade-off of high-performance adder and multiplier designs to design an ASIC-based computational unit using low-power techniques to enhance the efficiency in power and delay. The computational unit utilized for the digital image compression process was synthesized and implemented using gpdk 45 nm standard libraries with the Genus tool of Cadence. A reduced PDP of 46.87% was observed when the image compression was performed on a medical image, along with an improved PSNR of 5.89% for the reconstructed image. Full article

(This article belongs to the Topic Advances in Microelectronics and Semiconductor Engineering)

► Show Figures

Figure 1

22 pages, 7285 KiB

Open AccessArticle

Design and Application of an Onboard Particle Identification Platform Based on Convolutional Neural Networks

by Chaoping Bai, Xin Zhang, Shenyi Zhang, Yueqiang Sun, Xianguo Zhang, Ziting Wang and Shuai Zhang

Appl. Sci. 2024, 14(15), 6628; https://doi.org/10.3390/app14156628 - 29 Jul 2024

Viewed by 369

Abstract

Space radiation particle detection plays a crucial role in scientific research and engineering practice, especially in particle species identification. Currently, commonly used in-orbit particle identification techniques include telescope methods, electrostatic analysis time of flight (ESA × TOF), time-of-flight energy (TOF × E), and [...] Read more.

Space radiation particle detection plays a crucial role in scientific research and engineering practice, especially in particle species identification. Currently, commonly used in-orbit particle identification techniques include telescope methods, electrostatic analysis time of flight (ESA × TOF), time-of-flight energy (TOF × E), and pulse shape discrimination (PSD). However, these methods usually fail to utilize the full waveform information containing rich features, and their particle identification results may be affected by the random rise and fall of particle deposition and noise interference. In this study, a low-latency and lightweight onboard FPGA real-time particle identification platform based on full waveform information was developed utilizing the superior target classification, robustness, and generalization capabilities of convolutional neural networks (CNNs). The platform constructs diversified input datasets based on the physical features of waveforms and uses Optuna and Pytorch software architectures for model training. The hardware platform is responsible for the real-time inference of waveform data and the dynamic expansion of the dataset. The platform was utilized for deep learning training and the testing of the historical waveform data of neutron and gamma rays, and the inference time of a single waveform takes 4.9 microseconds, with an accuracy rate of over 97%. The classification expectation FOM (figure-of-merit) value of this CNN model is 133, which is better than the traditional pulse shape discrimination (PSD) algorithm’s FOM value of 0.8. The development of this platform not only improves the accuracy and efficiency of space particle discrimination but also provides an advanced tool for future space environment monitoring, which is of great value for engineering applications. Full article

► Show Figures

Figure 1

Search Results (1,920)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (1,920)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI