Submit to Special Issue Submit Abstract to Special Issue Review for Applied Sciences Propose a Special Issue

Journal Menu

Journal Browser

Data Structures for Graphics Processing Units (GPUs)

Print Special Issue Flyer
Special Issue Editors
Special Issue Information
Keywords
Benefits of Publishing in a Special Issue
Published Papers

A special issue of Applied Sciences (ISSN 2076-3417). This special issue belongs to the section "Computing and Artificial Intelligence".

Deadline for manuscript submissions: 20 July 2025 | Viewed by 2043

Share This Special Issue

Special Issue Editors

Dr. Byunghyun Jang

E-Mail Website
Guest Editor

Department of Computer and Information Science, University of Mississippi, University, MS 38677, USA
Interests: hardware architecture and compilers for parallel and heterogeneous processors; GPU computing (GPGPU) and CPU-GPU heterogeneous computing and CPU-GPU heterogeneous computing

Prof. Dr. Juan A. Gómez-Pulido

E-Mail Website
Guest Editor

Department of Technologies of Computers and Communications, Universidad de Extremadura, Cáceres, Spain
Interests: embedded computing; reconfigurable computing; computational intelligence; optimization
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

This Special Issue will delve into the innovative and rapidly evolving field of data structures tailored to graphics processing units (GPUs). GPUs, originally designed for rendering graphics, have emerged as powerful parallel processors, revolutionizing computational tasks across diverse domains. This Special Issue will explore the development and optimization of data structures that leverage the parallel processing capabilities of GPUs to achieve significant performance enhancements.

Contributors to this Special Issue will present cutting-edge research into a variety of GPU-optimized data structures, including, but not limited to, stacks, queues, trees, graphs, hash tables, and priority queues. The articles will highlight novel approaches to memory management, data access patterns, and algorithmic modifications that harness the massive parallelism of GPUs. Furthermore, this Special Issue will address practical challenges, such as synchronization, load balancing, and efficient data transfer between CPU and GPU memory.

By featuring both theoretical advancements and practical implementations, this Special Issue will bridge the gap between traditional CPU-centric data structures and their GPU-optimized counterparts. Readers will gain insights into the latest techniques for maximizing GPU performance, making this Special Issue an essential resource for researchers and practitioners seeking to exploit the full potential of GPU computing for data-intensive applications.

Dr. Byunghyun Jang
Prof. Dr. Juan A. Gómez-Pulido
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

GPU computing
concurrent data structures
GPGPU
parallel computing

Benefits of Publishing in a Special Issue

Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (3 papers)

Download All Papers

Order results

Result details

Show export options Show export options

Select all

Export citation of selected articles as:

Research

30 pages, 1684 KiB

Open AccessArticle

Efficient GPU Implementation of the McMurchie–Davidson Method for Shell-Based ERI Computations

by Haruto Fujii, Yasuaki Ito, Nobuya Yokogawa, Kanta Suzuki, Satoki Tsuji, Koji Nakano, Victor Parque and Akihiko Kasagi

Appl. Sci. 2025, 15(5), 2572; https://doi.org/10.3390/app15052572 - 27 Feb 2025

Viewed by 217

Abstract

Quantum chemistry offers the formal machinery to derive molecular and physical properties arising from (sub)atomic interactions. However, as molecules of practical interest are largely polyatomic, contemporary approximation schemes such as the Hartree–Fock scheme are computationally expensive due to the large number of electron repulsion integrals (ERIs). Central to the Hartree–Fock method is the efficient computation of ERIs over Gaussian functions (GTO-ERIs). Here, the well-known McMurchie–Davidson method (MD) offers an elegant formalism by incrementally extending Hermite Gaussian functions and auxiliary tabulated functions. Although the MD method offers a high degree of versatility to acceleration schemes through Graphics Processing Units (GPUs), the current GPU implementations limit the practical use of supported values of the azimuthal quantum number. In this paper, we propose a generalized framework capable of computing GTO-ERIs for arbitrary azimuthal quantum numbers, provided that the intermediate terms of the MD method can be stored. Our approach benefits from extending the MD recurrence relations through shells, batches, and triple-buffering of the shared memory, and ordering similar ERIs, thus enabling the effective parallelization and use of GPU resources. Furthermore, our approach proposes four GPU implementation schemes considering the suitable mappings between Gaussian basis and CUDA blocks and threads. Our computational experiments involving the GTO-ERI computations of molecules of interest on an NVIDIA A100 Tensor Core GPU (NVIDIA, Santa Clara, CA, USA) have revealed the merits of the proposed acceleration schemes in terms of computation time, including up to a 72× improvement over our previous GPU implementation and up to a 4500× speedup compared to a naive CPU implementation, highlighting the effectiveness of our method in accelerating ERI computations for both monatomic and polyatomic molecules. Our work has the potential to explore new parallelization schemes of distinct and complex computation paths involved in ERI computation. Full article

(This article belongs to the Special Issue Data Structures for Graphics Processing Units (GPUs))

► Show Figures

Figure 1

20 pages, 899 KiB

Open AccessArticle

Boundary-Aware Concurrent Queue: A Fast and Scalable Concurrent FIFO Queue on GPU Environments

by Md. Sabbir Hossain Polak, David A. Troendle and Byunghyun Jang

Appl. Sci. 2025, 15(4), 1834; https://doi.org/10.3390/app15041834 - 11 Feb 2025

Viewed by 387

Abstract

This paper presents Boundary-Aware Concurrent Queue (BACQ), a high-performance queue designed for modern GPUs, which focuses on high concurrency in massively parallel environments. BACQ operates at the warp level, leveraging intra-warp locality to improve throughput. A key to BACQ’s design is its ability to replace conflicting accesses to shared data with independent accesses to private data. It uses a ticket-based system to ensure fair ordering of operations and supports infinite growth of the head and tail across its ring buffer. The leader thread of each warp coordinates enqueue and dequeue operations, broadcasting offsets for intra-warp synchronization. BACQ dynamically adjusts operation priorities based on the queue’s state, especially as it approaches boundary conditions such as overfilling the buffer. It also uses a virtual caching layer for intra-warp communication, reducing memory latency. Rigorous benchmarking results show that BACQ outperforms the BWD (Broker Queue Work Distributor), the fastest known GPU queue, by more than 2× while preserving FIFO semantics. The paper demonstrates BACQ’s superior performance through real-world empirical evaluations. Full article

(This article belongs to the Special Issue Data Structures for Graphics Processing Units (GPUs))

► Show Figures

Figure 1

21 pages, 6218 KiB

Open AccessArticle

Multi-GPU Acceleration for Finite Element Analysis in Structural Mechanics

by David Herrero-Pérez and Humberto Martínez-Barberá

Appl. Sci. 2025, 15(3), 1095; https://doi.org/10.3390/app15031095 - 22 Jan 2025

Viewed by 860

Abstract

This work evaluates the computing performance of finite element analysis in structural mechanics using modern multi-GPU systems. We can avoid the usual memory limitations when using one GPU device for many-core computing using multiple GPUs for scientific computing. We use a GPU-awareness MPI approach implementing a suitable smoothed aggregation multigrid for preconditioning an iterative distributed conjugate gradient solver for GPU computing. We evaluate the performance and scalability of different models, problem sizes, and computing resources. We take an efficient multi-core implementation as the reference to assess the computing performance of the numerical results. The numerical results show the advantages and limitations of using distributed many-core architectures to address structural mechanics problems. Full article

(This article belongs to the Special Issue Data Structures for Graphics Processing Units (GPUs))

► Show Figures

Figure 1

Show export options Show export options

Select all

Export citation of selected articles as:

Displaying articles 1-3

Journal Menu

Journal Browser

Data Structures for Graphics Processing Units (GPUs)

Share This Special Issue

Special Issue Editors

Special Issue Information

Keywords

Benefits of Publishing in a Special Issue

Published Papers (3 papers)

Research

Further Information

Guidelines

MDPI Initiatives

Follow MDPI