Zhang et al., 2012 - Google Patents

A hybrid method for solving tridiagonal systems on the GPU

Zhang et al., 2012

Document ID: 12196687706262727602
Author: Zhang Y; Cohen J; Davidson A; Owens J
Publication year: 2012
Publication venue: GPU Computing Gems Jade Edition

External Links

Cited by

Snippet

Publisher Summary Tridiagonal linear systems are of importance to many problems in numerical analysis and computational fluid dynamics, as well as to computer graphics applications in video games and computer-animated films. Typical applications require …

Continue reading at www.sciencedirect.com (other versions)

238000004422 calculation algorithm 0 abstract description 44

Classifications

- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/14—Fourier, Walsh or analogous domain transformations, e.g. Laplace, Hilbert, Karhunen-Loeve, transforms
- G06F17/141—Discrete Fourier transforms
- G06F17/142—Fast Fourier transforms, e.g. using a Cooley-Tukey type algorithm
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/30—Arrangements for executing machine-instructions, e.g. instruction decode
- G06F9/30003—Arrangements for executing specific machine instructions
- G06F9/30007—Arrangements for executing specific machine instructions to perform operations on data operands
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/30—Arrangements for executing machine-instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline, look ahead
- G06F9/3885—Concurrent instruction execution, e.g. pipeline, look ahead using a plurality of independent parallel functional units
- G06F9/3887—Concurrent instruction execution, e.g. pipeline, look ahead using a plurality of independent parallel functional units controlled by a single instruction, e.g. SIMD
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/30—Arrangements for executing machine-instructions, e.g. instruction decode
- G06F9/30003—Arrangements for executing specific machine instructions
- G06F9/30076—Arrangements for executing specific machine instructions to perform miscellaneous control operations, e.g. NOP
- G06F9/30087—Synchronisation or serialisation instructions
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/16—Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/76—Architectures of general purpose stored programme computers
- G06F15/80—Architectures of general purpose stored programme computers comprising an array of processing units with common control, e.g. single instruction multiple data processors
- G06F15/8007—Architectures of general purpose stored programme computers comprising an array of processing units with common control, e.g. single instruction multiple data processors single instruction multiple data [SIMD] multiprocessors
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/50—Computer-aided design
- G06F17/5009—Computer-aided design using simulation
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30861—Retrieval from the Internet, e.g. browsers
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T1/00—General purpose image data processing
- G06T1/20—Processor architectures; Processor configuration, e.g. pipelining
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T11/00—2D [Two Dimensional] image generation
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformations of program code

Similar Documents

Publication	Publication Date	Title
Göddeke et al.	2010	Cyclic reduction tridiagonal solvers on GPUs applied to mixed-precision multigrid
Kiss et al.	2012	Parallel realization of the element-by-element FEM technique by CUDA
Zhang et al.	2010	Fast tridiagonal solvers on the GPU
Schenk et al.	2008	Algorithmic performance studies on graphics processing units
Dang et al.	2013	CUDA-enabled Sparse Matrix–Vector Multiplication on GPUs using atomic operations
Romero et al.	2020	High performance implementations of the 2D Ising model on GPUs
Allard et al.	2012	Implicit FEM solver on GPU for interactive deformation simulation
Krotkiewski et al.	2013	Efficient 3D stencil computations using CUDA
Banaś et al.	2016	Finite element numerical integration for first order approximations on multi-and many-core architectures
Amorim et al.	2009	Comparing CUDA and OpenGL implementations for a Jacobi iteration
Valero-Lara et al.	2012	Block tridiagonal solvers on heterogeneous architectures
Pikle et al.	2018	GPGPU-based parallel computing applied in the FEM using the conjugate gradient algorithm: a review
Aliaga et al.	2016	Exploiting task and data parallelism in ILUPACK’s preconditioned CG solver on NUMA architectures and many-core accelerators
Yount et al.	2019	Multi-level spatial and temporal tiling for efficient HPC stencil computation on many-core processors with large shared caches
Zhang et al.	2012	A hybrid method for solving tridiagonal systems on the GPU
Wu et al.	2013	An optimized FFT-based direct Poisson solver on CUDA GPUs
Oyarzun et al.	2017	Portable implementation model for CFD simulations. Application to hybrid CPU/GPU supercomputers
Woźniak	2015	Fast GPU integration algorithm for isogeometric finite element method solvers using task dependency graphs
Stpiczyński	2020	Algorithmic and language-based optimization of Marsa-LFIB4 pseudorandom number generator using OpenMP, OpenACC and CUDA
Vasconcellos et al.	2020	Accelerating simulations of cardiac electrical dynamics through a multi‐GPU platform and an optimized data structure
Kiss et al.	2012	High locality and increased intra-node parallelism for solving finite element models on GPUs by novel element-by-element implementation
Magee et al.	2018	Accelerating solutions of one-dimensional unsteady PDEs with GPU-based swept time–space decomposition
Bisson et al.	2025	Massive-scale simulations of 2D Ising and Blume-Capel models on rack-scale multi-GPU systems
Bleichrodt et al.	2012	Accelerating a barotropic ocean model using a GPU
Pikle et al.	2022	Low occupancy high performance elemental products in assembly free FEM on GPU