[go: up one dir, main page]

Zhang et al., 2012 - Google Patents

A hybrid method for solving tridiagonal systems on the GPU

Zhang et al., 2012

Document ID
12196687706262727602
Author
Zhang Y
Cohen J
Davidson A
Owens J
Publication year
Publication venue
GPU Computing Gems Jade Edition

External Links

Snippet

Publisher Summary Tridiagonal linear systems are of importance to many problems in numerical analysis and computational fluid dynamics, as well as to computer graphics applications in video games and computer-animated films. Typical applications require …
Continue reading at www.sciencedirect.com (other versions)

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/14Fourier, Walsh or analogous domain transformations, e.g. Laplace, Hilbert, Karhunen-Loeve, transforms
    • G06F17/141Discrete Fourier transforms
    • G06F17/142Fast Fourier transforms, e.g. using a Cooley-Tukey type algorithm
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for programme control, e.g. control unit
    • G06F9/06Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
    • G06F9/30Arrangements for executing machine-instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30007Arrangements for executing specific machine instructions to perform operations on data operands
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for programme control, e.g. control unit
    • G06F9/06Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
    • G06F9/30Arrangements for executing machine-instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline, look ahead
    • G06F9/3885Concurrent instruction execution, e.g. pipeline, look ahead using a plurality of independent parallel functional units
    • G06F9/3887Concurrent instruction execution, e.g. pipeline, look ahead using a plurality of independent parallel functional units controlled by a single instruction, e.g. SIMD
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for programme control, e.g. control unit
    • G06F9/06Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
    • G06F9/30Arrangements for executing machine-instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30076Arrangements for executing specific machine instructions to perform miscellaneous control operations, e.g. NOP
    • G06F9/30087Synchronisation or serialisation instructions
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/76Architectures of general purpose stored programme computers
    • G06F15/80Architectures of general purpose stored programme computers comprising an array of processing units with common control, e.g. single instruction multiple data processors
    • G06F15/8007Architectures of general purpose stored programme computers comprising an array of processing units with common control, e.g. single instruction multiple data processors single instruction multiple data [SIMD] multiprocessors
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/50Computer-aided design
    • G06F17/5009Computer-aided design using simulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for programme control, e.g. control unit
    • G06F9/06Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
    • G06F9/46Multiprogramming arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30861Retrieval from the Internet, e.g. browsers
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T1/00General purpose image data processing
    • G06T1/20Processor architectures; Processor configuration, e.g. pipelining
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T11/002D [Two Dimensional] image generation
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformations of program code

Similar Documents

Publication Publication Date Title
Göddeke et al. Cyclic reduction tridiagonal solvers on GPUs applied to mixed-precision multigrid
Kiss et al. Parallel realization of the element-by-element FEM technique by CUDA
Zhang et al. Fast tridiagonal solvers on the GPU
Schenk et al. Algorithmic performance studies on graphics processing units
Dang et al. CUDA-enabled Sparse Matrix–Vector Multiplication on GPUs using atomic operations
Romero et al. High performance implementations of the 2D Ising model on GPUs
Allard et al. Implicit FEM solver on GPU for interactive deformation simulation
Krotkiewski et al. Efficient 3D stencil computations using CUDA
Banaś et al. Finite element numerical integration for first order approximations on multi-and many-core architectures
Amorim et al. Comparing CUDA and OpenGL implementations for a Jacobi iteration
Valero-Lara et al. Block tridiagonal solvers on heterogeneous architectures
Pikle et al. GPGPU-based parallel computing applied in the FEM using the conjugate gradient algorithm: a review
Aliaga et al. Exploiting task and data parallelism in ILUPACK’s preconditioned CG solver on NUMA architectures and many-core accelerators
Yount et al. Multi-level spatial and temporal tiling for efficient HPC stencil computation on many-core processors with large shared caches
Zhang et al. A hybrid method for solving tridiagonal systems on the GPU
Wu et al. An optimized FFT-based direct Poisson solver on CUDA GPUs
Oyarzun et al. Portable implementation model for CFD simulations. Application to hybrid CPU/GPU supercomputers
Woźniak Fast GPU integration algorithm for isogeometric finite element method solvers using task dependency graphs
Stpiczyński Algorithmic and language-based optimization of Marsa-LFIB4 pseudorandom number generator using OpenMP, OpenACC and CUDA
Vasconcellos et al. Accelerating simulations of cardiac electrical dynamics through a multi‐GPU platform and an optimized data structure
Kiss et al. High locality and increased intra-node parallelism for solving finite element models on GPUs by novel element-by-element implementation
Magee et al. Accelerating solutions of one-dimensional unsteady PDEs with GPU-based swept time–space decomposition
Bisson et al. Massive-scale simulations of 2D Ising and Blume-Capel models on rack-scale multi-GPU systems
Bleichrodt et al. Accelerating a barotropic ocean model using a GPU
Pikle et al. Low occupancy high performance elemental products in assembly free FEM on GPU