Chen et al., 2022 - Google Patents

Exploiting hierarchical parallelism and reusability in tensor kernel processing on heterogeneous HPC systems

Chen et al., 2022

Document ID: 38452485182481226
Author: Chen Y; Xiao G; Özsu M; Tang Z; Zomaya A; Li K
Publication year: 2022
Publication venue: 2022 IEEE 38th International Conference on Data Engineering (ICDE)

External Links

Cited by

Snippet

Canonical Polyadic Decomposition (CPD) of sparse tensors is an effective tool in various machine learning and data analytics applications, in which sparse Matricized Tensor Times Khatri-Rao Product (MTTKRP) is the major performance bottleneck. To overcome this …

Continue reading at ieeexplore.ieee.org (other versions)

235000010977 hydroxypropyl cellulose 0 title abstract description 18

Classifications

- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
- G06F17/30386—Retrieval requests
- G06F17/30424—Query processing
- G06F17/30442—Query optimisation
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
- G06F17/30386—Retrieval requests
- G06F17/30424—Query processing
- G06F17/30533—Other types of queries
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
- G06F17/30386—Retrieval requests
- G06F17/30389—Query formulation
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/76—Architectures of general purpose stored programme computers
- G06F15/80—Architectures of general purpose stored programme computers comprising an array of processing units with common control, e.g. single instruction multiple data processors
- G06F15/8007—Architectures of general purpose stored programme computers comprising an array of processing units with common control, e.g. single instruction multiple data processors single instruction multiple data [SIMD] multiprocessors
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30943—Information retrieval; Database structures therefor; File system structures therefor details of database functions independent of the retrieved data type
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30861—Retrieval from the Internet, e.g. browsers
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/14—Fourier, Walsh or analogous domain transformations, e.g. Laplace, Hilbert, Karhunen-Loeve, transforms
- G06F17/141—Discrete Fourier transforms
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/16—Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/50—Computer-aided design
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/16—Combinations of two or more digital computers each having at least an arithmetic unit, a programme unit and a register, e.g. for a simultaneous processing of several programmes
- G06F15/163—Interprocessor communication
- G06F15/173—Interprocessor communication using an interconnection network, e.g. matrix, shuffle, pyramid, star, snowflake
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformations of program code
- G06F8/41—Compilation
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F19/00—Digital computing or data processing equipment or methods, specially adapted for specific applications
- G06F19/10—Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology

Similar Documents

Publication	Publication Date	Title
Zhang et al.	2021	Gamma: Leveraging Gustavson’s algorithm to accelerate sparse matrix multiplication
Gómez-Luna et al.	2021	Benchmarking a new paradigm: An experimental analysis of a real processing-in-memory architecture
Li et al.	2018	HiCOO: Hierarchical storage of sparse tensors
Springer et al.	2017	HPTT: A high-performance tensor transposition C++ library
EP3757754B1 (en)	2023-01-04	Sorting for data-parallel computing devices
Kim et al.	2013	Parallel multi-dimensional range query processing with R-trees on GPU
Choi et al.	2018	Blocking optimization techniques for sparse tensor computation
Koza et al.	2014	Compressed multirow storage format for sparse matrices on graphics processing units
CN102799416B (en)	2014-09-17	GPU-oriented fine grit parallel application mapping method
Gmys et al.	2016	A GPU-based Branch-and-Bound algorithm using Integer–Vector–Matrix data structure
Weigel	2011	Connected-component identification and cluster update on graphics processing units
Yang et al.	2023	Isosceles: Accelerating sparse cnns through inter-layer pipelining
Liu	2015	Parallel and scalable sparse basic linear algebra subprograms
Odemuyiwa et al.	2023	Accelerating sparse data orchestration via dynamic reflexive tiling
Kelefouras et al.	2014	A Matrix–Matrix Multiplication methodology for single/multi-core architectures using SIMD
Chen et al.	2022	Exploiting hierarchical parallelism and reusability in tensor kernel processing on heterogeneous HPC systems
Wang et al.	2023	A novel parallel algorithm for sparse tensor matrix chain multiplication via TCU-acceleration
Xiao et al.	2023	A survey of accelerating parallel sparse linear algebra
Malik et al.	2012	Task scheduling for GPU accelerated hybrid OLAP systems with multi-core support and text-to-integer translation
Samsi et al.	2016	Benchmarking scidb data import on hpc systems
Zhang et al.	2017	Towards GPU-accelerated Web-GIS for query-driven visual exploration
Tavakoli et al.	2024	FSpGEMM: A framework for accelerating sparse general matrix–matrix multiplication using Gustavson’s algorithm on FPGAs
Williams et al.	2008	PERI-auto-tuning memory-intensive kernels for multicore
Kislal et al.	2018	Data access skipping for recursive partitioning methods
Abdel-Hafeez et al.	2018	A comparison-free sorting algorithm on CPUs and GPUs