Chen et al., 2022 - Google Patents
Exploiting hierarchical parallelism and reusability in tensor kernel processing on heterogeneous HPC systemsChen et al., 2022
- Document ID
- 38452485182481226
- Author
- Chen Y
- Xiao G
- Özsu M
- Tang Z
- Zomaya A
- Li K
- Publication year
- Publication venue
- 2022 IEEE 38th International Conference on Data Engineering (ICDE)
External Links
Snippet
Canonical Polyadic Decomposition (CPD) of sparse tensors is an effective tool in various machine learning and data analytics applications, in which sparse Matricized Tensor Times Khatri-Rao Product (MTTKRP) is the major performance bottleneck. To overcome this …
- 235000010977 hydroxypropyl cellulose 0 title abstract description 18
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
- G06F17/30386—Retrieval requests
- G06F17/30424—Query processing
- G06F17/30442—Query optimisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
- G06F17/30386—Retrieval requests
- G06F17/30424—Query processing
- G06F17/30533—Other types of queries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
- G06F17/30386—Retrieval requests
- G06F17/30389—Query formulation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/76—Architectures of general purpose stored programme computers
- G06F15/80—Architectures of general purpose stored programme computers comprising an array of processing units with common control, e.g. single instruction multiple data processors
- G06F15/8007—Architectures of general purpose stored programme computers comprising an array of processing units with common control, e.g. single instruction multiple data processors single instruction multiple data [SIMD] multiprocessors
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30943—Information retrieval; Database structures therefor; File system structures therefor details of database functions independent of the retrieved data type
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30861—Retrieval from the Internet, e.g. browsers
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/14—Fourier, Walsh or analogous domain transformations, e.g. Laplace, Hilbert, Karhunen-Loeve, transforms
- G06F17/141—Discrete Fourier transforms
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/16—Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/50—Computer-aided design
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/16—Combinations of two or more digital computers each having at least an arithmetic unit, a programme unit and a register, e.g. for a simultaneous processing of several programmes
- G06F15/163—Interprocessor communication
- G06F15/173—Interprocessor communication using an interconnection network, e.g. matrix, shuffle, pyramid, star, snowflake
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformations of program code
- G06F8/41—Compilation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F19/00—Digital computing or data processing equipment or methods, specially adapted for specific applications
- G06F19/10—Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Zhang et al. | Gamma: Leveraging Gustavson’s algorithm to accelerate sparse matrix multiplication | |
Gómez-Luna et al. | Benchmarking a new paradigm: An experimental analysis of a real processing-in-memory architecture | |
Li et al. | HiCOO: Hierarchical storage of sparse tensors | |
Springer et al. | HPTT: A high-performance tensor transposition C++ library | |
EP3757754B1 (en) | Sorting for data-parallel computing devices | |
Kim et al. | Parallel multi-dimensional range query processing with R-trees on GPU | |
Choi et al. | Blocking optimization techniques for sparse tensor computation | |
Koza et al. | Compressed multirow storage format for sparse matrices on graphics processing units | |
CN102799416B (en) | GPU-oriented fine grit parallel application mapping method | |
Gmys et al. | A GPU-based Branch-and-Bound algorithm using Integer–Vector–Matrix data structure | |
Weigel | Connected-component identification and cluster update on graphics processing units | |
Yang et al. | Isosceles: Accelerating sparse cnns through inter-layer pipelining | |
Liu | Parallel and scalable sparse basic linear algebra subprograms | |
Odemuyiwa et al. | Accelerating sparse data orchestration via dynamic reflexive tiling | |
Kelefouras et al. | A Matrix–Matrix Multiplication methodology for single/multi-core architectures using SIMD | |
Chen et al. | Exploiting hierarchical parallelism and reusability in tensor kernel processing on heterogeneous HPC systems | |
Wang et al. | A novel parallel algorithm for sparse tensor matrix chain multiplication via TCU-acceleration | |
Xiao et al. | A survey of accelerating parallel sparse linear algebra | |
Malik et al. | Task scheduling for GPU accelerated hybrid OLAP systems with multi-core support and text-to-integer translation | |
Samsi et al. | Benchmarking scidb data import on hpc systems | |
Zhang et al. | Towards GPU-accelerated Web-GIS for query-driven visual exploration | |
Tavakoli et al. | FSpGEMM: A framework for accelerating sparse general matrix–matrix multiplication using Gustavson’s algorithm on FPGAs | |
Williams et al. | PERI-auto-tuning memory-intensive kernels for multicore | |
Kislal et al. | Data access skipping for recursive partitioning methods | |
Abdel-Hafeez et al. | A comparison-free sorting algorithm on CPUs and GPUs |