Mehrez et al., 2018 - Google Patents
Understanding the performances of SMVP on multiprocessor platformMehrez et al., 2018
- Document ID
- 2337960006929931258
- Author
- Mehrez I
- Hamdi-Larbi O
- Dufaud T
- Emad N
- Publication year
- Publication venue
- Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA)
External Links
Snippet
Abstract Sparse Matrix Vector Product (SMVP) is an important kernel in many scientific applications. In this paper we study the performances of this kernel on multiprocessor platform using four different compression format (CSR, CSC, ELL and COO). Our aim is to …
- 239000011159 matrix material 0 abstract description 33
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/16—Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
- G06F17/30386—Retrieval requests
- G06F17/30424—Query processing
- G06F17/30442—Query optimisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/14—Fourier, Walsh or analogous domain transformations, e.g. Laplace, Hilbert, Karhunen-Loeve, transforms
- G06F17/141—Discrete Fourier transforms
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/11—Complex mathematical operations for solving equations, e.g. nonlinear equations, general mathematical optimization problems
- G06F17/12—Simultaneous equations, e.g. systems of linear equations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
- G06F17/30312—Storage and indexing structures; Management thereof
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/50—Computer-aided design
- G06F17/5068—Physical circuit design, e.g. layout for integrated circuits or printed circuit boards
- G06F17/5072—Floorplanning, e.g. partitioning, placement
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3074—Audio data retrieval
- G06F17/30778—Audio database index structures and management thereof
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/0207—Addressing or allocation; Relocation with multidimensional access, e.g. row/column, matrix
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F7/38—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F2207/00—Indexing scheme relating to methods or arrangements for processing data by operating upon the order or content of the data handled
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformations of program code
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/76—Architectures of general purpose stored programme computers
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10296556B2 (en) | System and method for efficient sparse matrix processing | |
CA3090329C (en) | Neural network accelerator | |
US20190266217A1 (en) | Apparatus and method for matrix computation | |
Anderson et al. | Communication-avoiding QR decomposition for GPUs | |
Ashari et al. | An efficient two-dimensional blocking strategy for sparse matrix-vector multiplication on GPUs | |
KR102065672B1 (en) | Apparatus and method for convolution operation | |
Zhang et al. | Algorithm-hardware co-design of attention mechanism on FPGA devices | |
KR20190049593A (en) | Method and apparatus for performing operations in convolutional neural network | |
US20190095790A1 (en) | Method and apparatus for adapting parameters of neural network | |
US20230281271A1 (en) | Distributing matrix multiplication processing among processing nodes | |
Sun et al. | Optimizing SpMV for diagonal sparse matrices on GPU | |
EP3295300B1 (en) | System and method for determining concurrency factors for dispatch size of parallel processor kernels | |
Lee et al. | Flexible group-level pruning of deep neural networks for on-device machine learning | |
Zhang et al. | Regularizing irregularity: bitmap-based and portable sparse matrix multiplication for graph data on GPUs | |
KR20240149907A (en) | Adaptive tensor computation kernel for sparse neural networks | |
Abubaker et al. | Spatiotemporal graph and hypergraph partitioning models for sparse matrix-vector multiplication on many-core architectures | |
Mehrez et al. | Understanding the performances of SMVP on multiprocessor platform | |
Jiang et al. | Characterizing and optimizing transformer inference on arm many-core processor | |
Mondal et al. | A unified engine for accelerating GNN weighting/aggregation operations, with efficient load balancing and graph-specific caching | |
Arrigoni et al. | Efficiently parallelizable strassen-based multiplication of a matrix by its transpose | |
US11989257B2 (en) | Assigning processing threads for matrix-matrix multiplication | |
Sun et al. | Crsd: application specific auto-tuning of spmv for diagonal sparse matrices | |
Mehrez et al. | Machine learning for optimal compression format prediction on multiprocessor platform | |
Mehrez et al. | Understanding the performances of sparse compression formats using data parallel programming model | |
Ibrahim et al. | Improvement of data throughput in data-intensive cloud computing applications |