Yu et al., 2024 - Google Patents
Case study: optimization methods with TVM hybrid-op on RISC-V packed SIMDYu et al., 2024
View PDF- Document ID
- 2185950162648510863
- Author
- Yu M
- Yuan C
- Chen T
- Lee J
- Publication year
- Publication venue
- IEEE Access
External Links
Snippet
In recent years, considerable research has focused on the use of custom hardware to accelerate deep learning on edge devices. However, the end-to-end flow of deep learning includes preprocessing and postprocessing. Deep learning hardware accelerators cannot …
- 238000000034 method 0 title abstract description 57
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformations of program code
- G06F8/41—Compilation
- G06F8/44—Encoding
- G06F8/443—Optimisation
- G06F8/4441—Reducing the execution time required by the program code
- G06F8/4442—Reducing the number of cache misses; Data prefetching
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/30—Arrangements for executing machine-instructions, e.g. instruction decode
- G06F9/30003—Arrangements for executing specific machine instructions
- G06F9/30007—Arrangements for executing specific machine instructions to perform operations on data operands
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/30—Arrangements for executing machine-instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline, look ahead
- G06F9/3885—Concurrent instruction execution, e.g. pipeline, look ahead using a plurality of independent parallel functional units
- G06F9/3889—Concurrent instruction execution, e.g. pipeline, look ahead using a plurality of independent parallel functional units controlled by multiple instructions, e.g. MIMD, decoupled access or execute
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformations of program code
- G06F8/41—Compilation
- G06F8/45—Exploiting coarse grain parallelism in compilation, i.e. parallelism between groups of instructions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/30—Arrangements for executing machine-instructions, e.g. instruction decode
- G06F9/32—Address formation of the next instruction, e.g. incrementing the instruction counter, jump
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/44—Arrangements for executing specific programmes
- G06F9/455—Emulation; Software simulation, i.e. virtualisation or emulation of application or operating system execution engines
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/30—Creation or generation of source code
- G06F8/34—Graphical or visual programming
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F7/38—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
- G06F7/48—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/50—Computer-aided design
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/36—Preventing errors by testing or debugging software
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/02—Knowledge representation
- G06N5/022—Knowledge engineering, knowledge acquisition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F2207/00—Indexing scheme relating to methods or arrangements for processing data by operating upon the order or content of the data handled
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F19/00—Digital computing or data processing equipment or methods, specially adapted for specific applications
- G06F19/10—Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| JP2007518176A (en) | Automatic generation system for optimized code | |
| Yu et al. | Case study: optimization methods with TVM hybrid-op on RISC-V packed SIMD | |
| Oancea et al. | Financial software on GPUs: between Haskell and Fortran | |
| Moren et al. | Automatic mapping for OpenCL-programs on CPU/GPU heterogeneous platforms | |
| Daly et al. | Synthesizing instruction selection rewrite rules from RTL using SMT | |
| Ballabriga et al. | Symbolic WCET computation | |
| Yu et al. | Optimizing computer vision algorithms with TVM on VLIW architecture based on RVV | |
| Bai et al. | Gtco: Graph and tensor co-design for transformer-based image recognition on tensor cores | |
| Carrington et al. | An idiom-finding tool for increasing productivity of accelerators | |
| Wang et al. | Soter: Analytical tensor-architecture modeling and automatic tensor program tuning for spatial accelerators | |
| Blackmore et al. | Automatically tuning the gcc compiler to optimize the performance of applications running on embedded systems | |
| Fahringer et al. | A unified symbolic evaluation framework for parallelizing compilers | |
| Sovetov | Development of dsl compilers for specialized processors | |
| Mukherjee et al. | DFGenTool: A dataflow graph generation tool for coarse grain reconfigurable architectures | |
| Zheng et al. | Reaap: A reconfigurable and algorithm-oriented array processor with compiler-architecture co-design | |
| Chikin et al. | Memory-access-aware safety and profitability analysis for transformation of accelerator-bound OpenMP loops | |
| Rapaport et al. | Streamlining whole function vectorization in C using higher order vector semantics | |
| Gupta et al. | Efficient Compiler Design for a Geometric Shape Domain-Specific Language: Emphasizing Abstraction and Optimization Techniques. | |
| Winterstein | Separation Logic for High-level Synthesis | |
| Mukhin et al. | Benchmarking Deep Learning Inference on RISC-V CPUs | |
| Singh | An Empirical Study of Programming Languages from the Point of View of Scientific Computing | |
| Luo et al. | A RISC-V extended infrastructure for CNNs through pipelined computing and data dependence optimization | |
| Mihajlenko et al. | A method for decompilation of AMD GCN kernels to OpenCL | |
| Hu et al. | A configurable simulator for neural network processors | |
| Das et al. | ML-driven Hardware Cost Model for MLIR |