[go: up one dir, main page]

Lee et al., 2022 - Google Patents

MVP: An efficient CNN accelerator with matrix, vector, and processing-near-memory units

Lee et al., 2022

View PDF
Document ID
6824833216726758650
Author
Lee S
Choi J
Jung W
Kim B
Park J
Kim H
Ahn J
Publication year
Publication venue
ACM Transactions on Design Automation of Electronic Systems (TODAES)

External Links

Snippet

Mobile and edge devices become common platforms for inferring convolutional neural networks (CNNs) due to superior privacy and service quality. To reduce the computational costs of convolution (CONV), recent CNN models adopt depth-wise CONV (DW-CONV) and …
Continue reading at dl.acm.org (PDF) (other versions)

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30286Information retrieval; Database structures therefor; File system structures therefor in structured data stores
    • G06F17/30386Retrieval requests
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30861Retrieval from the Internet, e.g. browsers
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/50Computer-aided design
    • G06F17/5045Circuit design
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for programme control, e.g. control unit
    • G06F9/06Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
    • G06F9/46Multiprogramming arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformations of program code
    • G06F8/41Compilation
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/76Architectures of general purpose stored programme computers
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/16Combinations of two or more digital computers each having at least an arithmetic unit, a programme unit and a register, e.g. for a simultaneous processing of several programmes
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F1/00Details of data-processing equipment not covered by groups G06F3/00 - G06F13/00, e.g. cooling, packaging or power supply specially adapted for computer application
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS

Similar Documents

Publication Publication Date Title
Xu et al. A survey of design and optimization for systolic array-based DNN accelerators
Hegde et al. Extensor: An accelerator for sparse tensor algebra
Qu et al. Dota: detect and omit weak attentions for scalable transformer acceleration
Zhang et al. BoostGCN: A framework for optimizing GCN inference on FPGA
Mittal A survey of techniques for approximate computing
Mahajan et al. Tabla: A unified template-based framework for accelerating statistical machine learning
Nguyen et al. ShortcutFusion: From tensorflow to FPGA-based accelerator with a reuse-aware memory allocation for shortcut data
Chen et al. A high-throughput neural network accelerator
Lee et al. MVP: An efficient CNN accelerator with matrix, vector, and processing-near-memory units
US12321849B1 (en) Performing hardware operator fusion
Liu et al. An efficient FPGA-based depthwise separable convolutional neural network accelerator with hardware pruning
Arora et al. Tensor slices: FPGA building blocks for the deep learning era
Que et al. Remarn: A reconfigurable multi-threaded multi-core accelerator for recurrent neural networks
Pellauer et al. Symphony: Orchestrating sparse and dense tensors with hierarchical heterogeneous processing
Potocnik et al. Optimizing foundation model inference on a many-tiny-core open-source risc-v platform
Raha et al. Efficient hardware acceleration of emerging neural networks for embedded machine learning: An industry perspective
Cicek et al. Energy efficient boosting of gemm accelerators for dnn via reuse
Ioannou et al. Streaming Overlay Architecture for Lightweight LSTM Computation on FPGA SoCs
Gan et al. High performance reconfigurable computing for numerical simulation and deep learning
Agullo et al. Task-based sparse hybrid linear solver for distributed memory heterogeneous architectures
Qararyah et al. An efficient hybrid deep learning accelerator for compact and heterogeneous CNNs
Shin et al. Pimflow: Compiler and runtime support for cnn models on processing-in-memory dram
CN113642722A (en) Chip for convolution calculation, control method thereof and electronic device
Gupta et al. Store-n-learn: Classification and clustering with hyperdimensional computing across flash hierarchy
Lee et al. Resa: Reconfigurable systolic array for multiple tiny dnn tensors