[go: up one dir, main page]

Guo et al., 2023 - Google Patents

Cambricon-u: A systolic random increment memory architecture for unary computing

Guo et al., 2023

View PDF
Document ID
13538713478930611456
Author
Guo H
Zhao Y
Li Z
Hao Y
Liu C
Song X
Li X
Du Z
Zhang R
Guo Q
Chen T
Xu Z
Publication year
Publication venue
Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture

External Links

Snippet

Unary computing, whose arithmetics require only one logic gate, has enabled efficient DNN processing, especially on strictly power-constrained devices. However, unary computing still confronts the power efficiency bottleneck for buffering unary bitstreams. The buffering of …
Continue reading at dl.acm.org (PDF) (other versions)

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/50Adding; Subtracting
    • G06F7/505Adding; Subtracting in bit-parallel fashion, i.e. having a different digit-handling circuit for each denomination
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/52Multiplying; Dividing
    • G06F7/523Multiplying only
    • G06F7/53Multiplying only in parallel-parallel fashion, i.e. both operands being entered in parallel
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30286Information retrieval; Database structures therefor; File system structures therefor in structured data stores
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/50Computer-aided design
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/03Arrangements for converting the position or the displacement of a member into a coded form
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for programme control, e.g. control unit
    • G06F9/06Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformations of program code
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F19/00Digital computing or data processing equipment or methods, specially adapted for specific applications
    • G06F19/10Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F2207/00Indexing scheme relating to methods or arrangements for processing data by operating upon the order or content of the data handled
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F1/00Details of data-processing equipment not covered by groups G06F3/00 - G06F13/00, e.g. cooling, packaging or power supply specially adapted for computer application

Similar Documents

Publication Publication Date Title
Catthoor et al. Custom memory management methodology: Exploration of memory organisation for embedded multimedia system design
Liang et al. High‐Level Synthesis: Productivity, Performance, and Software Constraints
Lee et al. Application codesign of near-data processing for similarity search
Nguyen et al. ShortcutFusion: From tensorflow to FPGA-based accelerator with a reuse-aware memory allocation for shortcut data
TW201602813A (en) Systems, apparatuses, and methods for feature searching
Drozd et al. Green IT engineering in the view of resource-based approach
Li et al. MeNTT: A compact and efficient processing-in-memory number theoretic transform (NTT) accelerator
Fu et al. 2-in-1 accelerator: Enabling random precision switch for winning both adversarial robustness and efficiency
Gao et al. Millimeter-scale and billion-atom reactive force field simulation on sunway taihulight
US9626334B2 (en) Systems, apparatuses, and methods for K nearest neighbor search
Ghaffar et al. A low power in-DRAM architecture for quantized CNNs using fast Winograd convolutions
Guo et al. Cambricon-u: A systolic random increment memory architecture for unary computing
Li et al. A precision-scalable deep neural network accelerator with activation sparsity exploitation
Potocnik et al. Optimizing foundation model inference on a many-tiny-core open-source risc-v platform
Liu et al. FPGA-Based Sparse Matrix Multiplication Accelerators: From State-of-the-art to Future Opportunities
Lee et al. MVP: An efficient CNN accelerator with matrix, vector, and processing-near-memory units
Choi et al. A deep neural network training architecture with inference-aware heterogeneous data-type
Zhang et al. Tensorcache: Reconstructing memory architecture with sram-based in-cache computing for efficient tensor computations in gpgpus
Li et al. Mathematical framework for optimizing crossbar allocation for reram-based CNN accelerators
Angizi et al. Processing-in-memory acceleration of mac-based applications using residue number system: A comparative study
Servais et al. Adaptive computation reuse for energy-efficient training of deep neural networks
Ghanbari et al. Energy-efficient acceleration of convolutional neural networks using computation reuse
US20230064886A1 (en) Techniques for data type detection with learned metadata
Haghi et al. O⁴-DNN: A Hybrid DSP-LUT-Based Processing Unit With Operation Packing and Out-of-Order Execution for Efficient Realization of Convolutional Neural Networks on FPGA Devices
Luo et al. A single clock cycle approximate adder with hybrid prediction and error compensation methods