Guo et al., 2023 - Google Patents

Cambricon-u: A systolic random increment memory architecture for unary computing

Guo et al., 2023

Document ID: 13538713478930611456
Author: Guo H; Zhao Y; Li Z; Hao Y; Liu C; Song X; Li X; Du Z; Zhang R; Guo Q; Chen T; Xu Z
Publication year: 2023
Publication venue: Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture

External Links

Cited by

Snippet

Unary computing, whose arithmetics require only one logic gate, has enabled efficient DNN processing, especially on strictly power-constrained devices. However, unary computing still confronts the power efficiency bottleneck for buffering unary bitstreams. The buffering of …

Continue reading at dl.acm.org (PDF) (other versions)

230000015654 memory 0 title abstract description 32

Classifications

- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F7/38—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
- G06F7/48—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
- G06F7/50—Adding; Subtracting
- G06F7/505—Adding; Subtracting in bit-parallel fashion, i.e. having a different digit-handling circuit for each denomination
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F7/38—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
- G06F7/48—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
- G06F7/52—Multiplying; Dividing
- G06F7/523—Multiplying only
- G06F7/53—Multiplying only in parallel-parallel fashion, i.e. both operands being entered in parallel
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/50—Computer-aided design
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/03—Arrangements for converting the position or the displacement of a member into a coded form
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformations of program code
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F19/00—Digital computing or data processing equipment or methods, specially adapted for specific applications
- G06F19/10—Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F2207/00—Indexing scheme relating to methods or arrangements for processing data by operating upon the order or content of the data handled
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F1/00—Details of data-processing equipment not covered by groups G06F3/00 - G06F13/00, e.g. cooling, packaging or power supply specially adapted for computer application

Similar Documents

Publication	Publication Date	Title
Catthoor et al.	2013	Custom memory management methodology: Exploration of memory organisation for embedded multimedia system design
Liang et al.	2012	High‐Level Synthesis: Productivity, Performance, and Software Constraints
Lee et al.	2018	Application codesign of near-data processing for similarity search
Nguyen et al.	2022	ShortcutFusion: From tensorflow to FPGA-based accelerator with a reuse-aware memory allocation for shortcut data
TW201602813A (en)	2016-01-16	Systems, apparatuses, and methods for feature searching
Drozd et al.	2017	Green IT engineering in the view of resource-based approach
Li et al.	2022	MeNTT: A compact and efficient processing-in-memory number theoretic transform (NTT) accelerator
Fu et al.	2021	2-in-1 accelerator: Enabling random precision switch for winning both adversarial robustness and efficiency
Gao et al.	2020	Millimeter-scale and billion-atom reactive force field simulation on sunway taihulight
US9626334B2 (en)	2017-04-18	Systems, apparatuses, and methods for K nearest neighbor search
Ghaffar et al.	2020	A low power in-DRAM architecture for quantized CNNs using fast Winograd convolutions
Guo et al.	2023	Cambricon-u: A systolic random increment memory architecture for unary computing
Li et al.	2023	A precision-scalable deep neural network accelerator with activation sparsity exploitation
Potocnik et al.	2024	Optimizing foundation model inference on a many-tiny-core open-source risc-v platform
Liu et al.	2024	FPGA-Based Sparse Matrix Multiplication Accelerators: From State-of-the-art to Future Opportunities
Lee et al.	2022	MVP: An efficient CNN accelerator with matrix, vector, and processing-near-memory units
Choi et al.	2021	A deep neural network training architecture with inference-aware heterogeneous data-type
Zhang et al.	2023	Tensorcache: Reconstructing memory architecture with sram-based in-cache computing for efficient tensor computations in gpgpus
Li et al.	2023	Mathematical framework for optimizing crossbar allocation for reram-based CNN accelerators
Angizi et al.	2021	Processing-in-memory acceleration of mac-based applications using residue number system: A comparative study
Servais et al.	2021	Adaptive computation reuse for energy-efficient training of deep neural networks
Ghanbari et al.	2022	Energy-efficient acceleration of convolutional neural networks using computation reuse
US20230064886A1 (en)	2023-03-02	Techniques for data type detection with learned metadata
Haghi et al.	2020	O⁴-DNN: A Hybrid DSP-LUT-Based Processing Unit With Operation Packing and Out-of-Order Execution for Efficient Realization of Convolutional Neural Networks on FPGA Devices
Luo et al.	2019	A single clock cycle approximate adder with hybrid prediction and error compensation methods