[go: up one dir, main page]

Catanzaro et al., 2008 - Google Patents

A map reduce framework for programming graphics processors

Catanzaro et al., 2008

View PDF
Document ID
960278597208957899
Author
Catanzaro B
Sundaram N
Keutzer K
Publication year
Publication venue
Workshop on Software Tools for MultiCore Systems

External Links

Snippet

Recent developments in programmable, highly parallel Graphics Processing Units (GPUs) have enabled high performance general purpose computation. We describe a framework designed for high performance GPU programming, built on Nvidia's Compute Unified Device …
Continue reading at parlab.eecs.berkeley.edu (PDF) (other versions)

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformations of program code
    • G06F8/41Compilation
    • G06F8/44Encoding
    • G06F8/443Optimisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for programme control, e.g. control unit
    • G06F9/06Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
    • G06F9/44Arrangements for executing specific programmes
    • G06F9/4421Execution paradigms
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for programme control, e.g. control unit
    • G06F9/06Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for programme control, e.g. control unit
    • G06F9/06Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
    • G06F9/44Arrangements for executing specific programmes
    • G06F9/455Emulation; Software simulation, i.e. virtualisation or emulation of application or operating system execution engines
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformations of program code
    • G06F8/41Compilation
    • G06F8/45Exploiting coarse grain parallelism in compilation, i.e. parallelism between groups of instructions
    • G06F8/456Parallelism detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/70Software maintenance or management
    • G06F8/76Adapting program code to run in a different environment; Porting
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/50Computer-aided design
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/30Creation or generation of source code
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N99/00Subject matter not provided for in other groups of this subclass
    • G06N99/005Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computer systems based on biological models
    • G06N3/02Computer systems based on biological models using neural network models
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/76Architectures of general purpose stored programme computers
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/16Combinations of two or more digital computers each having at least an arithmetic unit, a programme unit and a register, e.g. for a simultaneous processing of several programmes
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computer systems utilising knowledge based models
    • G06N5/02Knowledge representation
    • G06N5/022Knowledge engineering, knowledge acquisition

Similar Documents

Publication Publication Date Title
Catanzaro et al. A map reduce framework for programming graphics processors
Catanzaro et al. Fast support vector machine training and classification on graphics processors
Lai et al. HeteroCL: A multi-paradigm programming infrastructure for software-defined reconfigurable computing
Wu et al. Machine learning at facebook: Understanding inference at the edge
De Sa et al. Understanding and optimizing asynchronous low-precision stochastic gradient descent
US8225074B2 (en) Methods and systems for managing computations on a hybrid computing platform including a parallel accelerator
US20200327448A1 (en) Predicting machine learning or deep learning model training time
Yeung et al. Map-reduce as a programming model for custom computing machines
Kumar et al. Energy-efficient machine learning on the edges
US11789711B2 (en) Using artificial intelligence to optimize software to run on heterogeneous computing resource
Wang et al. Deep learning at scale and at ease
CN102779207A (en) Wing profile optimal design method of parallel difference evolutionary algorithm based on open computing language (Open CL)
Meloni et al. ALOHA: an architectural-aware framework for deep learning at the edge
Carvalho et al. Using machine learning techniques to analyze the performance of concurrent kernel execution on GPUs
CN115469931B (en) Instruction optimization method, device, system, equipment and medium of loop program
Li et al. Deep learning and machine learning with gpgpu and cuda: Unlocking the power of parallel computing
Ma et al. Efficiently emulating high-bitwidth computation with low-bitwidth hardware
Palkowski et al. Tuning iteration space slicing based tiled multi-core code implementing Nussinov’s RNA folding
Wang et al. Paralleljs: An execution framework for javascript on heterogeneous systems
WO2022031561A1 (en) Memory usage prediction for machine learning and deep learning models
CN118536565A (en) AI algorithm acceleration method, device, equipment and readable storage medium
Šinkarovs et al. Convolutional neural networks in apl
Ivanov et al. Sten: Productive and efficient sparsity in pytorch
Kayraklioglu et al. A machine-learning-based framework for productive locality exploitation
Nanjappa CAFFE2 QUICK START GUIDE: modular and scalable deep learning made easy