Catanzaro et al., 2008 - Google Patents
A map reduce framework for programming graphics processorsCatanzaro et al., 2008
View PDF- Document ID
- 960278597208957899
- Author
- Catanzaro B
- Sundaram N
- Keutzer K
- Publication year
- Publication venue
- Workshop on Software Tools for MultiCore Systems
External Links
Snippet
Recent developments in programmable, highly parallel Graphics Processing Units (GPUs) have enabled high performance general purpose computation. We describe a framework designed for high performance GPU programming, built on Nvidia's Compute Unified Device …
- 230000018109 developmental process 0 abstract description 2
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformations of program code
- G06F8/41—Compilation
- G06F8/44—Encoding
- G06F8/443—Optimisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/44—Arrangements for executing specific programmes
- G06F9/4421—Execution paradigms
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5061—Partitioning or combining of resources
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/44—Arrangements for executing specific programmes
- G06F9/455—Emulation; Software simulation, i.e. virtualisation or emulation of application or operating system execution engines
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformations of program code
- G06F8/41—Compilation
- G06F8/45—Exploiting coarse grain parallelism in compilation, i.e. parallelism between groups of instructions
- G06F8/456—Parallelism detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/70—Software maintenance or management
- G06F8/76—Adapting program code to run in a different environment; Porting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/50—Computer-aided design
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/30—Creation or generation of source code
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
- G06N99/005—Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/76—Architectures of general purpose stored programme computers
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/16—Combinations of two or more digital computers each having at least an arithmetic unit, a programme unit and a register, e.g. for a simultaneous processing of several programmes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/02—Knowledge representation
- G06N5/022—Knowledge engineering, knowledge acquisition
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Catanzaro et al. | A map reduce framework for programming graphics processors | |
Catanzaro et al. | Fast support vector machine training and classification on graphics processors | |
Lai et al. | HeteroCL: A multi-paradigm programming infrastructure for software-defined reconfigurable computing | |
Wu et al. | Machine learning at facebook: Understanding inference at the edge | |
De Sa et al. | Understanding and optimizing asynchronous low-precision stochastic gradient descent | |
US8225074B2 (en) | Methods and systems for managing computations on a hybrid computing platform including a parallel accelerator | |
US20200327448A1 (en) | Predicting machine learning or deep learning model training time | |
Yeung et al. | Map-reduce as a programming model for custom computing machines | |
Kumar et al. | Energy-efficient machine learning on the edges | |
US11789711B2 (en) | Using artificial intelligence to optimize software to run on heterogeneous computing resource | |
Wang et al. | Deep learning at scale and at ease | |
CN102779207A (en) | Wing profile optimal design method of parallel difference evolutionary algorithm based on open computing language (Open CL) | |
Meloni et al. | ALOHA: an architectural-aware framework for deep learning at the edge | |
Carvalho et al. | Using machine learning techniques to analyze the performance of concurrent kernel execution on GPUs | |
CN115469931B (en) | Instruction optimization method, device, system, equipment and medium of loop program | |
Li et al. | Deep learning and machine learning with gpgpu and cuda: Unlocking the power of parallel computing | |
Ma et al. | Efficiently emulating high-bitwidth computation with low-bitwidth hardware | |
Palkowski et al. | Tuning iteration space slicing based tiled multi-core code implementing Nussinov’s RNA folding | |
Wang et al. | Paralleljs: An execution framework for javascript on heterogeneous systems | |
WO2022031561A1 (en) | Memory usage prediction for machine learning and deep learning models | |
CN118536565A (en) | AI algorithm acceleration method, device, equipment and readable storage medium | |
Šinkarovs et al. | Convolutional neural networks in apl | |
Ivanov et al. | Sten: Productive and efficient sparsity in pytorch | |
Kayraklioglu et al. | A machine-learning-based framework for productive locality exploitation | |
Nanjappa | CAFFE2 QUICK START GUIDE: modular and scalable deep learning made easy |