Lam et al., 2004 - Google Patents

A data locality optimizing algorithm

Lam et al., 2004

Document ID: 8970840579794733109
Author: Lam M; Wolf M
Publication year: 2004
Publication venue: ACM SIGPLAN Notices

External Links

Cited by

Snippet

This paper proposes an algorithm that improves the locality of a loop nest by transforming the code via interchange, reversal, skewing and tiling. The loop transformation algorithm is based on two concepts: a mathematical formulation of reuse and locality, and a loop …

Continue reading at www.cs.cmu.edu (PDF) (other versions)

230000001131 transforming 0 abstract description 15

Classifications

- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformations of program code
- G06F8/41—Compilation
- G06F8/44—Encoding
- G06F8/443—Optimisation
- G06F8/4441—Reducing the execution time required by the program code
- G06F8/4442—Reducing the number of cache misses; Data prefetching
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformations of program code
- G06F8/41—Compilation
- G06F8/45—Exploiting coarse grain parallelism in compilation, i.e. parallelism between groups of instructions
- G06F8/451—Code distribution
- G06F8/452—Loops
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformations of program code
- G06F8/41—Compilation
- G06F8/45—Exploiting coarse grain parallelism in compilation, i.e. parallelism between groups of instructions
- G06F8/456—Parallelism detection
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformations of program code
- G06F8/41—Compilation
- G06F8/44—Encoding
- G06F8/445—Exploiting fine grain parallelism, i.e. parallelism at instruction level
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformations of program code
- G06F8/41—Compilation
- G06F8/43—Checking; Contextual analysis
- G06F8/436—Semantic checking
- G06F8/437—Type checking
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformations of program code
- G06F8/41—Compilation
- G06F8/43—Checking; Contextual analysis
- G06F8/433—Dependency analysis; Data or control flow analysis
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/30—Arrangements for executing machine-instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline, look ahead
- G06F9/3885—Concurrent instruction execution, e.g. pipeline, look ahead using a plurality of independent parallel functional units
- G06F9/3889—Concurrent instruction execution, e.g. pipeline, look ahead using a plurality of independent parallel functional units controlled by multiple instructions, e.g. MIMD, decoupled access or execute
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformations of program code
- G06F8/41—Compilation
- G06F8/42—Syntactic analysis
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/30—Arrangements for executing machine-instructions, e.g. instruction decode
- G06F9/30003—Arrangements for executing specific machine instructions
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/50—Computer-aided design
- G06F17/5045—Circuit design
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/76—Architectures of general purpose stored programme computers
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation

Similar Documents

Publication	Publication Date	Title
Rus et al.	2002	Hybrid analysis: static & dynamic memory reference analysis
Ding et al.	2004	Improving effective bandwidth through compiler enhancement of global cache reuse
Hiranandani et al.	1992	Compiling Fortran D for MIMD distributed-memory machines
Yang et al.	2010	A GPGPU compiler for memory optimization and parallelism management
Rogers et al.	1989	Process decomposition through locality of reference
Carr	1993	Memory-hierarchy management
Gokhale et al.	1998	NAPA C: Compiling for a hybrid RISC/FPGA architecture
Leung et al.	2010	A mapping path for multi-GPGPU accelerated computers from a portable high level programming abstraction
Kennedy et al.	1998	Automatic data layout for distributed-memory machines
Sarkar	1997	Automatic selection of high-order transformations in the IBM XL FORTRAN compilers
Lam et al.	2004	A data locality optimizing algorithm
Franchetti et al.	2005	Efficient utilization of SIMD extensions
Lastovetsky	2008	Parallel computing on heterogeneous networks
Amarasinghe	1997	Parallelizing compiler techniques based on linear inequalities
Su et al.	1994	Processor Tagged Descriptors: A Data Structure for Compiling for Distributed-Memory Multicomputers.
Palermo	1996	Compiler techniques for optimizing communication and data distribution for distributed-memory multicomputers
Nicolau et al.	2009	Techniques for efficient placement of synchronization primitives
Lewis et al.	2000	Pipelining wavefront computations: Experiences and performance
Yang et al.	2008	Fei teng 64 stream processing system: architecture, compiler, and programming
Mellor-Crummey et al.	1998	Simplifying control flow in compiler-generated parallel code
Chang	2017	Toward performance portability for CPUS and GPUS through algorithmic compositions
Muller et al.	2001	Caches with compositional performance
Seo et al.	2002	HPF/JA: extensions of High Performance Fortran for accelerating real‐world applications
Paek	1997	Compiling for distributed memory multiprocessors based on access region analysis
Rinard et al.	2003	Eliminating synchronization bottlenecks using adaptive replication