Skip to main content

    Keshav Pingali

    In this paper, we discuss various performance overheads in MATLAB codes and propose different program transformation strategies to overcome them. In particular, we demonstrate that high-level source-to-source transformations of MATLAB... more
    In this paper, we discuss various performance overheads in MATLAB codes and propose different program transformation strategies to overcome them. In particular, we demonstrate that high-level source-to-source transformations of MATLAB programs are effective in obtaining substantial performance gains regardless of whether programs are interpreted or later compiled into C or FORTRAN. We argue that automating such transformations provides a promising area of future research.
    Abstract. We have recently developed a new program analysis strategy called fractal symbolic analysis that addresses some of limitations of techniques such as dependence analysis. In this paper, we show how fractal symbolic analysis can... more
    Abstract. We have recently developed a new program analysis strategy called fractal symbolic analysis that addresses some of limitations of techniques such as dependence analysis. In this paper, we show how fractal symbolic analysis can be used to convert between ...
    abstract The problem of writing software for multicore processors is greatly simplified if we could automatically parallelize sequential programs. Although auto-parallelization has been studied for many decades, it has succeeded only in a... more
    abstract The problem of writing software for multicore processors is greatly simplified if we could automatically parallelize sequential programs. Although auto-parallelization has been studied for many decades, it has succeeded only in a few application areas such as dense matrix computations. In particular, auto-parallelization of irregular programs, which are organized around large, pointer-based data struc- tures like graphs, has seemed
    Fractal symbolic analysis is a symbolic analysis technique for verifying the legality of program transformations. It is strictly more powerful than depen-dence analysis; for example, it can be used to verify the legality of blocking LU... more
    Fractal symbolic analysis is a symbolic analysis technique for verifying the legality of program transformations. It is strictly more powerful than depen-dence analysis; for example, it can be used to verify the legality of blocking LU factorization with pivoting, a task for which ...
    ... 72 Page 5. and send the block in one message to the right. The best block size depends on the size of the matrix. In the remainder of the paper, we discuss how our compiler generates code similar to that of Figure 2 from the program... more
    ... 72 Page 5. and send the block in one message to the right. The best block size depends on the size of the matrix. In the remainder of the paper, we discuss how our compiler generates code similar to that of Figure 2 from the program of Figure 1. 3 Code Generation ...
    ... Calculation ofPseudospectra by the Arnoldi Iteration . Kim-Chuan Toh and Lloyd N. Trefethen, CTC94TR1 7 9 , May 1994. 14 Page 15. ... Chunguang Sun, CTC94TR1û5 ,July 1994. Modifying a Rank-revealingULLV Decomposition .James M. Lebak... more
    ... Calculation ofPseudospectra by the Arnoldi Iteration . Kim-Chuan Toh and Lloyd N. Trefethen, CTC94TR1 7 9 , May 1994. 14 Page 15. ... Chunguang Sun, CTC94TR1û5 ,July 1994. Modifying a Rank-revealingULLV Decomposition .James M. Lebak and AdamW. ...
    ... Structures *` Vladimir Kotlyar, Keshav Pingali and Paul Stodghill Department of Computer Science Cornell University, Ithaca, NY 14853 {vladimir,pingali,stodghil}@cs.cornell.edu June 2, 1997 Abstract ... SPMD translat i on phase ta k... more
    ... Structures *` Vladimir Kotlyar, Keshav Pingali and Paul Stodghill Department of Computer Science Cornell University, Ithaca, NY 14853 {vladimir,pingali,stodghil}@cs.cornell.edu June 2, 1997 Abstract ... SPMD translat i on phase ta k es an HPF-lik e parallel program description ...
    Restructuring compilers use dependence analysis to prove that the meaning of a program is not changed by a transformation. A well-known limitation of dependence analysis is that it examines only the memory locations read and written by a... more
    Restructuring compilers use dependence analysis to prove that the meaning of a program is not changed by a transformation. A well-known limitation of dependence analysis is that it examines only the memory locations read and written by a statement, and does not assume any particular interpretation for the operations in that statement. Exploiting the semantics of these operations enables a
    Restructuring compilers use dependence analysis to prove that the meaning of a program is not changed by a transformation. A well-known limitation of dependence analysis is that it examines only the memory locations read and written by a... more
    Restructuring compilers use dependence analysis to prove that the meaning of a program is not changed by a transformation. A well-known limitation of dependence analysis is that it examines only the memory locations read and written by a statement, and does not assume any particular interpretation for the operations in that statement. Exploiting the semantics of these operations enables a wider set of transformations to be used, and is critical for optimizing important codes such as LU factorization with pivoting. Symbolic execution of programs enables the exploitation of such semantic properties, but it is intractable for all but the simplest programs. In this paper, we propose a new form of symbolic analysis for use in restructuring compilers. Fractal symbolic analysis compares a program and its transformed version by repeatedly simplifying these programs until symbolic analysis becomes tractable, ensuring that equality of simplified programs is sufficient to guarantee equality of the original programs. We present a prototype implementation of fractal symbolic analysis, and show how it can be used to optimize the cache performance of LU factorization with pivoting.
    Control dependence information is useful for a wide range of software maintenance and testing tasks. For example, pro-gram slicers use it to determine statements and predicates that might affect the value of a particular variable at a... more
    Control dependence information is useful for a wide range of software maintenance and testing tasks. For example, pro-gram slicers use it to determine statements and predicates that might affect the value of a particular variable at a partic-ular program location. In the ...
    ... Tree: Computing Control Regions in Linear Time David Pearson Keshav Pingali pearson@cs.cornell.edtt pirtgaM@cs.comell.edu ... David Pearson is supported by aFannie and John Hertz Fellowship. Permission to copy without fee all or part... more
    ... Tree: Computing Control Regions in Linear Time David Pearson Keshav Pingali pearson@cs.cornell.edtt pirtgaM@cs.comell.edu ... David Pearson is supported by aFannie and John Hertz Fellowship. Permission to copy without fee all or part of this material is ...
    We describe a novel approach to sparse and dense SPMD code generation: we view arrays (sparse and dense) as distributed relations and parallel loop execution as distributed relational query evaluation. This approach p r o vides for a... more
    We describe a novel approach to sparse and dense SPMD code generation: we view arrays (sparse and dense) as distributed relations and parallel loop execution as distributed relational query evaluation. This approach p r o vides for a uniform treatment of arbitrary sparse matrix formats and partitioning information formats. The relational algebra view of computation and communication sets provides new opportunities for the optimization of node program perfor-mance and the reduction of communucation set generation and index translation overhead.
    ance,string-to-stringcommunication.They do nothave a common semanticbasethatwould allowone system to "collaborate"withanother.Thisis theconnectivity problem .To addresstheconnectivityproblem,a common mathematical bus (theMathBus... more
    ance,string-to-stringcommunication.They do nothave a common semanticbasethatwould allowone system to "collaborate"withanother.Thisis theconnectivity problem .To addresstheconnectivityproblem,a common mathematical bus (theMathBus )willserveas thebackboneofthesystem.Itscommunicationprotocolswillbe based on a typedformallanguagewhich providesthesemanticsforcollaboration.A major designobjectiveisto raisethelevelof communicationamong softwaretools,allowingthe communicationofmathematicalobjectsinsteadofbeingrestrictedtosimplestrings. Althoughexistingsoftwarehas contributedsubstantiallyto scientificprogramming productivity, thetimetakentogeneratecoderemainsa majorimpedimenttoprogressincomputationalscience. Thisisthecode creation problem .Inpart,thisproblemisdue tothedifficulty ofexpressingcertainmathematicaltechniquesassubroutines.The problem ofcode creation isaddressedwitha method oftransformationand refinement,allowingthetransformatio
    Tiling is one of the more important transformations for en-hancing locality of reference in programs. Intuitively, tiling a set of loops achieves the effect of interleaving iterations of these loops. Tiling of perfectly-nested loop nests... more
    Tiling is one of the more important transformations for en-hancing locality of reference in programs. Intuitively, tiling a set of loops achieves the effect of interleaving iterations of these loops. Tiling of perfectly-nested loop nests (which are loop nests in which all assignment ...
    Many applications that require high-performance computing perform computations on sparse matrices. For example, the finite-element method for solving partial differential equa-tions approximately requires the solution of large linear... more
    Many applications that require high-performance computing perform computations on sparse matrices. For example, the finite-element method for solving partial differential equa-tions approximately requires the solution of large linear sys-tems of the form Ax = b where A is a large sparse ...
    We describe and evaluate ordered and unordered algorithms for shared-memory parallel breadth-first search. The unordered algorithm is based on viewing breadth-first search as a fixpoint computation, and in general, it may perform more... more
    We describe and evaluate ordered and unordered algorithms for shared-memory parallel breadth-first search. The unordered algorithm is based on viewing breadth-first search as a fixpoint computation, and in general, it may perform more work than the ordered algorithms while requiring less global synchronization.
    It is becoming important for long-running scientific applications to tolerate hardware faults. The most commonly used approach is checkpoint and restart (CPR) - the computation's state is saved periodically to disk. Upon failure the... more
    It is becoming important for long-running scientific applications to tolerate hardware faults. The most commonly used approach is checkpoint and restart (CPR) - the computation's state is saved periodically to disk. Upon failure the computation is restarted from the last saved state. The common CPR mechanism, called System-level Checkpointing (SLC), requires modifying the Operating System and the communication libraries to enable them to save the state of the entire parallel application. This approach is not portable since a checkpointer for one system rarely works on another. Application-level Checkpointing (ALC) is a portable alternative where the programmer manually modifies their program to enable CPR, a very labor-intensive task.We are investigating the use of compiler technology to instrument codes to embed the ability to tolerate faults into applications themselves, making them self-checkpointing and self-restarting on any platform. In [9] we described a general approach ...
    Research Interests:

    And 114 more