Joel Emer

Followers

Following

Co-authors

Public Views

Interests

Uploads

Papers by Joel Emer

A comparative study of arbitration algorithms for the Alpha 21364 pipelined router

Computer architecture news, Oct 1, 2002

Download

Optimizing Compression Schemes for Parallel Sparse Tensor Algebra

2023 Data Compression Conference (DCC)

Advanced Technologies

Synthesis Lectures on Computer Architecture, 2020

Hierarchical circuit integrated cache memory

An Architectural Perspective on Soft Errors From Cosmic Radiation

Multiplying Alpha Performance

Sparseloop: An Analytical Approach To Sparse Tensor Accelerator Modeling

2022 55th IEEE/ACM International Symposium on Microarchitecture (MICRO)

Download

The Sparse Abstract Machine

Download

Fractal

Computer architecture news, Jun 24, 2017

Download

A-Port Networks

ACM Transactions on Reconfigurable Technology and Systems, Sep 1, 2009

Download

(FPL 2015) Scavenger

ACM Transactions on Reconfigurable Technology and Systems, Mar 22, 2017

High-level abstractions separate algorithm design from platform implementation, allowing programm... more High-level abstractions separate algorithm design from platform implementation, allowing programmers to focus on algorithms while building complex systems. This separation also provides system programmers and compilers an opportunity to optimize platform services on an application-by-application basis. In field-programmable gate arrays (FPGAs), platform-level malleability extends to the memory system: Unlike general-purpose processors, in which memory hardware is fixed at design time, the capacity, associativity, and topology of FPGA memory systems may all be tuned to improve application performance. Since application kernels may only explicitly use few memory resources, substantial memory capacity may be available to the platform for use on behalf of the user program. In this work, we present Scavenger, which utilizes spare resources to construct program-optimized memories, and we also perform an initial exploration of methods for automating the construction of these application-specific memory hierarchies. Although exploiting spare resources can be beneficial, naïvely consuming all memory resources may cause frequency degradation. To relieve timing pressure in large block RAM (BRAM) structures, we provide microarchitectural techniques to trade memory latency for design frequency. We demonstrate, by examining a set of benchmarks, that our scalable cache microarchitecture achieves performance gains of 7% to 74% (with a 26% geometric mean on average) over the baseline cache microarchitecture when scaling the size of first-level caches to the maximum.

Late-binding

Download

A comparative study of arbitration algorithms for the Alpha 21364 pipelined router

Sigplan Notices, Oct 1, 2002

Download

LoopTree: Enabling Exploration of Fused-layer Dataflow Accelerators

2023 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)

RAELLA: Reforming the Arithmetic for Efficient, Low-Resolution, and Low-Loss Analog PIM: No Retraining Required!

Proceedings of the 50th Annual International Symposium on Computer Architecture

Download

The Sparse Abstract Machine

Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 3

Download

Overview of Deep Neural Networks

Synthesis Lectures on Computer Architecture, 2020

Designing Efficient DNN Models

Springer eBooks, 2020

IEEE 68 Computer

such machines. Asim addresses these needs by providing a framework for creating many models, inst... more such machines. Asim addresses these needs by providing a framework for creating many models, instead of being a single performance model. More specifically, Asim achieves these goals through modularity and reusability. Modularity helps break down the performance -modeling problem into individual pieces that can be modeled separately, while reusability allows using a software component repeatedly in different contexts. Reusability increases productivity and confidence in the robustness of the software component itself. Asim provides a set of tools that can effectively manage these software components to help model writers deal with a large software base's complexity. BASIC COMPONENTS In Asim, the basic software component, or module, will usually represent a physical component of a design, such as a cache, or capture a hardware algorithm's operation, such as the cache's replacement policy. A particular model will be represented as a user-selected hierarchy of modul

Download