[go: up one dir, main page]

Academia.eduAcademia.edu
2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2010) Atlanta, Georgia, USA 4 – 8 December 2010 IEEE Catalog Number: ISBN: CFP10071-PRT 978-1-4244-9071-4 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture MICRO 2010 Table of Contents Message from the General Co-Chairs.....................................................................................................x Message from the Program Chair.............................................................................................................xi Organizing Committee....................................................................................................................................xii Program Committee........................................................................................................................................xiii MICRO 2010 Reviewers.................................................................................................................................xiv Session 1: Transactional Systems Scalable Speculative Parallelization on Commodity Clusters ........................................................................................3 Hanjun Kim, Arun Raman, Feng Liu, Jae W. Lee, and David I. August Hardware Support for Relaxed Concurrency Control in Transactional Memory .........................................................15 Utku Aydonat and Tarek S. Abdelrahman A Dynamically Adaptable Hardware Transactional Memory ......................................................................................27 Marc Lupon, Grigorios Magklis, and Antonio González ASF: AMD64 Extension for Lock-Free Data Structures and Transactional Memory .................................................39 Jaewoong Chung, Luke Yen, Stephan Diestelhorst, Martin Pohlack, Michael Hohmuth, David Christie, and Dan Grossman Session 2A: Scheduling Memory Latency Reduction via Thread Throttling ......................................................................................................53 Hsiang-Yun Cheng, Chung-Hsiang Lin, Jian Li, and Chia-Lin Yang Thread Cluster Memory Scheduling: Exploiting Differences in Memory Access Behavior ........................................................................................................................................................................65 Yoongu Kim, Michael Papamichael, Onur Mutlu, and Mor Harchol-Balter v Voltage Smoothing: Characterizing and Mitigating Voltage Noise in Production Processors via Software-Guided Thread Scheduling ...................................................................................................77 Vijay Janapa Reddi, Svilen Kanev, Wonyoung Kim, Simone Campanoni, Michael D. Smith, Gu-Yeon Wei, and David Brooks Task Superscalar: An Out-of-Order Task Pipeline .......................................................................................................89 Yoav Etsion, Felipe Cabarcas, Alejandro Rico, Alex Ramirez, Rosa M. Badia, Eduard Ayguade, Jesus Labarta, and Mateo Valero Session 2B: Reliability/Scheduling Combating Aging with the Colt Duty Cycle Equalizer ..............................................................................................103 Erika Gunadi, Abhisek A. Sinkar, Nam Sung Kim, and Mikko H. Lipasti SAFER: Stuck-At-Fault Error Recovery for Memories .............................................................................................115 Nak Hee Seong, Dong Hyuk Woo, Vijayalakshmi Srinivasan, Jude A. Rivers, and Hsien-Hsin S. Lee AVF Stressmark: Towards an Automated Methodology for Bounding the Worst-Case Vulnerability to Soft Errors ........................................................................................................................................125 Arun Arvind Nair, Lizy Kurian John, and Lieven Eeckhout Flexible and Efficient Instruction-Grained Run-Time Monitoring Using On-Chip Reconfigurable Fabric ................................................................................................................................................137 Daniel Y. Deng, Daniel Lo, Greg Malysa, Skyler Schneider, and G. Edward Suh Session 3A: Caching Achieving Non-Inclusive Cache Performance with Inclusive Caches: Temporal Locality Aware (TLA) Cache Management Policies .................................................................................................151 Aamer Jaleel, Eric Borch, Malini Bhandaru, Simon C. Steely Jr., and Joel Emer STEM: Spatiotemporal Management of Capacity for Intra-core Last Level Caches .................................................163 Dongyuan Zhan, Hong Jiang, and Sharad C. Seth Sampling Dead Block Prediction for Last-Level Caches ...........................................................................................175 Samira Manabi Khan, Yingying Tian, and Daniel A. Jiménez The ZCache: Decoupling Ways and Associativity .....................................................................................................187 Daniel Sanchez and Christos Kozyrakis Session 3B: Data Parallelism Efficient Selection of Vector Instructions Using Dynamic Programming.................................................................201 Rajkishore Barik, Jisheng Zhao, and Vivek Sarkar Many-Thread Aware Prefetching Mechanisms for GPGPU Applications .................................................................213 Jaekyu Lee, Nagesh B. Lakshminarayana, Hyesoon Kim, and Richard Vuduc vi Single-Chip Heterogeneous Computing: Does the Future Include Custom Logic, FPGAs, and GPGPUs? ...............................................................................................................................................225 Eric S. Chung, Peter A. Milder, James C. Hoe, and Ken Mai Improving SIMT Efficiency of Global Rendering Algorithms with Architectural Support for Dynamic Micro-Kernels ..........................................................................................................................237 Michael Steffen and Joseph Zambreno Session 4A: Concurrency InstantCheck: Checking the Determinism of Parallel Programs Using On-the-Fly Incremental Hashing ...................................................................................................................................................251 Adrian Nistor, Darko Marinov, and Josep Torrellas Tolerating Concurrency Bugs Using Transactions as Lifeguards ..............................................................................263 Jie Yu and Satish Narayanasamy Architectural Support for Fair Reader-Writer Locking ..............................................................................................275 Enrique Vallejo, Ramón Beivide, Adrián Cristal, Tim Harris, Fernando Vallejo, Osman Unsal, and Mateo Valero AtomTracker: A Comprehensive Approach to Atomic Region Inference and Violation Detection .....................................................................................................................................................................287 Abdullah Muzahid, Norimasa Otsuki, and Josep Torrellas Session 4B: Microarchitecture I Register Cache System Not for Latency Reduction Purpose .....................................................................................301 Ryota Shioya, Kazuo Horio, Masahiro Goshima, and Shuichi Sakai Synergistic TLBs for High Performance Address Translation in Chip Multiprocessors ...........................................313 Shekhar Srikantaiah and Mahmut Kandemir Erasing Core Boundaries for Robust and Configurable Performance ........................................................................325 Shantanu Gupta, Shuguang Feng, Amin Ansari, and Scott Mahlke Minimal Multi-threading: Finding and Removing Redundant Instructions in Multi-threaded Processors ......................................................................................................................................337 Guoping Long, Diana Franklin, Susmit Biswas, Pablo Ortiz, Jason Oberg, Dongrui Fan, and Frederic T. Chong Session 5A: Memories Parichute: Generalized Turbocode-Based Error Correction for Near-Threshold Caches ..........................................351 Timothy N. Miller, Renji Thomas, James Dinan, Bruce Adcock, and Radu Teodorescu Understanding the Energy Consumption of Dynamic Random Access Memories ....................................................363 Thomas Vogelsang Elastic Refresh: Techniques to Mitigate Refresh Penalties in High Density Memory ...............................................375 Jeffrey Stuecheli, Dimitris Kaseridis, Hillery C.Hunter, and Lizy K. John vii Moneta: A High-Performance Storage Array Architecture for Next-Generation, Non-volatile Memories ...............................................................................................................................................385 Adrian M. Caulfield, Arup De, Joel Coburn, Todor I. Mollow, Rajesh K. Gupta, and Steven Swanson Session 5B: NoCs Pseudo-Circuit: Accelerating Communication for On-Chip Interconnection Networks ............................................399 Minseon Ahn and Eun Jung Kim LOFT: A High Performance Network-on-Chip Providing Quality-of-Service Support ............................................409 Jin Ouyang and Yuan Xie Throughput-Effective On-Chip Networks for Manycore Accelerators ......................................................................421 Ali Bakhoda, John Kim, and Tor M. Aamodt Adaptive Flow Control for Robust Performance and Energy ....................................................................................433 Syed Ali Raza Jafri, Yu-Ju Hong, Mithuna Thottethodi, and T.N. Vijaykumar Session 6A: Coherence ScalableBulk: Scalable Cache Coherence for Atomic Blocks in a Lazy Environment ..............................................447 Xuehai Qian, Wonsun Ahn, and Josep Torrellas Virtual Snooping: Filtering Snoops in Virtualized Multi-cores .................................................................................459 Daehoon Kim, Hwanju Kim, and Jaehyuk Huh Fractal Coherence: Scalably Verifiable Cache Coherence .........................................................................................471 Meng Zhang, Alvin R. Lebeck, and Daniel J. Sorin Session 6B: Microarchitecture II A Predictive Model for Dynamic Microarchitectural Adaptivity Control .................................................................485 Christophe Dubach, Timothy M. Jones, Edwin V. Bonilla, and Michael F.P. O’Boyle ReMAP: A Reconfigurable Heterogeneous Multicore Architecture ..........................................................................497 Matthew A. Watkins and David H. Albonesi Probabilistic Distance-Based Arbitration: Providing Equality of Service for Many-Core CMPs .................................................................................................................................................509 Michael M. Lee, John Kim, Dennis Abts, Michael Marty, and Jae W. Lee Session 7: Tools Adaptive and Speculative Slack Simulations of CMPs on CMPs ..............................................................................523 Jainwei Chen, Lakshmi Kumar Dabbiru, Daniel Wong, Murali Annavaram, and Michel Dubois SD3: A Scalable Approach to Dynamic Data-Dependence Profiling ........................................................................535 Minjang Kim, Hyesoon Kim, and Chi-Keung Luk viii Automatic Parallelization in a Binary Rewriter ..........................................................................................................547 Aparna Kotha, Kapil Anand, Matthew Smithson, Greeshma Yellareddy, and Rajeev Barua Author Index .......................................................................................................................................................559 ix