COMPUTER ORGANIZATION AND DESIGN RISC-V
Edition
The Hardware/Software Interface
Chapter 6
Selected topics from
Parallel Processors from
Client to Cloud
Adapted by Prof. Gheith Abandah
Last updated 13/4/2023
Contents
6.3 SISD, MIMD, SIMD, SPMD, and Vector
6.4 Hardware Multithreading
Chapter 6 — Parallel Processors from Client to Cloud — 2
SIMD
◼ Modern processors feature single instruction, multiple
data (SIMD) instruction extensions.
◼ Operate elementwise on vectors of data
◼ E.g., MMX and SSE instructions in x86
◼ Multiple data elements in 128-bit wide registers
Chapter 6 — Parallel Processors from Client to Cloud — 3
§6.4 Hardware Multithreading
Hardware Multithreading
◼ The processor executes multiple threads of execution in
parallel
◼ Replicate registers, PC, etc.
◼ Fast switching between threads
◼ Fine-grain multithreading
◼ Switch threads after each cycle
◼ Interleave instruction execution
◼ If one thread stalls, others are executed
◼ Coarse-grain multithreading
◼ Only switch on long stall (e.g., L2-cache miss)
◼ Simplifies hardware, but doesn’t hide short stalls (e.g., data hazards)
Chapter 6 — Parallel Processors from Client to Cloud — 4
Hardware Multithreading
◼ Simultaneous Multithreading (SMT): In multiple-issue
dynamically scheduled processor
◼ Schedule instructions from multiple threads
◼ Instructions from independent threads execute when function units are
available
◼ Within threads, dependencies handled by scheduling and register
renaming
◼ Example: Intel Pentium-4 HT
◼ Two threads: duplicated registers, shared function units and caches
Chapter 6 — Parallel Processors from Client to Cloud — 5
Multithreading Examples
Simultaneous
Superscalar Fine-Grained Coarse-Grained Multiprocessing Multithreading
Time (processor cycle)
Thread 1 Thread 3 Thread 5
Thread 2 Thread 4 Idle slot
Chapter 6 — Parallel Processors from Client to Cloud — 6