[go: up one dir, main page]

0% found this document useful (0 votes)
24 views11 pages

Prebook MCAP

The document summarizes a book titled "Multi-Core Architectures and Programming" that discusses multi-core processor architecture and parallel programming using OpenMP. It contains 5 units that cover: (1) multi-core processor architecture including cache coherence and performance issues; (2) parallel programming challenges like synchronization, data sharing and scalability; (3) shared memory programming using OpenMP directives and libraries; (4) distributed memory programming using MPI; and (5) parallel programming case studies on n-body solvers and tree search problems. The book provides an introduction to multi-core processors and parallel programming concepts.

Uploaded by

Suganya C
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views11 pages

Prebook MCAP

The document summarizes a book titled "Multi-Core Architectures and Programming" that discusses multi-core processor architecture and parallel programming using OpenMP. It contains 5 units that cover: (1) multi-core processor architecture including cache coherence and performance issues; (2) parallel programming challenges like synchronization, data sharing and scalability; (3) shared memory programming using OpenMP directives and libraries; (4) distributed memory programming using MPI; and (5) parallel programming case studies on n-body solvers and tree search problems. The book provides an introduction to multi-core processors and parallel programming concepts.

Uploaded by

Suganya C
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/311986658

Multi-Core Architectures and Programming

Book · October 2016

CITATIONS READS

2 14,837

2 authors:

Krishna Sankar P Shangaranarayanee N P


Tata Consultancy Services Limited Angel college of Engineering and Technology
58 PUBLICATIONS 177 CITATIONS 25 PUBLICATIONS 116 CITATIONS

SEE PROFILE SEE PROFILE

All content following this page was uploaded by Krishna Sankar P on 24 August 2020.

The user has requested enhancement of the downloaded file.


MULTI-CORE ARCHITECTURES AND PROGRAMMING

BE Common to
Professional
All IV VII
Elective
BTECH Department

(CS8083 - Multi-core Architectures and Programming)

As per the Latest Syllabus of Anna University, Chennai

(Regulation 2017)

Mr. P. Krishna Sankar, B.E., M.E.,

Data Science – Freelance Trainer & Consultant

Ms. N. P. Shangaranarayanee, B.E., M.E.,

Associate – Software Development,

Senteena Business Solutions


Preface

This book “Multi-Core Architectures and Programming” is about an introductory conceptual idea about Multicore Processor
with Architecture and programming using OpenMP API. It gives an outline on Multicore Architecture and its functional blocks
like Intercommunication, Cache and Memory. It provides an ideology of working mechanism process scheduling in Operating
System is performed in a Multicore processor. Memory programming in core processor using OpenMP API and its libraries
for C language is discussed.

Unit I: Introduction towards Single and Multi-core processor architecture along with working mechanism like Shared Memory
Architecture, Cache, Intercommunication and queuing policies. Comparatively performance issues of single and multicore
architecture are discussed.

Unit II: Parallel programming challenges deals with prerequisite idea of operating system’s resource allocation, data sharing,
synchronization, deadlock, Semaphore, Signal, Pipes and Threads on Multicore processing.

Unit III: Outline towards shared memory programming through OpenMP. Writing an OpenMP Application Programming
Interface’s simple program and procedure of setting up environment was illuminated. Preliminary idea on C language to
write simple programs using OpenMP API was detailed.

Unit IV: Introduction towards the simple program execution in multicore processor was demonstrated procedurally. Outline
about creating the distributed memory through concept of constructs, derived data types and their corresponding libraries
along with their communication.

Unit V: Parallel program development using OpenMP case studies like n-Body solvers with basic and reduced solvers and
Tree search to find a tour plan of a travelling salesman problem through depth-first search. Finally, summarizing the Message
Passing Interface standards that exist in OpenMP libraries.
Contents
UNIT I

MULTI-CORE PROCESSORS

1.1 Single core to Multi-core architectures

1.1.1 Introduction

1.1.2 Single-Core Processor

1.1.3 Multi-core processor

1.1.4 Development

1.1.5 Technical factors

1.1.6 Advantages

1.2 SIMD and MIMD systems

1.2.1 SIMD Systems

1.2.2 MIMD Systems

1.3 Interconnection networks

1.3.1 Shared-memory interconnects

1.3.2 Distributed-memory interconnects

1.4 Symmetric and Distributed Shared Memory Architectures

1.4.1 Symmetric Shared Memory Architectures

1.4.2 Distributed Shared Memory Architectures

1.5 Cache coherence

1.5.1 Snooping cache coherence

1.5.2 Directory-based cache coherence

1.5.3 False sharing

1.6 Performance Issues

1.6.1 Speedup and efficiency

1.6.2 Amdahl’s law

1.6.3 Scalability

1.6.4 Taking timings

1.7 Parallel program design

1.7.1 An example
UNIT 2

PARALLEL PROGRAM CHALLENGES

2.1 Performance

2.1.1 Defining Performance

2.1.2 Understanding Algorithmic Complexity

2.1.3 Why Algorithmic Complexity Is Important

2.1.4 Using Algorithmic Complexity with Care

2.1.5 How Structure Impacts Performance

2.1.6 The Impact of Data Structures on Performance

2.2 Scalability

2.2.1 Constraints to Application Scaling

2.2.2 Performance Limited by Serial Code

2.2.3 Superlinear Scaling

2.2.4 Scaling of Library Code

2.2.5 Hardware Constraints to Scaling

2.2.6 Operating System Constraints to Scaling

2.2.7 Multicore Processors and Scaling

2.3 Synchronization and data sharing

2.4 Data races

2.4.1 Using Tools to Detect Data Races

2.4.2 Avoiding Data Races

2.5 Synchronization primitives

2.5.1 Mutexes and Critical Regions

2.5.2 Spin Locks

2.5.3 Semaphores

2.5.4 Readers-Writer Locks

2.5.5 Barriers

2.6 Deadlocks and livelocks

2.7 Communication between threads

2.7.1 Memory, Shared Memory, and Memory-Mapped Files

2.7.2 Condition Variables


2.7.3 Signals and Events

2.7.4 Message Queues

2.7.5 Named Pipes

2.7.6 Communication Through the Network Stack

2.7.7 Other Approaches to Sharing Data Between Threads

2.7.8 Storing Thread-Private Data

UNIT 3

SHARED MEMORY PROGRAMMING WITH OpenMP

3.1 OpenMP Execution Model

3.2 Memory Model

3.2.1 Structure of the OpenMP Memory Model

3.2.2 Device Data Environments

3.2.3 The Flush Operation

3.2.4 OpenMP Memory Consistency

3.3 OpenMP Directives

3.3.1 Directive Format

3.3.2 Conditional Compilation

3.3.3 Internal Control Variables

3.3.4 Array Sections

3.3.5 Parallel Construct

3.3.6 Canonical Loop Form

3.4 Work-sharing Constructs

3.4.1 Loop Construct

3.4.2 Sections Construct

3.4.3 Single Construct

3.4.4 workshare Construct

3.5 Library functions

3.5.1 Runtime Library Definitions

3.5.2 Execution Environment Routines

3.5.3 Lock Routines

3.5.4 Timing Routines


3.6 Handling Data and Functional Parallelism

3.6.1 GENERAL DATA PARALLELISM

3.6.2 FUNCTIONAL PARALLELISM

3.7 Handling Loops

3.7.1 Parallel for Pragma

3.7.2 Function omp_get_num_procs

3.8 Performance Considerations

3.8.1 Inverting loops

3.8.2 Conditionally Executing Loops

3.8.3 Scheduling

3.8.4 Loops

UNIT 4

DISTRIBUTED MEMORY PROGRAMMING WITH MPI

4.1 Introduction

4.2 MPI program execution

4.2.1 Compilation and execution

4.2.2 MPI programs

4.2.3 MPI Init and MPI Finalize

4.2.4 Communicators, MPI_Comm_size and MPI_Comm_rank

4.2.5 SPMD programs

4.2.6 Communication

4.2.7 Message matching

4.2.8 The status_p argument

4.3 MPI constructs

4.3.1 Datatype Constructors

4.3.2 Subarray Datatype Constructor

4.3.3 Distributed Array Datatype Constructor

4.3.4 Cartesian Constructor

4.3.5 Distributed Graph Constructor


4.4 Libraries

4.4.1 Contexts of communication

4.4.2 Groups of processes

4.4.3 Virtual topologies

4.4.4 Attribute caching

4.4.5 Communicators

4.5 MPI send and receive

4.5.1 MPI Send

4.5.2 MPI Recv

4.5.3 Semantics of MPI Send and MPI Recv

4.6 Point-to-point and Collective communication

4.6.1 Point-to-point communication

4.6.2 Collective communication

4.7 MPI derived datatypes

4.8 Performance evaluation

4.8.1 Taking timings

4.8.2 Results

4.8.3 Speedup and efficiency

4.8.4 Scalability

UNIT 5

PARALLEL PROGRAM DEVELOPMENT – CASE STUDIES

5.1 n-Body solvers

5.1.1 The problem

5.1.2 Two serial programs

5.1.3 Parallelizing the n-body solvers

5.1.4 A word about I/O

5.1.5 Parallelizing the basic solver using OpenMP

5.1.6 Parallelizing the reduced solver using OpenMP

5.1.7 Evaluating the OpenMP codes

5.1.8 Parallelizing the solvers using pthreads


5.1.9 Parallelizing the basic solver using MPI

5.1.10 Parallelizing the reduced solver using MPI

5.1.11 Performance of the MPI solvers

5.2 Tree Search

5.2.1 Recursive depth-first search

5.2.2 Nonrecursive depth-first search

5.2.3 Data structures for the serial implementations

5.2.4 Performance of the serial implementations

5.2.5 Parallelizing tree search

5.2.6 A static parallelization of tree search using pthreads

5.2.7 A dynamic parallelization of tree search using pthreads

5.2.8 Evaluating the Pthreads tree-search programs

5.2.9 Parallelizing the tree-search programs using OpenMP

5.2.10 Performance of the OpenMP implementations

5.3 OpenMP and MPI implementations and comparison

5.3.1 Implementation of tree search using MPI and static partitioning

5.3.2 Implementation of tree search using MPI and dynamic partitioning


View publication stats

You might also like