PDC Last Min Notes For MCQS - Theory
PDC Last Min Notes For MCQS - Theory
- Un-Parallelizable Tasks :
Tasks that have dependencies
1
and cannot be easily private memory, communicate
parallelized. via messages.
4. Building a Cluster :
3. Flynn's Taxonomy, Parallel
- Cluster : Group of
Architectures :
interconnected computers that
- Flynn's Taxonomy : work together as a single
Classification of computer system.
architectures based on
- Components : Compute
instruction and data streams.
nodes, interconnection
- Categories : SISD, SIMD, network, storage, software
MISD, MIMD. stack.
2
- Cluster Management : - Formula : Speedup = (1 -
Resource allocation, load P) + (P * N), where *P is the
balancing, fault tolerance. parallelizable portion and N is
the number of processors*.
3
Certainly! Here's a summarized - Parallelism breaks big
version of the key points, problems into smaller tasks for
presented in bullet points: simultaneous solving.
- Parallelizable tasks:
Independent tasks that can be
Fundamentals of Parallel and
solved concurrently.
Distributed Computing :
- Un-parallelizable tasks: Tasks
- Parallel computing: Multiple
with dependencies that can't
people/computers working
be easily parallelized.
together to solve a problem
faster. - Domain decomposition:
Dividing a problem based on
- Distributed computing:
different data categories.
Multiple interconnected
computers working together to - Functional decomposition:
solve a problem. Breaking a problem into
smaller steps or functions.
- Benefits: Faster processing,
increased computational
power, improved scalability.
Flynn's Taxonomy, Parallel
Architectures :
4
- Shared Nothing: Independent - Cluster Management:
nodes/computers solving their Resource allocation, load
tasks without communication. balancing, fault tolerance.
5
—----------------------------- —-----------------------------
-------- --------IMPORTANT
PLEASE!!!!!!!
Certainly! Here's a shorter
example for the "firstprivate" Data Parallelism vs. Task
and In this example, the Parallelism:
"firstprivate" clause initializes
* Data Parallelism : Divides
each thread's private copy of
data into smaller parts and
the variable "x" with the
processes them
original value of "x" (which is
simultaneously.
0). Each thread independently
performs its calculations, and - Example: Each student in a
6
- Key concept: Parallel evenly across processors to
execution of independent maximize parallelism.
tasks.
- Ensures that each processor
has a similar amount of work
to do.
Parallelism Granularity:
- Key concept: Avoiding idle
* Parallelism Granularity :
processors while others are
Refers to the size of the tasks
still working.
or data that are executed in
parallel.
7
- Formula: Speedup = 1 / [(1 -
P) + (P/N)]
Introduction to Parallel
Programming Environments: - Gustafson's Law : Focuses
on scaling the problem size
* Shared Memory Systems :
with more resources.
Multiple processors access a
shared memory space. - Formula: Speedup = (1 - P) +
(P * N)
- Parallel Algorithms using
OpenMP: * Efficiency : Measures how
effectively resources are
- Mutual Exclusion: Ensuring
utilized in parallel computing.
exclusive access to shared
resources. - Formula: Efficiency =
Speedup / Number of
- Reduction: Combining
Processors
results from multiple threads
into a single value. * Measuring Parallelism :
8
—----------------------------- - P: Fraction of the program
------------------------------- that can be parallelized
---------------------
- N: Number of processors
- Example: If 80% of a
program is parallelizable and
we have 4 processors (N=4):
9
- Speedup = (1 - 0.2) + (0.2 *
4) = 1.8
Note: The code examples are
simplified for illustration
purposes, and the outputs may
3. Efficiency : Measures how
vary depending on the system
effectively resources are
and the number of threads or
utilized in parallel computing.
processors used.
- Formula: Efficiency =
Speedup / Number of
Processors
10
- N: Number of processors 4. **Efficiency**: Measures the
utilization of resources in
- Example: If 80% of the
parallel execution.
program is parallelizable and
we have 4 processors (N=4): - Formula: *Efficiency =
Speedup / Number of
- Speedup = (1 - 0.8) + (0.8 *
Processors*
4) = 3.2
- Example: If a task achieves
a speedup of 5 and uses 4
3. **Speedup**: Represents processors:
the improvement gained by
- Efficiency = 5 / 4 = 1.25
executing a task in parallel.
- Formula: *Speedup =
Sequential Execution Time / 5. **Measuring Parallelism**:
Parallel Execution Time* Several metrics can be used to
quantify the level of
- Example: If a task takes 10
parallelism in a program.
seconds to execute
sequentially and 2 seconds to - **Parallel Fraction**: The
execute in parallel: portion of the program that
can be executed in parallel.
- Speedup = 10 seconds / 2
seconds = 5 - **Parallel Overhead**: The
additional time or resources
required for parallel execution,
beyond the sequential baseline.
11
- **Scalability**: The ability fixed workload. It takes into account
the portion of the computation that
of a program to maintain or is sequential and the portion that
improve performance as the can be parallelized.
● Formula: Speedup = 1 / [(1 -
problem size or number of
P) + (P / N)]
processors increases. ● P: Fraction of the
computation that is
- **Load Balance**: The parallelizable
● N: Number of processors
distribution of work among
● Example: If 75% of the
parallel tasks, aiming for equal computation is parallelizable
12
programming models include Shared the number of parallel
Memory (e.g., OpenMP), Message
Passing (e.g., MPI), and Data Parallel
processing units.
(e.g., CUDA).
2. Gustafson's Law: Speedup =
9. Parallel Patterns: Parallel patterns
are reusable templates for solving S + (N - 1) * S, where S is the
common parallel computing
parallelizable portion and N is
problems. They capture the essence
of parallel algorithms and provide the number of parallel
high-level abstractions for
processing units.
expressing parallelism. Examples of
parallel patterns include MapReduce, 3. Speedup measures the
Parallel Prefix Sum, and Pipeline.
10. Parallel Performance Tools: Tools
performance improvement
and frameworks are available to achieved by parallel execution:
analyze and optimize the
Speedup = Sequential
performance of parallel programs.
These tools help identify Execution Time / Parallel
performance bottlenecks, measure
Execution Time.
scalability, analyze parallel execution
traces, and optimize resource
4. Amdahl's Law focuses on
utilization. Examples include Intel
VTune, NVIDIA Nsight, and OpenMP
non-parallelizable portions,
Tools. Gustafson's Law considers
problem size and available
resources, and Speedup
quantifies actual improvement
in parallel execution.
13
rewritten with italics and bold 2. **Superlinear Speedup:**
for improved readability:
- *Superlinear speedup*
refers to achieving a *speedup
factor greater than the number
1. **Amdahl's Law:**
of processors* used.
- *Amdahl's Law* is a formula
- Although theoretically not
that helps us understand the
possible, *superlinear
potential *speedup* achievable
speedup* can occur due to
in a program when
factors such as increased
*parallelizing* a portion of it.
memory availability or
- The formula for *Amdahl's *nondeterministic
Law* is: **Max.speedup = 1 / (1 algorithms*.
- P)**, where *P* represents
the fraction of code that can be
parallelized. 3. **Speedup (ts/tp):**
14
when utilizing multiple factor greater than the number
processors to execute a of processors* used, which is
program. rare in practice.
15
as the number of processors - *Strong scalability* occurs
increases. when increasing the number of
processors without increasing
- *Gustafson's Law* states
the problem size maintains
that if the *problem size scales
efficiency.
with the number of
processors*, *scalability* can - *Weak scalability* occurs
be achieved. when the efficiency remains
fixed by increasing the
- *Scaled Speedup (S(p))*
problem size proportionally to
using *P processors* can be
the number of processors.
calculated using the formula:
**S(p) = fS + P * fP**.
16
Dependency Analysis in
Parallel and Distributed
2. Types of Dependencies:
Computing (PDC)
There are two main types of
dependencies: control
dependence and data
Dependency analysis is a
dependence.
critical step in parallelizing
and optimizing programs in
parallel and distributed
- Control Dependence:
computing. It involves
Control dependence exists
identifying and analyzing the
between instructions when the
dependencies between
execution of one instruction
instructions or data elements
depends on the outcome of a
to determine if they can be
preceding instruction's control
executed or accessed in
flow, such as conditional
parallel. Here is a brief
branches.
overview of dependency
analysis in PDC:
17
structures, as loops are crucial
for parallelization.
3. Dependence Analysis
Techniques: Dependency
analysis involves various
4. Dependency Testing:
techniques to identify and
Dependency testing is
analyze dependencies,
performed to determine if
including:
dependencies can be safely
eliminated or transformed
without affecting the
- Data Flow Analysis: It
correctness of the program.
tracks the flow of data through
This testing involves
the program to determine
evaluating the distance
dependencies between
vectors, direction vectors, and
instructions or data elements.
dependence directions
between instructions or data
1. Example: Loop-Carried
- Loop Dependence Analysis:
Dependence
It focuses on analyzing
dependencies within loop ```python
18
for i = 1 to N: 3. Example: Control
Dependence
A[i] = B[i-1] + C[i+1]
```python
```
if (x > 0):
Dependency Analysis: There is
a loop-carried dependence y=y+1
between the iterations because
else:
A[i] depends on the values of
B[i-1] and C[i+1] from the z=z+1
```
4. Example: True Dependency
Dependency Analysis: There
are no dependencies between ```python
iterations because each
x=y+z
iteration operates on different
elements of arrays B and C. a=x+b
```
19
Dependency Analysis: There is encountered in parallel and
a true dependency between distributed computing. The
the instructions. The dependency analysis process
instruction `a = x + b` depends involves analyzing the code
on the value produced by `x = y structure, tracking variable
+ z`. usage and definitions, and
determining the dependencies
based on their characteristics.
5. Example: Anti Dependency
```python
It's important to note that the
x=a+b actual analysis and
y=x-c transformation of
dependencies may vary
```
depending on the specific
Dependency Analysis: There is programming language,
an anti-dependency between compiler, and optimization
the instructions. The techniques used in parallel and
instruction `y = x - c` depends distributed computing
on the value produced by `x = a systems.
+ b`.
—-----------------------------
-------------------------------
20
Dependency Analysis in
Parallel and Distributed
2. Types of Dependencies:
Computing (PDC)
There are two main types of
dependencies: control
dependence and data
Dependency analysis is a
dependence.
critical step in parallelizing
and optimizing programs in
parallel and distributed
- Control Dependence:
computing. It involves
Control dependence exists
identifying and analyzing the
between instructions when the
dependencies between
execution of one instruction
instructions or data elements
depends on the outcome of a
to determine if they can be
preceding instruction's control
executed or accessed in
flow, such as conditional
parallel. Here is a brief
branches.
overview of dependency
analysis in PDC:
21
structures, as loops are crucial
for parallelization.
3. Dependence Analysis
Techniques: Dependency
analysis involves various
4. Dependency Testing:
techniques to identify and
Dependency testing is
analyze dependencies,
performed to determine if
including:
dependencies can be safely
eliminated or transformed
without affecting the
- Data Flow Analysis: It
correctness of the program.
tracks the flow of data through
This testing involves
the program to determine
evaluating the distance
dependencies between
vectors, direction vectors, and
instructions or data elements.
dependence directions
between instructions or data
1. Example: Loop-Carried
- Loop Dependence Analysis:
Dependence
It focuses on analyzing
dependencies within loop ```python
22
for i = 1 to N: 3. Example: Control
Dependence
A[i] = B[i-1] + C[i+1]
```python
```
if (x > 0):
Dependency Analysis: There is
a loop-carried dependence y=y+1
between the iterations because
else:
A[i] depends on the values of
B[i-1] and C[i+1] from the z=z+1
```
4. Example: True Dependency
Dependency Analysis: There
are no dependencies between ```python
iterations because each
x=y+z
iteration operates on different
elements of arrays B and C. a=x+b
```
23
Dependency Analysis: There is encountered in parallel and
a true dependency between distributed computing. The
the instructions. The dependency analysis process
instruction `a = x + b` depends involves analyzing the code
on the value produced by `x = y structure, tracking variable
+ z`. usage and definitions, and
determining the dependencies
based on their characteristics.
5. Example: Anti Dependency
```python
It's important to note that the
x=a+b actual analysis and
y=x-c transformation of
dependencies may vary
```
depending on the specific
Dependency Analysis: There is programming language,
an anti-dependency between compiler, and optimization
the instructions. The techniques used in parallel and
instruction `y = x - c` depends distributed computing
on the value produced by `x = a systems.
+ b`.
—-----------------------------
-------------------------------
24
Exam-style Question:
25
synchronization overhead in parallel system. It aims to
parallel programs. optimize resource utilization
and minimize idle time by
ensuring that each processor
Exam-style Question: has a similar amount of work.
Q: What is parallelism
granularity, and why is it
Exam-style Question:
important in parallel
computing? Q: What is load balancing in
parallel computing, and why is
A: Parallelism granularity
it important?
refers to the size of tasks or
units of work in parallel. It A: Load balancing involves
affects concurrency, distributing the workload
communication, and evenly across processors to
synchronization overhead in optimize resource utilization
parallel programs. and minimize idle time.
26
simultaneously. It allows simultaneously, while
different tasks to execute synchronization involves
independently and coordinating tasks to ensure
concurrently. proper ordering and
consistency using
synchronization primitives.
- **Synchronization**:
Synchronization is the
coordination of tasks to ensure Task Graph and Task
proper ordering and Dependency Graph:
consistency in parallel
programs. It involves using
synchronization primitives like - **Task Graph**: A task graph
27
representation of the between data elements or
dependencies and enables instructions in a program. It
efficient scheduling and helps determine the order of
execution of tasks. execution and ensures
correctness and data
consistency in parallel
Exam-style Question: programs.
- **Data Dependency
Analysis**: Data dependency Introduction to Parallel
analysis is the process of Programming Environments:
identifying dependencies
28
- **Parallel Programming (Continued in the next
Environments**: Parallel message)
programming environments
Shared Memory Systems:
provide tools, libraries, and
Parallel Algorithms using
frameworks to develop and
OpenMP:
execute parallel programs.
They offer high-level
abstractions and APIs to - **Shared Memory
simplify parallel program Systems**: Shared memory
development. systems are parallel computing
architectures where multiple
processors share a common
Exam-style Question:
memory space. They allow
Q: concurrent access to shared
data and support parallel
programming models like
What are parallel OpenMP.
programming environments,
and why are they used?
- **Parallel Algorithms using
A: Parallel programming
OpenMP**: OpenMP is a
environments provide tools
programming model for shared
and abstractions to develop
memory systems. It provides
and execute parallel programs
directives and libraries for
efficiently.
parallelizing code and
29
exploiting parallelism. Parallel - **Amdahl's Law**: Amdahl's
algorithms in OpenMP include Law is a formula that predicts
mutual exclusion, reduction, the maximum potential
and locks. speedup of a program when
only a portion of it can be
parallelized. It highlights the
Exam-style Question: impact of serial portions on
30
case to the execution time in and efficiency quantifies
the parallel case. resource utilization in parallel
computing.
- **Efficiency**: Efficiency
quantifies the utilization of Measuring Parallelism: Work
resources in parallel Done and Speedup:
computing. It is calculated as
the ratio of the speedup
achieved to the number of - **Work Done**: Work done
31
a standard API for developing
portable parallel applications.
Exam-style Question:
32
target architecture and designing and implementing
programming model, as parallel programs that operate
OpenCL is for heterogeneous on data distributed across
systems and uses a task-based multiple memory spaces. MPI
programming model. provides the necessary tools
and libraries for this purpose.
Distributed Memory
Programming using MPI - Exam-style Question:
Basics:
Q: What is MPI, and what does
distributed memory
programming involve?
- **MPI**: MPI (Message
Passing Interface) is a standard A: MPI is a standard for
for message passing in message passing in distributed
distributed memory systems. memory systems. Distributed
It enables communication memory programming involves
designing parallel programs
for distributed data and
and coordination among utilizing MPI for
multiple processes running on communication and
separate nodes. coordination.
33
involves multiple computers
working together on a task by
- **Parallelism**: Parallelism
sharing resources and
refers to the simultaneous
coordinating actions.
execution of multiple tasks or
computations, which can
improve performance and
Flynns Taxonomy, Parallel
solve larger problems
Architectures, and Building a
efficiently.
Cluster:
- **Distributed Computing**:
- **Flynns Taxonomy**: Flynns
Distributed computing
Taxonomy classifies computer
involves utilizing multiple
architectures based on the
computers or nodes to work
number of instruction streams
together on a task, sharing
(single or multiple) and data
resources and coordinating
streams (single or multiple). It
their actions.
categorizes architectures as
SISD, SIMD, MISD, and MIMD.
Exam-style Question:
34
and shared memory Building a cluster involves
architectures. assembling multiple
computers to function as a
unified system.
- **Building a Cluster**:
Data Parallelism vs Task
Building a cluster involves
Parallelism:
assembling multiple
computers or nodes and
connecting them to work
- **Data Parallelism**: Data
together as a single system.
parallelism is a parallel
Clusters are commonly used in
programming technique where
parallel and distributed
the same operation is
computing.
performed concurrently on
different subsets of data. It
involves dividing the data into
Exam-style Question:
chunks and assigning each
Q: Explain Flynns Taxonomy, chunk to a separate processing
different parallel unit.
architectures, and the process
of building a cluster.
- **Task Parallelism**: Task
A: Flynns Taxonomy
parallelism is a parallel
categorizes architectures
programming technique where
based on instruction and data
different tasks or operations
streams. Parallel architectures
are executed concurrently.
include shared nothing, shared
Each task can operate on
disk, and shared memory.
35
different data or perform coarse-grained parallelism,
different computations where larger tasks are
independently. executed concurrently.
Load Balancing:
Parallelism Granularity:
36
Exam-style Question: - **Synchronization**:
Synchronization is the
Q: What is load balancing in
coordination of tasks or
parallel computing?
processes to ensure proper
A: Load balancing involves order of execution or to
distributing the computational prevent conflicts. It involves
workload evenly across mechanisms such as locks,
multiple processing units to semaphores, and barriers.
ensure efficient resource
utilization.
Exam-style Question:
A: Concurrency enables
- **Concurrency**: simultaneous progress of
Concurrency refers to the tasks, while synchronization
ability of multiple tasks or ensures proper order or avoids
operations to make progress conflicts through coordination
simultaneously. It allows mechanisms.
overlapping execution and can
be achieved through
parallelism or interleaved Task Graph and Task
execution. Dependency Graph:
37
flow, while a task dependency
graph represents the
- **Task Graph**: A task graph
dependencies between tasks.
represents the dependencies
and relationships among tasks
in a parallel program. It
Data Dependency Analysis:
visually depicts the flow of
execution and data
dependencies between tasks. - **Data Dependency
Analysis**: Data dependency
analysis is a technique used to
- **Task Dependency Graph**:
identify dependencies between
A task dependency graph
data elements or variables in a
illustrates the dependencies
program. It helps determine if
between tasks, showing which
certain operations can be
tasks must complete before
executed in parallel or if
others can start. It helps in
synchronization is required.
determining the order of task
execution.
Exam-style Question:
38
parallelizability and developing and executing
synchronization requirements. parallel programs.
Exam-style Question:
A: Parallel programming
environments provide tools
and abstractions for
39