[go: up one dir, main page]

0% found this document useful (0 votes)
92 views39 pages

PDC Last Min Notes For MCQS - Theory

- Parallel computing involves simultaneously solving tasks to address large problems faster using multiple processors or computers. - Tasks can be parallelized if they are independent and can be solved concurrently, while tasks with dependencies require sequential execution. - Problems can be broken into smaller parallel pieces based on different data domains or functional decompositions. - Architectures include shared nothing, shared disk, and shared memory configurations allowing distributed and parallel solutions.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
92 views39 pages

PDC Last Min Notes For MCQS - Theory

- Parallel computing involves simultaneously solving tasks to address large problems faster using multiple processors or computers. - Tasks can be parallelized if they are independent and can be solved concurrently, while tasks with dependencies require sequential execution. - Problems can be broken into smaller parallel pieces based on different data domains or functional decompositions. - Architectures include shared nothing, shared disk, and shared memory configurations allowing distributed and parallel solutions.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 39

- Distributed Computing :

LAST MIN REVISION NOTES


Computing performed on
PDC (PDC, DUAAS, HOPE) :)
multiple interconnected
you got this, you only miss 100 computers.
percent of the shots you don't
- Benefits of Parallel and
take!
Distributed Computing :
—----------------------------- *Faster processing, increased
-------- computational power,

MOD 1 BASICS- improved scalability.*

Certainly! Here's an extended


summary of the topics you 2. Need for Parallelism,
mentioned, using bold and Parallelizable vs.
italics for emphasis, and Un-Parallelizable Tasks,
presented in bullet points for Domain and Functional
readability: Decomposition :

- Need for Parallelism :

1. Fundamentals of Parallel Large problems can be divided

and Distributed Computing : into smaller tasks for parallel


processing.
- Parallel Computing :
Simultaneous execution of - Parallelizable Tasks :

multiple tasks to solve a larger Tasks that can be executed

problem faster. independently or concurrently.

- Un-Parallelizable Tasks :
Tasks that have dependencies

1
and cannot be easily private memory, communicate
parallelized. via messages.

- Domain Decomposition : - Shared Disk : Nodes


Breaking a problem into parts share a global disk but have
based on different data private memory.
domains.
- Shared Memory : Nodes
- Functional Decomposition share a common address
: Breaking a problem into parts space, allowing direct memory
based on different functions or access.
subtasks.

4. Building a Cluster :
3. Flynn's Taxonomy, Parallel
- Cluster : Group of
Architectures :
interconnected computers that
- Flynn's Taxonomy : work together as a single
Classification of computer system.
architectures based on
- Components : Compute
instruction and data streams.
nodes, interconnection
- Categories : SISD, SIMD, network, storage, software
MISD, MIMD. stack.

- Parallel Architectures : - Network Topologies : Bus,


ring, mesh, tree, hypercube.
- Shared Nothing :
Independent nodes with

2
- Cluster Management : - Formula : Speedup = (1 -
Resource allocation, load P) + (P * N), where *P is the
balancing, fault tolerance. parallelizable portion and N is
the number of processors*.

- Efficiency : Measures how


7. Amdahl's Law vs.
effectively parallel resources
Gustafson's Law, Speedup,
are used.
Efficiency, Measuring
Parallelism : - Formula : Efficiency =
(Speedup / Number of
- Amdahl's Law : Predicts
Processors) * 100%.
the maximum speedup
achievable by parallelizing a - Measuring Parallelism :
portion of a program with a
- Work Done : Measures
fixed sequential portion.
the total amount of work
- Formula : Speedup = 1 / performed by a parallel
((1 - P) + (P / N)), where *P is algorithm.
the parallelizable portion and
- Speedup : Ratio of the
N is the number of
time taken by the sequential
processors*.
algorithm to the time taken by
- Gustafson's Law : the parallel algorithm.
Considers a fixed problem size
—-----------------------------
and focuses on how
-----
parallelization can reduce the
overall time. SUMMARY FOR CONCEPT–

3
Certainly! Here's a summarized - Parallelism breaks big
version of the key points, problems into smaller tasks for
presented in bullet points: simultaneous solving.

- Parallelizable tasks:
Independent tasks that can be
Fundamentals of Parallel and
solved concurrently.
Distributed Computing :
- Un-parallelizable tasks: Tasks
- Parallel computing: Multiple
with dependencies that can't
people/computers working
be easily parallelized.
together to solve a problem
faster. - Domain decomposition:
Dividing a problem based on
- Distributed computing:
different data categories.
Multiple interconnected
computers working together to - Functional decomposition:
solve a problem. Breaking a problem into
smaller steps or functions.
- Benefits: Faster processing,
increased computational
power, improved scalability.
Flynn's Taxonomy, Parallel
Architectures :

Need for Parallelism, - Flynn's Taxonomy


Parallelizable vs. categorizes computer
Un-Parallelizable Tasks, architectures based on
Domain and Functional instruction and data streams.
Decomposition :

4
- Shared Nothing: Independent - Cluster Management:
nodes/computers solving their Resource allocation, load
tasks without communication. balancing, fault tolerance.

- Shared Disk: Nodes work on


different tasks but share a
Amdahl's Law vs. Gustafson's
common storage.
Law, Speedup, Efficiency,
- Shared Memory: Nodes work Measuring Parallelism :
on the same task and share a
- Amdahl's Law: Predicts
common memory.
maximum speedup by
parallelizing a portion of a
program.
Building a Cluster :
- Gustafson's Law: Focuses on
- Cluster: Group of
reducing overall time by
interconnected computers
increasing parallel resources.
working together.
- Speedup: Ratio of time taken
- Components: Computers,
by sequential algorithm to
interconnection network,
parallel algorithm.
storage, software stack.
- Efficiency: Measures
- Network Topologies:
effective resource utilization.
Different ways computers are
connected (bus, ring, mesh, - Measuring Parallelism: Work
tree, hypercube). Done (total work performed)
and Speedup (sequential time
divided by parallel time).

5
—----------------------------- —-----------------------------
-------- --------IMPORTANT
PLEASE!!!!!!!
Certainly! Here's a shorter
example for the "firstprivate" Data Parallelism vs. Task
and In this example, the Parallelism:
"firstprivate" clause initializes
* Data Parallelism : Divides
each thread's private copy of
data into smaller parts and
the variable "x" with the
processes them
original value of "x" (which is
simultaneously.
0). Each thread independently
performs its calculations, and - Example: Each student in a

the final value of "x" is not classroom solving a different

shared among threads. math problem.

- Key concept: Parallel


processing of independent
The "lastprivate" clause
data elements.
ensures that the value of "y"
from the last iteration of the * Task Parallelism : Divides a

loop is preserved. Each thread task into smaller sub-tasks and

updates its private copy of "y" executes them simultaneously.

independently, but the value of - Example: Each student in a


"y" outside the parallel region classroom working on a
reflects the value from the last different aspect of a group
iteration of the loop (which is 4 project.
in this case).

6
- Key concept: Parallel evenly across processors to
execution of independent maximize parallelism.
tasks.
- Ensures that each processor
has a similar amount of work
to do.
Parallelism Granularity:
- Key concept: Avoiding idle
* Parallelism Granularity :
processors while others are
Refers to the size of the tasks
still working.
or data that are executed in
parallel.

- Coarse-grained: Large tasks Concurrency and


are divided and executed Synchronization:
independently.
* Concurrency : Execution of
- Fine-grained: Small tasks multiple tasks overlapping in
are divided and executed time.
independently.
- Key concept: Simultaneous
- Key concept: Determining execution without strict order.
the appropriate task size for
* Synchronization :
efficient parallel execution.
Coordination of tasks to
ensure correctness and avoid
conflicts.
Load Balancing:
- Key concept: Enforcing
* Load Balancing :
order, mutual exclusion, and
Distributing the workload
data consistency.

7
- Formula: Speedup = 1 / [(1 -
P) + (P/N)]
Introduction to Parallel
Programming Environments: - Gustafson's Law : Focuses
on scaling the problem size
* Shared Memory Systems :
with more resources.
Multiple processors access a
shared memory space. - Formula: Speedup = (1 - P) +
(P * N)
- Parallel Algorithms using
OpenMP: * Efficiency : Measures how
effectively resources are
- Mutual Exclusion: Ensuring
utilized in parallel computing.
exclusive access to shared
resources. - Formula: Efficiency =
Speedup / Number of
- Reduction: Combining
Processors
results from multiple threads
into a single value. * Measuring Parallelism :

- Locks: Synchronization - Work Done : Total amount


mechanisms to protect critical of work accomplished by the
sections. parallel program.

* Amdahl's Law vs. - Speedup : Ratio of the


Gustafson's Law : execution time of a sequential
program to the parallel
- Amdahl's Law : Calculates
program's execution time.
the maximum speedup
achievable for a fixed
workload.

8
—----------------------------- - P: Fraction of the program
------------------------------- that can be parallelized
---------------------
- N: Number of processors

- Example: If 80% of a
program is parallelizable and
we have 4 processors (N=4):

- Speedup = 1 / [(1 - 0.8) +


(0.8/4)] = 1.67

2. Gustafson's Law : Focuses


Shared Memory Systems: on scaling the problem size
Parallel Algorithms using with more resources.
OpenMP
- Formula: Speedup = (1 - P) +
Amdahl's Law vs. Gustafson's (P * N)
Law, Speedup, Efficiency,
- P: Fraction of the program
Measuring Parallelism
that is sequential
1. Amdahl's Law : Calculates
- N: Number
the maximum speedup
achievable for a fixed
workload.
of processors
- Formula: Speedup = 1 / [(1 -
- Example: If 20% of a
P) + (P/N)]
program is sequential and we
have 4 processors (N=4):

9
- Speedup = (1 - 0.2) + (0.2 *
4) = 1.8
Note: The code examples are
simplified for illustration
purposes, and the outputs may
3. Efficiency : Measures how
vary depending on the system
effectively resources are
and the number of threads or
utilized in parallel computing.
processors used.
- Formula: Efficiency =
Speedup / Number of
Processors

- Example: If the speedup is 3 —-----------------------------


and we have 4 processors: -------------------------------
----
- Efficiency = 3 / 4 = 0.75
Certainly! Continuing from the
previous response:
4. Measuring Parallelism:
Work Done and Speedup :
2. **Gustafson's Law**:
- Work Done : Total amount
Focuses on scaling the problem
of work accomplished by the
size with more resources.
parallel program.
- Formula: *Speedup = (1 - P)
- Speedup : Ratio of the
+ (P * N)*
execution time of a sequential
program to the parallel - P: Fraction of the program
program's execution time. that is parallelized

10
- N: Number of processors 4. **Efficiency**: Measures the
utilization of resources in
- Example: If 80% of the
parallel execution.
program is parallelizable and
we have 4 processors (N=4): - Formula: *Efficiency =
Speedup / Number of
- Speedup = (1 - 0.8) + (0.8 *
Processors*
4) = 3.2
- Example: If a task achieves
a speedup of 5 and uses 4
3. **Speedup**: Represents processors:
the improvement gained by
- Efficiency = 5 / 4 = 1.25
executing a task in parallel.

- Formula: *Speedup =
Sequential Execution Time / 5. **Measuring Parallelism**:
Parallel Execution Time* Several metrics can be used to
quantify the level of
- Example: If a task takes 10
parallelism in a program.
seconds to execute
sequentially and 2 seconds to - **Parallel Fraction**: The
execute in parallel: portion of the program that
can be executed in parallel.
- Speedup = 10 seconds / 2
seconds = 5 - **Parallel Overhead**: The
additional time or resources
required for parallel execution,
beyond the sequential baseline.

11
- **Scalability**: The ability fixed workload. It takes into account
the portion of the computation that
of a program to maintain or is sequential and the portion that
improve performance as the can be parallelized.
● Formula: Speedup = 1 / [(1 -
problem size or number of
P) + (P / N)]
processors increases. ● P: Fraction of the
computation that is
- **Load Balance**: The parallelizable
● N: Number of processors
distribution of work among
● Example: If 75% of the
parallel tasks, aiming for equal computation is parallelizable

or optimal load distribution. and we have 4 processors


(N=4):
● Speedup = 1 / [(1 -
0.75) + (0.75 / 4)] = 4
These concepts and metrics 7. Parallel Algorithms: These are
algorithms specifically designed to
are important in analyzing and
take advantage of parallel
optimizing parallel algorithms computing resources. Parallel
algorithms divide the problem into
and systems, helping
subtasks that can be executed
developers make informed simultaneously on different

decisions about parallelization processors, and then combine the


results to obtain the final solution.
and resource utilization.
Examples include parallel sorting
algorithms, parallel matrix
—-----------------------------
multiplication, and parallel graph
------------------------------- algorithms.
8. Parallel Programming Models:
-----
Parallel programming models
–MORE NOTES EASIER– provide a high-level abstraction for
expressing parallelism in programs.
6. Amdahl's Law: Amdahl's Law is a They offer constructs and APIs for
formula that estimates the creating parallel tasks, synchronizing
maximum speedup achievable by them, and managing data
parallelizing a computation with a dependencies. Common parallel

12
programming models include Shared the number of parallel
Memory (e.g., OpenMP), Message
Passing (e.g., MPI), and Data Parallel
processing units.
(e.g., CUDA).
2. Gustafson's Law: Speedup =
9. Parallel Patterns: Parallel patterns
are reusable templates for solving S + (N - 1) * S, where S is the
common parallel computing
parallelizable portion and N is
problems. They capture the essence
of parallel algorithms and provide the number of parallel
high-level abstractions for
processing units.
expressing parallelism. Examples of
parallel patterns include MapReduce, 3. Speedup measures the
Parallel Prefix Sum, and Pipeline.
10. Parallel Performance Tools: Tools
performance improvement
and frameworks are available to achieved by parallel execution:
analyze and optimize the
Speedup = Sequential
performance of parallel programs.
These tools help identify Execution Time / Parallel
performance bottlenecks, measure
Execution Time.
scalability, analyze parallel execution
traces, and optimize resource
4. Amdahl's Law focuses on
utilization. Examples include Intel
VTune, NVIDIA Nsight, and OpenMP
non-parallelizable portions,
Tools. Gustafson's Law considers
problem size and available
resources, and Speedup
quantifies actual improvement
in parallel execution.

1. Amdahl's Law: Speedup = 1 /


[(1 - P) + (P / N)], where P is the
Certainly! Here are worked
parallelizable portion and N is
examples for each subtopic,

13
rewritten with italics and bold 2. **Superlinear Speedup:**
for improved readability:
- *Superlinear speedup*
refers to achieving a *speedup
factor greater than the number
1. **Amdahl's Law:**
of processors* used.
- *Amdahl's Law* is a formula
- Although theoretically not
that helps us understand the
possible, *superlinear
potential *speedup* achievable
speedup* can occur due to
in a program when
factors such as increased
*parallelizing* a portion of it.
memory availability or
- The formula for *Amdahl's *nondeterministic
Law* is: **Max.speedup = 1 / (1 algorithms*.
- P)**, where *P* represents
the fraction of code that can be
parallelized. 3. **Speedup (ts/tp):**

- The *maximum speedup* - *Speedup* is defined as the


represents the theoretical ratio of *execution time on a
limit of performance single processor (ts)* to the
improvement when *execution time on a
parallelizing a program. It is multiprocessor system (tp)*.
*infinite* when all code is
- The formula for *speedup*
parallelizable (*P = 1*) and *1*
is: **Speedup = ts / tp**.
when no code can be
parallelized (*P = 0*). - *Speedup* quantifies the
improvement in performance

14
when utilizing multiple factor greater than the number
processors to execute a of processors* used, which is
program. rare in practice.

4. **Speedup for N CPUs or 6. **Efficiency:**


Machines:**
- *Efficiency* measures the
- To model the relationship ability to utilize resources
between the *number of effectively without waste.
processors (Proc)*, the
- In the context of parallel
*parallel fraction (fP)*, and the
computing, *efficiency* is
*serial fraction (fS)*, the
often defined as the ratio of
*speedup* can be calculated
*speedup to the number of
using the formula: **Speedup =
processors*.
1 / (fS + fP / Proc)**.
- The formula for
*efficiency* is: **Efficiency =
5. **Linear vs. Superlinear Speedup / Proc**.
Speedup:**

- *Linear speedup* refers to


7. **Gustafson's Law:**
achieving a *speedup factor
equal to the number of - *Gustafson's Law* provides

processors* used. an alternative perspective to


*Amdahl's Law*, focusing on
- *Superlinear speedup*
increasing the *problem size*
refers to achieving a *speedup

15
as the number of processors - *Strong scalability* occurs
increases. when increasing the number of
processors without increasing
- *Gustafson's Law* states
the problem size maintains
that if the *problem size scales
efficiency.
with the number of
processors*, *scalability* can - *Weak scalability* occurs
be achieved. when the efficiency remains
fixed by increasing the
- *Scaled Speedup (S(p))*
problem size proportionally to
using *P processors* can be
the number of processors.
calculated using the formula:
**S(p) = fS + P * fP**.

I hope these worked examples

8. **Speedups and Efficiencies


on Different Problem Sizes:**
, presented with the use of
- *Speedups* and italics and bold formatting,
*efficiencies* can vary for help clarify the concepts of
parallel programs based on Amdahl's Law, the speedup
different *problem sizes*. factor, and Gustafson's Law.

- *Scalability* is determined —-----------------------------


by the ability to handle -------------------------------
*increasing problem sizes* ----
effectively.

16
Dependency Analysis in
Parallel and Distributed
2. Types of Dependencies:
Computing (PDC)
There are two main types of
dependencies: control
dependence and data
Dependency analysis is a
dependence.
critical step in parallelizing
and optimizing programs in
parallel and distributed
- Control Dependence:
computing. It involves
Control dependence exists
identifying and analyzing the
between instructions when the
dependencies between
execution of one instruction
instructions or data elements
depends on the outcome of a
to determine if they can be
preceding instruction's control
executed or accessed in
flow, such as conditional
parallel. Here is a brief
branches.
overview of dependency
analysis in PDC:

- Data Dependence: Data


dependence exists between
1. Goals: The primary goal of
instructions when the
dependency analysis is to
execution or ordering of
identify dependencies
instructions is determined by
accurately to enable safe and
their data dependencies,
efficient parallel execution of
including true, anti, and output
instructions or data elements.
dependencies.

17
structures, as loops are crucial
for parallelization.
3. Dependence Analysis
Techniques: Dependency
analysis involves various
4. Dependency Testing:
techniques to identify and
Dependency testing is
analyze dependencies,
performed to determine if
including:
dependencies can be safely
eliminated or transformed
without affecting the
- Data Flow Analysis: It
correctness of the program.
tracks the flow of data through
This testing involves
the program to determine
evaluating the distance
dependencies between
vectors, direction vectors, and
instructions or data elements.
dependence directions
between instructions or data

- Use-Def and Def-Use elements.

Chains: These chains identify


the usage and definition of
Worked Examples of PDC
variables to analyze data
Dependency Analysis:
dependencies.

1. Example: Loop-Carried
- Loop Dependence Analysis:
Dependence
It focuses on analyzing
dependencies within loop ```python

18
for i = 1 to N: 3. Example: Control
Dependence
A[i] = B[i-1] + C[i+1]
```python
```
if (x > 0):
Dependency Analysis: There is
a loop-carried dependence y=y+1
between the iterations because
else:
A[i] depends on the values of
B[i-1] and C[i+1] from the z=z+1

previous iteration. ```

Dependency Analysis: There is

2. Example: Loop-Independent control dependence between

Dependence the if and else branches. The


execution of the instructions
```python
in the if branch depends on the
for i = 1 to N: outcome of the condition (x >
0).
A[i] = B[i] + C[i]

```
4. Example: True Dependency
Dependency Analysis: There
are no dependencies between ```python
iterations because each
x=y+z
iteration operates on different
elements of arrays B and C. a=x+b

```

19
Dependency Analysis: There is encountered in parallel and
a true dependency between distributed computing. The
the instructions. The dependency analysis process
instruction `a = x + b` depends involves analyzing the code
on the value produced by `x = y structure, tracking variable
+ z`. usage and definitions, and
determining the dependencies
based on their characteristics.
5. Example: Anti Dependency

```python
It's important to note that the
x=a+b actual analysis and

y=x-c transformation of
dependencies may vary
```
depending on the specific
Dependency Analysis: There is programming language,
an anti-dependency between compiler, and optimization
the instructions. The techniques used in parallel and
instruction `y = x - c` depends distributed computing
on the value produced by `x = a systems.
+ b`.
—-----------------------------
-------------------------------

Note: The above examples -----

demonstrate different types of


dependencies commonly

20
Dependency Analysis in
Parallel and Distributed
2. Types of Dependencies:
Computing (PDC)
There are two main types of
dependencies: control
dependence and data
Dependency analysis is a
dependence.
critical step in parallelizing
and optimizing programs in
parallel and distributed
- Control Dependence:
computing. It involves
Control dependence exists
identifying and analyzing the
between instructions when the
dependencies between
execution of one instruction
instructions or data elements
depends on the outcome of a
to determine if they can be
preceding instruction's control
executed or accessed in
flow, such as conditional
parallel. Here is a brief
branches.
overview of dependency
analysis in PDC:

- Data Dependence: Data


dependence exists between
1. Goals: The primary goal of
instructions when the
dependency analysis is to
execution or ordering of
identify dependencies
instructions is determined by
accurately to enable safe and
their data dependencies,
efficient parallel execution of
including true, anti, and output
instructions or data elements.
dependencies.

21
structures, as loops are crucial
for parallelization.
3. Dependence Analysis
Techniques: Dependency
analysis involves various
4. Dependency Testing:
techniques to identify and
Dependency testing is
analyze dependencies,
performed to determine if
including:
dependencies can be safely
eliminated or transformed
without affecting the
- Data Flow Analysis: It
correctness of the program.
tracks the flow of data through
This testing involves
the program to determine
evaluating the distance
dependencies between
vectors, direction vectors, and
instructions or data elements.
dependence directions
between instructions or data

- Use-Def and Def-Use elements.

Chains: These chains identify


the usage and definition of
Worked Examples of PDC
variables to analyze data
Dependency Analysis:
dependencies.

1. Example: Loop-Carried
- Loop Dependence Analysis:
Dependence
It focuses on analyzing
dependencies within loop ```python

22
for i = 1 to N: 3. Example: Control
Dependence
A[i] = B[i-1] + C[i+1]
```python
```
if (x > 0):
Dependency Analysis: There is
a loop-carried dependence y=y+1
between the iterations because
else:
A[i] depends on the values of
B[i-1] and C[i+1] from the z=z+1

previous iteration. ```

Dependency Analysis: There is

2. Example: Loop-Independent control dependence between

Dependence the if and else branches. The


execution of the instructions
```python
in the if branch depends on the
for i = 1 to N: outcome of the condition (x >
0).
A[i] = B[i] + C[i]

```
4. Example: True Dependency
Dependency Analysis: There
are no dependencies between ```python
iterations because each
x=y+z
iteration operates on different
elements of arrays B and C. a=x+b

```

23
Dependency Analysis: There is encountered in parallel and
a true dependency between distributed computing. The
the instructions. The dependency analysis process
instruction `a = x + b` depends involves analyzing the code
on the value produced by `x = y structure, tracking variable
+ z`. usage and definitions, and
determining the dependencies
based on their characteristics.
5. Example: Anti Dependency

```python
It's important to note that the
x=a+b actual analysis and

y=x-c transformation of
dependencies may vary
```
depending on the specific
Dependency Analysis: There is programming language,
an anti-dependency between compiler, and optimization
the instructions. The techniques used in parallel and
instruction `y = x - c` depends distributed computing
on the value produced by `x = a systems.
+ b`.
—-----------------------------
-------------------------------

Note: The above examples ----

demonstrate different types of Data Parallelism Vs Task


dependencies commonly Parallelism:

24
Exam-style Question:

- **Data Parallelism**: In data Q: What are the differences


parallelism, the same task is between data parallelism and
performed on different data task parallelism in parallel
sets simultaneously. It aims to computing?
divide the data and execute the
A: Data parallelism focuses on
same operations on multiple
distributing the workload by
processors concurrently. The
performing the same task on
focus is on distributing the
different data sets
workload across processors to
concurrently. Task parallelism
achieve parallelism.
involves executing different
tasks simultaneously by
multiple processors.
- **Task Parallelism**: In task
parallelism, different tasks or
operations are executed
Parallelism Granularity:
simultaneously by multiple
processors. Each processor
performs a unique task - **Parallelism Granularity**:
independently. Task Parallelism granularity refers
parallelism aims to divide the to the size of the tasks or units
workload into smaller tasks of work that are executed in
and execute them parallel. It determines the level
concurrently. of concurrency and the
amount of communication and

25
synchronization overhead in parallel system. It aims to
parallel programs. optimize resource utilization
and minimize idle time by
ensuring that each processor
Exam-style Question: has a similar amount of work.

Q: What is parallelism
granularity, and why is it
Exam-style Question:
important in parallel
computing? Q: What is load balancing in
parallel computing, and why is
A: Parallelism granularity
it important?
refers to the size of tasks or
units of work in parallel. It A: Load balancing involves
affects concurrency, distributing the workload
communication, and evenly across processors to
synchronization overhead in optimize resource utilization
parallel programs. and minimize idle time.

Load Balancing: Concurrency and


Synchronization:

- **Load Balancing**: Load


balancing is the process of - **Concurrency**:
distributing the workload Concurrency refers to the
evenly across multiple ability of multiple tasks or
processors or nodes in a operations to make progress

26
simultaneously. It allows simultaneously, while
different tasks to execute synchronization involves
independently and coordinating tasks to ensure
concurrently. proper ordering and
consistency using
synchronization primitives.
- **Synchronization**:
Synchronization is the
coordination of tasks to ensure Task Graph and Task
proper ordering and Dependency Graph:
consistency in parallel
programs. It involves using
synchronization primitives like - **Task Graph**: A task graph

locks, barriers, and is a visual representation of

semaphores to control access tasks and their dependencies

to shared resources and in a parallel program. It shows

enforce mutual exclusion. the relationships and


dependencies between tasks
and helps in understanding the
Exam-style Question: parallel execution flow.

Q: What is the difference


between concurrency and
- **Task Dependency Graph**:
synchronization in parallel
A task dependency graph
computing?
illustrates the dependencies
A: Concurrency refers to the between tasks in a parallel
ability of tasks to execute program. It provides a detailed

27
representation of the between data elements or
dependencies and enables instructions in a program. It
efficient scheduling and helps determine the order of
execution of tasks. execution and ensures
correctness and data
consistency in parallel
Exam-style Question: programs.

Q: What are task graphs and


task dependency graphs in
Exam-style Question:
parallel computing?
Q: What is data dependency
A: A task graph is a visual
analysis in parallel computing,
representation of tasks and
and why is it important?
their dependencies, while a
task dependency graph A: Data dependency analysis
illustrates the dependencies identifies dependencies
between tasks in a parallel between data elements or
program. instructions to determine the
order of execution and ensure
correctness and data
Data Dependency Analysis: consistency in parallel
programs.

- **Data Dependency
Analysis**: Data dependency Introduction to Parallel
analysis is the process of Programming Environments:
identifying dependencies

28
- **Parallel Programming (Continued in the next
Environments**: Parallel message)
programming environments
Shared Memory Systems:
provide tools, libraries, and
Parallel Algorithms using
frameworks to develop and
OpenMP:
execute parallel programs.
They offer high-level
abstractions and APIs to - **Shared Memory
simplify parallel program Systems**: Shared memory
development. systems are parallel computing
architectures where multiple
processors share a common
Exam-style Question:
memory space. They allow
Q: concurrent access to shared
data and support parallel
programming models like
What are parallel OpenMP.
programming environments,
and why are they used?
- **Parallel Algorithms using
A: Parallel programming
OpenMP**: OpenMP is a
environments provide tools
programming model for shared
and abstractions to develop
memory systems. It provides
and execute parallel programs
directives and libraries for
efficiently.
parallelizing code and

29
exploiting parallelism. Parallel - **Amdahl's Law**: Amdahl's
algorithms in OpenMP include Law is a formula that predicts
mutual exclusion, reduction, the maximum potential
and locks. speedup of a program when
only a portion of it can be
parallelized. It highlights the
Exam-style Question: impact of serial portions on

Q: What are shared memory overall performance.

systems, and how are parallel


algorithms implemented using
- **Gustafson's Law**:
OpenMP?
Gustafson's Law focuses on
A: Shared memory systems are scalability and argues that as
architectures where multiple the problem size increases, the
processors share a common parallel portion of the program
memory. OpenMP is a becomes dominant, leading to
programming model for such improved speedup.
systems, and parallel
algorithms in OpenMP utilize
techniques like mutual - **Speedup**: Speedup
exclusion, reduction, and locks. measures the performance
improvement achieved by
executing a program in parallel
Amdahl's Law vs Gustafson's compared to serial execution.
Law, Speedup, Efficiency: It is calculated as the ratio of
the execution time in the serial

30
case to the execution time in and efficiency quantifies
the parallel case. resource utilization in parallel
computing.

- **Efficiency**: Efficiency
quantifies the utilization of Measuring Parallelism: Work
resources in parallel Done and Speedup:
computing. It is calculated as
the ratio of the speedup
achieved to the number of - **Work Done**: Work done

processors used. refers to the total amount of


computation or tasks
performed in a program. It is a
Exam-style Question: measure of the overall
workload and is often used to
Q: Explain the difference
assess the efficiency of parallel
between Amdahl's Law and
programs.
Gustafson's Law, and define
speedup and efficiency in
parallel computing.
- **Speedup**: Speedup
A: Amdahl's Law predicts measures the performance
maximum speedup when a improvement achieved by
portion of the program is parallel execution. It is
parallelized, while Gustafson's calculated as the ratio of the
Law focuses on scalability. execution time in the serial
Speedup measures case to the execution time in
performance improvement, the parallel case.

31
a standard API for developing
portable parallel applications.
Exam-style Question:

Q: What is meant by work done


and speedup in parallel - **Differences with
computing? OpenMP**: OpenCL and
OpenMP are both parallel
A: Work done quantifies the
programming models but
total computation in a
differ in their target
program, and speedup
architectures and
measures the performance
programming models. OpenCL
improvement achieved by
targets heterogeneous systems
parallel execution.
and employs a task-based
programming model, while

OpenCL basics and differences OpenMP focuses on shared

with OpenMP: memory systems and relies on


directives for parallelization.

- **OpenCL**: OpenCL (Open


Computing Language) is a Exam-style Question:

framework for heterogeneous Q: What is OpenCL, and how


computing that enables the does it differ from OpenMP?
execution of parallel programs
A: OpenCL is a framework for
across different devices, such
heterogeneous computing,
as CPUs and GPUs. It provides
targeting various devices. It
differs from OpenMP in its

32
target architecture and designing and implementing
programming model, as parallel programs that operate
OpenCL is for heterogeneous on data distributed across
systems and uses a task-based multiple memory spaces. MPI
programming model. provides the necessary tools
and libraries for this purpose.

Distributed Memory
Programming using MPI - Exam-style Question:
Basics:
Q: What is MPI, and what does
distributed memory
programming involve?
- **MPI**: MPI (Message
Passing Interface) is a standard A: MPI is a standard for
for message passing in message passing in distributed
distributed memory systems. memory systems. Distributed
It enables communication memory programming involves
designing parallel programs
for distributed data and
and coordination among utilizing MPI for
multiple processes running on communication and
separate nodes. coordination.

- **Distributed Memory Fundamentals of Parallel and


Programming**: Distributed Distributed Computing:
memory programming involves

33
involves multiple computers
working together on a task by
- **Parallelism**: Parallelism
sharing resources and
refers to the simultaneous
coordinating actions.
execution of multiple tasks or
computations, which can
improve performance and
Flynns Taxonomy, Parallel
solve larger problems
Architectures, and Building a
efficiently.
Cluster:

- **Distributed Computing**:
- **Flynns Taxonomy**: Flynns
Distributed computing
Taxonomy classifies computer
involves utilizing multiple
architectures based on the
computers or nodes to work
number of instruction streams
together on a task, sharing
(single or multiple) and data
resources and coordinating
streams (single or multiple). It
their actions.
categorizes architectures as
SISD, SIMD, MISD, and MIMD.

Exam-style Question:

Q: Define parallelism and - **Parallel Architectures**:


distributed computing. Parallel architectures refer to
the different types of hardware
A: Parallelism is the
configurations used for parallel
concurrent execution of tasks,
computing. Examples include
while distributed computing
shared nothing, shared disk,

34
and shared memory Building a cluster involves
architectures. assembling multiple
computers to function as a
unified system.
- **Building a Cluster**:
Data Parallelism vs Task
Building a cluster involves
Parallelism:
assembling multiple
computers or nodes and
connecting them to work
- **Data Parallelism**: Data
together as a single system.
parallelism is a parallel
Clusters are commonly used in
programming technique where
parallel and distributed
the same operation is
computing.
performed concurrently on
different subsets of data. It
involves dividing the data into
Exam-style Question:
chunks and assigning each
Q: Explain Flynns Taxonomy, chunk to a separate processing
different parallel unit.
architectures, and the process
of building a cluster.
- **Task Parallelism**: Task
A: Flynns Taxonomy
parallelism is a parallel
categorizes architectures
programming technique where
based on instruction and data
different tasks or operations
streams. Parallel architectures
are executed concurrently.
include shared nothing, shared
Each task can operate on
disk, and shared memory.

35
different data or perform coarse-grained parallelism,
different computations where larger tasks are
independently. executed concurrently.

Exam-style Question: Exam-style Question:

Q: What is the difference Q: What is parallelism


between data parallelism and granularity?
task parallelism?
A: Parallelism granularity
A: Data parallelism involves refers to the size or scale of
concurrent processing of data tasks executed in parallel,
subsets, while task parallelism ranging from fine-grained to
focuses on executing different coarse-grained parallelism.
tasks concurrently.

Load Balancing:
Parallelism Granularity:

- **Load Balancing**: Load


- **Parallelism Granularity**: balancing is the distribution of
Parallelism granularity refers computational workload across
to the size or scale of tasks multiple processing units or
that are executed in parallel. It nodes in a parallel system. It
can range from fine-grained aims to achieve equal
parallelism, where small tasks utilization of resources and
are executed concurrently, to minimize idle time.

36
Exam-style Question: - **Synchronization**:
Synchronization is the
Q: What is load balancing in
coordination of tasks or
parallel computing?
processes to ensure proper
A: Load balancing involves order of execution or to
distributing the computational prevent conflicts. It involves
workload evenly across mechanisms such as locks,
multiple processing units to semaphores, and barriers.
ensure efficient resource
utilization.
Exam-style Question:

Q: Define concurrency and


Concurrency and
synchronization in parallel
Synchronization:
computing.

A: Concurrency enables
- **Concurrency**: simultaneous progress of
Concurrency refers to the tasks, while synchronization
ability of multiple tasks or ensures proper order or avoids
operations to make progress conflicts through coordination
simultaneously. It allows mechanisms.
overlapping execution and can
be achieved through
parallelism or interleaved Task Graph and Task
execution. Dependency Graph:

37
flow, while a task dependency
graph represents the
- **Task Graph**: A task graph
dependencies between tasks.
represents the dependencies
and relationships among tasks
in a parallel program. It
Data Dependency Analysis:
visually depicts the flow of
execution and data
dependencies between tasks. - **Data Dependency
Analysis**: Data dependency
analysis is a technique used to
- **Task Dependency Graph**:
identify dependencies between
A task dependency graph
data elements or variables in a
illustrates the dependencies
program. It helps determine if
between tasks, showing which
certain operations can be
tasks must complete before
executed in parallel or if
others can start. It helps in
synchronization is required.
determining the order of task
execution.

Exam-style Question:

Exam-style Question: Q: What is data dependency


analysis?
Q: What are task graphs and
task dependency graphs? A: Data dependency analysis
identifies dependencies
A: A task graph depicts task
between data elements in a
relationships and execution
program to determine

38
parallelizability and developing and executing
synchronization requirements. parallel programs.

Introduction to Parallel Note: Please note that the


Programming Environments: provided summaries are
condensed versions of the
topics. For a comprehensive
- **Parallel Programming understanding, it is
Environments**: Parallel recommended to refer to the
programming environments complete material available to
provide tools, libraries, and you.
frameworks for developing and
executing parallel programs.
They offer abstractions and
APIs that simplify the task of
writing parallel code.

Exam-style Question:

Q: What is meant by parallel


programming environments?

A: Parallel programming
environments provide tools
and abstractions for

39

You might also like