0% found this document useful (0 votes)

92 views39 pages

PDC Last Min Notes For MCQS - Theory

- Parallel computing involves simultaneously solving tasks to address large problems faster using multiple processors or computers. - Tasks can be parallelized if they are independent and can be solved concurrently, while tasks with dependencies require sequential execution. - Problems can be broken into smaller parallel pieces based on different data domains or functional decompositions. - Architectures include shared nothing, shared disk, and shared memory configurations allowing distributed and parallel solutions.

Uploaded by

i202496 Rida Kamal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

92 views39 pages

PDC Last Min Notes For MCQS - Theory

Uploaded by

i202496 Rida Kamal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 39

- Distributed Computing :

LAST MIN REVISION NOTES

Computing performed on
PDC (PDC, DUAAS, HOPE) :)
multiple interconnected
you got this, you only miss 100 computers.
percent of the shots you don't
- Benefits of Parallel and
take!
Distributed Computing :
—----------------------------- *Faster processing, increased
-------- computational power,

MOD 1 BASICS- improved scalability.*

Certainly! Here's an extended

summary of the topics you 2. Need for Parallelism,
mentioned, using bold and Parallelizable vs.
italics for emphasis, and Un-Parallelizable Tasks,
presented in bullet points for Domain and Functional
readability: Decomposition :

- Need for Parallelism :

1. Fundamentals of Parallel Large problems can be divided

and Distributed Computing : into smaller tasks for parallel

processing.
- Parallel Computing :
Simultaneous execution of - Parallelizable Tasks :

multiple tasks to solve a larger Tasks that can be executed

problem faster. independently or concurrently.

- Un-Parallelizable Tasks :
Tasks that have dependencies

1
and cannot be easily private memory, communicate
parallelized. via messages.

- Domain Decomposition : - Shared Disk : Nodes

Breaking a problem into parts share a global disk but have
based on different data private memory.
domains.
- Shared Memory : Nodes
- Functional Decomposition share a common address
: Breaking a problem into parts space, allowing direct memory
based on different functions or access.
subtasks.

4. Building a Cluster :
3. Flynn's Taxonomy, Parallel
- Cluster : Group of
Architectures :
interconnected computers that
- Flynn's Taxonomy : work together as a single
Classification of computer system.
architectures based on
- Components : Compute
instruction and data streams.
nodes, interconnection
- Categories : SISD, SIMD, network, storage, software
MISD, MIMD. stack.

- Parallel Architectures : - Network Topologies : Bus,

ring, mesh, tree, hypercube.
- Shared Nothing :
Independent nodes with

2
- Cluster Management : - Formula : Speedup = (1 -
Resource allocation, load P) + (P * N), where *P is the
balancing, fault tolerance. parallelizable portion and N is
the number of processors*.

- Efficiency : Measures how

7. Amdahl's Law vs.
effectively parallel resources
Gustafson's Law, Speedup,
are used.
Efficiency, Measuring
Parallelism : - Formula : Efficiency =
(Speedup / Number of
- Amdahl's Law : Predicts
Processors) * 100%.
the maximum speedup
achievable by parallelizing a - Measuring Parallelism :
portion of a program with a
- Work Done : Measures
fixed sequential portion.
the total amount of work
- Formula : Speedup = 1 / performed by a parallel
((1 - P) + (P / N)), where *P is algorithm.
the parallelizable portion and
- Speedup : Ratio of the
N is the number of
time taken by the sequential
processors*.
algorithm to the time taken by
- Gustafson's Law : the parallel algorithm.
Considers a fixed problem size
—-----------------------------
and focuses on how
-----
parallelization can reduce the
overall time. SUMMARY FOR CONCEPT–

3
Certainly! Here's a summarized - Parallelism breaks big
version of the key points, problems into smaller tasks for
presented in bullet points: simultaneous solving.

- Parallelizable tasks:
Independent tasks that can be
Fundamentals of Parallel and
solved concurrently.
Distributed Computing :
- Un-parallelizable tasks: Tasks
- Parallel computing: Multiple
with dependencies that can't
people/computers working
be easily parallelized.
together to solve a problem
faster. - Domain decomposition:
Dividing a problem based on
- Distributed computing:
different data categories.
Multiple interconnected
computers working together to - Functional decomposition:
solve a problem. Breaking a problem into
smaller steps or functions.
- Benefits: Faster processing,
increased computational
power, improved scalability.
Flynn's Taxonomy, Parallel
Architectures :

Need for Parallelism, - Flynn's Taxonomy

Parallelizable vs. categorizes computer
Un-Parallelizable Tasks, architectures based on
Domain and Functional instruction and data streams.
Decomposition :

4
- Shared Nothing: Independent - Cluster Management:
nodes/computers solving their Resource allocation, load
tasks without communication. balancing, fault tolerance.

- Shared Disk: Nodes work on

different tasks but share a
Amdahl's Law vs. Gustafson's
common storage.
Law, Speedup, Efficiency,
- Shared Memory: Nodes work Measuring Parallelism :
on the same task and share a
- Amdahl's Law: Predicts
common memory.
maximum speedup by
parallelizing a portion of a
program.
Building a Cluster :
- Gustafson's Law: Focuses on
- Cluster: Group of
reducing overall time by
interconnected computers
increasing parallel resources.
working together.
- Speedup: Ratio of time taken
- Components: Computers,
by sequential algorithm to
interconnection network,
parallel algorithm.
storage, software stack.
- Efficiency: Measures
- Network Topologies:
effective resource utilization.
Different ways computers are
connected (bus, ring, mesh, - Measuring Parallelism: Work
tree, hypercube). Done (total work performed)
and Speedup (sequential time
divided by parallel time).

5
—----------------------------- —-----------------------------
-------- --------IMPORTANT
PLEASE!!!!!!!
Certainly! Here's a shorter
example for the "firstprivate" Data Parallelism vs. Task
and In this example, the Parallelism:
"firstprivate" clause initializes
* Data Parallelism : Divides
each thread's private copy of
data into smaller parts and
the variable "x" with the
processes them
original value of "x" (which is
simultaneously.
0). Each thread independently
performs its calculations, and - Example: Each student in a

the final value of "x" is not classroom solving a different

shared among threads. math problem.

- Key concept: Parallel

processing of independent
The "lastprivate" clause
data elements.
ensures that the value of "y"
from the last iteration of the * Task Parallelism : Divides a

loop is preserved. Each thread task into smaller sub-tasks and

updates its private copy of "y" executes them simultaneously.

independently, but the value of - Example: Each student in a

"y" outside the parallel region classroom working on a
reflects the value from the last different aspect of a group
iteration of the loop (which is 4 project.
in this case).

6
- Key concept: Parallel evenly across processors to
execution of independent maximize parallelism.
tasks.
- Ensures that each processor
has a similar amount of work
to do.
Parallelism Granularity:
- Key concept: Avoiding idle
* Parallelism Granularity :
processors while others are
Refers to the size of the tasks
still working.
or data that are executed in
parallel.

- Coarse-grained: Large tasks Concurrency and

are divided and executed Synchronization:
independently.
* Concurrency : Execution of
- Fine-grained: Small tasks multiple tasks overlapping in
are divided and executed time.
independently.
- Key concept: Simultaneous
- Key concept: Determining execution without strict order.
the appropriate task size for
* Synchronization :
efficient parallel execution.
Coordination of tasks to
ensure correctness and avoid
conflicts.
Load Balancing:
- Key concept: Enforcing
* Load Balancing :
order, mutual exclusion, and
Distributing the workload
data consistency.

7
- Formula: Speedup = 1 / [(1 -
P) + (P/N)]
Introduction to Parallel
Programming Environments: - Gustafson's Law : Focuses
on scaling the problem size
* Shared Memory Systems :
with more resources.
Multiple processors access a
shared memory space. - Formula: Speedup = (1 - P) +
(P * N)
- Parallel Algorithms using
OpenMP: * Efficiency : Measures how
effectively resources are
- Mutual Exclusion: Ensuring
utilized in parallel computing.
exclusive access to shared
resources. - Formula: Efficiency =
Speedup / Number of
- Reduction: Combining
Processors
results from multiple threads
into a single value. * Measuring Parallelism :

- Locks: Synchronization - Work Done : Total amount

mechanisms to protect critical of work accomplished by the
sections. parallel program.

* Amdahl's Law vs. - Speedup : Ratio of the

Gustafson's Law : execution time of a sequential
program to the parallel
- Amdahl's Law : Calculates
program's execution time.
the maximum speedup
achievable for a fixed
workload.

8
—----------------------------- - P: Fraction of the program
------------------------------- that can be parallelized
---------------------
- N: Number of processors

- Example: If 80% of a
program is parallelizable and
we have 4 processors (N=4):

- Speedup = 1 / [(1 - 0.8) +

(0.8/4)] = 1.67

2. Gustafson's Law : Focuses

Shared Memory Systems: on scaling the problem size
Parallel Algorithms using with more resources.
OpenMP
- Formula: Speedup = (1 - P) +
Amdahl's Law vs. Gustafson's (P * N)
Law, Speedup, Efficiency,
- P: Fraction of the program
Measuring Parallelism
that is sequential
1. Amdahl's Law : Calculates
- N: Number
the maximum speedup
achievable for a fixed
workload.
of processors
- Formula: Speedup = 1 / [(1 -
- Example: If 20% of a
P) + (P/N)]
program is sequential and we
have 4 processors (N=4):

9
- Speedup = (1 - 0.2) + (0.2 *
4) = 1.8
Note: The code examples are
simplified for illustration
purposes, and the outputs may
3. Efficiency : Measures how
vary depending on the system
effectively resources are
and the number of threads or
utilized in parallel computing.
processors used.
- Formula: Efficiency =
Speedup / Number of
Processors

- Example: If the speedup is 3 —-----------------------------

and we have 4 processors: -------------------------------
----
- Efficiency = 3 / 4 = 0.75
Certainly! Continuing from the
previous response:
4. Measuring Parallelism:
Work Done and Speedup :
2. **Gustafson's Law**:
- Work Done : Total amount
Focuses on scaling the problem
of work accomplished by the
size with more resources.
parallel program.
- Formula: *Speedup = (1 - P)
- Speedup : Ratio of the
+ (P * N)*
execution time of a sequential
program to the parallel - P: Fraction of the program
program's execution time. that is parallelized

10
- N: Number of processors 4. **Efficiency**: Measures the
utilization of resources in
- Example: If 80% of the
parallel execution.
program is parallelizable and
we have 4 processors (N=4): - Formula: *Efficiency =
Speedup / Number of
- Speedup = (1 - 0.8) + (0.8 *
Processors*
4) = 3.2
- Example: If a task achieves
a speedup of 5 and uses 4
3. **Speedup**: Represents processors:
the improvement gained by
- Efficiency = 5 / 4 = 1.25
executing a task in parallel.

- Formula: *Speedup =
Sequential Execution Time / 5. **Measuring Parallelism**:
Parallel Execution Time* Several metrics can be used to
quantify the level of
- Example: If a task takes 10
parallelism in a program.
seconds to execute
sequentially and 2 seconds to - **Parallel Fraction**: The
execute in parallel: portion of the program that
can be executed in parallel.
- Speedup = 10 seconds / 2
seconds = 5 - **Parallel Overhead**: The
additional time or resources
required for parallel execution,
beyond the sequential baseline.

11
- **Scalability**: The ability fixed workload. It takes into account
the portion of the computation that
of a program to maintain or is sequential and the portion that
improve performance as the can be parallelized.
● Formula: Speedup = 1 / [(1 -
problem size or number of
P) + (P / N)]
processors increases. ● P: Fraction of the
computation that is
- **Load Balance**: The parallelizable
● N: Number of processors
distribution of work among
● Example: If 75% of the
parallel tasks, aiming for equal computation is parallelizable

or optimal load distribution. and we have 4 processors

(N=4):
● Speedup = 1 / [(1 -
0.75) + (0.75 / 4)] = 4
These concepts and metrics 7. Parallel Algorithms: These are
algorithms specifically designed to
are important in analyzing and
take advantage of parallel
optimizing parallel algorithms computing resources. Parallel
algorithms divide the problem into
and systems, helping
subtasks that can be executed
developers make informed simultaneously on different

decisions about parallelization processors, and then combine the

results to obtain the final solution.
and resource utilization.
Examples include parallel sorting
algorithms, parallel matrix
—-----------------------------
multiplication, and parallel graph
------------------------------- algorithms.
8. Parallel Programming Models:
-----
Parallel programming models
–MORE NOTES EASIER– provide a high-level abstraction for
expressing parallelism in programs.
6. Amdahl's Law: Amdahl's Law is a They offer constructs and APIs for
formula that estimates the creating parallel tasks, synchronizing
maximum speedup achievable by them, and managing data
parallelizing a computation with a dependencies. Common parallel

12
programming models include Shared the number of parallel
Memory (e.g., OpenMP), Message
Passing (e.g., MPI), and Data Parallel
processing units.
(e.g., CUDA).
2. Gustafson's Law: Speedup =
9. Parallel Patterns: Parallel patterns
are reusable templates for solving S + (N - 1) * S, where S is the
common parallel computing
parallelizable portion and N is
problems. They capture the essence
of parallel algorithms and provide the number of parallel
high-level abstractions for
processing units.
expressing parallelism. Examples of
parallel patterns include MapReduce, 3. Speedup measures the
Parallel Prefix Sum, and Pipeline.
10. Parallel Performance Tools: Tools
performance improvement
and frameworks are available to achieved by parallel execution:
analyze and optimize the
Speedup = Sequential
performance of parallel programs.
These tools help identify Execution Time / Parallel
performance bottlenecks, measure
Execution Time.
scalability, analyze parallel execution
traces, and optimize resource
4. Amdahl's Law focuses on
utilization. Examples include Intel
VTune, NVIDIA Nsight, and OpenMP
non-parallelizable portions,
Tools. Gustafson's Law considers
problem size and available
resources, and Speedup
quantifies actual improvement
in parallel execution.

1. Amdahl's Law: Speedup = 1 /

[(1 - P) + (P / N)], where P is the
Certainly! Here are worked
parallelizable portion and N is
examples for each subtopic,

13
rewritten with italics and bold 2. **Superlinear Speedup:**
for improved readability:
- *Superlinear speedup*
refers to achieving a *speedup
factor greater than the number
1. **Amdahl's Law:**
of processors* used.
- *Amdahl's Law* is a formula
- Although theoretically not
that helps us understand the
possible, *superlinear
potential *speedup* achievable
speedup* can occur due to
in a program when
factors such as increased
*parallelizing* a portion of it.
memory availability or
- The formula for *Amdahl's *nondeterministic
Law* is: **Max.speedup = 1 / (1 algorithms*.
- P)**, where *P* represents
the fraction of code that can be
parallelized. 3. **Speedup (ts/tp):**

- The maximum speedup - Speedup is defined as the

represents the theoretical ratio of *execution time on a
limit of performance single processor (ts)* to the
improvement when *execution time on a
parallelizing a program. It is multiprocessor system (tp)*.
*infinite* when all code is
- The formula for *speedup*
parallelizable (*P = 1*) and *1*
is: **Speedup = ts / tp**.
when no code can be
parallelized (*P = 0*). - *Speedup* quantifies the
improvement in performance

14
when utilizing multiple factor greater than the number
processors to execute a of processors* used, which is
program. rare in practice.

4. Speedup for N CPUs or 6. Efficiency:**

Machines:**
- *Efficiency* measures the
- To model the relationship ability to utilize resources
between the *number of effectively without waste.
processors (Proc)*, the
- In the context of parallel
*parallel fraction (fP)*, and the
computing, *efficiency* is
*serial fraction (fS)*, the
often defined as the ratio of
*speedup* can be calculated
*speedup to the number of
using the formula: **Speedup =
processors*.
1 / (fS + fP / Proc)**.
- The formula for
*efficiency* is: **Efficiency =
5. **Linear vs. Superlinear Speedup / Proc**.
Speedup:**

- Linear speedup refers to

7. **Gustafson's Law:**
achieving a *speedup factor
equal to the number of - *Gustafson's Law* provides

processors* used. an alternative perspective to

*Amdahl's Law*, focusing on
- *Superlinear speedup*
increasing the *problem size*
refers to achieving a *speedup

15
as the number of processors - *Strong scalability* occurs
increases. when increasing the number of
processors without increasing
- *Gustafson's Law* states
the problem size maintains
that if the *problem size scales
efficiency.
with the number of
processors*, *scalability* can - *Weak scalability* occurs
be achieved. when the efficiency remains
fixed by increasing the
- *Scaled Speedup (S(p))*
problem size proportionally to
using *P processors* can be
the number of processors.
calculated using the formula:
**S(p) = fS + P * fP**.

I hope these worked examples

8. **Speedups and Efficiencies

on Different Problem Sizes:**
, presented with the use of
- *Speedups* and italics and bold formatting,
*efficiencies* can vary for help clarify the concepts of
parallel programs based on Amdahl's Law, the speedup
different *problem sizes*. factor, and Gustafson's Law.

- Scalability is determined —-----------------------------

by the ability to handle -------------------------------
*increasing problem sizes* ----
effectively.

16
Dependency Analysis in
Parallel and Distributed
2. Types of Dependencies:
Computing (PDC)
There are two main types of
dependencies: control
dependence and data
Dependency analysis is a
dependence.
critical step in parallelizing
and optimizing programs in
parallel and distributed
- Control Dependence:
computing. It involves
Control dependence exists
identifying and analyzing the
between instructions when the
dependencies between
execution of one instruction
instructions or data elements
depends on the outcome of a
to determine if they can be
preceding instruction's control
executed or accessed in
flow, such as conditional
parallel. Here is a brief
branches.
overview of dependency
analysis in PDC:

- Data Dependence: Data

dependence exists between
1. Goals: The primary goal of
instructions when the
dependency analysis is to
execution or ordering of
identify dependencies
instructions is determined by
accurately to enable safe and
their data dependencies,
efficient parallel execution of
including true, anti, and output
instructions or data elements.
dependencies.

17
structures, as loops are crucial
for parallelization.
3. Dependence Analysis
Techniques: Dependency
analysis involves various
4. Dependency Testing:
techniques to identify and
Dependency testing is
analyze dependencies,
performed to determine if
including:
dependencies can be safely
eliminated or transformed
without affecting the
- Data Flow Analysis: It
correctness of the program.
tracks the flow of data through
This testing involves
the program to determine
evaluating the distance
dependencies between
vectors, direction vectors, and
instructions or data elements.
dependence directions
between instructions or data

- Use-Def and Def-Use elements.

Chains: These chains identify

the usage and definition of
Worked Examples of PDC
variables to analyze data
Dependency Analysis:
dependencies.

1. Example: Loop-Carried
- Loop Dependence Analysis:
Dependence
It focuses on analyzing
dependencies within loop ```python

18
for i = 1 to N: 3. Example: Control
Dependence
A[i] = B[i-1] + C[i+1]
```python
```
if (x > 0):
Dependency Analysis: There is
a loop-carried dependence y=y+1
between the iterations because
else:
A[i] depends on the values of
B[i-1] and C[i+1] from the z=z+1

previous iteration. ```

Dependency Analysis: There is

2. Example: Loop-Independent control dependence between

Dependence the if and else branches. The

execution of the instructions
```python
in the if branch depends on the
for i = 1 to N: outcome of the condition (x >
0).
A[i] = B[i] + C[i]

```
4. Example: True Dependency
Dependency Analysis: There
are no dependencies between ```python
iterations because each
x=y+z
iteration operates on different
elements of arrays B and C. a=x+b

```

19
Dependency Analysis: There is encountered in parallel and
a true dependency between distributed computing. The
the instructions. The dependency analysis process
instruction `a = x + b` depends involves analyzing the code
on the value produced by `x = y structure, tracking variable
+ z`. usage and definitions, and
determining the dependencies
based on their characteristics.
5. Example: Anti Dependency

```python
It's important to note that the
x=a+b actual analysis and

y=x-c transformation of
dependencies may vary
```
depending on the specific
Dependency Analysis: There is programming language,
an anti-dependency between compiler, and optimization
the instructions. The techniques used in parallel and
instruction `y = x - c` depends distributed computing
on the value produced by `x = a systems.
+ b`.
—-----------------------------
-------------------------------

Note: The above examples -----

demonstrate different types of

dependencies commonly

20
Dependency Analysis in
Parallel and Distributed
2. Types of Dependencies:
Computing (PDC)
There are two main types of
dependencies: control
dependence and data
Dependency analysis is a
dependence.
critical step in parallelizing
and optimizing programs in
parallel and distributed
- Control Dependence:
computing. It involves
Control dependence exists
identifying and analyzing the
between instructions when the
dependencies between
execution of one instruction
instructions or data elements
depends on the outcome of a
to determine if they can be
preceding instruction's control
executed or accessed in
flow, such as conditional
parallel. Here is a brief
branches.
overview of dependency
analysis in PDC:

- Data Dependence: Data

21
structures, as loops are crucial
for parallelization.
3. Dependence Analysis
Techniques: Dependency
analysis involves various
4. Dependency Testing:
techniques to identify and
Dependency testing is
analyze dependencies,
performed to determine if
including:
dependencies can be safely
eliminated or transformed
without affecting the
- Data Flow Analysis: It
correctness of the program.
tracks the flow of data through
This testing involves
the program to determine
evaluating the distance
dependencies between
vectors, direction vectors, and
instructions or data elements.
dependence directions
between instructions or data

- Use-Def and Def-Use elements.

Chains: These chains identify

the usage and definition of
Worked Examples of PDC
variables to analyze data
Dependency Analysis:
dependencies.

1. Example: Loop-Carried
- Loop Dependence Analysis:
Dependence
It focuses on analyzing
dependencies within loop ```python

22
for i = 1 to N: 3. Example: Control
Dependence
A[i] = B[i-1] + C[i+1]
```python
```
if (x > 0):
Dependency Analysis: There is
a loop-carried dependence y=y+1
between the iterations because
else:
A[i] depends on the values of
B[i-1] and C[i+1] from the z=z+1

previous iteration. ```

Dependency Analysis: There is

2. Example: Loop-Independent control dependence between

Dependence the if and else branches. The

execution of the instructions
```python
in the if branch depends on the
for i = 1 to N: outcome of the condition (x >
0).
A[i] = B[i] + C[i]

```
4. Example: True Dependency
Dependency Analysis: There
are no dependencies between ```python
iterations because each
x=y+z
iteration operates on different
elements of arrays B and C. a=x+b

```

23
Dependency Analysis: There is encountered in parallel and
a true dependency between distributed computing. The
the instructions. The dependency analysis process
instruction `a = x + b` depends involves analyzing the code
on the value produced by `x = y structure, tracking variable
+ z`. usage and definitions, and
determining the dependencies
based on their characteristics.
5. Example: Anti Dependency

```python
It's important to note that the
x=a+b actual analysis and

Note: The above examples ----

demonstrate different types of Data Parallelism Vs Task

dependencies commonly Parallelism:

24
Exam-style Question:

- Data Parallelism: In data Q: What are the differences

parallelism, the same task is between data parallelism and
performed on different data task parallelism in parallel
sets simultaneously. It aims to computing?
divide the data and execute the
A: Data parallelism focuses on
same operations on multiple
distributing the workload by
processors concurrently. The
performing the same task on
focus is on distributing the
different data sets
workload across processors to
concurrently. Task parallelism
achieve parallelism.
involves executing different
tasks simultaneously by
multiple processors.
- **Task Parallelism**: In task
parallelism, different tasks or
operations are executed
Parallelism Granularity:
simultaneously by multiple
processors. Each processor
performs a unique task - **Parallelism Granularity**:
independently. Task Parallelism granularity refers
parallelism aims to divide the to the size of the tasks or units
workload into smaller tasks of work that are executed in
and execute them parallel. It determines the level
concurrently. of concurrency and the
amount of communication and

25
synchronization overhead in parallel system. It aims to
parallel programs. optimize resource utilization
and minimize idle time by
ensuring that each processor
Exam-style Question: has a similar amount of work.

Q: What is parallelism
granularity, and why is it
Exam-style Question:
important in parallel
computing? Q: What is load balancing in
parallel computing, and why is
A: Parallelism granularity
it important?
refers to the size of tasks or
units of work in parallel. It A: Load balancing involves
affects concurrency, distributing the workload
communication, and evenly across processors to
synchronization overhead in optimize resource utilization
parallel programs. and minimize idle time.

Load Balancing: Concurrency and

Synchronization:

- Load Balancing: Load

balancing is the process of - **Concurrency**:
distributing the workload Concurrency refers to the
evenly across multiple ability of multiple tasks or
processors or nodes in a operations to make progress

26
simultaneously. It allows simultaneously, while
different tasks to execute synchronization involves
independently and coordinating tasks to ensure
concurrently. proper ordering and
consistency using
synchronization primitives.
- **Synchronization**:
Synchronization is the
coordination of tasks to ensure Task Graph and Task
proper ordering and Dependency Graph:
consistency in parallel
programs. It involves using
synchronization primitives like - **Task Graph**: A task graph

locks, barriers, and is a visual representation of

semaphores to control access tasks and their dependencies

to shared resources and in a parallel program. It shows

enforce mutual exclusion. the relationships and

dependencies between tasks
and helps in understanding the
Exam-style Question: parallel execution flow.

Q: What is the difference

between concurrency and
- **Task Dependency Graph**:
synchronization in parallel
A task dependency graph
computing?
illustrates the dependencies
A: Concurrency refers to the between tasks in a parallel
ability of tasks to execute program. It provides a detailed

27
representation of the between data elements or
dependencies and enables instructions in a program. It
efficient scheduling and helps determine the order of
execution of tasks. execution and ensures
correctness and data
consistency in parallel
Exam-style Question: programs.

Q: What are task graphs and

task dependency graphs in
Exam-style Question:
parallel computing?
Q: What is data dependency
A: A task graph is a visual
analysis in parallel computing,
representation of tasks and
and why is it important?
their dependencies, while a
task dependency graph A: Data dependency analysis
illustrates the dependencies identifies dependencies
between tasks in a parallel between data elements or
program. instructions to determine the
order of execution and ensure
correctness and data
Data Dependency Analysis: consistency in parallel
programs.

- **Data Dependency
Analysis**: Data dependency Introduction to Parallel
analysis is the process of Programming Environments:
identifying dependencies

28
- **Parallel Programming (Continued in the next
Environments**: Parallel message)
programming environments
Shared Memory Systems:
provide tools, libraries, and
Parallel Algorithms using
frameworks to develop and
OpenMP:
execute parallel programs.
They offer high-level
abstractions and APIs to - **Shared Memory
simplify parallel program Systems**: Shared memory
development. systems are parallel computing
architectures where multiple
processors share a common
Exam-style Question:
memory space. They allow
Q: concurrent access to shared
data and support parallel
programming models like
What are parallel OpenMP.
programming environments,
and why are they used?
- **Parallel Algorithms using
A: Parallel programming
OpenMP**: OpenMP is a
environments provide tools
programming model for shared
and abstractions to develop
memory systems. It provides
and execute parallel programs
directives and libraries for
efficiently.
parallelizing code and

29
exploiting parallelism. Parallel - **Amdahl's Law**: Amdahl's
algorithms in OpenMP include Law is a formula that predicts
mutual exclusion, reduction, the maximum potential
and locks. speedup of a program when
only a portion of it can be
parallelized. It highlights the
Exam-style Question: impact of serial portions on

Q: What are shared memory overall performance.

systems, and how are parallel

algorithms implemented using
- **Gustafson's Law**:
OpenMP?
Gustafson's Law focuses on
A: Shared memory systems are scalability and argues that as
architectures where multiple the problem size increases, the
processors share a common parallel portion of the program
memory. OpenMP is a becomes dominant, leading to
programming model for such improved speedup.
systems, and parallel
algorithms in OpenMP utilize
techniques like mutual - **Speedup**: Speedup
exclusion, reduction, and locks. measures the performance
improvement achieved by
executing a program in parallel
Amdahl's Law vs Gustafson's compared to serial execution.
Law, Speedup, Efficiency: It is calculated as the ratio of
the execution time in the serial

30
case to the execution time in and efficiency quantifies
the parallel case. resource utilization in parallel
computing.

- **Efficiency**: Efficiency
quantifies the utilization of Measuring Parallelism: Work
resources in parallel Done and Speedup:
computing. It is calculated as
the ratio of the speedup
achieved to the number of - **Work Done**: Work done

processors used. refers to the total amount of

computation or tasks
performed in a program. It is a
Exam-style Question: measure of the overall
workload and is often used to
Q: Explain the difference
assess the efficiency of parallel
between Amdahl's Law and
programs.
Gustafson's Law, and define
speedup and efficiency in
parallel computing.
- **Speedup**: Speedup
A: Amdahl's Law predicts measures the performance
maximum speedup when a improvement achieved by
portion of the program is parallel execution. It is
parallelized, while Gustafson's calculated as the ratio of the
Law focuses on scalability. execution time in the serial
Speedup measures case to the execution time in
performance improvement, the parallel case.

31
a standard API for developing
portable parallel applications.
Exam-style Question:

Q: What is meant by work done

and speedup in parallel - **Differences with
computing? OpenMP**: OpenCL and
OpenMP are both parallel
A: Work done quantifies the
programming models but
total computation in a
differ in their target
program, and speedup
architectures and
measures the performance
programming models. OpenCL
improvement achieved by
targets heterogeneous systems
parallel execution.
and employs a task-based
programming model, while

OpenCL basics and differences OpenMP focuses on shared

with OpenMP: memory systems and relies on

directives for parallelization.

- OpenCL: OpenCL (Open

Computing Language) is a Exam-style Question:

framework for heterogeneous Q: What is OpenCL, and how

computing that enables the does it differ from OpenMP?
execution of parallel programs
A: OpenCL is a framework for
across different devices, such
heterogeneous computing,
as CPUs and GPUs. It provides
targeting various devices. It
differs from OpenMP in its

32
target architecture and designing and implementing
programming model, as parallel programs that operate
OpenCL is for heterogeneous on data distributed across
systems and uses a task-based multiple memory spaces. MPI
programming model. provides the necessary tools
and libraries for this purpose.

Distributed Memory
Programming using MPI - Exam-style Question:
Basics:
Q: What is MPI, and what does
distributed memory
programming involve?
- **MPI**: MPI (Message
Passing Interface) is a standard A: MPI is a standard for
for message passing in message passing in distributed
distributed memory systems. memory systems. Distributed
It enables communication memory programming involves
designing parallel programs
for distributed data and
and coordination among utilizing MPI for
multiple processes running on communication and
separate nodes. coordination.

- **Distributed Memory Fundamentals of Parallel and

Programming**: Distributed Distributed Computing:
memory programming involves

33
involves multiple computers
working together on a task by
- **Parallelism**: Parallelism
sharing resources and
refers to the simultaneous
coordinating actions.
execution of multiple tasks or
computations, which can
improve performance and
Flynns Taxonomy, Parallel
solve larger problems
Architectures, and Building a
efficiently.
Cluster:

- **Distributed Computing**:
- **Flynns Taxonomy**: Flynns
Distributed computing
Taxonomy classifies computer
involves utilizing multiple
architectures based on the
computers or nodes to work
number of instruction streams
together on a task, sharing
(single or multiple) and data
resources and coordinating
streams (single or multiple). It
their actions.
categorizes architectures as
SISD, SIMD, MISD, and MIMD.

Exam-style Question:

Q: Define parallelism and - Parallel Architectures:

distributed computing. Parallel architectures refer to
the different types of hardware
A: Parallelism is the
configurations used for parallel
concurrent execution of tasks,
computing. Examples include
while distributed computing
shared nothing, shared disk,

34
and shared memory Building a cluster involves
architectures. assembling multiple
computers to function as a
unified system.
- **Building a Cluster**:
Data Parallelism vs Task
Building a cluster involves
Parallelism:
assembling multiple
computers or nodes and
connecting them to work
- **Data Parallelism**: Data
together as a single system.
parallelism is a parallel
Clusters are commonly used in
programming technique where
parallel and distributed
the same operation is
computing.
performed concurrently on
different subsets of data. It
involves dividing the data into
Exam-style Question:
chunks and assigning each
Q: Explain Flynns Taxonomy, chunk to a separate processing
different parallel unit.
architectures, and the process
of building a cluster.
- **Task Parallelism**: Task
A: Flynns Taxonomy
parallelism is a parallel
categorizes architectures
programming technique where
based on instruction and data
different tasks or operations
streams. Parallel architectures
are executed concurrently.
include shared nothing, shared
Each task can operate on
disk, and shared memory.

35
different data or perform coarse-grained parallelism,
different computations where larger tasks are
independently. executed concurrently.

Exam-style Question: Exam-style Question:

Q: What is the difference Q: What is parallelism

between data parallelism and granularity?
task parallelism?
A: Parallelism granularity
A: Data parallelism involves refers to the size or scale of
concurrent processing of data tasks executed in parallel,
subsets, while task parallelism ranging from fine-grained to
focuses on executing different coarse-grained parallelism.
tasks concurrently.

Load Balancing:
Parallelism Granularity:

- Load Balancing: Load

- **Parallelism Granularity**: balancing is the distribution of
Parallelism granularity refers computational workload across
to the size or scale of tasks multiple processing units or
that are executed in parallel. It nodes in a parallel system. It
can range from fine-grained aims to achieve equal
parallelism, where small tasks utilization of resources and
are executed concurrently, to minimize idle time.

36
Exam-style Question: - **Synchronization**:
Synchronization is the
Q: What is load balancing in
coordination of tasks or
parallel computing?
processes to ensure proper
A: Load balancing involves order of execution or to
distributing the computational prevent conflicts. It involves
workload evenly across mechanisms such as locks,
multiple processing units to semaphores, and barriers.
ensure efficient resource
utilization.
Exam-style Question:

Q: Define concurrency and

Concurrency and
synchronization in parallel
Synchronization:
computing.

A: Concurrency enables
- **Concurrency**: simultaneous progress of
Concurrency refers to the tasks, while synchronization
ability of multiple tasks or ensures proper order or avoids
operations to make progress conflicts through coordination
simultaneously. It allows mechanisms.
overlapping execution and can
be achieved through
parallelism or interleaved Task Graph and Task
execution. Dependency Graph:

37
flow, while a task dependency
graph represents the
- **Task Graph**: A task graph
dependencies between tasks.
represents the dependencies
and relationships among tasks
in a parallel program. It
Data Dependency Analysis:
visually depicts the flow of
execution and data
dependencies between tasks. - **Data Dependency
Analysis**: Data dependency
analysis is a technique used to
- **Task Dependency Graph**:
identify dependencies between
A task dependency graph
data elements or variables in a
illustrates the dependencies
program. It helps determine if
between tasks, showing which
certain operations can be
tasks must complete before
executed in parallel or if
others can start. It helps in
synchronization is required.
determining the order of task
execution.

Exam-style Question:

Exam-style Question: Q: What is data dependency

analysis?
Q: What are task graphs and
task dependency graphs? A: Data dependency analysis
identifies dependencies
A: A task graph depicts task
between data elements in a
relationships and execution
program to determine

38
parallelizability and developing and executing
synchronization requirements. parallel programs.

Introduction to Parallel Note: Please note that the

Programming Environments: provided summaries are
condensed versions of the
topics. For a comprehensive
- **Parallel Programming understanding, it is
Environments**: Parallel recommended to refer to the
programming environments complete material available to
provide tools, libraries, and you.
frameworks for developing and
executing parallel programs.
They offer abstractions and
APIs that simplify the task of
writing parallel code.

Exam-style Question:

Q: What is meant by parallel

programming environments?

A: Parallel programming
environments provide tools
and abstractions for

Parallel and Distributed Computing Complete Notes
No ratings yet
Parallel and Distributed Computing Complete Notes
41 pages
Untitled Document
No ratings yet
Untitled Document
63 pages
PDC Complete Course File
No ratings yet
PDC Complete Course File
422 pages
1st Ia Preparation
No ratings yet
1st Ia Preparation
15 pages
Presentation 3
No ratings yet
Presentation 3
63 pages
Computer System Organization - IM's With Matermark
No ratings yet
Computer System Organization - IM's With Matermark
166 pages
RISC-V Assembly Language Programming: Unlock The Power of The RISC-V Instruction Set (Maker Innovations Series) 1st Edition Stephen Smith
100% (3)
RISC-V Assembly Language Programming: Unlock The Power of The RISC-V Instruction Set (Maker Innovations Series) 1st Edition Stephen Smith
55 pages
Untitled Document
No ratings yet
Untitled Document
39 pages
1 Introduction
No ratings yet
1 Introduction
30 pages
BDS Session 2
No ratings yet
BDS Session 2
56 pages
Cambridge O Level 2210 Computer Science Syllabus For Examination in 2026, 2027 and 2028 - 2
No ratings yet
Cambridge O Level 2210 Computer Science Syllabus For Examination in 2026, 2027 and 2028 - 2
56 pages
CS439 CC 2 Parallel Distributed Systems
No ratings yet
CS439 CC 2 Parallel Distributed Systems
37 pages
Module 1 Chapter3
No ratings yet
Module 1 Chapter3
45 pages
SPPU High Performance Computing
No ratings yet
SPPU High Performance Computing
12 pages
E - Notes - HPC-Unit 3-1
No ratings yet
E - Notes - HPC-Unit 3-1
26 pages
Pda 1
No ratings yet
Pda 1
72 pages
HPC Parallel
No ratings yet
HPC Parallel
122 pages
Patient Monitoring System Using Zigbee and GSM
No ratings yet
Patient Monitoring System Using Zigbee and GSM
63 pages
Unit1 2 and 3
No ratings yet
Unit1 2 and 3
76 pages
Bus Architecture
No ratings yet
Bus Architecture
18 pages
Parallel Programming: Sathish S. Vadhiyar Course Web Page
No ratings yet
Parallel Programming: Sathish S. Vadhiyar Course Web Page
36 pages
48405156-Hostel-Management-System (1) (1) - Merged (1) (1) - 042105
No ratings yet
48405156-Hostel-Management-System (1) (1) - Merged (1) (1) - 042105
60 pages
Parallel Algorithms Presentation
No ratings yet
Parallel Algorithms Presentation
32 pages
DDS PF & LF
No ratings yet
DDS PF & LF
102 pages
BDS Session 2
No ratings yet
BDS Session 2
58 pages
Week 7
No ratings yet
Week 7
27 pages
Cours 2
No ratings yet
Cours 2
25 pages
VI Sem Maths C Programming For Mathematical Computing
No ratings yet
VI Sem Maths C Programming For Mathematical Computing
152 pages
2 ND
No ratings yet
2 ND
19 pages
PDC Summers Finals Revision Notes
No ratings yet
PDC Summers Finals Revision Notes
50 pages
Group3 - Parallel - Computing - Techniques - Presentation Power Point 2025
No ratings yet
Group3 - Parallel - Computing - Techniques - Presentation Power Point 2025
27 pages
Lecture 2
No ratings yet
Lecture 2
32 pages
Screenshot 2024-12-05 at 2.01.32 PM
No ratings yet
Screenshot 2024-12-05 at 2.01.32 PM
49 pages
07 Parallel Algorithms in Parallel and Distributed Computing
No ratings yet
07 Parallel Algorithms in Parallel and Distributed Computing
13 pages
CS621 Cheatsheet
No ratings yet
CS621 Cheatsheet
11 pages
Parallel Programming Module 1
No ratings yet
Parallel Programming Module 1
71 pages
Module 1
No ratings yet
Module 1
14 pages
HT48R06A-1 Holtek Elenota - PL
No ratings yet
HT48R06A-1 Holtek Elenota - PL
40 pages
Assignement
No ratings yet
Assignement
3 pages
001 - DDS IIIT Jan 10th
No ratings yet
001 - DDS IIIT Jan 10th
34 pages
Parallel Computing
No ratings yet
Parallel Computing
28 pages
HPC Note
No ratings yet
HPC Note
39 pages
Microprocessor Systems Lab Manual 2023-1 PDF
No ratings yet
Microprocessor Systems Lab Manual 2023-1 PDF
130 pages
BDS Session 2
No ratings yet
BDS Session 2
58 pages
Ec8552 Computer Architecture and Organization
No ratings yet
Ec8552 Computer Architecture and Organization
106 pages
CC Assignment 3
No ratings yet
CC Assignment 3
8 pages
Week 7
No ratings yet
Week 7
27 pages
PDC ch#5
No ratings yet
PDC ch#5
12 pages
ParallelIzation Principles
No ratings yet
ParallelIzation Principles
40 pages
Course Outcome 1:: 15Cs4180 - Parallel Computing
No ratings yet
Course Outcome 1:: 15Cs4180 - Parallel Computing
23 pages
02 - Lecture #2
No ratings yet
02 - Lecture #2
29 pages
Week 2 p2
No ratings yet
Week 2 p2
10 pages
Week1 - Parallel and Distributed Computing
100% (1)
Week1 - Parallel and Distributed Computing
46 pages
Execution Unit (EU) : Instruction Queue, and The Instruction Pointer. It Has The Task of Making Sure That The Bus
No ratings yet
Execution Unit (EU) : Instruction Queue, and The Instruction Pointer. It Has The Task of Making Sure That The Bus
10 pages
Distributed Computing Seminar
No ratings yet
Distributed Computing Seminar
37 pages
HPC Module 4
No ratings yet
HPC Module 4
18 pages
CDG
No ratings yet
CDG
100 pages
HPC Ut 2
No ratings yet
HPC Ut 2
4 pages
PDC - Lecture - No. 2
No ratings yet
PDC - Lecture - No. 2
31 pages
DSECL ZG 522: Big Data Systems: Session 2: Parallel and Distributed Systems
No ratings yet
DSECL ZG 522: Big Data Systems: Session 2: Parallel and Distributed Systems
58 pages
HPC Lecture 2 Points
No ratings yet
HPC Lecture 2 Points
7 pages
Zimsec O Level Notes
100% (1)
Zimsec O Level Notes
73 pages
An Instruction Set
No ratings yet
An Instruction Set
3 pages
Pc98 Lect5 Part1 Speedup
No ratings yet
Pc98 Lect5 Part1 Speedup
36 pages
HPC Lectures 1 5
No ratings yet
HPC Lectures 1 5
18 pages
HPC Overview
No ratings yet
HPC Overview
45 pages
Principles of Scalable Performance
No ratings yet
Principles of Scalable Performance
61 pages
Intro Parallel Programming 2015
No ratings yet
Intro Parallel Programming 2015
38 pages
1st Unit
No ratings yet
1st Unit
4 pages
Parallel and Distributed Algorithms
No ratings yet
Parallel and Distributed Algorithms
65 pages
SPC Exchance
No ratings yet
SPC Exchance
12 pages
Lec7 PDF
No ratings yet
Lec7 PDF
16 pages
June 2021 (v1) QP
No ratings yet
June 2021 (v1) QP
16 pages
Parallel Computing
No ratings yet
Parallel Computing
19 pages
24-25 - Parallel Processing PDF
No ratings yet
24-25 - Parallel Processing PDF
36 pages
Python Unit 1
No ratings yet
Python Unit 1
15 pages
Computer Science Coursebook-66-75
No ratings yet
Computer Science Coursebook-66-75
10 pages
Aacs3064 (Pyp 29.4.2020)
No ratings yet
Aacs3064 (Pyp 29.4.2020)
7 pages
Assembly Language and Machine Code
No ratings yet
Assembly Language and Machine Code
3 pages
Syllabus of Undergraduate Degree Course: B.Tech. VI Semester
No ratings yet
Syllabus of Undergraduate Degree Course: B.Tech. VI Semester
27 pages
Time, Space and Power Complexity of ARM CORTEX M3: Team: Jatin Kumar (18BCB0072) and Prateek Sinha (18BCB0081)
No ratings yet
Time, Space and Power Complexity of ARM CORTEX M3: Team: Jatin Kumar (18BCB0072) and Prateek Sinha (18BCB0081)
5 pages
Computer Architecture Prof. Madhu Mutyam Department of Computer Science and Engineering Indian Institute of Technology, Madras
No ratings yet
Computer Architecture Prof. Madhu Mutyam Department of Computer Science and Engineering Indian Institute of Technology, Madras
10 pages
.Trashed-1650000204-Hpc Prac Exam
No ratings yet
.Trashed-1650000204-Hpc Prac Exam
5 pages
Functional Verification Methodology of A 32bit Risc Microprocess PDF
No ratings yet
Functional Verification Methodology of A 32bit Risc Microprocess PDF
4 pages
Today's Content: Microprocessor Notes Lecture #1
No ratings yet
Today's Content: Microprocessor Notes Lecture #1
15 pages
IES Sample MCQ-01
No ratings yet
IES Sample MCQ-01
12 pages
Learn Multithreading with Modern C++
From Everand
Learn Multithreading with Modern C++
James Raynard
No ratings yet
DRBD-Cookbook: How to create your own cluster solution, without SAN or NAS!
From Everand
DRBD-Cookbook: How to create your own cluster solution, without SAN or NAS!
Joerg Christian Seubert
No ratings yet
The Mac Terminal Reference and Scripting Primer
From Everand
The Mac Terminal Reference and Scripting Primer
Jay Docherty
4.5/5 (3)
Build Your Own Distributed Compilation Cluster - A Practical Walkthrough
From Everand
Build Your Own Distributed Compilation Cluster - A Practical Walkthrough
Hunter Davis
No ratings yet

PDC Last Min Notes For MCQS - Theory

Uploaded by

PDC Last Min Notes For MCQS - Theory

Uploaded by

- Distributed Computing :

LAST MIN REVISION NOTES

MOD 1 BASICS- improved scalability.*

Certainly! Here's an extended

- Need for Parallelism :

1. Fundamentals of Parallel Large problems can be divided

and Distributed Computing : into smaller tasks for parallel

multiple tasks to solve a larger Tasks that can be executed

problem faster. independently or concurrently.

- Domain Decomposition : - Shared Disk : Nodes

- Parallel Architectures : - Network Topologies : Bus,

- Efficiency : Measures how

Need for Parallelism, - Flynn's Taxonomy

- Shared Disk: Nodes work on

the final value of "x" is not classroom solving a different

shared among threads. math problem.

- Key concept: Parallel

loop is preserved. Each thread task into smaller sub-tasks and

updates its private copy of "y" executes them simultaneously.

independently, but the value of - Example: Each student in a

- Coarse-grained: Large tasks Concurrency and

- Locks: Synchronization - Work Done : Total amount

* Amdahl's Law vs. - Speedup : Ratio of the

- Speedup = 1 / [(1 - 0.8) +

2. Gustafson's Law : Focuses

- Example: If the speedup is 3 —-----------------------------

or optimal load distribution. and we have 4 processors

decisions about parallelization processors, and then combine the

1. Amdahl's Law: Speedup = 1 /

- The *maximum speedup* - *Speedup* is defined as the

4. **Speedup for N CPUs or 6. **Efficiency:**

- *Linear speedup* refers to

processors* used. an alternative perspective to

I hope these worked examples

8. **Speedups and Efficiencies

- *Scalability* is determined —-----------------------------

- Data Dependence: Data

- Use-Def and Def-Use elements.

Chains: These chains identify

previous iteration. ```

Dependency Analysis: There is

2. Example: Loop-Independent control dependence between

Dependence the if and else branches. The

Note: The above examples -----

demonstrate different types of

- Data Dependence: Data

- Use-Def and Def-Use elements.

Chains: These chains identify

previous iteration. ```

Dependency Analysis: There is

2. Example: Loop-Independent control dependence between

Dependence the if and else branches. The

Note: The above examples ----

demonstrate different types of Data Parallelism Vs Task

- **Data Parallelism**: In data Q: What are the differences

Load Balancing: Concurrency and

- **Load Balancing**: Load

locks, barriers, and is a visual representation of

semaphores to control access tasks and their dependencies

to shared resources and in a parallel program. It shows

enforce mutual exclusion. the relationships and

Q: What is the difference

Q: What are task graphs and

Q: What are shared memory overall performance.

systems, and how are parallel

processors used. refers to the total amount of

Q: What is meant by work done

OpenCL basics and differences OpenMP focuses on shared

with OpenMP: memory systems and relies on

- **OpenCL**: OpenCL (Open

framework for heterogeneous Q: What is OpenCL, and how

- **Distributed Memory Fundamentals of Parallel and

Q: Define parallelism and - **Parallel Architectures**:

Exam-style Question: Exam-style Question:

Q: What is the difference Q: What is parallelism

- **Load Balancing**: Load

- The maximum speedup - Speedup is defined as the

4. Speedup for N CPUs or 6. Efficiency:**

- Linear speedup refers to

- Scalability is determined —-----------------------------

- Data Parallelism: In data Q: What are the differences

- Load Balancing: Load

- OpenCL: OpenCL (Open

Q: Define parallelism and - Parallel Architectures:

- Load Balancing: Load