[go: up one dir, main page]

0% found this document useful (0 votes)
4 views6 pages

Map55611 1 2

This document outlines the examination for the M.Sc. in High-Performance Computing, detailing instructions for candidates and the structure of the exam. It includes four main questions covering topics such as Amdahl's law, OpenMP, parallel computing, and MPI functions. Each question is designed to assess the candidates' understanding of high-performance computing concepts and their application in programming.

Uploaded by

yashhpc
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views6 pages

Map55611 1 2

This document outlines the examination for the M.Sc. in High-Performance Computing, detailing instructions for candidates and the structure of the exam. It includes four main questions covering topics such as Amdahl's law, OpenMP, parallel computing, and MPI functions. Each question is designed to assess the candidates' understanding of high-performance computing concepts and their application in programming.

Uploaded by

yashhpc
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

MAP55611-1

Faculty of Science, Technology, Engineering and Mathematics


School of Mathematics
M.Sc. in High-Performance Computing Michaelmas Term 2021

MAP55611: High-Performance Computing Software

Tuesday, 13 December 2022 RDS Simmonscourt 9.30-11.30

Michael Peardon and Darach Golden

Instructions to candidates:

Attempt ALL FOUR questions. All questions are worth 25 marks.

You may not start this examination until you are instructed to do so by the Invigi-
lator.

Page 1 of 6
MAP55611-1

1. (a) A computational task comprises three parts. The first step initialises data and takes
10 seconds to complete on a single core. The second part evaluates a function
on the data and takes 200 seconds on one core. The final part performs post-
processings and takes 10 seconds to complete on a single core. Only the algorithm
for the second part of the problem can be parallelised.

i. State Amdahl’s law and explain how it applies to estimating the value of
executing this composite task on more than one compute core. [3]

ii. Using Amdahl’s law, estimate how much faster this task can be executed on
a system with 8 cores and another system with a very large number of cores.
[4]

iii. Estimate how many cores would be needed to achieve at least 95% of the
maximum possible parallel performance for this task. [3]

(b) Explain which one of these two operations on arrays of length n can be parallelised
efficiently and write a function in OpenMP to perform the case where a simple
parallel implementation is possible
p p
gk = fk + gk−1 or gk = fk + fk−1 for k = 1 . . . n − 1

[8]

(c) Execution times for six runs of a parallel code performed with two different numbers
of compute cores, n and three problem sizes, p are given in the Table below.
Comment on the strong and weak scaling behaviour of this software. [7]

p = 100 p = 1, 000 p = 10, 000


n = 64 200s 2000s 30000s
n = 256 150s 1500s 25000s

Table 1: Run times for different problem sizes, p and numbers of compute cores, n.

Page 2 of 6
© TRINITY COLLEGE DUBLIN, THE UNIVERSITY OF DUBLIN 2022
MAP55611-1

2. (a) Describe the output of the following OpenMP code fragment: [7]

#pragma omp parallel num_threads(3)


{
int id = omp_get_thread_num();
printf("%d\n",id);
#pragma omp single
printf("%d\n",id);
}

(b) A function to sum the elements in an array is written to use the OpenMP library.

double sum=0; int n=1000;


#pragma omp parallel num_threads(3)
for (int i=0;i<n;i++)
sum += data[i];

Describe what is likely to fail with this code, and write a corrected version.
[6]

(c) The loop below calls function do task ten times

for (i=0;i<50;i++)
do_task(i);

For each example given below, where a prediction can be made, state which thread
will execute the function for each value of the argument i = 0, 1, . . . 49 when each
OpenMP directive is used to parallelise this loop. Explain your answer.

i. #pragma omp parallel for num_threads(4) [4]

ii. #pragma omp parallel for num_threads(4) schedule(chunk,3) [4]

iii. #pragma omp parallel for num_threads(4) schedule(dynamic) [4]

Page 3 of 6
© TRINITY COLLEGE DUBLIN, THE UNIVERSITY OF DUBLIN 2022
MAP55611-1

3. The Gauss-Seidel iteration to solve an approximation to the Laplace equation in three


dimensions visits every interior point (x, y, z) on a grid of dimension 40×40×40. These
points have co-ordinates x, y, z ∈ [1, 39]. At each visited site, the field is replaced with
the average of its six nearest neighbours.

(a) Explain why the ordering of site visits matters when considering if this algorithm
can be parallelised. [5]

(b) Write a function

double gauss_seidel(double phi[40][40][40])

that uses the OpenMP library to over-write the array phi with the next iteration of
Gauss-Seidel. The function should return the modulus of the difference between
phi between the two iterations. [20]

4. (a) For a matched pair of MPI Send() and MPI Recv() on two processes in the same
MPI communicator what information is the send and receive matched on? [3]

(b) Describe how using MPI Send and MPI Recv can lead to deadlock between pro-
cesses. Write a C snippet showing deadlock. Assume exactly two MPI processes.
[7]

(c) Using the same assumptions as above, write a C snippet showing a version of this
code that will not deadlock under any circumstances. [5]

(d) Explain the operation of MPI Gather. Write a function my gather() implement-
ing the effect of MPI Gather using at least some of the basic MPI functions
MPI Comm rank, MPI Comm size, MPI Send, MPI Recv and MPI Barrier. The
function my gather()should take the same arguments as MPI Gather [8]

(e) What is the difference between MPI Gather and MPI Allgather [2]

Page 4 of 6
© TRINITY COLLEGE DUBLIN, THE UNIVERSITY OF DUBLIN 2022
MAP55611-1

Appendix: MPI function declarations

int MPI_Allgather(void *sendbuf, int sendcount, MPI_Datatype sendtype,


void *recvbuf, int recvcount, MPI_Datatype recvtype,
MPI_Comm comm );
int MPI_Allreduce(void *send_buffer, void *recv_buffer, int count,
MPI_Datatype datatype, MPI_Op op, MPI_Comm comm);
int MPI_Alltoall( void *sendbuf, int sendcount, MPI_Datatype sendtype,
void *recvbuf, int recvcnt, MPI_Datatype recvtype,
MPI_Comm comm )
int MPI_Bcast(void *buffer, int count, MPI_Datatype datatype, int root,
MPI_Comm comm);
int MPI_Cart_shift(MPI_Comm comm, int direction, int displ, int *src, int *dest);
int MPI_Comm_rank(MPI_Comm comm, int *rank);
int MPI_Comm_size(MPI_Comm comm, int *size);
int MPI_Comm_spawn(const char *command, char *argv[], int maxprocs,
MPI_Info info, int root, MPI_Comm comm,
MPI_Comm *intercomm, int array_of_errcodes[])
int MPI_Comm_split(MPI_Comm comm, int colour, int key, MPI_Comm *newcomm);
int MPI_Finalize();
int MPI_Gather(void *send_buffer, int send_count, MPI_Datatype sendtype,
void *recv_buffer, int recv_count, MPI_Datatype recvtype,
int root, MPI_Comm comm);
int MPI_Init(int *argc, char ***argv);
int MPI_Init_thread(int *argc, char ***argv, int required, int *provided);
int MPI_Irecv(void *buf, int count, MPI_Datatype datatype, int source,
int tag, MPI_Comm comm, MPI_Request *request);
int MPI_Isend(void *buf, int count, MPI_Datatype datatype, int dest, int tag,
MPI_Comm comm, MPI_Request *request);

Page 5 of 6
© TRINITY COLLEGE DUBLIN, THE UNIVERSITY OF DUBLIN 2022
MAP55611-1

int MPI_Recv(void *buf, int count, MPI_Datatype datatype, int source,


int tag, MPI_Comm comm, MPI_Status *status );
int MPI_Reduce(void *sendbuf, void *recvbuf, int count, MPI_Datatype datatype,
MPI_Op op, int root, MPI_Comm comm);
int MPI_Scatter(void *sendbuf, int sendcnt, MPI_Datatype sendtype,
void *recvbuf, int recvcnt, MPI_Datatype recvtype,
int root, MPI_Comm comm);
int MPI_Send(void *buf, int count, MPI_Datatype datatype, int dest,
int tag, MPI_Comm comm);
int MPI_Sendrecv(void *sendbuf, int sendcount, MPI_Datatype sendtype,
int dest, int sendtag,
void *recvbuf, int recvcount, MPI_Datatype recvtype,
int source, int recvtag,
MPI_Comm comm, MPI_Status *status);
int MPI_Test(MPI_Request *request, int *flag, MPI_Status *status);
int MPI_Type_commit(MPI_Datatype *datatype)
int MPI_Type_contiguous(int count, MPI_Datatype oldtype,
MPI_Datatype *newtype)
int MPI_Type_free(MPI_Datatype *datatype)
int MPI_Type_struct(int count, int *array_of_blocklengths,
MPI_Aint *array_of_displacements, MPI_Datatype *array_of_types,
MPI_Datatype *newtype)
int MPI_Type_vector(int count, int blocklength, int stride,
MPI_Datatype oldtype, MPI_Datatype *newtype)
int MPI_Wait(MPI_Request *request, MPI_Status *status);

Page 6 of 6
© TRINITY COLLEGE DUBLIN, THE UNIVERSITY OF DUBLIN 2022

You might also like