ACA 2024W 04 Shared-memory programming with OpenMP 1-15
ACA 2024W 04 Shared-memory programming with OpenMP 1-15
Modifications by H. Weber
Objectives
!
Learn how to use OpenMP compiler directives to introduce concurrency in a
sequential program.
!
Learn the most important OpenMP #pragma directives and associated
clauses, for controlling the concurrent constructs generated by the compiler.
!
Understand which loops can be parallelized with OpenMP directives.
!
Address the dependency issues that OpenMP-generated threads face,
using synchronization constructs.
!
Learn how to use OpenMP to create function-parallel programs.
!
Learn how to write thread-safe functions.
!
Understand the issue of cache-false sharing and learn how to eliminate it.
!
The decomposition of a sequential program into
components that can execute in parallel is a tedious
enterprise.
!
OpenMP has been designed to alleviate much of the effort
involved, by accommodating the incremental conversion of
sequential programs into parallel ones, with the assistance
of the compiler.
!
OpenMP relies on compiler directives for decorating
portions of the code that the compiler will attempt to
parallelize.
OpenMP History
! OpenMP: Open Multi-Processing is an API for shared-memory programming.
! OpenMP was specifically designed for parallelizing existing sequential
programs.
! Uses compiler directives and a library of functions to support its operation.
! OpenMP v.1 was published in 1998.
! OpenMP v.4.0 was published in 2013.
! Standard controlled by the OpenMP Architecture Review Board (ARB).
! GNU C support:
− GCC 4.7 supports OpenMP 3.1 specification
− GCC 4.9 supports OpenMP 4.0.
! Can you match some of the previous definitions with parts of this
program?
<C> G. Barlas, 2015 7
!
Pragma directives allow a programmer to access compiler-
specific preprocessor extensions.
!
For example, a common use of pragmas, is in the management
of include files. E.g.
#pragma once
!
Pragma directives in OpenMP can have a number of optional
clauses, that modify their behavior.
!
In the previous example the clause is num_threads(numThr)
!
Compilers that do not support certain pragma directives, ignore
them.
!
Universally: via the OMP_NUM_THREADS environmental
variable:
$ echo ${OMP_NUM_THREADS} # to query the value
$ export OMP_NUM_THREADS=4 # to set it in BASH
!
Program level: via the omp_set_number_threads function,
outside an OpenMP construct.
!
Pragma level: via the num_threads clause.
!
The omp_get_num_threads call returns the active threads in a
parallel region. If it is called in a sequential part it returns 1.
end
∫ f (x )dx
start
n−1
f (x i ) + f (x i+ 1 )
≈ ∑ step⋅ 2
i=0
f (start ) + f (end) n−1
= step ⋅( + ∑ f (x i ))
2 i=1
where x 0 = start
x n = end
step = (end−start )/n
return localRes;
}
//---------------------------------------
int main (int argc, char *argv[])
{
. . .
double finalRes = integrate (start, end, divisions, testf);
Race <C>
condition!
G. Barlas, 2015 14
OpenMP V.1: Removing the race cond.
! Giving each thread its own private storage. Sequential
reduction is required afterwards.