HPC
&
BigData
Grid
Compu3ng
High
Performance
compu3ng
Curriculum
UvA-‐SARA
h@p://www.hpc.uva.nl/
outline
• e-Science
• Grid approach
• Grid computing
• Programming models for the Grid
• Grid-middleware
• Web Services
• Open Grid Service Architecture (OGSA)
Doing
Science
in
the
21th
century
• Nowadays
Scien3fic
Applica3ons
are
– CPU
intensive
– Produce/process
Huge
sets
of
Data
– Requires
access
to
geographically
distributed
and
expensive
instruments
Online
Access
to
Scien3fic
Instruments
Advanced Photon Source
wide-area
dissemination
real-time archival desktop & VR clients
with shared controls
collection storage
tomographic reconstruction
DOE X-ray grand challenge: ANL, USC/ISI, NIST, U.Chicago
From
the
Grid
tutorials
available
at
:
h@p:.//www.globus.org
CPU intensive Science: Optimization problem
NUG30
• The problem, a quadratic
assignment problem (QAP)
known as NUG30
– given a set of n locations and n
facilities, the goal is to assign each
facility to a location.
– There are n! possible assignments
• NUG30 proposed in 1968 as a test
of computer capabilities, but
remained unsolved because of its
great complexity.
Nug30
Quadra+c
Assignment
Problem
Solved
by
1,000
h@ps://scout.wisc.edu/archives/r7125
To
solve
these
problems?
LHC NUG30 Online Access
Application Application Application
Specific Specific Specific
Part Part Part
Potential Generic Potential Generic Potential Generic
part part part
Management Management Management
of comm. & of comm. & of comm. &
computing computing computing
Grid Services
Harness multi-domain distributed
resources
“VL-e project” UvA
LHC NUG30 Online Access
Application Application Application
Management Management Management
of comm. & of comm. & of comm. &
computing computing computing
Grid Services
Harness multi-domain distributed resources
outline
• e-Science
• Grid approach
• Grid computing
• Programming models for the Grid
• Grid-middleware
• Web Services
• Open Grid Service Architecture (OGSA)
The
Grid
Problem
• Flexible,
secure,
coordinated
resource
sharing
among
dynamic
collec3ons
of
individuals,
ins3tu3ons,
and
resources
• Enable
communi3es
(“Virtual
Organiza3ons”)
to
share
geographically
distributed
resources
as
they
pursue
common
goals
-‐-‐
assuming
the
absence
of
:
central
loca3on,
central
control,
exis3ng
trust
rela3onships.
From
the
Grid
tutorials
available
at
:
h@p:.//www.globus.org
Some Definitions of the Grid?
“A Computational grid is a hardware and software infrastructure
that provides dependable, consistent, pervasive, and inexpensive
access to high-end computational capabilities”. Karl Kesselman & Ian Foster.
“The overall motivation for Grids is to enable the routine
interactions of resources geographically and organizationally
dispersed to facilitate Large-scale Science and engineering” The
Vision for a DOE Science Grid, William Johnston, Lawrence Berkeley Nat. Lab.
“Making possible a shared large wide-area Computational
infrastructure a concept which has been named the Grid” Peter Dinda,
Gorgia Tech, 2001.
The real Grid target
• A Grid is a system that is able to
– Coordinate resources
• not subject to centralized control
– Use standard, open, general-purpose protocols and
interfaces
– Deliver nontrivial qualities of service.
“Ian
Foster’s
3
point
checklist”
Coordinated Sharing
• The
sharing
is
controlled
by
the
providers
and
consumers
– what
is
shared?
– who
is
allowed
to
share?
– and
the
condi3ons
under
which
sharing
occurs?
• sharing
rela3onships
– client-‐server,
peer-‐to-‐peer,
and
brokered
– access
control:
fine
AC,
delega3on,
local/global
policies
From
“The
Anatomy
of
the
Grid:
Enabling
Scalable
Virtual
Organiza3ons”
Foster
et
al
outline
• e-Science
• Grid approach
• Grid computing
• Programming models for the Grid
• Grid-middleware
• Web Services
• Open Grid Service Architecture (OGSA)
What
is
Grid
Compu3ng
• Grid
compu3ng
is
the
use
of
hundreds,
thousands,
or
millions
of
geographically
and
organiza3onally
disperse
and
diverse
resources
to
solve:
è
problems
that
require
more
compu3ng
power
than
is
available
from
a
single
machine
or
from
a
local
area
distributed
system
Poten3al
Grid
Applica3on
• An
applica3on
which
requires
the
grid
solu3on
is
likely
distributed
(Distributed
Compu3ng)
and
fit
in
one
of
the
following
paradigms:
– High
throughput
Compu3ng
– High
performance
Compu3ng
Grid
compu3ng
will
be
mainly
needed
for
large-‐
scale,
high-‐performance
compu3ng.
Distributed
Compu3ng
• Distributed
compu3ng
is
a
programming
model
in
which
processing
occurs
in
many
geographically
distributed
places.
– Processing
can
occur
wherever
it
makes
the
most
sense,
whether
that
is
on
a
server,
Web
site,
personal
computer,
etc.
• Distributed
compu3ng
and
grid
compu3ng
either
– overlap
or
distributed
compu3ng
is
a
subset
of
grid
compu3ng
From
“The
Anatomy
of
the
Grid:
Enabling
Scalable
Virtual
Organiza3ons”
Foster
et
al
High
Throughput
Compu3ng
• HTC
employs
large
amounts
of
compu3ng
power
for
very
lengthy
periods
– HTC
is
needed
for
doing
sensi3vity
analyses,
parametric
studies
or
simula3ons
to
establish
sta3s3cal
confidence.
• The
features
of
HTC
are
– Availability
of
compu3ng
power
for
a
long
period
of
3me
– Efficient
fault
tolerance
mechanism
• The
key
to
HTC
in
grids
– Efficiently
harness
the
use
of
all
available
resources
across
organiza3ons
High
Performance
Compu3ng
• HPC
brings
enormous
amounts
of
compu3ng
power
to
bear
over
rela3vely
short
periods
of
3me.
– HPC
is
needed
for
decision-‐support
or
applica3ons
under
sharp
3me-‐constraint,
such
as
weather
modeling
• HPC
applica3ons
are:
– Large
in
scale
and
complex
in
structure.
– Real
3me
requirements.
– Ul3mately
must
run
on
more
than
one
type
of
HPC
system.
HPC/HTC
requirements
• HPC/HTC
requires
a
balance
of
computa3on
and
communica3on
among
all
resources
involved.
– Managing
computa3on,
– communica3on,
– data
locality
outline
• e-‐Science
• Grid
approach
• Grid
compu3ng
• Programming
models
for
the
Grid
• Grid-‐middleware
• Web
Services
• Open
Grid
Service
Architecture
(OGSA)
Programming
Model
for
the
grid
• To
achieve
petaflop
rates
on
3ghtly/loosely
coupled
grid
clusters,
applica3ons
will
have
to
allow:
–
extremely
large
granularity
or
produce
massive
parallelism
such
that
high
latencies
can
be
tolerated.
• This
type
of
parallelism,
and
the
performance
delivered
by
it
in
a
heterogeneous
environment,
is
– currently
manageable
by
hand-‐coded
applica3ons
Programming
Model
for
the
grid
• A
programming
model
can
be
presented
in
different
forms:
a
language,
a
library
API,
or
a
tool
with
extensible
func3onality.
• The
successful
programming
model
will
– enable
both
high-‐performance
and
the
flexible
composi3on
and
management
of
resources.
– influence
the
en3re
soeware
lifecycle:
design,
implementa3on,
debugging,
opera3on,
maintenance,
etc.
– facilitate
the
effec3ve
use
of
all
manner
of
development
tools,
e.g.,
compilers,
debuggers,
performance
monitors,
etc
Grid
Programming
Issues
• Portability,
Interoperability,
and
Adaptability
• Discovery
• Performance
• Fault
Tolerance
• Security
Programming
models
• Shared-‐state
models
• Message
passing
models
• RPC
and
RMI
models
• Hybrid
Models
• Peer
to
Peer
Models
• Web
Service
Models
• ...
outline
• e-‐Science
• Grid
approach
• Grid
compu3ng
• Programming
models
for
the
Grid
• Grid-‐middleware
• Web
Services
• Open
Grid
Service
Architecture
(OGSA)
Grid
Middleware
Defini3on
• Architecture
iden3fies
the
fundamental
system
components,
specifies
purpose
and
func3on
of
these
components,
and
indicates
how
these
components
interact
with
each
other.
• Grid
architecture
is
a
protocol
architecture,
with
protocols
defining
the
basic
mechanisms
by
which
VO
users
and
resources
nego3ate,
establish,
manage
and
exploit
sharing
rela3onships.
• Grid
architecture
is
also
a
service
standard-‐based
open
architecture
that
facilitates
extensibility,
interoperability,
portability
and
code
sharing.
“Introduc+on
to
Grid
Technology” B.Ramamurthy
Architecture
Applica3on
Internet
Protocol
Architecture
“Coordina3ng
mul3ple
resources”:
ubiquitous
infrastructure
services,
app-‐ Collec3ve
specific
distributed
services
Applica3on
“Sharing
single
resources”:
nego3a3ng
access,
controlling
use
Resource
“Talking
to
things”:
communica3on
(Internet
protocols)
&
security
Connec3vity
Transport
Internet
“Controlling
things
locally”:
Access
to,
&
control
of
resources
Fabric
Link
Emergence
of
Open
Grid
Standards
Managed shared
Computer science research
Increased functionality,
virtual systems
standardization
Open Grid
Web services, etc.
Services Arch
Real standards
Multiple implementations
Internet
Globus Toolkit
standards
Defacto standard
Custom Single implementation
solutions
1990 1995 2000 2005 2010
“Grid Computing and Scaling Up the Internet” I. Foster, IPv6 Forum, an
Examples
of
Grid
Middleware
• Globus
Toolkit
(GT4.X)
now
(GT5.X)
– www.globus.org
• Legion/Avaki
– h@p://www.avaki.com/
– h@p://legion.virginia.edu/
• Grid
Sun
engine
– h@p://www.sun.com/service/sungrid/
overview.jsp
• Unicore
– h@p://www.unicore.org
The
Grid
Middleware
• Soeware
toolkit
addressing
key
technical
areas
– Offer
a
modular
“bag
of
technologies”
– Enable
incremental
development
of
grid-‐enabled
tools
and
applica3ons
– Define
and
standardize
grid
protocols
and
APIs
• Focus
is
on
inter-‐domain
issues,
not
clustering
– Collabora3ve
resource
use
spanning
mul3ple
organiza3ons
– Integrates
cleanly
with
intra-‐domain
services
– Creates
a
“collec3ve”
service
layer
“Basics
Globus
Toolkit™
Developer
Tutorial”
Globus
Team,
2003
Globus
Approach
• Focus
on
architecture
issues
A
p
p
l
i
c
a
t
i
o
n
s
– Provide
implementa3ons
of
grid
Diverse
global
services
protocols
and
APIs
as
basic
infrastructure
– Use
to
construct
high-‐level,
domain-‐
specific
solu3ons
Core
Globus
• Design
principles
services
– Keep
par3cipa3on
cost
low
– Enable
local
control
– Support
for
adapta3on
Local
OS
“Basics
Globus
Toolkit™
Developer
Tutorial”
Globus
Team,
2003
Globus
Toolkit
2.0
Components
1
MDS
client
API
calls
to
locate
resources
Client
MDS:
Grid
Index
Info
Server
2
MDS
client
API
calls
Site
boundary
to
get
resource
info
4
GRAM
client
API
calls
to
request
resource
alloca3on
MDS:
Grid
Resource
Info
Server
and
process
crea3on.
Query
current
status
GRAM
client
API
state
3
of
resource
Globus
Security
change
callbacks
Infrastructure
Local
Resource
Manager
7
8
create
Allocate
&
processes
5
Create
Job
Manager
Gatekeeper
6
Parse
Monitor
&
Process
control
Process
RSL
Library
Process
outline
• e-‐Science
• Grid
approach
• Grid
compu3ng
• Programming
models
for
the
Grid
• Grid-‐middleware
• Web
Services
• Open
Grid
Service
Architecture
(OGSA)
Best
of
Two
Worlds
Open
Grid
Services
Architecture
share
manage
access
Applica+ons
on
Resources
demand
on
demand
Secure
and
Global
universal
access
Accessibility
Business
Vast
resource
integra+on
scalability
Web
Services
Grid
Protocols
‘Open
Grid
Services
Architecture
Evolu3on,
J.P.
Prost,
IBM
Montpellier,
France,
Ecole
Bruide
2004
Web
Services
• Increasingly
popular
standards-‐based
framework
for
accessing
network
applica3ons
– W3C
standardiza3on;
Microsoe,
IBM,
Sun,
others
• WSDL:
Web
Services
Descrip3on
Language
– Interface
Defini3on
Language
for
Web
services
• SOAP:
Simple
Object
Access
Protocol
– XML-‐based
RPC
protocol;
common
WSDL
target
• WS-‐Inspec3on
– Conven3ons
for
loca3ng
service
descrip3ons
• UDDI:
Universal
Desc.,
Discovery,
&
Integra3on
– Directory
for
Web
services
“Globus
Toolkit
Futures:
An
Open
Grid
Services
Architecture” Ian
Foster
et
al.
Globus
Tutorial,
Argonne
Na3onal
Laboratory,
January
29,
2002
The
Need
to
Support
Transient
Service
Instances
• “Web
services”
address
discovery
&
invoca3on
of
persistent
services
– Interface
to
persistent
state
of
en3re
enterprise
• In
Grids,
must
also
support
transient
service
instances,
created/destroyed
dynamically
– Interfaces
to
the
states
of
distributed
ac3vi3es
– E.g.
workflow,
video
conf.,
dist.
data
analysis
• Significant
implica3ons
for
how
services
are
managed,
named,
discovered,
and
used
– In
fact,
much
of
the
work
is
concerned
with
the
management
of
service
instances
“Globus
Toolkit
Futures:
An
Open
Grid
Services
Architecture” Ian
Foster
et
al.
Globus
Tutorial,
Argonne
Na3onal
Laboratory,
January
29,
2002
outline
• e-‐Science
• Grid
approach
• Grid
compu3ng
• Programming
models
for
the
Grid
• Grid-‐middleware
• Web
Services
• Open
Grid
Service
Architecture
(OGSA)
Open
Grid
Services
Architecture
• Service
orienta3on
to
virtualize
resources
• From
Web
services:
– Standard
interface
defini3on
mechanisms:
mul3ple
protocol
bindings,
mul3ple
implementa3ons,
local/remote
transparency
• Building
on
Globus
Toolkit:
– Grid
service:
seman3cs
for
service
interac3ons
– Management
of
transient
instances
(&
state)
– Factory,
Registry,
Discovery,
other
services
– Reliable
and
secure
transport
• Mul3ple
hos3ng
targets:
J2EE,
.NET,
…
“Globus
Toolkit
Futures:
An
Open
Grid
Services
Architecture” Ian
Foster
et
al.
Globus
Tutorial,
Argonne
Na3onal
Laboratory,
January
29,
2002
Open
Grid
Services
Architecture
Objec3ves
• Manage
resources
across
distributed
heterogeneous
plarorms
• Deliver
seamless
QoS
• Provide
a
common
base
for
autonomic
management
solu3ons
• Define
open,
published
interfaces
• Exploit
industry-‐standard
integra3on
technologies
– Web
Services,
SOAP,
XML,...
• Integrate
with
exis3ng
IT
resources
‘Open Grid Services Architecture Evolution, J.P. Prost, IBM Montpellier, France, Ecole Bruide 2004