Linux Clusters Ins.
tute:
Cluster Stack Basics
Bre$
Zimmerman,
University
of
Oklahoma
Senior
Systems
Analyst,
OU
Supercompu<ng
Center
for
Educa<on
and
Research
(OSCER)
A Bunch of Computers
Users
can
login
to
any
node
Filesystems
arent
shared
between
nodes
Work
is
run
wherever
you
can
nd
space
Nodes
maintained
individually
4-8 August 2014
Whats wrong with a bunch of nodes?
Compe<<on
for
resources
Size
and
type
of
problem
is
limited
Nodes
get
out
of
sync
Problems
for
users
Diculty
in
management
4-8 August 2014
Cluster Approach
Shared
lesystems
Job
management
Nodes
dedicated
to
compute
Consistent
environment
Interconnect
4-8 August 2014
Whats right about the cluster approach?
Easier
to
use
Maximize
eciency
Can
do
bigger
and
be$er
problems
Nodes
can
be
used
coopera<vely
4-8 August 2014
The Types of Nodes
Login
Users
login
here
Compiling
Edi<ng
SubmiTng
and
Monitoring
jobs
Compute
Users
might
login
here
Run
jobs
as
directed
by
the
scheduler
Support
Users
dont
login
here
Do
all
the
other
stu
4-8 August 2014
What a cluster needs the mundane
Network
services
NTP,
DNS,
DHCP
Shared
Storage
--
NFS
Logging
Consolidated
Syslog
as
a
star<ng
point
Licensing
FlexLM
and
the
like
Database
User
and
Administra<ve
Data
Boot/Provisioning
PXE,
build
system
Authen<ca<on
LDAP
4-8 August 2014
What a cluster needs -- Specialized
Interconnect
An
ideally
low-latency
network
Job
manager
Resource
manager/
scheduler
Parallel
Storage
Get
around
the
limita<ons
of
NFS
4-8 August 2014
Network Services
NTP
Network
Time
Protocol,
provides
clock
synchroniza<on
across
all
nodes
in
the
cluster
DHCP
Dynamic
Host
Congura<on
Protocol,
allows
central
congura<on
of
host
networking
DNS
Provides
name
to
address
transla<on
for
the
cluster
NFS
Basic
UNIX
network
lesystem
4-8 August 2014
Logging
Syslog
The
classic
system
for
UNIX
logging
Applica<on
has
to
opt
to
emit
messages
Monitoring
Ac<ve
monitoring
to
catch
condi<ons
elec<ve
monitoring
doesnt
catch
Resource
manager
Nagios/cac</zabbix/ganglia
IDS
Intrusion
detec<on
Monitoring
targe<ng
misuse/a$acks
on
the
cluster
4-8 August 2014
10
Basic services, con.nued
Licensing
FlexNet/FlexLM
or
equivalent,
mediates
access
to
a
pool
of
shared
licenses.
Database
Administra<ve
use
for
logging/monitoring,
dynamic
congura<on.
Requirements
of
user
so`ware.
Boot/Provisioning
For
example
PXE/Cobbler,
PXE/Image
or
part
of
a
cluster
management
suite
4-8 August 2014
11
Authen.ca.on
Flat
les
--
passwd,
group,
shadow
entries
NIS
--
network
access
to
central
at
les
LDAP
--
Read/Write
access
to
a
dynamic
tree
structure
of
account
and
other
informa<on
Host
equivalency
4-8 August 2014
12
Cluster Networking
Hardware
Management
Lights
out
management
External
Public
interfaces
to
the
cluster
Internal
General
node
to
node
communica<on
Storage
Access
to
network
lesystems
Interconnect
high-speed,
low-latency
for
mul<-
node
jobs
Some
of
these
can
share
a
medium
4-8 August 2014
13
Interconnect
In
the
most
recent
Top
500
list
(h$p://top500.org)
there
were
224
installa<ons
relying
on
Inniband,
100
using
Gigabit
Ethernet,
and
88
using
10
Gigabit
Ethernet
Ethernet
Latency
of
50-125
s
(GbE),
5-50
s
(10GbE),
~5
s
RoCEE
Inniband
Latency
of
1.3
s
(QDR)
.7
s
(FDR-10/
FDR),
.5
s
(EDR)
4-8 August 2014
14
Parallel Filesystem
Lustre
-
h$p://lustre.org/
PanFS
-
h$p://www.panasas.com/
GPFS
-
h$p://www-03.ibm.com/so`ware/products/en/
so`ware
Parallel
lesystems
take
the
general
approach
of
separa<ng
lesystem
metadata
from
the
storage.
Lustre
and
PanFS
have
dedicated
nodes
for
metadata
(MDS
or
director
blades).
GPFS
distributes
metadata
throughout
the
cluster
4-8 August 2014
15
Cluster Management
Automates
the
building
of
a
cluster
Some
way
to
easily
maintain
cluster
system
consistency
The
ability
to
automate
cluster
maintenance
tasks
Oer
some
way
to
monitor
cluster
health
and
performance
4-8 August 2014
16
Cluster Managemement SoNware
The
resource
manager
knows
the
state
of
the
various
resources
on
the
cluster
and
maintains
a
list
of
the
jobs
that
are
reques<ng
resources
The
scheduler,
using
the
informa<on
from
the
resource
manager
selects
jobs
from
the
queue
for
execu<on
Rocks
(h$p://www.rocksclusters.org/wordpress/)
Bright
Cluster
Manager
(
h$p://www.brightcompu<ng.com/Bright-Cluster-Manager)
xCAT
(Extreme
Cluster/Cloud
Administra<on
Toolkit)
(
h$p://sourceforge.net/p/xcat/wiki/Main_Page/
4-8 August 2014
17
Congura.on Management
While
it
is
true
that
boo<ng
with
a
central
boot
server
can
make
it
easier
to
make
sure
the
OS
on
each
compute
node
(or,
at
least,
each
type
of
compute
node)
has
an
iden<cal
setup/install,
there
are
s<ll
les
which
wind
up
being
more
dynamic.
Some
such
les
are
password/group/shadow
and
hosts
les.
Rsync
Cfengine
Chef
Puppet
Salt
4-8 August 2014
18
SoNware Installa.on and Management
All
linux
distros
have
some
sort
of
package
management
tool.
For
Redhat/CentOS/Scien<c
based
clusters,
this
is
rpm
and
yum.
Debian
has
dpkg
and
apt
In
any
case
pre-packaged
so`ware
tends
to
assume
that
it
is
going
to
be
installed
in
a
specic
place
on
the
machine
and
that
it
will
be
the
only
version
of
that
so`ware
on
the
machine.
One
a
cluster,
it
may
be
necessary
to
look
at
so`ware
installa<on
dierently
from
a
standard
linux
machine
Install
to
global
lesystem
Keep
boot
image
as
small
as
possible
Maintain
mul<ple
versions
4-8 August 2014
19
SoNware installa.on and management
There
are
a
couple
of
tools
useful
for
naviga<ng
the
dicul<es
of
maintaining
user
environments
when
dealing
with
mul<ple
versions
of
so`ware
or
so`ware
in
non-standard
loca<ons.
So`Env
(h$p://h$p://www.lcrc.anl.gov/info/So`ware/So`env)
Useful
for
packaging
sta<c
user
environment
required
by
packages
Modules
(h$p://modules.sourceforge.net/)
Can
be
used
to
make
dynamic
changes
to
a
users
environment.
4-8 August 2014
20
Resource Manager/Scheduler
Accepts
job
submissions,
maintains
a
queue
of
jobs
Allocates
nodes/resources
and
starts
jobs
on
compute
nodes
Schedules
wai<ng
jobs
Available
op<ons
SGE
(Sun
Grid
Engine)
LSF
/
Openlava
(Load
Sharing
Facility)
PBS
(Portable
Batch
System)
OpenPBS
Torque
SLURM
4-8 August 2014
21
Best Prac.ces
Here
is
a
quick
overview
of
the
general
func<ons
to
secure
a
cluster
Risk
Avoidance
Deterrence
Preven<on
Detec<on
Recovery
The
priority
of
these
will
depend
on
your
security
approach
4-8 August 2014
22
Risk Avoidance
Provide
the
minimum
of
services
necessary
Grant
the
least
privileges
necessary
Install
the
minimum
so`ware
necessary
The
simpler
the
environment,
the
fewer
the
vectors
available
for
a$ack.
4-8 August 2014
23
Deterrence
Limit
the
discoverability
of
the
cluster
Publish
acceptable
use
policies
Preven.on
Fix
known
issues
(patching)
Congure
services
for
minimal
func<onality
Restrict
user
access
and
authority
Document
ac<ons
and
changes
4-8 August 2014
24
Detec.on
Monitor
the
cluster
Integrate
feedback
from
the
users
Set
alerts
and
automated
response
Recovery
Backups
Documenta<on
Dene
acceptable
loss
4-8 August 2014
25