[go: up one dir, main page]

0% found this document useful (0 votes)
14 views33 pages

CC Module-1 Notes

Module 1 of BIS613D discusses the evolution of distributed computing, highlighting the differences between grids and clouds, and the transition from high-performance computing (HPC) to high-throughput computing (HTC). It covers scalable computing trends, the Internet of Things (IoT), and the significance of multicore processors and GPUs in modern computing systems. The module emphasizes the need for efficiency, dependability, and adaptability in future computing architectures to meet growing demands.

Uploaded by

knpavithra.95
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views33 pages

CC Module-1 Notes

Module 1 of BIS613D discusses the evolution of distributed computing, highlighting the differences between grids and clouds, and the transition from high-performance computing (HPC) to high-throughput computing (HTC). It covers scalable computing trends, the Internet of Things (IoT), and the significance of multicore processors and GPUs in modern computing systems. The module emphasizes the need for efficiency, dependability, and adaptability in future computing architectures to meet growing demands.

Uploaded by

knpavithra.95
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 33

BIS613D Cloud Computing and Security Module-1

Module-1
Distributed SystemModelsandEnablingTechnologies:
ScalableComputingOvertheInternet, TechnologiesforNetworkBasedSystems, SystemModels for
Distributed and Cloud Computing, Software Environments for Distributed Systems and Clouds,
Performance, Security and Energy Efficiency.
Textbook1:Chapter1:1.1to1.5

EVOLUTIONOFDISTRIBUTEDCOMPUTING

 Grids enable accesstosharedcomputingpowerandstoragecapacityfromyour desktop.


 Cloudsenableaccesstoleasedcomputingpowerandstoragecapacityfromyour desktop.
• Grids are an open source technology. Resource users and providers alike can understand and
contribute to the management of their grid
• Cloudsareaproprietarytechnology.Onlytheresourceproviderknowsexactlyhowtheircloud
manages data, job queues, and security requirements and so on.
• The concept of grids was proposed in 1995. The Open science grid (OSG) started in 1995 The
EDG (European Data Grid) project began in 2001.
• Inthelate1990`sOracleandEMCofferedearlyprivatecloudsolutions.Howeverthetermcloud
computingdidn'tgainprominenceuntil2007.ohigh-performancecomputing(HPC)applications is
no longer optimal for measuring system performance

• The emergence of computing clouds instead demands high-throughput computing (HTC)


systems built with parallel and distributed computing technologies
• We have to upgrade data centers using fast servers, storage systems, and high-bandwidth
networks.

• From1950 to1970, ahandful ofmainframes,including theIBM360and CDC 6400

1.1SCALABLECOMPUTINGOVERTHEINTERNET
Instead of using a centralized computer to solve computational problems, a parallel and distributed
computing system uses multiple computers to solve large-scale problems over the Internet. Thus,
distributed computing becomes data-intensive and network-centric.

TheAgeofInternetComputing
ThePlatform Evolution
o From1960to1980,lower-costminicomputerssuchastheDECPDP11andVAXSeries
o From1970to1990,wesawwidespreaduseofpersonalcomputersbuiltwithVLSImicroprocessors.
o From1980to2000,massivenumbersofportablecomputersandpervasivedevicesappearedinboth wired
and wireless applications
o Since1990,theuseofbothHPCandHTCsystemshiddeninclusters,grids,orInternetcloudshas
proliferated

pg. 1
BIS613DCloudComputing and Security Module-1

High-PerformanceComputing(HPC)andHigh-ThroughputComputing(HTC)haveevolved significantly,
driven by advances in clustering, P2Pnetworks, and cloud computing.
 HPCEvolution:
o Traditional supercomputers (MPPs) are being replaced by clusters of cooperative
computers for better resource sharing.
o HPC has focused on raw speed performance, progressing from Gflops (1990s) to
Pflops (2010s).
 HTCand P2PNetworks:
o HTCsystemsprioritizehigh-fluxcomputing,emphasizingtaskthroughputover raw
speed.
o P2Pnetworksfacilitatedistributedfilesharingandcontentdeliveryusingglobally
distributed client machines.
o HTCapplicationsdominateareaslikeInternetsearchesandwebservicesfor millions of
users.
 MarketShiftfromHPCtoHTC:
o HTCsystemsaddresschallengesbeyondspeed,includingcost, energyefficiency,
security, and reliability.
 EmergingParadigms:
o Advancesinvirtualization haveledtotheriseofInternetclouds,enablingservice-
oriented computing.
o TechnologieslikeRFID,GPS,andsensorsarefuelingthegrowthoftheInternetof
Things (IoT).
 ComputingModelOverlaps:
o Distributedcomputingcontrastswithcentralized computing.
o Parallelcomputingsharesconceptswithdistributedcomputing.
o Cloudcomputingintegratesaspectsofdistributed,centralized,andparallelcomputing
.

pg. 2
BIS613DCloudComputing and Security Module-1

ThetransitionfromHPCtoHTCmarksastrategicshiftincomputingparadigms,focusingon
scalability,efficiency,andreal-worldusabilityoverpureprocessingpower.
Computing Paradigm Distinctions
Centralized computing
A computing paradigm where all computer resources are centralized in a single physical
system. In this setup, processors, memory, and storage are fully shared and tightly integrated
within one operating system. Many data centers and supercomputers operate as centralized
systems, but they are also utilized in parallel, distributed, and cloud computing applications.

• Parallelcomputing
In parallel computing, processors are either tightly coupled with shared memory or loosely
coupledwithdistributedmemory.Communicationoccursthroughsharedmemoryormessage
passing.Asystem that performs parallel computing is a parallel computer, and the programs
running on it are called parallel programs. Writing these programs is referred to as parallel
programming.
• Distributed computing studies distributed systems, which consist of multiple autonomous
computers with private memory communicating through a network via message passing.
Programsrunninginsuchsystemsarecalleddistributedprograms,andwritingthemisknown as
distributed programming.
Cloudcomputingrefers toasystemofInternet-basedresourcesthatcanbeeithercentralized or
distributed. It uses parallel, distributed computing, or both, and can be established with
physical or virtualized resources over large data centers. Some regard cloud computing as a
form of utility computing or service computing. Alternatively, terms such as concurrent
computing or concurrent programming are used within the high-tech community, typically
referring to the combination of parallel and distributed computing, although interpretations
may vary among practitioners.

• Ubiquitous computing refers to computing with pervasive devices at any place and time
usingwiredorwirelesscommunication.TheInternetofThings(IoT)isanetworkedconnection
ofeverydayobjectsincludingcomputers,sensors,humans,etc.TheIoTissupportedbyInternet
cloudstoachieveubiquitouscomputingwithanyobjectatanyplaceandtime.Finally,theterm
InternetcomputingisevenbroaderandcoversallcomputingparadigmsovertheInternet.This book
covers all the aforementioned computing paradigms, placing more emphasis on
distributedandcloud computing andtheirworkingsystems, including theclusters,grids,P2P, and
cloud systems.

Internet of ThingsThe traditional Internetconnects machines to machines or web pages to


webpages. The concept of the IoT was introduced in 1999 at MIT.
• The IoTrefers to the networked interconnection of everyday objects, tools, devices, or computers.
One can view the IoT as a wireless network of sensors that interconnect all things in our daily life.
• Itallowsobjectstobesensedandcontrolledremotelyacrossexistingnetworkinfrastructure

. pg. 3
BIS613DCloudComputing and Security Module-1

DistributedSystemFamilies
Massively distributed systems, including grids, clouds, and P2P networks, focus on resource
sharing in hardware,software, anddatasets.These systems emphasizeparallelism andconcurrency,
asdemonstratedbylarge-scaleinfrastructuresliketheTianhe-1Asupercomputer(builtinChinain 2010
with over 3.2 million cores).

FutureHPC(High-PerformanceComputing)andHTC(High-ThroughputComputing)systems will
require multicore and many-core processors to support large-scale parallel computing. The
effectiveness of these systems is determined by the following key design objectives:
1. Efficiency – Maximizing resource utilization for HPC and optimizing job throughput, data
access, and power efficiency for HTC.
2. Dependability–Ensuringreliability,self-management,andQualityofService(QoS),even in
failure conditions.
3. Adaptability–Supportinglarge-scalejobrequestsandvirtualizedresourcesacrossdifferent
workload and service models.

4. Flexibility – Enabling HPC applications (scientific and engineering) and HTC applications
(business and cloud services) to run efficiently in distributed environments.
Thefutureof distributedcomputing dependson scalable,efficient,and flexible architectures that can
meet the growing demand for computational power, throughput, and energy efficiency.

ScalableComputingTrendsandNewParadigms
Scalable computing is driven by technological advancements that enable high-performance
computing(HPC) andhigh-throughputcomputing(HTC).Severaltrends,suchas Moore’sLaw
(doubling of processor speed every 18 months) and Gilder’s Law (doubling of network bandwidth
each year), have shaped modern computing. The increasing affordability of commodity hardware
has also fueled the growth of large-scale distributed systems.

DegreesofParallelism
Parallelismincomputinghasevolvedfrom:

 Bit-LevelParallelism(BLP)–Transitionfromserialtoword-level processing.
 Instruction-LevelParallelism(ILP)–
Executingmultipleinstructionssimultaneously(pipelining, superscalar computing).

 Data-LevelParallelism(DLP)–SIMD(SingleInstruction,MultipleData)architectures.
 Task-LevelParallelism(TLP)–Parallelexecutionofindependenttasksonmulticore processors.

 Job-LevelParallelism(JLP)–Large-scaledistributedjobexecutionincloudcomputing.
Coarse-grainedparallelismbuildsonfine-grainedparallelism,ensuringscalabilityinHPCandHTC systems.

. pg. 4
BIS613DCloudComputing and Security Module-1

InnovativeApplicationsofDistributedSystems
Parallelanddistributedsystemssupportapplicationsinvariousdomains:

Domain Applications
Science&Engineering Weatherforecasting,genomicanalysis
Business, education, servicesindustry,and E-commerce,banking,stockexchanges
health care
Internetandwebservices,andgovernment Cybersecurity, digital governance, traffic
applications monitoring
Mission-CriticalSystems Military,crisismanagement

HTC systems prioritize task throughput over raw speed, addressing challenges like cost,
energyefficiency, security, and reliability.

TheShiftTowardUtilityComputing
Utilitycomputingfollowsapay-per-usemodelwherecomputingresourcesaredeliveredasaservice.
Cloudcomputingextendsthisconcept,allowingdistributedapplicationstorunonedgenetworks.

Challengesinclude:

 Efficientnetworkprocessors
 Scalablestorageandmemory

 Virtualizationmiddleware
 Newprogrammingmodels

TheHypeCycleofEmergingTechnologies
Newtechnologies followahypecycle,progressingthrough:
1. TechnologyTrigger–Earlydevelopmentandresearch.
2. PeakofInflatedExpectations –Highexpectationsbutunprovenbenefits.

3. TroughofDisillusionment–Realizationoflimitations.

4. SlopeofEnlightenment–Gradual improvements.
5. PlateauofProductivity–Mainstream adoption.

. pg. 5
BIS613DCloudComputing and Security Module-1

Forexample,in2010,cloudcomputingwasmovingtowardmainstreamadoption,whilebroadband over
power lines was expected to become obsolete.

TheInternetofThings(IoT)andCyber-Physical Systems (CPS)


 IoT: Interconnects everyday objects (sensors, RFID, GPS) to enable real-time tracking and
automation.
 CPS:Mergescomputation,communication,andcontrol(3C) tocreateintelligentsystems for
virtual and physical world interactions.

BothIoTandCPSwillplayasignificantroleinfuturecloudcomputingandsmartinfrastructure
development.

1.2TechnologiesforNetwork-BasedSystems
Advancements in multicore CPUs and multithreading technologies have played a crucial role in
thedevelopmentofhigh-performancecomputing (HPC)and high-throughputcomputing (HTC).

AdvancesinCPUProcessors

. pg. 6
BIS613DCloudComputing and Security Module-1

 Modernmulticoreprocessorsintegratedual,quad,six,ormoreprocessingcoresto enhance
parallelism at the instruction level (ILP) and task level (TLP).

 Processor speed growth has followed Moore’s Law, increasing from 1 MIPS (VAX 780,
1978)to22,000MIPS(SunNiagara2,2008)and159,000MIPS(IntelCorei7990x,2011).

 Clockrateshaveincreasedfrom10MHz(Intel286)to4GHz(Pentium4)buthave
stabilized due to heat and powerlimitations.

MulticoreCPUandMany-CoreGPUArchitectures

 Multicoreprocessorshousemultipleprocessingunits,eachwithprivateL1cacheand
sharedL2/L3cacheforefficientdataaccess.

 Many-coreGPUs (e.g., NVIDIAandAMD architectures) leverage hundredstothousands of


cores, excelling in data-level parallelism (DLP) and graphics processing.
 Example:SunNiagaraII–Builtwitheightcores,eachsupportingeightthreads,achieving a
maximum parallelism of 64 threads.

KeyTrendsinProcessorandNetworkTechnology
 Multicorechipscontinuetoevolvewithimprovedcachingmechanismsandincreased processing
cores per chip.

 NetworkspeedshaveimprovedfromEthernet(10Mbps)toGigabitEthernet(1Gbps)
andbeyond100Gbpstosupporthigh-speeddatacommunication.
Modern distributed computing systems rely on scalable multicore architectures and high-speed
networks to handle massive parallelism, optimize efficiency, and enhance overall performance.

MulticoreCPUandMany-CoreGPUArchitectures
Advancements in multicore CPUs and many-core GPUs have significantly influenced modern
high-performancecomputing(HPC) andhigh-throughputcomputing(HTC) systems.As CPUs
approach their parallelism limits, GPUs have emerged as powerful alternatives for massive
parallelism and high computational efficiency.

MulticoreCPUandMany-CoreGPUTrends

. pg. 7
BIS613DCloudComputing and Security Module-1

 MulticoreCPUscontinuetoevolvefromtenstohundredsofcores,buttheyfacechallenges like
the memory wall problem, limiting data-level parallelism (DLP).

 Many-coreGPUs,withhundredstothousandsoflightweightcores,excelinDLPand
task-levelparallelism(TLP),makingthemidealformassivelyparallelworkloads.

 Hybrid architectures are emerging, combining fat CPU cores and thin GPU cores on
asingle chip for optimal performance.
MultithreadingTechnologiesinModernCPUs
 Differentmicroarchitecturesexploitparallelismatinstruction-level(ILP)andthread- level
(TLP):
o SuperscalarProcessors–Executemultipleinstructionsper cycle.
o Fine-GrainedMultithreading–Switchesbetweenthreadseverycycle.
o Coarse-GrainedMultithreading–Runsonethreadformultiplecyclesbefore switching.
o SimultaneousMultithreading(SMT)–Executesmultiplethreadsinthesamecycle.

GPUComputingtoExascaleandBeyond
 GPUswereinitiallydesignedforgraphicsaccelerationbutarenowusedforgeneral-purpose
parallel computing (GPGPU).

. pg. 8
BIS613DCloudComputing and Security Module-1



 ModernGPUs(e.g.,NVIDIACUDA,Tesla,andFermi)featurehundredsofcores, handling
thousands of concurrent threads.

 Example:TheNVIDIA FermiGPUhas512CUDA coresanddelivers82.4teraflops,


contributing to the performance of top supercomputers like Tianhe-1A.

 
GPUvs.CPUPerformanceandPowerEfficiency
 GPUsprioritizethroughput,whileCPUsoptimizelatencyusingcache hierarchies.

 Power efficiencyisa key advantage of GPUs–GPUs consume 1/10thof the power per
instruction compared to CPUs.

 FutureExascaleSystemswillrequire60Gflops/Wpercore,makingpowerefficiencya
majorchallengeinparallelanddistributedcomputing.

. pg. 9
BIS613DCloudComputing and Security Module-1

ChallengesinFutureParallelandDistributedSystems
1. EnergyandPowerEfficiency–Reducingpowerconsumptionwhileincreasing performance.
2. MemoryandStorageBottlenecks–Optimizingdatamovementtoavoidbandwidthlimitations.
3. ConcurrencyandLocality–Improvingsoftwareandcompilersupportforparallel execution.

4. SystemResiliency–Ensuringfaulttoleranceinlarge-scalecomputingenvironments.
The shifttowards hybridarchitectures (CPU + GPU)and the rise of power-aware computingmodels
will drive the next generation of HPC, HTC, and cloud computing systems.

1.2.3Memory,Storage,andWide-AreaNetworking Memory
Technology
 DRAMcapacityhasincreased4xeverythreeyears(from16KBin1976to64GBin2011).
 Memoryaccessspeedhasnotkeptpace,causingthememorywallproblem,whereCPUs
outpace memory access speeds.

. pg.10
BIS613DCloudComputing and Security Module-1


DisksandStorageTechnology
 Harddrivecapacityhasgrown10xeveryeightyears,reaching3TB(SeagateBarracuda XT,
2011).
 Solid-State Drives (SSDs) provide significant speed improvements and durability (300,000
to 1 million write cycles per block).

 Powerandcoolingchallengeslimitlarge-scalestorageexpansion.

System-AreaInterconnects&Wide-AreaNetworking
 LocalAreaNetworks(LANs)connectclientsand servers.
 StorageArea Networks (SANs) & NetworkAttached Storage (NAS) support large-scale
data storage and retrieval.
 Ethernetspeedshaveevolvedfrom10Mbps(1979)to100Gbps(2011),with1Tbpslinks
expected in the future.

 High-speednetworkingenhancesdistributedcomputingefficiencyandscalability.

. pg.11
BIS613DCloudComputing and Security Module-1

1.2.4VirtualMachinesandVirtualizationMiddleware

Virtualization in Distributed Systems


 TraditionalcomputingtightlycouplesOSandhardware,reducingflexibility.
 Virtual Machines (VMs) abstract hardware resources, allowing multiple OS instances on a
single system.

VirtualMachineArchitectures
1. NativeVM(Hypervisor-based)–Directhardwareaccessviabare-metalhypervisors(e.g.,
VMware ESXi, Xen).

Native VMs, also known as bare-metal virtualization, directly run on physical hardware
without requiring a host operating system. These VMs rely on a hypervisor (or Virtual
MachineMonitor,VMM)tomanagemultiplevirtualinstancesrunningonasinglehardware
platform.

 Runsdirectlyonthephysicalmachine(bare-metal).
 Thehypervisorisresponsibleforallocatingresources(CPU,memory,I/O)tovirtual machines.
 Provideshighperformanceandlowoverhead sinceitbypassesthehost OS.
 EnsuresstrongisolationbetweenVMs.
2. HostVM(Software-based)–RunsasanapplicationonahostOS(e.g.,VirtualBox,VMware
Workstation).

Ahostedvirtualmachinerunsasanapplicationwithinanexistingoperatingsystem,relying on a
host OS to provide access to hardware resources. These VMs are managed using software-
based virtualization platforms.

 Runsontopofahostoperating system.
 Uses software-based virtualization techniques (binary translation, dynamic
recompilation).
 HashigheroverheadcomparedtonativeVMs.
 Providesgreaterflexibilitysinceitcanrunongeneral-purpose systems.

. pg.12
BIS613DCloudComputing and Security Module-1

3. HybridVM–Usesacombinationofuser-modeandprivileged-modevirtualization.

Hybrid VMs combine features of both native and hosted virtualization. They partially
virtualizehardware byrunning somecomponents in usermodeand others in privileged
mode. This architecture optimizes performance by reducing overhead while maintaining
flexibility and ease of management.

 Usesbothhardware-assistedandsoftwarevirtualizationtechniques.
 Thehypervisor runsatthekernellevel, butsomefunctionsrelyonthehost OS.
 Balancesperformanceandflexibilityfordifferent workloads.

VirtualMachineOperations

• First, theVMscanbemultiplexedbetweenhardware machines, asshowninFigure1.13(a).


• Second,aVMcanbesuspendedandstoredinstablestorage,asshowninFigure1.13(b).
• Third,asuspendedVMcanberesumedorprovisionedtoanewhardwareplatform,asshowninFigure1.13(c).
• Finally,aVM canbemigratedfromone hardwareplatformtoanother,asshowninFigure1.13(d).

 Multiplexing–MultipleVMssharephysical resources.
 Suspension&Migration –VMscanbepaused, saved,ormigratedacrossdifferent servers.
 Provisioning–VMscanbedynamicallydeployedbasedonworkloaddemand.

VirtualInfrastructure
 Separatesphysicalhardwarefromapplications,enablingflexibleresourcemanagement.

 Enhancesserverutilization from5–15%to60–80%(asclaimedbyVMware).

. pg.13
BIS613DCloudComputing and Security Module-1

1.2.5DataCenterVirtualizationforCloudComputing Data
Center Growth and Cost Breakdown
 43millionserversworldwide(2010),withutilities(power&cooling)exceedinghardware
costs after three years.

 60% of data center costs go toward maintenance and management, emphasizing energy
efficiency over raw performance.

Low-CostDesignPhilosophy
 Commodityx86servers&Ethernetreplaceexpensivemainframes&proprietarynetworking
hardware.
 Software handles fault tolerance, load balancing, and scalability, reducing infrastructure
costs.

ConvergenceofTechnologiesEnablingCloudComputing
1. Virtualization&Multi-coreProcessors–Enablescalablecomputing.
2. Utility&GridComputing–Provideafoundationforcloudcomputing.
3. SOA,Web2.0,andMashups–Facilitatecloud-basedserviceintegration.

4. AutonomicComputing&DataCenterAutomation–Improveefficiencyandfaulttolerance.

TheRiseofData-IntensiveComputing
 Scientificresearch,business,andwebapplications generatevastamountsof data.
 Cloudcomputing&parallelcomputing addressthedatadelugechallenge.

 MapReduce&IterativeMapReduceenablescalabledataprocessingforbigdataand machine
learning applications.
 Theconvergenceofdata-intensivecomputing,cloudplatforms,andmulticore architectures
is shaping the next generation of distributed computing.

. pg.14
BIS613DCloudComputing and Security Module-1

The integration of memory, storage, networking, virtualization, and cloud data centers is
transforming distributed systems. By leveraging virtualization, scalable networking, and cloud
computing, modern infrastructures achieve higher efficiency, flexibility, and cost-effectiveness,
paving the way for future exascale computing.

1.3SYSTEMMODELSFORDISTRIBUTEDANDCLOUD COMPUTING
• Distributedandcloudcomputingsystemsarebuiltusinglarge-scale,interconnectedautonomous
computer nodes. These nodes are linked through Storage Area Networks (SANs), Local Area
Networks (LANs), orWideArea Networks (WANs) in a hierarchical manner.

 Clusters: Connected by LAN switches, forming tightly coupled systems with hundreds of
machines.

 Grids:InterconnectmultipleclustersviaWANs,allowingresourcesharingacross
thousandsof computers.
 P2PNetworks:Formdecentralized,cooperativenetworkswithmillionsofnodes,usedin file
sharing and content distribution.
 CloudComputing:Operatesovermassivedatacenters,deliveringon-demandcomputing
resources at a global scale.
These systems exhibit high scalability, enabling web-scale computing with millions of
interconnected nodes.Their technical and application characteristics vary based on factors such as
resource sharing, control mechanisms, and workload distribution.
Functionality, Computer Clusters Peer-to-Peer Data/Computational CloudPlatforms
Applications [10,28,38] Networks[34,46] Grids[6,18,51] [1,9,11,12,30]
Architecture, Flexiblenetwork
Network ofcompute Heterogeneous clusters Virtualized cluster of
Network
nodes interconnected by of client machines interconnected by high- servers over data centers
Connectivity,
SAN, LAN, or WAN logically speednetworklinksover via SLA
and Size
hierarchically connectedbyan selectedresource sites
overlaynetwork
Control and Homogeneous nodes Autonomous client Centralized control, Dynamic resource
Resources withdistributedcontrol, nodes, free in and out,with server- oriented with provisioning of servers,
Management runningUNIX or Linux self-organization authenticated security storage, andnetworks

Applicationsand High-performance Most appealing to Distributed Upgraded web search,


Network- computing, business file sharing, supercomputing,global utility computing, and
centricServices searchen gin es ,andweb content delivery, andsocial problemsolving, and outsourced Most
services, etc. networking data center services appealing to business file
sharing, content delivery,
and
socialnetworking
computingservices
Representative Google searchengine, Gnutella, eMule, TeraGrid,GriPhyN,UK Google AppEngine, IBM
Operational SunBlade, IBM Road BitTorrent, Napster, EGEE, D-Grid, Bluecloud, AWS, and
Systems Runner,Cray KaZaA, Skype,JXTA ChinaGrid,etc Microsoft Gnutella,
XT4,etc. eMule, BitTorrent,
Napster,KaZaA,Skype,
JXTA

. pg.15
BIS613DCloudComputing and Security Module-1

ClustersofCooperativeComputers
Acomputing cluster consists of interconnected stand-alone computers which work cooperatively as
a single integrated computing resource.
• In the past, clustered computer systems have demonstrated impressive results in handling
heavyworkloads with large data sets.

ClusterArchitecture

ServerClustersandSystemModelsforDistributed Computing
1.3.1ServerClustersandInterconnectionNetworks
Serverclustersconsistof multipleinterconnectedcomputers using high-bandwidth,low-latency
networks like StorageArea Networks (SANs), LocalArea Networks (LANs), and InfiniBand.
These clusters are scalable, allowing thousands of nodes to be connected hierarchically.

 ClustersareconnectedtotheInternetviaaVPNgateway,whichassignsanIPaddressto locate
the cluster.
 Eachnodeoperatesindependently,withitsownOS,creatingmultiplesystemimages (MSI).
 TheclustermanagessharedI/Odevicesanddiskarrays,providingefficientresource utilization.

1.3.1.2Single-SystemImage (SSI)
An ideal cluster should merge multiple system images into a single-system image (SSI), where all
nodes appear as a single powerful machine.
 SSIisachievedthroughmiddlewareorspecializedOSsupport,enablingCPU,memory, and
I/O sharing across all cluster nodes.
 ClusterswithoutSSIfunctionasacollectionofindependentcomputersratherthanaunified
system.

. pg.16
BIS613DCloudComputing and Security Module-1

1.3.1.3Hardware,Software,andMiddlewareSupport
 ClusternodesconsistofPCs,workstations,orservers,interconnectedusingGigabit Ethernet,
Myrinet, or InfiniBand.

 LinuxOSiscommonlyusedforcluster management.
 Message-passinginterfaces(MPI,PVM)enableparallelexecutionacrossnodes.
 Middlewaresupportsfeatureslikehighavailability(HA),distributedmemorysharing (DSM),
and job scheduling.
 Virtualclusterscanbedynamicallycreatedusingvirtualization,optimizingresource allocation
on demand.

1.3.1.4MajorClusterDesignIssues

Features FunctionalCharacterization FeasibleImplementations

AvailabilityandSupport Hardware andsoftware support for Failover,failback,checkpointing,


sustainedHA in cluster rollbackrecovery,nonstopOS,etc.
HardwareFaultTolerance Automated failure managementto Component redundancy,hot
eliminateallsinglepointsoffailure swapping, R AID, multiple
power supplies, etc.
SingleSystemImage(SSI) AchievingSSIatfunctionallevelwith Hardwaremechanismsormiddleware
hardwareand softwaresupport, support toachieveDSM atcoherent
middleware,or OSextensions cache level
EfficientCommunications Toreducem ess a ge -passingsystem Fast message passing, active
overheadandhidelatencies messages,enhancedMPIlibrary,etc.
Cluster-wideJob Usingaglobaljobmanagement Application of single-job
system withbetter schedulingand managementsystemssuch asLSF,
Management monitoring Codine,etc.
DynamicLoadBalancing Balancingtheworkloadofall Workload monitoring, process
processing nodesalongwith failure migration,job replication andgang
recovery scheduling, etc.
Scalabilityand Addingmoreserverstoaclusteror Useofscalableinterconnect,
addingmoreclustersto a grid as the performance monitoring, distributed
Programmability workloador datasetincreases execution environment, and better
software tools
 Lackofacluster-wideOSlimitsfullresourcesharing.
 Middlewaresolutions providenecessary functionalities like scalability, fault tolerance, and
job management.
 Keychallengesincludeefficientmessagepassing,seamlessfaulttolerance,high availability,
and performance scalability.
Server clusters are scalable, high-performance computing systems that utilize networked
computing nodes for parallel and distributed processing. Achieving SSI and efficient
middleware support remains a key challenge in cluster computing. Virtual clusters and cloud
computing are evolving to enhance cluster flexibility and resource management.

1.3.2GridComputing,Peer-to-Peer(P2P)Networks,andSystemModels
Grid Computing Infrastructures

. pg.17
BIS613DCloudComputing and Security Module-1

GridcomputinghasevolvedfromInternetandweb-basedservicestoenablelarge-scale distributed
computing. It allows applications running on remote systems to interact in real-time.

ComputationalGrids
 Agridconnectsdistributedcomputingresources(workstations,servers,clusters,supercompute
rs) over LANs, WANs, and the Internet.

 Usedforscientificand enterpriseapplications,including SETI@Homeand astrophysics


simulations.
 Providesanintegratedresourcepool,enablingsharedcomputing,data,andinformation services.

GridFamilies

DesignIssues ComputationalandDataGrids P2PGrids

GridApplicationsReported Distributedsupercomputing, OpengridwithP2Pflexibility,all


NationalGridinitiatives,etc. resources from client machines
RepresentativeSystems TeraGridbuiltin US,ChinaGridin JXTA, FightAid@home,
China,andthee-Sciencegrid built in SETI@home
UK
DevelopmentLessonsLearned Restricted user groups, Unreliable user-contributed
middlewarebugs,protocolsto resources,limitedtoafewapps
acquireresources
 ComputationalandDataGrids–Usedinnational-scalesupercomputingprojects(e.g.,
TeraGrid,ChinaGrid,e-ScienceGrid).
 P2P Grids – Utilize client machines for open, distributed computing (e.g.,
SETI@Home,JXTA, FightingAID@Home).

 Challengesincludemiddleware bugs,securityissues,andunreliable user-contributed


resources.

Peer-to-Peer(P2P)NetworkFamilies
P2Psystemseliminatecentralcoordination,allowingclientmachinestoactasbothserversand clients.

P2PSystems

. pg.18
BIS613DCloudComputing and Security Module-1

Decentralizedarchitecturewithself-organizingpeers.

Nocentralauthority;allnodesareindependent.

Dynamicmembership–peerscanjoinandleavefreely.

1.3.3.2OverlayNetworks
 Logicalconnectionsbetweenpeers,independent ofthephysical network.
 Twotypes:
o Unstructuredoverlays–Randomlyconnectedpeers,requiringfloodingfordata retrieval
(high traffic).
o Structured overlays – Use predefined rules for routing and data lookup, improving
efficiency.

1.3.3.3P2PApplicationFamilies
P2Pnetworksservefourmainapplication categories:

Category Examples Challenges

File Sharing Napster,BitTorrent,Gnutella Copyrightissues,security concern

CollaborationPlatform Skype,MSN,Multiplayergames Privacyrisks,spam,lackoftrust

DistributedComputing SETI@Home,Genome@Home Securityvulnerabilities,selfishno

OpenP2PPlatforms JXTA,.NET,FightingAID@Ho Lackof standardizationand secur

1.3.3.4P2PComputingChallenges
 Heterogeneity–Varyinghardware,OS,andnetworkconfigurations.

 Scalability–Musthandlegrowingworkloadsand distributedresources efficiently.


 DataLocation&Routing –Optimizingdataplacementforbetterperformance.

 FaultTolerance&LoadBalancing–Peerscanfailunpredictably.
 Security&Privacy–Nocentralcontrolmeansincreasedriskofdatabreachesand malware.
P2Pnetworksofferrobustanddecentralizedcomputing,butlacksecurityandreliability,making them
suitable only for low-security applications like file sharing and collaborative tools.

BothgridcomputingandP2Pnetworksprovidescalable,distributedcomputingmodels.While
gridsareusedforstructured,high-performancecomputing,P2Pnetworksenabledecentralized,

. pg.19
BIS613DCloudComputing and Security Module-1

user-driven resource sharing. Future developments will focus on security, standardization, and
efficiency improvements.

CloudComputingovertheInternet
Cloud computing has emerged as a transformative on-demand computing paradigm, shifting
computation and data storage from desktops to large data centers. This approach enables the
virtualization of hardware, software, and data resources, allowing users to access scalable
services over the Internet.

InternetClouds

 Cloudcomputingleveragesvirtualizationtodynamicallyprovisionresources,reducing costs
and complexity.
 Itofferselastic,scalable,andself-recoveringcomputingpowerthroughserverclustersand
large databases.
 Thecloudcanbeperceivedaseitheracentralizedresourcepooloradistributedcomputing
platform.

 Keybenefits:Cost-effectiveness,flexibility,andmulti-userapplicationsupport.

TheCloudLandscape
Traditionalcomputingsystemssufferfromhighmaintenancecosts,poorresourceutilization,and
expensive hardware upgrades. Cloud computing resolves these issues by providing on-demand
access to computing resources.

. pg.20
BIS613DCloudComputing and Security Module-1

ThreeMajorCloudServiceModels:

1. InfrastructureasaService(IaaS)
o Provides computing infrastructure such as virtual machines (VMs), storage, and
networking.

o Usersdeployandmanagetheirapplicationsbutdonotcontroltheunderlying
infrastructure.
o Examples:AmazonEC2, GoogleComputeEngine.
2. PlatformasaService (PaaS)
o Offers a development environment with middleware, databases, and
programming tools.
o Enablesdeveloperstobuild,test,anddeployapplicationswithoutmanaginginfrastructur
e.
o Examples:GoogleAppEngine,MicrosoftAzure,AWSLambda.

3. SoftwareasaService (SaaS)
o Deliverssoftwareapplicationsviawebbrowsers.
o Userspayforaccess insteadofpurchasingsoftwarelicenses.
o Examples:GoogleWorkspace,Microsoft365,Salesforce.
CloudDeploymentModels:
 PrivateCloud–Dedicatedtoasingleorganization(e.g.,corporatedatacenters).
 PublicCloud–Hostedbythird-partyprovidersforgeneraluse(e.g.,AWS,GoogleCloud).

 ManagedCloud–Operatedbyathird-partyserviceproviderwithcustomizedconfigurations.

. pg.21
BIS613DCloudComputing and Security Module-1

 HybridCloud–Combinespublicandprivateclouds,optimizingcostandsecurity.

Advantages of Cloud Computing


Cloudcomputingprovides severalbenefitsovertraditionalcomputingparadigms,including:
1. Energy-efficientdatacentersinsecure locations.

2. Resourcesharing,optimizingutilizationandhandlingpeakloads.
3. Separationofinfrastructuremaintenancefromapplicationdevelopment.
4. Costsavingscomparedtotraditionalon-premise infrastructure.

5. Scalabilityforapplicationdevelopmentandcloud-basedcomputingmodels.

6. Enhancedserviceanddatadiscovery forcontentandservice distribution.


7. Securityandprivacyimprovements,thoughchallengesremain.
8. Flexibleserviceagreementsandpricingmodelsforcost-effectivecomputing.
Cloud computing fundamentally changes how applications and services are developed, deployed,
and accessed.With virtualization, scalability, and cost efficiency, it has become the backbone of
modernInternetservicesandenterprisecomputing.Futureadvancementswillfocusonsecurity,
resource optimization, and hybrid cloud solutions.

SoftwareEnvironmentsforDistributedSystemsandClouds
ThissectionintroducesService-OrientedArchitecture(SOA)andotherkeysoftwareenvironments that
enable distributed and cloud computing systems. These environments define how applications,
services, and data interact within grids, clouds, and P2P networks.

Service-OrientedArchitecture(SOA)
SOA enables modular, scalable, and reusable software components that communicate over a
network. It underpins web services, grids, and cloud computing environments.

LayeredArchitectureforWebServicesandGrids
 DistributedcomputingbuildsontheOSImodel,addinglayersforserviceinterfaces, workflows,
and management.

. pg.22
BIS613DCloudComputing and Security Module-1

 Communicationstandardsinclude:

o SOAP(SimpleObjectAccess Protocol)– Usedinweb services.


o RMI(RemoteMethodInvocation)–Java-based communication.
o IIOP(InternetInter-ORBProtocol)–UsedinCORBA-basedsystems.
 Middlewaretools(e.g.,WebSphereMQ,JavaMessageService)managemessaging,security,
and fault tolerance.

WebServicesandTools
SOAisimplemented viatwomain approaches:
1. WebServices(SOAP-based)–Fullyspecifiedservicedefinitions,enablingdistributedOS- like
environments.

2. REST(RepresentationalStateTransfer)–
Simpler,lightweightalternativeforwebapplications and APIs.
 WebServicesprovidestructured,standardizedcommunicationbutfacechallengesin protocol
agreement and efficiency.

 RESTisflexibleandscalable,bettersuitedforfast-evolvingenvironments.
 Integration of Services – Distributed systems use Remote Method Invocation (RMI) or
RPCs to link services into larger applications.

TheEvolutionof SOA

. pg.23
BIS613DCloudComputing and Security Module-1

SOAhasexpandedfrombasicwebservicestocomplexmulti-layeredecosystems:

 SensorServices(SS)–DeviceslikeZigBee,Bluetooth,GPS,andWiFicollectrawdata.
 Filter Services (FS) – Process data before feeding into computing, storage, or discovery
clouds.
 Cloud Ecosystem – Integrates compute clouds, storage clouds, and discovery clouds for
managing large-scale applications.

SOAenablesdatatransformationfromrawdata→usefulinformation→knowledge→wisdom
→intelligentdecisions.
SOA defines the foundation for web services, distributed systems, and cloud computing. By
integrating sensors, processing layers, and cloud resources, SOA provides a scalable, flexible
approach for modern computing applications. The future of distributed computing will rely on
intelligent data processing, automation, and service-driven architectures.

1.4.1.4Gridsvs.Clouds
 Gridsusestaticresources,whereascloudsprovideelastic,on-demandresourcesvia
virtualization.
 Cloudsfocusonautomationandscalability,whilegridsarebetterfornegotiatedresource
allocation.

 Hybridmodelsexist,suchascloudsofgrids,gridsofclouds,andinter-cloudarchitectures.

1.4.2TrendstowardDistributedOperatingSystems
Traditional distributed systems run independent OS instances on each node, resulting in multiple
system images.Adistributed OS manages all resources coherently and efficiently across nodes.

DistributedOSApproaches(Tanenbaum'sModels)
1. NetworkOS–Basicresourcesharingviafilesystems(low transparency).
2. Middleware-based OS – Limited resource sharing through middleware extensions
(e.g.,MOSIX for Linux clusters).

3. TrulyDistributedOS–Providessingle-systemimage(SSI)withfulltransparencyacross
resources.

. pg.24
BIS613DCloudComputing and Security Module-1

Amoebavs.DCE
 Amoeba(microkernelapproach)offersalightweightdistributedOSmodel.
 DCE(middlewareapproach)extendsUNIXforRPC-baseddistributedcomputing.

 MOSIX2enablesprocessmigrationacrossLinux-basedclustersandclouds.

MOSIX2forLinuxClusters
 Supportsvirtualizationforseamlessprocessmigrationacrossmultiple clustersandclouds.

 EnhancesparallelcomputingbydynamicallybalancingworkloadsacrossLinuxnodes.

TransparencyinProgrammingEnvironments
 Cloudcomputingseparatesuserdata,applications,OS,andhardwareforflexible computing.

 UserscanswitchbetweenOSplatformsandcloudserviceswithoutbeinglockedinto specific
applications.

. pg.25
BIS613DCloudComputing and Security Module-1

ParallelandDistributedProgrammingModels
Distributedcomputingrequiresefficientparallelexecutionmodelstoprocesslarge-scale workloads.

Model Description Key Features

MPI (Message- Standard for writing parallel Explicitcommunicationbetween


Passing Interface) applications on distributed systems processes via message-passing

Map function generates key-value


Webprogrammingmodelfor scalable
MapReduce pairs; Reduce function merges
data processing on large clusters
values

Open-source framework for


Distributedstorage(HDFS)and
Hadoop processing vast datasets in business
MapReduce-based computing
and cloud applications

Message-PassingInterface (MPI)
 Usedforhigh-performancecomputing (HPC).
 Programsexplicitlysendandreceivemessagesforinter-processcommunication.

MapReduce
 Highlyscalableparallelmodel,usedinbigdataprocessingandsearch engines.

 SplitsworkloadsintoMap(processing)andReduce(aggregation)tasks.
 GoogleexecutesthousandsofMapReducejobsdaily forlarge-scaledataanalysis.

Hadoop
 Open-sourcealternativetoMapReduce,usedforprocessingpetabytesof data.

 Scalable,cost-effective,andfault-tolerant,makingitidealforcloud services.

GridStandardsandToolkits
Gridsusestandardizedmiddlewaretomanageresourcesharingandsecurity.

Standard Function Key Features

Supports heterogeneous computing,


OGSA (Open Grid Defines common grid
security policies, and resource
ServicesArchitecture) services
allocation

Middlewareforresource UsesPKIauthentication,Kerberos, SSL,


GlobusToolkit (GT4)
discovery and security and delegation policies

Gridcomputingframework Supportsautonomiccomputingand
IBMGridToolbox
for AIX/Linux clusters security management

. pg.26
BIS613DCloudComputing and Security Module-1

 DistributedOSmodelsareevolving,withMOSIX2enablingprocessmigrationand resource
sharing across Linux clusters.

 ParallelprogrammingmodelslikeMPIandMapReduceoptimizelarge-scale computing.
 Cloud computing and grid computing continue to merge, leveraging virtualization and
elastic resource management.
 Standardizedmiddleware(OGSA,Globus)enhancesgridsecurity,interoperability,and
automation.

Performance,Security,andEnergyEfficiency
Thissectiondiscusseskeydesignprinciplesfordistributedcomputingsystems,covering
performancemetrics,scalability,systemavailability,faulttolerance,andenergyefficiency.

PerformanceMetricsandScalabilityAnalysis
Performance is measured using MIPS,Tflops,TPS, and network latency. Scalability is crucial in
distributed systems and has multiple dimensions:
1. SizeScalability–Expandingsystemresources(e.g.,processors,memory,storage)toimprove
performance.
2. SoftwareScalability–UpgradingOS,compilers,andlibrariestoaccommodatelarger systems.
3. ApplicationScalability–Increasingproblemsizetomatchsystemcapacityforcost-effectiveness.

4. TechnologyScalability– Adaptingtonewhardwareandnetworkingtechnologieswhile ensuring


compatibility.

1.5.1.3Scalabilityvs.OSImage Count
 SMPsystemsscaleuptoafewhundredprocessorsduetohardware constraints.
 NUMAsystemsusemultipleOSimagestoscaletothousandsofprocessors.
 Clustersandclouds scalefurtherbyusing virtualization.

 Gridsintegratemultipleclusters,supportinghundredsofOS images.

 P2PnetworksscaletomillionsofnodeswithindependentOSimages.



. pg.27
BIS613DCloudComputing and Security Module-1

1.5.1.4Amdahl’sLaw(FixedWorkloadScaling)
 Speedupinparallelcomputingislimitedbythesequentialportionofaprogram.

 Speedup Formula:
whereα isthefractionof theworkload thatissequential.

 Evenwithhundredsofprocessors,speedupislimitedifsequentialexecution(α)ishigh.
ProblemwithFixedWorkload

 In Amdahl’s law, we have assumed the same amount of workload for both sequential and parallel
executionoftheprogramwithafixedproblemsizeordataset.Thiswascalledfixed-workloadspeedup
byHwangandXu[14].Toexecuteafixedworkloadonnprocessors,parallelprocessingmayleadto a system
efficiency defined as follows:


1.5.1.6Gustafson’sLaw(ScaledWorkloadScaling)
 Instead of fixing workload size, this model scales the problem to match available
processors.

 Speedup Formula:
 This speedup is known as Gustafson’s law. By fixing the parallel execution time
atlevel W, the following efficiency expression is obtained:

 Moreefficientforlargeclusters,asworkloadscalesdynamicallywithsystemsize.

1.5.2FaultToleranceandSystemAvailability
 Highavailability(HA)isessentialinclusters,grids,P2Pnetworks,andclouds.
 SystemavailabilitydependsonMeanTimetoFailure(MTTF)andMeanTimetoRepair
(MTTR):Availability=MTTF/(MTTF+MTTR)

 
 Eliminating single points of failure (e.g., hardware redundancy, fault isolation) improves
availability.

. pg.28
BIS613DCloudComputing and Security Module-1

 P2Pnetworksarehighlyscalablebuthavelowavailabilityduetofrequentpeerfailures.

 Grids and clouds offer better fault isolation and thus higher availability than
traditionalclusters.

 Scalabilityandperformancedependonresourceexpansion,workloaddistribution,and
parallelization.
 Amdahl’sLaw limitsspeedupfor fixedworkloads,while Gustafson’sLaw optimizes large-
scale computing.

 Highavailabilityrequiresredundancy,faulttolerance,andsystemdesignimprovements.
 Cloudsandgridsbalancescalabilityandavailabilitybetter thantraditionalSMP or
NUMAsystems.

NetworkThreats,DataIntegrity,andEnergyEfficiency
Thissectionhighlightssecuritychallenges,energyefficiencyconcerns,andmitigationstrategies
indistributedcomputingsystems,includingclusters, grids,clouds,andP2Pnetworks.
NetworkThreatsandDataIntegrity
Distributedsystemsrequiresecuritymeasurestopreventcyberattacks,databreaches,and unauthorized
access.
ThreatstoSystemsandNetworks

 LossofConfidentiality–Duetoeavesdropping,trafficanalysis,andmedia scavenging.

 LossofIntegrity–Causedbypenetrationattacks,Trojanhorses,andunauthorized access.
 Lossof Availability–DenialofService(DoS)andresourceexhaustiondisruptsystem operation.

. pg.29
BIS613DCloudComputing and Security Module-1

 ImproperAuthentication –Allows attackers to steal resources, modify data, and conduct


replay attacks.

SecurityResponsibilities
Securityincloudcomputingisdividedamongdifferentstakeholdersbasedonthecloudservice model:

 SaaS:Cloudproviderhandlessecurity,availability,andintegrity.

 PaaS:Providermanagesintegrityandavailability,whileuserscontrolconfidentiality.
 IaaS:Usersareresponsibleformostsecurityaspects,whileprovidersensureavailability.

CopyrightProtection
 CollusivepiracyinP2Pnetworksallowsunauthorizedfile sharing.

 Contentpoisoningandtimestampedtokenshelpdetectpiracyandprotectdigitalrights.

SystemDefenseTechnologies
Threegenerationsofnetworksecurity have evolved:

1. Prevention-based–Accesscontrol,cryptography.
2. Detection-based – Firewalls, intrusion detection systems (IDS), Public Key
Infrastructure(PKI).
3. Intelligentresponsesystems–AI-driventhreatdetectionand response.

1.5.3.5DataProtectionInfrastructure
 Trustnegotiationensuressecuredatasharing.
 Wormcontainment&intrusiondetectionprotectagainstcyberattacks.
 Cloudsecurityresponsibilitiesvarybasedontheservicemodel(SaaS,PaaS, IaaS).

1.5.4EnergyEfficiency inDistributedComputing
Distributedsystemsmustbalancehighperformancewithenergyefficiencyduetoincreasingpower costs
and environmental impact.

EnergyConsumptionofUnusedServers
 Manyserversareleftpoweredonbutidle,leadingtohugeenergywaste.
 Global energy cost of idle servers: $3.8 billion annually, with11.8 milliontons of
CO₂emissions.

 ITdepartmentsmust identifyunderutilizedserverstoreduce waste.

ReducingEnergyinActiveServers

. pg.30
BIS613DCloudComputing and Security Module-1

Energyconsumptioncanbemanaged across fourlayers(Figure1.26):

1. ApplicationLayer–Optimizesoftwaretobalanceperformanceandenergyconsumption.
2. MiddlewareLayer–Smarttaskschedulingtoreduceunnecessary computations.

3. ResourceLayer–UseDynamicPowerManagement(DPM)andDynamicVoltage-
Frequency Scaling (DVFS).
4. Network Layer – Develop energy-efficient routing algorithmsand optimize
bandwidthusage.

1.5.4.3DynamicVoltage-FrequencyScaling(DVFS)
 ReducesCPUvoltageandfrequencyduringidletimestosavepower.

o FormulaforEnergyConsumptioninCMOSCircuits:

o Loweringvoltageandfrequencysignificantlyreducesenergyusage.
 Potentialsavings:DVFScancutpowerconsumption whilemaintainingperformance.
 Energyefficiencyis criticalduetohighcostsandenvironmental impact.
 TechniqueslikeDPMandDVFScansignificantlyreducepowerconsumptionwithout
compromising performance.

. pg.31
BIS613DCloudComputing and Security Module-1

. pg.32

You might also like