INTRODUCTION TO
CLOUD COMPUTING
MODULE 1
CLOUD COMPUTING
• Cloud computing is the delivery of computing services—including servers,
storage, databases, networking, software, analytics, and intelligence—over
the Internet (“the cloud”) to offer faster innovation, flexible resources, and
economies of scale. You typically pay only for cloud services you use,
helping lower your operating costs, run your infrastructure more efficiently
and scale as your business needs change.
CLOUD COMPUTING
• Cloud computing is the on-demand availability of computer system resources, especially
data storage and computing power, without direct active management by the user
• Cloud computing relies on sharing of resources to achieve coherence and economies of
scale.
• The cloud aims to cut costs, and helps the users focus on their core business instead of
being impeded by IT obstacles
• The main enabling technology for cloud computing is virtualization.
• Virtualization software separates a physical computing device into one or more "virtual"
devices, each of which can be easily used and managed to perform computing tasks.
TRADITIONAL COMPUTING –
LIMITATIONS
DIFFICULTIES FACED BY INDIVIDUAL USERS IN
TRADITIONAL COMPUTING APPROACH
Traditional Computing Scenario Problematic facts and related questions
Business application package implementationEnterprises (or IT service firms) need to maintain a
also over-burdens the IT enterprises with team of experts (system maintenance team) in order to
many other costs. manage the whole thing.
Setting up infrastructure, installation of OS,This is a burden for HR management and incurs
device drivers, management of routers,recurring capital investment (for salaries).
firewalls, proxy servers etc. are all Can enterprises get relief from these responsibilities
responsibilities of the enterprise in traditionaland difficulties? It would help them concentrate fully
computing approach. on the functioning of business applications
This is an extra burden for enterprises who are only
Even those IT enterprises whose sole business interested in application development.
interest is developing applications are bound They can outsource the management of infrastructure
to setup computing infrastructure before they to some third party, but the cost and quality of such
start any development work services varies quite a bit.
Can IT enterprises avert such difficulties?
Computing infrastructure requires adequate It becomes difficult to compete in the market with
hardware procurement. outdated hardware infrastructure.
This procurement is costly, but it is not a Advanced software applications also require upgraded
one-time investment. hardware in order to maximize business output.
After every few years, existing devices Can this process of upgrading hardware on a regular
become outdated as more powerful devices basis be eliminated from an enterprise’s
appear responsibility?
Traditional Computing Scenario Problematic facts and related questions
Adopting an updated version of an application requires
It is not unusual to find an updated version
necessary efforts from subscriber’s end.
of application with new releases that is
Fresh installation and integration of components need to be
more advanced and apt to keep up with
done. Can subscribers be relieved of this difficulty of
changing business scenario.
periodically upgrading the applications?
Enterprises generally plan and procure to support the
Capacity planning of computing resources maximum business load that they have anticipated.
is a critical task for any organization. But average resource demand remains far less, most of the
Appropriate planning needs time, expertise time.
and budgetary allocation since low resource This causes resource wastage and increases the recurring
volume hampers the pace of the cost of business.
performance of applications. If this capacity planning task could be made less critical
and resource procurement strategy more cost effective?
Individual enterprises cannot manage system contraction in
Resource requirements of a system may a way that unutilized resources of a system can be utilized
increase or decrease from time to time. in some other system so that the cost of the business could
be reduced. If this were somehow possible?
When resource capacity expansion of such system becomes
Many enterprise computing systems run
an absolute requirement for the respective business, an
forever without stopping. Such systems host
system shutdown (hence service disruption) becomes
applications which require round-the-clock
unavoidable which may cause loss in the business.
availability to fulfill business demand.
If a system could be expanded without shutting it down?
Traditional Computing Scenario Problematic facts and related questions
For general users (who don’t want to experiment with computer
To work with software applications (like text editor,
hardware devices), this initial capital investment for setting up
image editor, programming, games etc.), users first
computing infrastructure is often more than the software
need to procure computing system where these
applications they use! If this huge unnecessary investment for
software applications run.
procuring hardware components could be avoided?
General users are usually not experts of computing systems.
They are often misguided and procure unnecessary
Requirement analysis and procurement of hardwarevolume/capacity of hardware
infrastructure are responsibility of the users. But,resources, most portions of which remain
actual utilization of these resources depends onunutilized. This reduces the return on investment
frequency of user access and the kind of software(ROI).
applications they run over it. If this approach could be changed; if users did
not have to procure a fixed volume of hardware
resource prior to its actual use or demand?
A hardware component may fail for many reasons. Time, cost and the uncertainty are involved in the process. If
Maintenance of the hardware infrastructure is theusers could get relief from these responsibilities and
users’ responsibility. difficulties?
Computing systems (desktop, laptop etc.) procuredNon-utilization of procured systems results in wastage of
by most users are hardly used for few hours dailyresource with regard to total investment. If the hardware
on an average. resources would be available on payment of usage basis?
Traditional Computing Scenario Problematic facts and related questions
Software licensing cost needs separate If software is used for 2–5 hour per day on an average
budgetary allocation. Licenses are sold for during licensing period, it depicts 8%–20% utilization of
fixed period of time (usually for one entire investment. If this cost could be reduced? If the
year-duration). licensing fee would be paid on hourly usage basis?
Professional help can be obtained against payment, or
Users are burdened with installation and users can troubleshoot themselves, thereby investing more
critical customization of software. They also time. If users could get relief from these responsibilities
troubleshoot in case the software crashes. and difficulties?
Though portable computing devices are available (like
Users need to have physical access to the laptop, tablet etc.), it may not be possible to carry them all
system for using a personal computing the time. Could there be a way of accessing personal
system. computing systems remotely, from any location, any time?
Within few years, hardware systems become Users have no other option but to throw out the whole
outdated. It becomes difficult to run setup and replace it with a new one. If there be a
advanced or new software on them. permanent solution to this wastage (from users’ end)?
• In traditional approach, computing subscribers were always over-burdened with
many additional difficulties and cost.
• Cloud computing facilitates the delivery of computing services like any other
utility service.
• Customers can use computing service just as electricity is consumed and pay bills
at the end of the month
• The key advantage of this new model of computing is the access to any kind and
any volume of computing facilities from anywhere and anytime.
• Cloud computing has created the scope where consumers can rent a server on
Internet and work from any place by signing-in to gain the access.
SYSTEM MODELS FOR DISTRIBUTED AND CLOUD
COMPUTING
• Distributed and Cloud computing systems :
• Built over a large number of autonomous computer nodes.
• Interconnected by SANs, LANs, or WANs in a hierarchical manner.
• LAN switches - connect hundreds of machines as a working cluster.
• WAN - connect many local clusters to form a very large cluster of clusters.
• A massive system with millions of computers connected to edge networks can be built this way.
• Massive systems are considered highly scalable, and can reach web-scale connectivity – physically
or logically.
• Massive systems are classified into four groups:
• Clusters
• P2P networks
• Computing grids
• Internet clouds over huge data centers
• These four system classes may involve hundreds, thousands, or even millions of
computers as participating nodes.
CLUSTERS OF COOPERATIVE COMPUTERS
• Consists of interconnected stand-alone computers which work cooperatively as a single
integrated computing resource.
CLUSTER ARCHITECTURE
• A typical server cluster built around a low-latency, high bandwidth interconnection network.
• Network can be:
• a simple SAN
• a LAN (e.g., Ethernet)
• To build a larger cluster with more nodes, the interconnection network can be built with
multiple levels of Gigabit Ethernet, Myrinet, or InfiniBand switches.
• Through hierarchical construction using a SAN, LAN, or WAN, one can build scalable
clusters with an increasing number of nodes.
• The cluster is connected to the Internet via a virtual private network (VPN) gateway.
• The gateway IP address locates the cluster.
• Most clusters have loosely coupled node computers and their resources are managed by
their own OS.
• So most clusters have multiple system images.
• Single System Image (SSI):
• An ideal cluster should merge multiple system images into a single-system image.
• A cluster operating system or some middleware is required to support SSI at various
levels, including the sharing of CPUs, memory, and I/O across all cluster nodes.
• SSI- illusion created by software or hardware that presents a collection of resources as
one integrated, powerful resource.SSI makes the cluster appear like a single machine to
the user.
• A cluster with multiple system images is nothing but a collection of independent
computers.
• Hardware, Software, and Middleware Support:
• Hardware:
• PCs, workstations, servers, or
• SMP
• Software:
• Special communication software such as PVM or MPI
• Network interface card in each computer node
• Most clusters run under the Linux OS.
• The computer nodes are interconnected by a high-bandwidth network (such as Gigabit
Ethernet, Myrinet, InfiniBand, etc.).
• Middleware:
• Special cluster middleware supports are needed to create SSI.
GRID COMPUTING INFRASTRUCTURES
• Grid Computing can be defined as a network of computers working together to perform
a task that would rather be difficult for a single machine.
• All machines on that network work under the same protocol to act like a virtual
supercomputer.
• The task that they work on may include analysing huge datasets or simulating situations
which require high computing power.
• Computers on the network contribute resources like processing power and storage
capacity to the network.
GRID COMPUTING INFRASTRUCTURES
• An infrastructure that couples computers, software/middleware, special instruments, and
people and sensors together.
• Constructed across LAN, WAN, or Internet backbone networks at a regional, national, or
global scale.
• Mainly uses workstations, servers, clusters, and supercomputers.
• Personal computers, laptops, and PDAs can be used as access devices to a grid system.
• Enterprises or organizations present grids as integrated computing resources
• Computational grid built over multiple resource sites owned by different organizations.
• The resource sites offer complementary computing resources, including workstations, large
servers, a mesh of processors, and Linux clusters to satisfy a chain of computational needs.
• The grid is built across various IP broadband networks including LANs and WANs
already used by enterprises or organizations over the Internet.
• The grid is presented to users as an integrated resource pool
• At the client end - wired or wireless terminal devices.
• The grid integrates the computing, communication, contents, and transactions as rented services.
• Enterprises and consumers form the user base.
• Industrial grid platform development by IBM, Microsoft, Sun, HP, Dell, Cisco
PEER-TO-PEER NETWORK FAMILIES
• The P2P architecture offers a distributed model of networked systems.
• A P2P network is client-oriented instead of server-oriented.
• P2P systems are introduced at the physical level and overlay networks at the logical level.
• P2P Systems:
• Every node acts as both a client and a server, providing part of the system resources.
• Peer machines- client computers connected to the Internet
• All client machines act autonomously to join or leave the system freely.
• No master-slave relationship exists among the peers.
• No central coordination or central database is needed.
• No peer machine has a global view of the entire P2P system.
• The system is self-organizing with distributed control.
• Physical Network:
• The participating peers form the physical network at any time.
• Unlike the cluster or grid, a P2P network does not use a dedicated interconnection
network.
• The physical network is simply an ad hoc network formed at various Internet domains
randomly using the TCP/IP and NAI protocols
• Overlay Network:
• Based on communication or file-sharing needs, the peer IDs form an overlay network at the logical level.
• This overlay is a virtual network formed by mapping each physical machine with its ID, logically, through a
virtual mapping .
• When a new peer joins the system, its peer ID is added as a node in the overlay network and is removed from
the overlay network automatically when it leaves.
• Therefore, it is the P2P overlay network that characterizes the logical connectivity among the peers
• Two types of overlay networks:
• unstructured and structured
• An unstructured overlay network is characterized by a random graph.
• There is no fixed route to send messages or files among the nodes.
• Often, flooding is applied to send a query to all nodes in an unstructured overlay, thus resulting in heavy
network traffic and nondeterministic search results.
• Structured overlay networks follow certain connectivity topology and rules for inserting
and removing nodes (peer IDs) from the overlay graph.
• Routing mechanisms are developed to take advantage of the structured overlays.
CLOUD COMPUTING OVER THE INTERNET
• Definition of Cloud Computing by IBM:
• A cloud is a pool of virtualized computer resources. A cloud can host a variety of
different workloads, including batch-style backend jobs and interactive and user-facing
applications
• i.e. a cloud allows workloads to be deployed and scaled out quickly through rapid
provisioning of virtual or physical machines.
• The cloud supports redundant, self-recovering, highly scalable programming models that
allow workloads to recover from many unavoidable hardware/software failures.
• Finally, the cloud system should be able to monitor resource use in real time to enable
rebalancing of allocations when needed.
• Cloud computing applies a virtualized platform with elastic resources on demand by
provisioning hardware, software, and data sets dynamically.
• Cloud computing intends to satisfy many user applications simultaneously.
• The cloud ecosystem must be designed to be secure, trustworthy, and dependable.
INTERNET CLOUDS
SOFTWARE ENVIRONMENTS FOR DISTRIBUTED
SYSTEMS AND CLOUDS
• Service-Oriented Architecture (SOA) :
• An architectural approach in which applications make use of services available in the network.
• An application's business logic or individual functions are modularized and presented as
services for consumer/client applications.
• Loosely coupled nature - the service interface is independent of the implementation.
• Application developers or system integrators can build applications by composing one or
more services without knowing the services' underlying implementations.
• For example, a service can be implemented either in .Net or J2EE, and the application
consuming the service can be on a different platform or language
• There are two major roles within Service-oriented Architecture:
• Service provider: The service provider is the maintainer of the service and the
organization that makes available one or more services for others to use.
• To advertise services, the provider can publish them in a registry, together with a service
contract that specifies the nature of the service, how to use it, the requirements for the
service, and the fees charged.
• Service consumer: The service consumer can locate the service metadata in the registry
and develop the required client components to bind and use the service.
• Distributed Operating Systems:
• Tanenbaum identifies 3 approaches for distributing resource management functions in a
distributed computer system.
• The first approach is to build a network OS over a large number of heterogeneous OS
platforms. Such an OS offers the lowest transparency to users, and is essentially a
distributed file system, with independent computers relying on file sharing as a means of
communication.
• Network Operating System runs on a server and gives the server the capability to
manage data, users, groups, security, applications, and other networking functions.
• The basic purpose of the network operating system is to allow shared file and device
access among multiple computers in a network, typically a local area network (LAN), a
private network or to other networks.
• Ex: Microsoft Windows Server 2003, Microsoft Windows Server 2008, UNIX, Linux
• The second approach is to develop middleware to offer a limited degree of resource sharing, similar
to the MOSIX/OS developed for clustered systems.
• Middleware in the context of distributed applications is software that provides services beyond those
provided by the operating system to enable the various components of a distributed system to
communicate and manage data. Middleware supports and simplifies complex distributed applications
• The third approach is to develop a truly distributed OS to achieve higher use or system
transparency.
• A distributed operating system is a software over a collection of independent,
networked, communicating, and physically separate computational nodes.
• They handle jobs which are serviced by multiple CPUs.
• Each individual node holds a specific software subset of the global aggregate operating
system.
• Each subset is a composite of two distinct service provisioners.
• The first is a ubiquitous minimal kernel, or microkernel, that directly controls that node’s
hardware.
• Second is a higher-level collection of system management components that coordinate
the node's individual and collaborative activities.
• These components abstract microkernel functions and support user applications
• Parallel and Distributed Programming Models:
• Message-Passing Interface (MPI):
• Primary programming standard used to develop parallel and concurrent programs to run
on a distributed system.
• MPI is essentially a library of subprograms that can be called from C or FORTRAN to
write parallel programs running on a distributed system.
• Synchronous or asynchronous point-to-point and collective communication commands
and I/O operations in user programs for message-passing execution.
• MPI's goals are high performance, scalability, and portability.
• MPI is not agreed upon by any standards body, but it is the most widely used.
• MapReduce:
• Web programming model for scalable data processing on large clusters over large data
sets.
• Applied mainly in web-scale search and cloud computing applications.
• The user specifies a Map function to generate a set of intermediate key/value pairs.
• Then applies a Reduce function to merge all intermediate values with the same
intermediate key.
• MapReduce is highly scalable to explore high degrees of parallelism at different job levels.
• A typical MapReduce computation process can handle terabytes of data on tens of
thousands or more client machines.
• Hadoop Library:
• The package enables users to write and run applications over vast amounts of distributed
data.
• Scalability: Users can easily scale Hadoop to store and process petabytes of data in the
web space.
• Economical: Comes with an open source version of MapReduce that minimizes
overhead in task spawning and massive data communication.
• Efficient: Processes data with a high degree of parallelism across a large number of
commodity nodes.
• Reliable: Automatically keeps multiple data copies to facilitate redeployment of
computing tasks upon unexpected system failures.
• Open Grid Services Architecture (OGSA)
• OGSA is a common standard for general public use of grid services.
• Genesis II is a realization of OGSA. Key features include a distributed
execution environment, Public Key Infrastructure (PKI) services using a local
certificate authority (CA), trust management, and security policies in grid
computing.
• Globus Toolkits and Extensions
• Globus is a middleware library
• This library implements some of the OGSA standards for resource discovery, allocation, and
security enforcement in a grid environment.
• The Globus packages support multisite mutual authentication with PKI certificates.
CLOUD COMPUTING AND SERVICE MODELS
• Public, Private, and Hybrid Clouds:
• Cloud computing has evolved from cluster, grid, and utility computing.
• Cluster and grid computing leverage the use of many computers in parallel to solve problems
of any size.
• Utility and Software as a Service (SaaS) provide computing resources as a service with the
notion of pay per use.
• Cloud computing is a high-throughput computing (HTC) paradigm whereby the infrastructure
provides the services through a large data center or server farms.
PUBLIC CLOUDS:
• A public cloud is built over the Internet and can be accessed by any user who has paid for the
service.
• Public clouds are owned by service providers and are accessible through a subscription.
• Google App Engine (GAE), Amazon Web Services (AWS), Microsoft Azure, IBM Blue Cloud etc.
• Commercial providers offer a publicly accessible remote interface for creating and managing VM
instances within their proprietary infrastructure.
• A public cloud delivers a selected set of business processes.
• The application and infrastructure services are offered on a flexible price-per-use basis.
ADVANTAGES OF PUBLIC CLOUDS
• Lower costs—no need to purchase hardware or software and you pay only for the service you use.
• No maintenance—your service provider provides the maintenance.
• Near-unlimited scalability—on-demand resources are available to meet your business needs.
• High reliability—a vast network of servers ensures against failure.
PRIVATE CLOUDS:
• A private cloud is built within the domain of an intranet owned by a single organization.
• Client owned and managed, and access is limited to the owning clients and their partners.
• NOT meant to sell capacity over the Internet through publicly accessible interfaces.
• Private clouds give local users a flexible and agile private infrastructure to run service workloads
within their administrative domains.
• A private cloud is supposed to deliver more efficient and convenient cloud services.
• It may impact the cloud standardization, while retaining greater customization and
organizational control.
ADVANTAGES OF A PRIVATE CLOUD
• More flexibility—your organisation can customise its cloud environment to meet
specific business needs.
• More control—resources are not shared with others, so higher levels of control and
privacy are possible.
• More scalability—private clouds often offer more scalability compared to on-premises
infrastructure.
HYBRID CLOUDS:
• A hybrid cloud is built with both public and private clouds.
• Private clouds can also support a hybrid cloud model by supplementing local
infrastructure with computing capacity from an external public cloud.
• The Research Compute Cloud (RC2) is a private cloud, built by IBM, that interconnects
the computing and IT resources at eight IBM Research Centers scattered throughout the
United States, Europe, and Asia.
• A hybrid cloud provides access to clients, the partner network, and third parties.
ADVANTAGES OF THE HYBRID CLOUD
• Control—your organisation can maintain a private infrastructure for sensitive assets or
workloads that require low latency.
• Flexibility—you can take advantage of additional resources in the public cloud when
you need them.
• Cost-effectiveness—with the ability to scale to the public cloud, you pay for extra
computing power only when needed.
• Ease—transitioning to the cloud does not have to be overwhelming because you can
migrate gradually—phasing in workloads over time.
SUMMARY
• Public clouds promote standardization, preserve capital investment, and offer application
flexibility.
• Private clouds attempt to achieve customization and offer higher efficiency, resiliency,
security, and privacy.
• Hybrid clouds operate in the middle, with many compromises in terms of resource
sharing.
CLOUD SERVICE MODELS:
• The services provided over the cloud can be generally categorized into three different
service models:
• Infrastructure as a Service (IaaS)
• Platform as a Service (PaaS)
• Software as a Service (SaaS)
• These services are available as subscription-based services in a pay-as-you-go model to
consumers.
• All three models allow users to access services over the Internet
INFRASTRUCTURE-AS-A-SERVICE (IAAS):
• This model allows users to use virtualized IT resources for computing, storage, and
networking.
• In short, the service is performed by rented cloud infrastructure.
• The user can deploy and run his applications over his chosen OS environment.
• The user does not manage or control the underlying cloud infrastructure, but has control
over the OS, storage, deployed applications, and possibly select networking components.
• This IaaS model encompasses:
• storage as a service, compute instances as a service, and communication as a service.
• Key features
• Instead of purchasing hardware outright, users pay for IaaS on demand.
• Infrastructure is scalable depending on processing and storage needs.
• Saves enterprises the costs of buying and maintaining their own hardware.
• Because data is on the cloud, there can be no single point of failure.
• Enables the virtualization of administrative tasks, freeing up time for other work.
AMAZON VIRTUAL PRIVATE CLOUD (VPC)
PLATFORM AS-A-SERVICE (PAAS):
• This model provides users with a cloud environment in which they can develop, manage
and deliver applications
• Platform includes operating system and runtime library support
• An integrated computer system consisting of both hardware and software infrastructure.
• In addition to storage and other computing resources, users are able to use a suite of
prebuilt tools to develop, customize and test their own applications.
• The user application can be developed on this virtualized cloud platform using some
programming languages and software tools supported by the provider (e.g., Java, Python,
.NET).
• The user does not manage the underlying cloud infrastructure.
• Enables a collaborated software development platform for users from different parts of the
world
• Key Features:
• PaaS provides a platform with tools to test, develop and host applications in the same environment.
• Enables organizations to focus on development without having to worry about underlying
infrastructure.
• Providers manage security, operating systems, server software and backups.
• Facilitates collaborative work even if teams work remotely
SOFTWARE AS-A-SERVICE (SAAS):
• SaaS model provides software applications as a service
• Provides users with access to a vendor’s cloud-based software.
• Users do not install applications on their local devices.
• Instead, the applications reside on a remote cloud network accessed through the web or
an API.
• Through the application, users can store and analyze data and collaborate on projects.
• Example: Google Gmail and docs, Microsoft SharePoint, and the CRM software from
Salesforce.com
KEY FEATURES
• SaaS vendors provide users with software and applications via a subscription model.
• Users do not have to manage, install or upgrade software; SaaS providers manage this.
• Data is secure in the cloud; equipment failure does not result in loss of data.
• Use of resources can be scaled depending on service needs.
• Applications are accessible from almost any internet-connected device, from virtually
anywhere in the world.
• Figure 4.5 illustrates three cloud models at different service levels of the cloud.
• SaaS is applied at the application end using special interfaces by users or clients. At the
PaaS layer, the cloud platform must perform billing services and handle job queuing,
launching, and monitoring services.
• At the bottom layer of the IaaS services, databases, compute instances, the file system,
and storage must be provisioned to satisfy user demands.