Chapter-1: Cloud Computing Fundamentals
Chapter-1: Cloud Computing Fundamentals
• Xen. The Xen hypervisor started as an open-source project and has served as a base to other
virtualization products, both commercial and open-source. It has pioneered the para-
virtualization concept, on which the guest operating system, by means of a specialized
kernel, can interact with the hypervisor, thus significantly improving performance. In
addition to an open-source distribution, Xen currently forms the base of commercial hypervisors
of a number of vendors, most notably Citrix Xen Server and Oracle VM.
• KVM. The kernel-based virtual machine (KVM) is a Linux virtualization subsystem. It has
been part of the mainline Linux kernel since version 2.6.20, thus being natively supported by
several distributions. In addition, activities such as memory management and scheduling are
carried out by existing kernel features, thus making KVM simpler and smaller than hypervisors
that take control of the entire machine. KVM leverages hardware-assisted virtualization,
which improves performance and allows it to support unmodified guest operating
systems; currently, it supports several versions of Windows, Linux, and UNIX.
Enabling Technologies
Web service and service-oriented architecture, service flows and workflows, and
Web 2.0 and mash-up.
Web Service and Service Oriented Architecture:
• Web services (WS) open standards has significantly contributed to advances in the domain of
software integration. Web services can 1) Glue together applications running on different
messaging product platforms, 2) Enabling information from one application to be made
available to others, and 3) enabling internal applications to be made available over the
Internet.
• A rich WS software stack has been specified and standardized, resulting in a multitude of
technologies to describe, compose, and orchestrate services, package and transport
messages between services, publish and discover services, represent quality of service
(QoS) parameters, and ensure security in service access.
• WS standards have been created on top of existing ubiquitous technologies such as HTTP and
XML, thus providing a common mechanism for delivering services. The purpose of a SOA is to
address requirements of loosely coupled, standards-based, and protocol-independent
distributed computing.
• In a SOA, software resources are packaged as “services,” that provide standard business
functionality and are independent of the state or context of other services. Services are described in
a standard definition language (WSDL) and have a published interface (UDDI).
• The advent of Web 2.0. information and services may be programmatically aggregated, acting as
building blocks of complex compositions, called service mashups (Web Service Composition).
i.e. an enterprise application that follows the SOA paradigm is a collection of services that together
perform complex business logic.
Enabling Technologies
Web service and service-oriented architecture, service flows and workflows, and
Web 2.0 and mash-up.
• Many service providers, such as Amazon, del.icio.us, Facebook, and Google, make
their service APIs publicly accessible using standard protocols such as SOAP and
REST [14]. Consequently, one can put an idea of a fully functional Web application into
practice just by gluing pieces with few lines of code.
• For example, programmable Web-2 is a public repository of service APIs and mashups
currently listing thousands of APIs and mashups. Popular APIs such as Google Maps,
Flickr, YouTube, Amazon e-Commerce, and Twitter, when combined, produce a variety
of interesting solutions, from finding video game retailers to weather maps. Similarly,
Salesforce.com’s offers AppExchange, which enables the sharing of solutions
developed by third-party developers on top of Salesforce.com components.
• A key aspect of the grid vision realization has been building standard Web services-
based protocols that allow distributed resources to be “discovered, accessed,
allocated, monitored, accounted for, and billed for, etc., and in general managed as
a single virtual system.”
• The Open Grid Services Architecture (OGSA) addresses this need for standardization
by defining a set of core capabilities and behaviors that address key concerns in grid
systems.
• Globus Toolkit is a middleware that implements several standard Grid services and over
the years has aided the deployment of several service-oriented Grid infrastructures and
applications. An ecosystem of tools is available to interact with service grids, including
grid brokers, which facilitate user interaction with multiple middleware and implement
policies to meet QoS needs. The development of standardized protocols for several grid
computing activities has contributed—theoretically—to allow delivery of on-demand
computing services over the Internet.
Enabling Technologies
Grid Computing
However, ensuring QoS in grids has been perceived as a difficult endeavor.
• Lack of performance isolation has prevented grids adoption in a variety of scenarios, especially on
environments where resources are oversubscribed or users are uncooperative. Activities
associated with one user or virtual organization (VO) can influence, in an uncontrollable way,
the performance perceived by other users using the same platform. Therefore, the
impossibility of enforcing QoS and guaranteeing execution time became a problem, especially
for time-critical applications.
• Another issue that has lead to frustration when using grids is the availability of resources with
diverse software configurations, including disparate operating systems, libraries, compilers,
runtime environments, and so forth. At the same time, user applications would often run only on
specially customized environments.
• Consequently, a portability barrier has often been present on most grid infrastructures, inhibiting
users of adopting grids as utility computing environments.
• Virtualization technology is the perfect fit to issues that have caused frustration when using
grids, such as hosting many dissimilar software applications on a single physical platform. In
this direction, some research projects (e.g., Globus Virtual Workspaces ) aimed at evolving grids to
support an additional layer to virtualize computation, storage, and network resources.
Enabling Technologies
Utility Computing
• With increasing popularity and usage, large grid installations have faced new
problems, such as excessive demands for resources coupled with strategic and
adversarial behavior by users.
• Initially, grid resource management techniques did not ensure fair and equitable
access to resources in many systems. Traditional metrics (throughput, waiting
time, and slowdown) failed to capture the more subtle requirements of users.
There were no real incentives for users to be flexible about resource
requirements or job deadlines, nor provisions to accommodate users with
urgent work.
• The reference model of Buyya et al. explains the role of each layer in an
integrated architecture. A core middleware manages physical resources
and the VMs deployed on top of them; in addition, it provides the required
features (e.g., accounting and billing) to offer multi-tenant pay-as-you-go
services.
• Cloud development environments are built on top of infrastructure services
to offer application development and deployment capabilities. In this level,
various programming models, libraries, APIs, and mashup editors enable the
creation of a range of business, Web, and scientific applications. Once
deployed in the cloud, these applications can be consumed by end users.
Layers and Types of Cloud Computing
Infrastructure as a Service
• Offering virtualized resources (computation, storage, and communication) on
demand is known as Infrastructure as a Service (IaaS). Infrastructure
services are considered to be the bottom layer of cloud computing systems.
• Amazon web services offers IaaS, which in the case of EC2 service means
offering VMs with a software stack that can be customized similar to how an
ordinary physical server would be customized. Users are given privileges to
perform numerous activities to the server, such as: starting and stopping it,
customizing it by installing software packages, attaching virtual disks to it, and
configuring access permissions and firewall rules.
Layers of Cloud Computing
Platform as a Service
• It offers a higher level of abstraction to make a cloud easily programmable, known
as Platform as a Service (PaaS). It provides Integrated Development Environment
(IDE) including data security, backup and recovery, application hosting, and
scalable architecture.
• There are three types of cloud computing: Three deployment types of cloud computing
(a) public cloud, (b) private cloud, and (c) hybrid cloud
• Amazon Elastic Compute Cloud (EC2) services can be leveraged via Web
services (SOAP or REST), a Web-based AWS (Amazon Web Service)
management console, or the EC2 command line tools.
• The Amazon service provides hundreds of pre-made AMIs (Amazon
Machine Images) with a variety of operating systems (i.e., Linux, Open
Solaris, or Windows) and pre-loaded software.
• It provides you with complete control of your computing resources and lets
you run on Amazon’s computing and infrastructure environment easily.
• It also reduces the time required for obtaining and booting a new server’s
instances to minutes, thereby allowing a quick scalable capacity and
resources, up and down, as the computing requirements change.
• Amazon offers different instances’ size according to (a) the resources’ needs
(small, large, and extra large), (b) the high CPU’s needs it provides (medium
and extra large high CPU instances), and (c) high-memory instances (extra
large, double extra large, and quadruple extra large instance).
Private cloud and Infrastructure Services
A private cloud aims at providing public cloud functionality, but on private resources, while
maintaining control over an organization’s data and resources to meet security and governance’s
requirements in an organization.
Private cloud exhibits a highly virtualized cloud data center located inside your organization’s
firewall. It may also be a private space dedicated for your company within a cloud vendor’s data
center designed to handle the organization’s workloads, and in this case it is called Virtual
Private Cloud (VPC). Private clouds exhibit the following characteristics:
1) Allow service provisioning and compute capability for an organization’s users in a self-
service manner.
2) Automate and provide well-managed virtualized environments.
3) Optimize computing resources, and servers’ utilization.
3) Support specific workloads.
There are many examples for vendors and frameworks that provide infrastructure as a service in
private setups. The best-known examples are Eucalyptus and OpenNebula (which will be
covered in more detail later on).
4) It is also important to highlight a third type of cloud setup named “hybrid cloud,” in which a
combination of private/internal and external cloud resources exist together by enabling outsourcing
of noncritical services and functions in public cloud and keeping the critical ones internal.
Hybrid cloud’s main function is to release resources from a public cloud and to handle sudden
demand usage, which is called “cloud bursting.”
Types of Cloud Computing
Deployment Models
Armbrust et al. propose definitions for public cloud as a “cloud made available in a
pay-as-you-go manner to the general public” and private cloud as “internal data
center of a business or other organization, not made available to the general
public.”
• building a cloud infrastructure is managing physical and virtual resources, namely servers, storage,
and networks, in a holistic fashion.
• The orchestration of resources must be performed in a way to rapidly and dynamically provision
resources to applications.
• The software toolkit responsible for this orchestration is called a virtual infrastructure manager
(VIM). This type of software resembles a traditional operating system, but instead of dealing with a
single computer, it aggregates resources from multiple computers, presenting a uniform view to
user and applications. The term “cloud operating system”, “infrastructure sharing software”, and
“virtual infrastructure engine.” are used to realize this toolkit. Sotomayor et al. present two
categories of tools to manage clouds:
• The first category—cloud toolkits—includes those that “expose a remote and secure
interface for creating, controlling and monitoring virtualize resources,” but do not specialize in VI
management.
• Tools in the second category—the virtual infrastructure managers—provide advanced
features such as automatic load balancing and server consolidation, but do not expose remote
cloud-like interfaces.
Cloud Management
Cloud Infrastructure Management – Features and Case studies
Features: Case Studies:
• Storage is typically measured as average daily amount of data stored in GB over a monthly
period.
• Bandwidth is measured by calculating the total amount of data transferred in and out of platform
service through transaction and batch processing. Generally, data transfer between services
within the same platform is free in many platforms.
• Compute is measured as the time units needed to run an instance, or application, or machine to
servicing requests. Table 6 compares pricing for three major cloud computing platforms.