Unit-III
Virtualization: Virtual machines,
Containers
Presented by:
Rajeshwari Patil
Assistant Professor
NMIT
Approaches to Virtualizations
Virtualization is a technology that creates multiple virtual machines on a single
physical resources.
The concept of virtual machines existed long before cloud computing was
invented.
The technologies used to implement virtual machines can be divided into three
broad categories:
o Software emulation
o Para-virtualization
o Full virtualization
Software emulation
o Emulates a different type of computer using software.
o Useful for running programs compiled for one computer type on another.
o Steps: Emulator reads instructions, mimics target computer behavior, and executes.
o Commonly used to quickly adopt new programming languages (e.g., byte code).
o Provides flexibility and portability but introduces significant performance overhead.
o Example: Java byte code and interpreter.
Para Virtualization:
o Runs multiple operating systems simultaneously on one computer.
o Uses a hypervisor to manage OS access to hardware resources.
o Allows native execution (directly by the processor), reducing overhead compared to
emulation.
o Requires modifying the OS to replace privileged instructions with hypervisor
calls.
o Offers faster performance but demands OS modifications.
Full Virtualization:
o Similar to para-virtualization but no OS modifications are needed.
o Allows multiple operating systems to run unaltered on the same hardware.
o Avoids the overhead of software emulation while maintaining performance.
o Provides the benefits of virtualization without requiring OS-level changes.
Difference between Para virtualization and Full
Virtualization
Full Virtualization Para Virtualization
In Full virtualization, virtual machines permit the In paravirtualization, a virtual machine does not
execution of the instructions with the running of implement full isolation of OS but rather provides a
unmodified OS in an entirely isolated way. different API which is utilized when OS is subjected to
alteration.
Full Virtualization is less secure. While the Paravirtualization is more secure than the Full
Virtualization.
Full Virtualization uses binary translation and a direct While Paravirtualization uses hypercalls at compile time
approach as a technique for operations. for operations.
Full Virtualization is more portable and compatible. Paravirtualization is less portable and compatible.
Full Virtualization is slow than paravirtualization in Paravirtualization is faster in operation as compared to
operation. full virtualization.
Examples of full virtualization are Microsoft and Examples of paravirtualization are Microsoft Hyper-V,
Parallels systems. Citrix Xen, etc.
Properties of Full Virtualization
The full virtualization technologies currently used to support Virtual Machines (VMs) in
cloud data centers have three key properties:
o Emulation of commercial instruction sets:
VMs mimic a standard commercial computer, including the full instruction set.
Code compiled for physical machines can run on VMs without modification.
VMs can boot and run commercial operating systems like Windows or Linux.
o Isolated facilities and operation:
VMs on the same physical server are fully isolated from each other.
Each VM perceives that it has control over physical memory, I/O devices, and the
processor.
This isolation ensures secure and independent operation for each VM.
o Efficient, low-overhead execution:
VMs offer near-native performance with minimal overhead.
Most application instructions are executed natively at full hardware speed.
Performance is surprisingly fast due to efficient virtualization techniques.
Conceptual Organization of VM Systems
VM Creation and Management:
o Software is installed on a server to allow the creation of one or more Virtual Machines (VMs).
o Each VM can be owned by a tenant who boots an operating system and runs applications
within the VM.
Hypervisor:
o The hypervisor is the key software responsible for managing and creating VMs.
o It controls the underlying hardware and ensures resource allocation.
o Hypervisor can be of two types
o Type 1 hypervisor: Installed directly on the computer’s hardware instead of the operating
system.
o Type 2 hypervisor: Installed on an operating system
Independence of VMs:
o Each VM operates independently, isolated from other VMs on the same server.
o VMs can run separate operating systems and applications concurrently without
interference.
Conceptual Organization:
o The server runs the hypervisor, which manages multiple VMs.
o Each VM runs its own operating system and applications as if it were a
standalone machine.
Efficient Execution and Processor Privilege Levels
Software Layers Concern:
o On a VM, it seems like two layers (operating system and hypervisor) separate the
app from hardware.
o Normally, software is slower than hardware, raising questions about performance.
Direct Execution by the Processor:
o Just like in a conventional computer, an operating system loads app code into
memory and directs the processor to execute it.
o The processor executes the app code directly, at full hardware speed, without
needing to "go through" the operating system.
Privilege Levels in Processors:
o Kernel Mode: Used by the operating system, allowing execution of all instructions,
including privileged ones.
o User Mode: Used by applications, restricting them to basic instructions and
preventing unauthorized memory access.
System Calls & Mode Switching:
o When an app requests an operating system service (e.g., reading a file), the
processor switches from user mode to kernel mode to handle the system call.
o This ensures that only safe instructions are executed in user mode.
Memory Protection:
o In user mode, an app can only access its own allocated memory and perform
basic operations.
o If an app tries to access OS memory or illegal areas, the processor raises an
exception and hands control back to the operating system.
Security & Exceptions:
o The system's privilege levels protect against malicious activity, such as
unauthorized memory access or instruction execution.
o If a privileged instruction or illegal access occurs, the OS handles it through an
exception mechanism.
Extending Privilege to Hypervisor
A server running a hypervisor and VMs uses a similar approach to how an operating
system runs apps, but with three privilege levels:
o Hypervisor privilege level
o Operating system privilege level
o App privilege level
Only the hypervisor has the authority to create VMs and allocate memory to them.
The operating system is limited to the memory allocated to its VM and can only run
apps within that space.
The processor always executes code directly from memory, maintaining hardware
execution speed.
Levels of Trusts
The three processor modes represent different levels of trust:
o Hypervisor mode: Complete trust, allowing any hardware operation.
o Kernel mode: Limited trust; the OS is restricted to prevent it from affecting other VMs
or the hypervisor.
o User mode: Least trust; apps are restricted from affecting other apps or the OS.
The trust hierarchy ensures that each entity manages only those below it
o Hypervisor: Trusted to configure hardware and ensure VM isolation.
o Operating system: Trusted to configure hardware and protect apps from each other.
If an entity exceeds its trust level, control passes up to the entity at the next higher trust
level.
Levels of Trusts and I/O Devices
Managing I/O devices becomes problematic with an extra level of privilege in
virtualized environments.
On a conventional computer:
o The OS communicates with I/O devices (e.g., screen, keyboard, disk, network
interface) via a hardware bus.
o It sends requests across the bus to detect and list available I/O devices.
o Device driver software is used by the OS to control the hardware and handle
communication with the devices.
The dilemma with multiple virtual machines (VMs):
o Each VM runs its own operating system, which attempts to take control of
I/O devices via the bus.
o A hypervisor cannot allow a single OS/VM to gain exclusive control of I/O
devices because these devices must be shared across all VMs.
o This is especially important for network I/O, as all VMs need network
access, and no VM can monopolize it.
Virtual I/O Devices
VM technology handles I/O using virtual I/O devices.
When creating a VM, the hypervisor generates a set of virtual I/O devices for the
VM to use.
Virtual I/O devices are implemented by software, simulating physical devices.
When the operating system in a VM tries to access a physical I/O device via the
bus:
o This triggers a privilege violation, invoking the hypervisor.
o The hypervisor runs the relevant virtual device software.
The hypervisor mimics the response as if it came from a physical device over the
bus, making it appear normal to the operating system.
Virtual Device Details
There are two main approaches for creating virtual devices in VM technology:
o Invent a new, imaginary device
o Emulate an existing hardware device
1. Invent a New, Imaginary Device:
o Flexibility: Since virtual devices are software-based, they can have any
properties the programmer designs.
o Standard Bus Operations: Communication between the OS and the virtual
device uses standard bus protocols.
o Simplified Design: Virtual devices can have cleaner, simpler interfaces than
physical hardware, making device driver development easier.
o Improved Efficiency: Virtual devices can be optimized for performance, such as
creating an imaginary disk with larger data blocks.
2. Emulate an Existing Hardware Device:
o Driver Compatibility: Emulating existing hardware allows the use of existing
device drivers, avoiding the need to write new drivers for each operating system.
o Tedious Driver Development: Developing drivers for different OSes is complex, so
emulation simplifies this process by using existing, proven drivers.
o Challenge of Exact Behavior: The virtual device must precisely replicate the
behavior of the real hardware, responding exactly as the actual device would.
Both approaches have been used in virtual machine systems, with each having
distinct advantages and challenges.
An example Virtual Device
Virtual disk is a type of virtual device used in data centers where storage is separate from servers, and disk I/O
occurs over the network.
The virtual disk software:
o Provides a standard disk interface to the operating system running in a virtual machine (VM).
o Communicates over the data center network to handle disk I/O.
When communicating with the OS, the virtual disk acts like a hardware device, using standard bus protocols to
receive read/write requests.
For each request:
o The virtual disk either sends data to be stored or requests a copy of data from the storage facility in the data
center.
o It specifies which VM made the request to ensure proper storage management for that VM.
A VM As Digital Object
A VM is created and managed entirely by software, unlike a physical server.
The hypervisor keeps track of:
o The VM's record, including memory regions allocated to the VM.
o Virtual I/O devices created for the VM (e.g., allocated disk space).
o The current status of the VM (running, suspended, etc.).
A complete VM record allows a VM to be turned into a digital object (a set of bytes).
When a VM is suspended, the hypervisor can:
o Collect the VM’s memory segments into a special file, including code and data for the operating
system and running apps.
o Collect the virtual devices, since they consist of software as well.
This allows the entire VM (memory, OS, apps, and virtual devices) to be stored as a digital snapshot.
VM Migration
VM as a digital object allows for significant data center operations, such as Stopping VMs,
saving them, shutting down a server, replacing hardware, and reloading VMs to continue
operation.
The key capability enabled by this is VM migration:
o A VM can be moved from one server to another by converting it into a digital object,
transferring it over the network, and resuming it on a new server.
o The hypervisors on both servers handle the migration process.
VM migration enables load balancing:
o Example: If server X has compute-intensive VMs and server Y has I/O-heavy VMs,
migration allows balancing workloads between the two servers to avoid bottlenecks.
Other uses of VM migration:
o Reduce power consumption: Migrate VMs away from servers during light loads,
then power down empty servers.
o Optimize network traffic and latency: Migrate VMs owned by a customer to the
same pod or adjacent pods, reducing cross-data-center traffic and latency.
Migration is essential for dynamic rebalancing and efficiency optimization in cloud
data centers.
Live Migration Using Three Phases
Live migration allows a VM to continue running while being moved across servers, minimizing downtime.
Challenge: Stopping a VM during migration can disrupt active network communication (e.g., database
access or file downloads).
Three-phase live migration process:
o Phase 1: Pre-copy: The entire VM memory is copied to the new server while the VM continues
running. A record is kept of any memory pages that are changed after being copied.
o Phase 2: Stop-and-copy: The VM is temporarily suspended, and only the changed (dirty) memory
pages are copied again to the new server.
o Phase 3: Post-copy: Remaining state information (e.g., register contents) is sent to the new server,
allowing the VM to resume execution.
Outcome: The VM is suspended briefly, but because most of the memory is pre-copied, the downtime is
minimal.
Running VM in an Application
Hosted hypervisor technology allows a hypervisor to run on a conventional
operating system (e.g., a user's laptop) rather than directly on server hardware.
Host operating system controls the hardware, while the hosted hypervisor runs as
an application in user space alongside other apps.
Each VM runs a guest operating system, which can differ from the host OS and
other VMs. For example, a Mac laptop can run Linux and Windows VMs
simultaneously.
How is it possible?
o The hosted hypervisor runs as an application in the user space of the host OS,
meaning it doesn't directly control the hardware but works within the host
system.
What benefit does it offer a user?
o A hosted hypervisor allows users to run different operating systems on the same
machine without needing to reboot or set up new hardware. This is useful for
tasks like testing software in different environments or learning different OS
platforms.
Is the technology useful in a cloud data center?
o While hosted hypervisors are commonly used on personal devices, they can also
have applications in cloud data centers. For example, they can be used to run
guest operating systems that manage containers, which are lightweight
environments for running applications.
Facilities that make a hosted Hypervisor possible
Modern processors and virtualization technology make it possible for a hosted
hypervisor to run VMs smoothly by creating the illusion that each guest OS has full
control, even though it's being managed by the host OS.
When using a hosted hypervisor to run virtual machines (VMs) on a regular
computer, like a laptop, two important systems help everything work smoothly
without needing full access to the hardware.
Two facilities allow a hosted hypervisor and a guest OS to operate correctly
without high privilege:
o A processor mechanism that supports virtualization
o A way to map guest OS operations onto the host operating system’s services.
Processor Support for Virtualization
o Modern processors have specific features that allow virtualization, which means
they enable multiple operating systems to run independently on the same physical
hardware without having full access.
o The processor does this by creating a "virtual environment" for each guest OS, so
each one behaves as if it has complete control over the hardware.
o However, the guest OS is actually operating with limited privileges.
o For example:
Memory Access: The guest OS only has access to a designated portion of the
physical memory, but it is unaware of this limitation.
Hardware Access: The processor restricts the guest OS from directly
interacting with the hardware. Instead, these interactions are managed by the
hypervisor.
Mapping Guest OS Operations to Host OS Services
o The hosted hypervisor is the software that manages the interaction between the
guest OS and the host OS. The hypervisor is like a middleman, connecting
requests from the guest OS to the host OS's services.
o For example:
Network Access: When the guest OS tries to communicate over the network,
it sends the request to what it thinks is a network interface. In reality, the
hypervisor intercepts this request and sends it to the host OS, which then
processes it using the actual network hardware. The hypervisor may even give
each guest OS a unique network address, creating the illusion that each
guest has its own network card.
Shared Files and File Systems
o One powerful feature of virtualization is the ability to share files or directories
between the guest and the host OS. This allows, for example, an application
running on the guest OS to save a file that the host OS can access instantly,
without needing to send it over a network or save it on an external drive.
Multiboot vs. Hosted Hypervisor
Multiboot Setup:
o In a multiboot setup, you can install multiple operating systems on a single
computer, such as Windows, Linux, and Mac OS. However, you can only run one
of these operating systems at a time.
o When you want to switch to a different OS, you have to restart (or "reboot") the
computer and choose the other operating system from a boot menu. This
approach works, but switching between systems is slow and disruptive.
Hosted Hypervisor: A hosted hypervisor lets you run multiple operating systems
simultaneously on the same machine. Instead of rebooting, you simply open each
OS in a virtual machine (VM), which appears as a window or virtual desktop. This
allows you to quickly switch between systems without restarting the computer.
The primary advantage of a hosted hypervisor over multiboot mechanisms is that it
allows multiple operating systems to run simultaneously on the same machine. This
means a user can switch between operating systems without rebooting, enabling
seamless multitasking and efficient workflow across different OS environments.
Containers
Advantages and disadvantages of VMs
Advantages of Virtual Machines (VMs)
o Support for Multiple OS Types: VM technology allows running multiple,
different operating systems on one server.
o Hardware-Level Emulation: VMs emulate actual processor hardware closely,
enabling conventional OSs to run unchanged.
o User Flexibility: Users leasing a VM can choose any software, including the
OS, without modifications.
o Performance: VMs can execute code at near hardware speed, supporting
efficient app and OS execution.
Disadvantages of Virtual Machines (VMs)
o VM Creation Time: Booting an OS within a VM is time-consuming.
o Computational Overhead: Running multiple VMs on a server adds
computational load.
o Scheduling Overhead: Each VM’s OS runs its own scheduler, increasing
processor switching load.
o Background Process Overhead: Each VM OS has its own background
processes, increasing resource usage.
o Scalability Impact: Adding more VMs proportionally increases server
overhead and reduces efficiency.
Traditional Apps and Elasticity on Demand
Limitations of VM Technology for Short-Term Use
o Long Boot Time: Booting an OS for short-term app use is slow and inefficient.
o Unnecessary OS Resources: Running a full OS is excessive when only a single app is
needed.
o Reduced Rapid Elasticity: VM technology is less suited for quickly scaling app copies up
or down.
Alternative Approach: Process-Based Virtualization
o Concurrent Process Support: OSs support concurrent processes, creating and
terminating apps faster than OS booting.
o Efficient Scaling: Cloud providers could use process-level control to quickly start or
stop app copies on demand.
Issues with OS-Based App Isolation
Incomplete Isolation: Traditional OSs don’t fully isolate multiple tenant apps:
o Shared Network Access: Apps use the same network address, reducing privacy.
o Shared File Systems: All apps typically share one file system, risking data
overlap.
o Process Visibility: Apps may access details about other tenant processes,
reducing security.
Isolation Facilities in an Operating system
To ensure the security and privacy of users, operating systems have developed ways to keep
each user's data and activities separate:
o Memory Isolation: Most operating systems use virtual memory to create an independent
memory space for each application (process). This means an app can only access its
designated memory and can't see or change other apps’ data.
o User IDs and Ownership: Every process and file is assigned a User ID (UID) that defines
ownership, allowing only the owner or authorized users to access or modify the data,
preventing interference between users.
o Limitations in Cloud Scale: Traditional UID systems don’t scale well for cloud
environments with numerous users, prompting developers to explore new methods for
effective app isolation in the cloud.
Linux Namespaces used for Isolation
Namespaces are mechanisms in Linux that provide isolation for various aspects of
applications, allowing multiple applications to run independently on the same host.
List of seven major namespaces used with containers
PID Namespace (Isolates process IDs): Each PID namespace has its own set of
process IDs, so processes in one namespace won’t see the processes in another.
Mount Namespace(Isolates filesystem mount points): Processes in different mount
namespaces can have different views of the filesystem.
Network Namespace(Isolates network resources): Each network namespace has its
own network stack (interfaces, IP addresses, routing tables, etc.).
IPC Namespace(Isolates inter-process communication): IPC namespaces isolate
mechanisms like shared memory and semaphores, so that processes in different IPC
namespaces cannot communicate with each other directly.
UTS Namespace(Isolates hostname and domain name): With UTS namespaces, each
namespace can have its own hostname and domain name.
User Namespace(Isolates user and group IDs): This namespace allows a process to have
different user and group IDs in different namespaces, providing a way for a process to run as
a "root" user within a container without being a root user on the host system.
Benefits of Namespace Isolation:
o Improved security: Prevents unauthorized access between applications.
o Resource management: Allows for better control over resource allocation.
o Flexibility: Supports running multiple versions of applications or services without conflict.
Adoption in Cloud Computing: These namespace mechanisms have been widely adopted in
cloud computing, allowing for efficient multi-tenancy and enhanced isolation for applications
running in containers.
The Containers approach for Isolated Apps
Apps use operating system mechanisms to maintain isolation from each other. Each
app runs in its own environment, metaphorically surrounded by "walls" that protect it
from other apps.
The term container refers to an environment that encapsulates and protects an app
while it is running. Containers ensure that each application operates in isolation.
A server can run multiple containers simultaneously, each containing a separate app.
Conventional apps can run outside of containers without interference with
containerized apps.
Interaction with Conventional Apps:
o While conventional, unprivileged apps cannot interfere with containerized apps,
they may still access some information about processes within containers.
o To mitigate potential information leaks, systems typically restrict conventional
apps to the control software used for container management.
Advantages of Containers:
o Lightweight: Containers are more lightweight than VMs because they share the host OS
kernel, which reduces overhead.
o Faster Start-Up: Containers can be started and stopped quickly, allowing for rapid scaling
and deployment.
o Resource Efficiency: Multiple containers can run on a single server without the need for
separate operating systems, optimizing resource usage.
Limitations of Containers:
o Less Isolation: While containers provide good isolation, they do not offer the same level of
separation as VMs, which can lead to security concerns if a container is compromised.
o Shared Kernel: Since containers share the host OS kernel, any vulnerabilities in the kernel
can affect all containers running on that host.
Comparison with VMs:
o VMs provide stronger isolation since each VM runs a separate operating system, whereas
containers operate under a shared OS.
o The resource overhead for VMs is higher due to the need to run multiple OS instances,
making containers more suitable for microservices and cloud-native applications.
However, VMs may be more appropriate for running legacy applications that require a
complete OS environment.
Docker Containers
Docker, a popular container technology, offers an efficient way to develop and manage
applications in cloud environments.
Docker makes container usage so effective by
o Easy Development Tools
o Ready-to-Use Software Registry (Docker Hub)
o Fast Setup
o Reliable and Consistent Performance
Docker Terminologies and Development Tools
Docker Image: Think of an image like a template or a blueprint for an app. It’s a file
that contains everything needed for an app to run, including the code, libraries, and
dependencies (like a mini-operating system for just that app). There are two kinds
of Docker images:
o Partial Images: These are smaller, reusable building blocks or pieces of code that
handle specific tasks. You can combine them to build up to a full app.
o Container Images: This is a complete version that has everything needed to run
the specific app and is built up from partial images or base images.
Docker Container: This is a running version of the Docker image. Think of the image
as a “blueprint” and the container as the “working version.”
Layer: Each step that a developer adds to the Docker image becomes a “layer.” Imagine each
layer as a block of instructions or code.
o These layers allow developers to build on top of previous ones, gradually creating the final
app.
o This setup also makes Docker images efficient because each layer is saved individually,
so Docker only needs to update the parts that changed instead of recreating the whole
image.
Docker file: When you want to create a Docker image, Docker uses a special setup file called
a Docker file.
o It’s a text file where you list out all the steps and instructions Docker should follow to
create the image.
o This is where you specify the base image, which other software or libraries to install, and
any additional code to add.
Docker Build: The docker build command is then used to read the Docker file and follow the
steps to create the final image.
Docker Run: After building the image, a Docker command called docker run is used to specify
that an image is to be run as a Docker container.. This command launches the image in a new
container. By running it this way, you tell Docker to create a safe, isolated environment where
your app can run independently from other apps on your machine.
Docker Software Components
Docker Engine is the main software that lets us create, manage, and run containers. It has
two main components:
o The Docker Daemon (dockerd)
o The Command-Line Interface (CLI) or docker.
Dockerd, runs in the background and manages all things related to Docker on the machine.
This includes:
o Building images from instructions in a Docker file.
o Running containers based on images.
o Downloading images from online repositories like Docker Hub.
Command-Line Interface (CLI): Designed for users who want to interact directly with Docker
by typing commands into a terminal.
Dockerd has two primary ways for users to interact with it:
o RESTful Interface intended for applications: Primarily used by orchestration software to
automate the creation, management, and termination of containers. It uses HTTP for
communication.
o Command-Line Interface (CLI) intended for Humans: Designed for users who want to
interact directly with Docker by typing commands into a terminal.
Basic Docker Commands
Building an Image: To create an image, a programmer writes a Docker file that describes
how to assemble an image. The Docker file should be saved in the current directory. To
build the image, the user runs
docker build .
Once built, Docker stores the image, and it assigns a unique 12-character hash as the
image name, like f34cd9527ae6.
Running a Container from an Image: Once an image is ready, it can be executed as a
container. To start a container using an image’s hash (e.g., f34cd9527ae6), the command
is:
docker run <image hash>
Viewing Saved Images: Docker keeps images stored until you decide to delete them. To see a
list of all images stored, you can use:
docker images
This command will display each saved image, showing their names, creation times, and
other details.
Deleting an Image: If an image is no longer needed, it can be deleted by its ID or name using
the command:
docker rmi [image-name or image-id]
Base Operating Systems and Files
* Search from questions
Items in a Dockerfile