[go: up one dir, main page]

0% found this document useful (0 votes)
26 views5 pages

Google App Engine and Google File System

Google App Engine (GAE) is a PaaS that enables developers to build and run web applications without managing servers, offering features like automatic scaling, load balancing, and persistent data storage. It supports languages like Java and Python and provides built-in APIs for various functionalities. Google File System (GFS) is a distributed file system designed for large data storage, featuring a master-chunk server architecture that ensures fault tolerance and high throughput.

Uploaded by

Pavithra
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views5 pages

Google App Engine and Google File System

Google App Engine (GAE) is a PaaS that enables developers to build and run web applications without managing servers, offering features like automatic scaling, load balancing, and persistent data storage. It supports languages like Java and Python and provides built-in APIs for various functionalities. Google File System (GFS) is a distributed file system designed for large data storage, featuring a master-chunk server architecture that ensures fault tolerance and high throughput.

Uploaded by

Pavithra
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

1.

Explain the basics of the Google App Engine (GAE) infrastructure programming
model.

Introduction:

Google App Engine (GAE) is a Platform as a Service (PaaS) provided by Google that
allows developers to build, deploy, and run web applications on Google’s infrastructure
without worrying about managing servers or hardware.

GAE offers a complete platform including computing power, data storage, security, and load
balancing.

Key Features of GAE:

1. Supports Programming Languages:

o Java and Python are mainly supported.


o Developers can use web frameworks like Django (Python) and Google Web
Toolkit (Java).
2. Automatic Scaling:

o GAE automatically adjusts resources like CPU and memory depending on


traffic.
o No need for manual scaling or managing servers.

3. Load Balancing:

o Distributes incoming traffic efficiently across multiple servers for high


performance.
4. Sandboxed Environment:

o Each app runs in a secure, isolated environment which increases security and
stability.

5. Persistent Data Storage:

o GAE uses BigTable (a NoSQL database) to store structured data.

o Blobstore is available for large file storage (up to 2 GB).

6. APIs and Services:

o Provides built-in APIs for:


▪ Sending emails
▪ Authenticating users via Google accounts
▪ Accessing images, URLs, etc.

7. Free and Pay-as-you-go Model:

o Free usage up to a quota.

o Charges apply only when you exceed the quota.

GAE Architecture:

Component Function

DataStore Stores data using BigTable with support for transactions.

Provides an environment to run Java/Python apps


Application Runtime
securely.

Admin Console Used to deploy, monitor, and manage applications easily.

Google Secure Data Connector


Provides secure access to private data from the cloud.
(SDC)

Allows developers to test apps locally before deploying


Local SDK
to the cloud.
Real-World Applications Built on GAE:

• Gmail

• Google Docs
• Google Maps

• Google Earth

• These apps are scalable and support millions of users globally.

Summary:

Google App Engine allows developers to focus on writing application logic while Google
handles everything else like infrastructure, scaling, and performance. It’s a powerful tool for
building reliable and scalable web applications easily.

2. Outline the architecture of Google File System (GFS).

Introduction:

Google File System (GFS) is a distributed file system created by Google to store and
manage huge amounts of data across many servers. It is mainly used for internal Google
applications like search indexing, Gmail, etc.

Key Design Goals of GFS:

• Handle very large files (hundreds of MB or GB).

• Be fault-tolerant (hardware failures are common).

• Support high throughput rather than low latency.


• Optimized for write-once, read-many usage patterns.

GFS Architecture:

GFS uses a Master–Chunk Server model:

Component Description

Master Controls the file system. Maintains metadata such as file names, chunk
Server locations, and namespace.
Component Description

Chunk Store actual file data in chunks (default size: 64 MB). Each chunk is
Servers replicated on multiple servers (usually 3).

Request file data from the master, then communicate directly with chunk
Clients
servers to read/write chunks.

Data Flow in GFS (Write Operation):

1. Client → Master: Client asks the master which chunk server holds the data and
where the replicas are.
2. Master Response: Master tells the client which server is the primary and the list of
secondaries.

3. Client → Replicas: Client sends the data to all replicas (primary + secondaries).
4. Client → Primary: Once all servers receive the data, the client sends a write
command to the primary server.

5. Primary → Secondaries: Primary assigns a serial number and forwards the


command.

6. All Confirm: Once all secondaries finish writing, they confirm back.
7. Primary → Client: Finally, the primary server informs the client that the write was
successful.

Key Features:
• Fault Tolerance:

o Every chunk is replicated (usually 3 times) across different servers/racks.

o Ensures data availability even if some servers fail.

• Efficient Data Management:


o Large block size (64 MB) helps reduce metadata size and speeds up sequential
data access.

• Master Server Role:


o Handles metadata and gives instructions.

o Doesn’t participate in actual data transfer, improving performance.

• Shadow Master:

o A backup copy of the master to ensure continuity during failures.

Real-Time Example:

Let’s say Google Search needs to index web pages:

• The data is stored in GFS as large files.

• GFS breaks them into chunks, stores them across different servers.

• If one server fails, GFS can still fetch data from its replicas.

Summary:

GFS provides a scalable, fault-tolerant, and high-performance storage system to support


Google’s massive data needs. Its architecture is simple but powerful—based on a central
master, chunk servers, and intelligent client communication.

You might also like