[go: up one dir, main page]

0% found this document useful (0 votes)
8 views16 pages

AWS Database Services

The document provides an overview of various AWS database services, including Amazon RDS, Aurora, DynamoDB, DocumentDB, ElastiCache, Neptune, and Redshift, along with their best use cases. It also explains the differences between structured, unstructured, and semistructured data, and highlights the importance of relational and non-relational databases. Additionally, it covers key concepts such as OLTP vs OLAP, data indexing, and migration challenges to the cloud.

Uploaded by

ananth
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views16 pages

AWS Database Services

The document provides an overview of various AWS database services, including Amazon RDS, Aurora, DynamoDB, DocumentDB, ElastiCache, Neptune, and Redshift, along with their best use cases. It also explains the differences between structured, unstructured, and semistructured data, and highlights the importance of relational and non-relational databases. Additionally, it covers key concepts such as OLTP vs OLAP, data indexing, and migration challenges to the cloud.

Uploaded by

ananth
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 16

AWS Database Services – Simplified Overview

Service 🔎 What It Does (Simple) 📚 Best Use Case

Managed relational databases Traditional business


Amazon RDS
(MySQL, PostgreSQL, Oracle, etc.) apps needing SQL DB

High-performance, cost-effective
Amazon Enterprise apps needing
RDS engine (MySQL/PostgreSQL
Aurora speed + scalability
compatible)

Real-time apps like


Amazon NoSQL key-value database —
gaming, e-commerce
DynamoDB highly scalable & fast
carts

MongoDB-compatible document Apps using JSON-like


Amazon
database with separate docs (e.g., content
DocumentDB
compute/storage apps)

Amazon Fast in-memory store with Boost app performance,


ElastiCache Redis/Memcached caching user sessions

Managed graph database — Social networks, fraud


Amazon
connects relationships between detection,
Neptune
data recommendation

Data warehouse for analytics — BI tools, large-scale


Amazon
petabyte-scale storage + fast reporting and
Redshift
queries dashboards

🧠 Tips for AWS Cloud Practitioner Exam

 RDS = Relational → SQL-based apps

 Aurora = Enhanced RDS with high speed

 DynamoDB = Non-relational → Scales automatically

 DocumentDB = JSON-style storage → MongoDB-like

 ElastiCache = In-memory → Fast data access

 Neptune = Graph → Relationship-focused workloads

 Redshift = Analytics → Warehousing & querying large datasets

Three Types of Data Sources – Easy Summary

 Structured Data
o Stored in rows and columns (like Excel or SQL tables)

o Easy to analyze and perform complex searches

o Structure must follow strict rules

 Unstructured Data

o Stored as files (images, videos, PDFs, etc.)

o No fixed format or structure

o Hard to search or analyze without special tools

 Semistructured Data

o Stored in flexible formats like JSON

o Allows changing structure on the fly

o Easier to query than unstructured but not as powerful as


structured data

  Structured → best for relational databases (like RDS, Aurora)


  Unstructured → stored in S3; needs tools like Athena, Macie
  Semistructured → fits DocumentDB, DynamoDB, or JSON-based processing

Feature 📘 Structured 🧾 Semistructured Unstructured

Fixed schema Flexible schema


💡 Format No defined format
(rows/columns) (JSON/XML)

NoSQL DB
Storage Relational DB Object/File store
(DocumentDB,
Option (RDS, Aurora) (S3, FSx)
DynamoDB)

🔎 Analysis Very strong with Moderate with Requires special


Power SQL custom logic processing

🔁 Schema Rigid — must Flexible — changes


None
Flexibility predefine easily

Config files,
Orders, sales, Text files, images,
📄 Examples messages, JSON
employee records videos
docs

💾 Data Source
📂 Example 🧠 Typical Use Case ☁️AWS DB Services
Type

Structured POS data (order Sales reports, Amazon RDS,


ID, total) finance, Aurora
Feature 📘 Structured 🧾 Semistructured Unstructured

transactions

Clickstream (XML, User behavior, app DynamoDB,


Semistructured
JSON) telemetry DocumentDB

S3 (object store),
Images, videos,
Unstructured Media storage, logs use Athena/Macie to
PDFs
query

Relational Databases — Explained Simply

 ✅ Relational databases are perfect for structured data, like


customer orders or employee info.

 Tables store the data, with each table focusing on a topic or entity
(like Customers or Orders).

 📄 Rows (records) = one item or entry. 📌 Example: one customer’s


order.

 📌 Columns (fields) = characteristics or attributes. 📌 Example: first


name, order total.

🔗 Relationships Between Tables

🔑 Key
🧠 What It Means
Type

Primary Unique ID for each row in a table (e.g.,


Key customer_id)

Foreign
Connects to a primary key from another table
Key

Builds connections between tables — creates linked


Purpose
records

💡 Example: A customer_id from Customers table connects to


customer_id in the Orders table — showing which customer made which
order.

☁️AWS Relational Database Services

🧩 Service 🔎 What It Supports

Amazon Managed relational DB — MySQL, PostgreSQL,


RDS etc.

Amazon High-performance MySQL/PostgreSQL-


🧩 Service 🔎 What It Supports

Aurora compatible DB

Relational Databases – Key Points

 💾 Relational databases store structured data in tables made of


rows and columns

 Each table represents an entity (like a customer or order), with


columns for attributes and rows for entries

 ✅ Designed to be highly available, consistent, and support


complex queries (like joins)

 🚀 Great for transactional data, such as finance, retail, or order


systems

🔄 OLTP vs OLAP – Explained Simply

🧩
📌 Purpose 📊 Optimized For 💡 AWS Solution
Method

Online Transaction
OLTP Write operations Amazon RDS, Aurora
Processing

Online Analytical Read operations Amazon Redshift (Data


OLAP
Processing & analysis Warehouse)

 ✍️Write operation = adding/updating data

 🔎 Read operation = querying or analyzing data

 ⚠️Large databases can’t optimize both at once → they separate


OLTP and OLAP for performance

💡 Summary Insight

 OLTP = speed for input (transactions)

 OLAP = power for reports and data insights

 AWS supports both — you choose based on your app needs

Data Indexing – Why It Matters

 Indexing = creating a shortcut so SQL queries find data faster

 Without index: full table scan

 With index: query jumps straight to relevant rows


 📌 Example: Indexing by OrderDate allows finding all orders from that
date quickly

⚡ OLTP vs OLAP – Quick Study Table

🧩
☁️AWS
Processing 🔧 Focus 🔎 Query Type 📚 Example
Service
Type

Bank ATM, Amazon


Real-time data Short, simple
OLTP retail POS RDS,
transactions queries
system Aurora

Business
Historical data Complex Amazon
OLAP analytics
analysis reports/queries Redshift
dashboard

 ✍️OLTP = Insert, Update, Delete (write-heavy)

 📊 OLAP = Read-heavy, summaries, deep analysis

Challenge: Migrating On-Prem Transactional DB to Cloud

 Migrating apps like an inventory control system to the cloud can


be tricky.

 You need to pick the right database service that suits:

o Structured data

o Transactional access

o Scalability and reliability

✅ Solution: Use Amazon RDS

My PostgreSQL database is seeing a lot of write I/Os smaller than


4KB. We're consuming a good amount of I/O resources. What
should I use?

Amazon Aurora

I need a data warehouse solution that provisions infrastructure


capacity and automates ongoing administrative tasks. What
should I use?
Amazon Redshift

Amazon RDS – Study Notes

🔧 What It Is

 Fully managed relational database service

 Supports MySQL, PostgreSQL, MariaDB, Oracle, SQL Server, and


Aurora

🎯 Why Use It

 Reduces setup and admin effort (patching, backups, provisioning)

 Scales easily and cost-effectively

 Works with analytics and machine learning services

How It Works

 Stores structured data in tables (rows = records, columns = fields)

 Supports relationships via primary and foreign keys

 Great for transactional and analytical data (OLTP and OLAP)

🔄 OLTP vs OLAP

Mod AWS
Purpose
e Service

Fast, short write operations (e.g., RDS,


OLTP
POS, ATM) Aurora

OLAP Complex analytics, large reads Redshift

Setup Basics

 Choose instance type (controls CPU, RAM, I/O)

 Choose DB engine (MySQL, PostgreSQL, etc.)

 Create via Console, CLI, or API

Security Features

 Host inside a VPC for network isolation

 Use security groups + IAM for access control

 Data in transit: SSL encryption

 Data at rest: AES-256 encryption

📦 Backup & Recovery


 Multi-AZ deployments for failover

 Snapshots via Lambda → Store in S3

 DR strategy: replicate snapshots across Regions

📊 Analytics Integration

 Real-time analytics: RDS → Lambda → Kinesis → S3 → Athena →


QuickSight

💸 Pricing Structure

 Instance: On-Demand (hourly) or Reserved (1–3 yrs, cheaper)

 Storage + I/O: charged per GB/month and per million requests

 Data transfer: charged across Regions; free within same Region

🌐 Real-World Use Cases

 Airbnb: migrated MySQL with just 15 mins downtime

 Mint by Intuit: moved 100+ instances from EC2 → RDS to reduce


costs

 Genomics firm: uses PostgreSQL to analyze data for 90K+ users


What Is Amazon Aurora?

 A cloud-native relational database managed by Amazon RDS

 Compatible with MySQL and PostgreSQL

 Combines high speed & reliability of enterprise DBs with cost-


effectiveness of open-source DBs

🧠 Key Features

🔧 Feature 💡 What It Means

Log-structured
Fast, efficient data writing
storage

Auto backup to S3 Point-in-time restore without manual setup

Multi-AZ Six copies across 3 AZs for durability; automatic


deployments failover

Read replicas (up


Improves performance and handles read-heavy apps
to 15)

Aurora Global Single DB replicated across AWS Regions; enables


Database fast reads + DR

Auto scales based on demand; no need to manage


Serverless mode
instances

Security Practices

 Place Aurora inside a VPC for isolation

 Use security groups to control access

 IAM manages DB credentials and permissions


 Encrypt data in transit (SSL) and at rest (AES-256)

🔌 Integration with AWS Services

⚙️Service 💡 Use Case with Aurora

Trigger actions or ETL processes based on DB


Lambda
changes

Stream external/public data → process → store in


Kinesis Firehose
Aurora

Athena +
Query logs in S3 → visualize dashboards
QuickSight

Migrate data from other DBs to Aurora (easy and


DMS
cost-saving)

CloudWatch Monitor Aurora DB activity and performance

💸 Pricing Model

💰 Pricing
💡 Details
Component

On-Demand (hourly), Reserved (1–3 years),


Instance cost
Serverless

Storage & I/O GB/month and per million requests

Built-in backups free; manual snapshots


Backups
charged

Data transfer Free within Region; charged across Regions

🏢 Real-World Use Cases

 Pagely: Uses Aurora Serverless for scalable WordPress hosting

 New Innovations: Migrated SQL Server → Aurora PostgreSQL;


saved nearly $1M

 Dow Jones: Runs 200 TPS on Aurora cluster with cross-region


replication

 BMLL Technologies: Uses Aurora PostgreSQL for fast big data


analysis

🚀 AWS Console Tips to Launch Aurora


Step Sample Choice

Engine Aurora MySQL or Aurora PostgreSQL

Burstable (for test) or Memory-optimized


Instance type
(for prod)

Availability Enable Multi-AZ for high durability

Connectivity Place in VPC, assign security group

Backup &
Automatic S3 backup enabled
restore

Non-Relational (NoSQL) Databases – Key Points

 Often used for semistructured (JSON/XML) or unstructured


(images, docs) data

 "NoSQL" really means "Not Only SQL" — many support querying


with SQL-like syntax

 Store everything in one document/item, not across related


tables

 Flexible schema: you don’t need to predefine all fields

 Perfect for apps that evolve quickly and scale fast (e.g., web/mobile
apps)

 Trade-off: they may use eventual consistency instead of


immediate (ACID) consistency

🔍 Relational vs Non-Relational – Comparison Table


⚙️
Characteristi Relational DB 🧾 Non-Relational DB (NoSQL)
c

Tables with rows & Documents or key-value pairs in a


Structure
columns single table

Data Design Normalized schema Denormalized, flexible schema

Optimized for Optimized for


Optimization
storage compute/performance

Query Multiple (JSON-style, object queries,


SQL
Language etc.)

Horizontal (scale-out across


Scalability Vertical (scale-up)
servers)

OLTP/OLAP for
Use Case OLTP for web/mobile, real-time apps
business & analytics

Sometimes (depends on NoSQL


ACID Support Yes
type & configuration)

Schema Requires updates to Add new fields anytime, no schema


Change table structure update needed

📚 Example Use Case

🧩 Scenario 🎯 Best Choice 💡 Why

Inventory with Relational DB (Amazon Structured data, join


complex relations RDS, Aurora) queries, stable schema

Evolving product NoSQL DB (DynamoDB, Flexible schema, fast scale,


catalog with metadata DocumentDB) varied record formats

Document databases

Document stores keep files containing data as a series of elements. These


files can be navigated using numerous languages including Python and
Node.js. Each element is an instance of a person, place, thing, or event.
For example, a document store may hold a series of log files from a set of
servers. These log files can each contain the specifics for that system
without concern for what the log files in other systems contain.
Strengths

 Flexibility

 No need to plan for a specific type of data when creating one

 Easy to scale

Weaknesses

 Sacrifice ACID compliance for flexibility

 Databases cannot query across files natively

In-memory databases

In-memory databases are used for applications that require real-time


access to data. Most databases have areas of data that are frequently
accessed but seldom updated. Additionally, querying a database is always
slower and more expensive than locating a key in a key-value pair cache.
Some database queries are especially expensive to perform. By caching
such query results, you pay the price of the query once and then are able
to quickly retrieve the data multiple times without having to re-execute
the query.

Strengths

 Support the most demanding applications requiring sub-millisecond


response times

 Great for caching, gaming, and session store

 Adapt to changes in demands by scaling out and in without


downtime

 Provide ultrafast (sub-microsecond latency) and inexpensive access


to copies of data

Weaknesses

 Data that is rapidly changing or is seldom accessed

 Application using the in-memory store has a low tolerance for stale
data

 Non-Relational Database Types – Overview Table


☁️AWS
🧠 DB
📘 Definition ✅ Strengths ⚠️Weaknesses Service
Type
Example
Stores data as
key-value pairs - Very flexible<br>- - Hard to run
without schema; Simple access (no analytics<br>- Amazon
Key-Value
values can be joins)<br>- Easy to copy Access patterns DynamoDB
blobs or any across systems must be predefined
format
Stores files (like
JSON, XML) - May not support
- Flexible structure<br>-
with elements; ACID fully<br>- Amazon
Document Easy to scale<br>- Good
each file is a Cannot query across DocumentDB
for varying data types
flexible data documents natively
document
Stores
- Sub-millisecond - Not ideal for data Amazon
frequently
In- latency<br>- Great for with low ElastiCache
accessed data in
Memory caching and gaming<br>- access<br>- Risk of (Redis,
RAM for ultra-
Scales in/out dynamically stale data Memcached)
fast response
- Not ideal for
transactional
Stores data as - Excellent for complex
use<br>- Requires
nodes and edges relationships<br>- Fast Amazon
Graph specialized query
representing recommendations<br>- Neptune
languages<br>-
relationships Good for big data mining
Analytics can be
slower

 💡 Quick Use Case Mapping


📂 Scenario 📌 Best DB Type 💬 Why It Fits
Caching user sessions In-Memory Fast access, frequent reads
Product catalog with
Document Flexible schema per item
JSON
Simple, fast access without joins
E-commerce transactions Key-Value
Models relationships between users/posts
Social media connections Graph easily

I have a gaming website that has grown so quickly that the whole
site loads slowly. I need a solution that can handle this huge
volume of uses. What should I use?

Amazon ElastiCache
I need to quickly gather shopping cart data from my website and
discard data on abandoned carts. What should I use?

Amazon DynamoDB

I am building a fraud detection app and need a database that


supports near real-time detection of patterns.

What should I use?

Amazon Neptune

We have a massive MongoDB database that needs to be migrated


to the cloud. We need a managed service that is purpose-built for
our workload. What should I use?

Amazon DocumentDB

AWS Service 📝 Correct Description

Amazon Stores data as nodes and the relationships between


Neptune each node

Amazon Stores files containing data as a series of elements


DocumentDB (e.g., JSON documents)

Amazon Stores data within a single table without a


DynamoDB predefined schema using key-value

Amazon Stores data in a cache to provide databases with


ElastiCache optimal performance for queries

AWS DATABASE MIGRATION

Common Migration Use Cases

📂 Source ☁️Target AWS Service

MongoDB Amazon DocumentDB

Oracle / SQL Amazon RDS or Amazon


Server Aurora
📂 Source ☁️Target AWS Service

Cassandra Amazon DynamoDB

Terraform (data Amazon Redshift (data


files) warehouse)

🧰 Data Migration Tools – Overview

Tool 📘 Purpose

Migrate live data with minimal downtime


AWS DMS
(replication)

Convert schemas + code for heterogeneous


AWS SCT
migrations

Native DB Use vendor tools for same-engine (homogeneous)


Tools migrations

🔍 Migration Types

🧩 Type 🔄 Definition 🧠 Tools Required

Homogeneo Same DB engine (e.g.,


Native DB tools, AWS DMS
us Oracle → Oracle)

Heterogene Different DB engines (e.g., AWS SCT → convert


ous Oracle → MySQL) schema<br>Then AWS DMS

🔐 Schema vs Data Migration

📦 AWS DMS 📑 AWS SCT

Converts schema, keys, constraints,


Moves data only
procedures

No foreign key Identifies limitations & generates schema


migration scripts

Translates views, functions, triggers between


Live data replication
engines

🧠 Planning Tips for Exam & Projects

 Use DMS for real-time replication and cutover migrations

 Use SCT when converting schemas between engines

 For Microsoft SQL → RDS SQL Server, use native SQL tools

 Understand schema limitations and customize before full cutover


Types of Database Architectures

1️⃣ Server-Based Architecture

 Traditional model

 Database runs on a physical or virtual server

 Requires manual setup, patching, scaling, and backups

 📌 Common in on-prem or EC2-hosted databases

 💡 Example: SQL Server running on Amazon EC2

2️⃣ Serverless Architecture

 No servers to manage — AWS handles it

 Scales automatically with demand

 Pay only for actual usage (read/write/storage)

 📌 Great for unpredictable workloads

 💡 Example: DynamoDB or Aurora Serverless

3️⃣ Managed Architecture (via Amazon RDS or Aurora)

 AWS manages compute, storage, scaling, backups

 You choose the DB engine (MySQL, PostgreSQL, etc.)

 Offers Multi-AZ, read replicas, snapshots

 📌 Combines server efficiency with cloud automation

 💡 Example: Amazon RDS for MySQL or Amazon Aurora PostgreSQL

You might also like