0% found this document useful (0 votes)

4 views10 pages

RDBMSvsHadoopVsSnowflake Architecture

The document outlines the basic level of cloud expertise required for using AWS, Azure, or GCP, focusing on cloud storage services. It details Snowflake's architecture and capabilities as a fully managed SaaS platform for data warehousing, lakes, and engineering, highlighting its hybrid shared-disk and shared-nothing architecture. Additionally, it discusses the limitations of traditional relational databases and the advantages of Snowflake's scalable storage and compute functions.

Uploaded by

susmitha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views10 pages

RDBMSvsHadoopVsSnowflake Architecture

Uploaded by

susmitha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 10

What level of AWS, Azure or GCP expertise is required?

Basic level only cloud storage services (AWS S3, Azure Blob storage and Google cloud storage) will be used
primarily for data storage and that will be covered as a part of this session

Snowflake Purpose

Snowflake is Ideal for below purposes

Snowflake is a fully managed SaaS (software as a service) that provides a single platform for data
warehousing, data lakes, data engineering, data science, data application development, and secure sharing
and consumption of real-time / shared data.

● Data Warehouse (primary)

● Data Lake (primary)

● Data Exchange

● Data Apps

● Data Science

● Data Engineering

● UNISTORE (OLTP )
Relational Databases (RDBMS) (Oracle, SQL Server, MySQL, PostgreSQL…)

Relational databases are designed to run on a single server in order to maintain the integrity of the table
mappings and avoid the problems of distributed computing.

Machine/node => computer

Shared Disk Architecture

Scaling vertically ===> One big machine will do all the work for you.
Scaling horizontally ===> Thousands of machines will do the work together for you.
RDBMS can be scaled vertically but not horizontally
In order to run a query on a database it would require compute resources like Processor, RAM
If Data volumes/users increases we will have to face performance issues since there is limit to scale a
machine/computer vertically.

There is limit to increase capacity of one machine’s RAM, CPU & Memory
Limitations of relational databases

1) Scalability, performance and speed

2) Licensing cost and maintenance over head
3) Concurrency issues (can’t handle large number of users at a point of time)
4) Limited/No support for Semi structured and unstructured data
5) Database Failure
6) Up gradation Costs

What is Big Data?

•The word "Big" in big data not just refers to data volume alone. It also refers to the fast rate of data
origination, its complex format and its origination from a variety of sources. The three V's of big data are
Volume, Velocity and Variety.

Hadoop architecture consists of two layers.

• MapReduce a Processing/Computation layer
• Hadoop Distributed File System (HDFS) as Storage layer (all the data will be distributed across multiple slave
nodes/machines as 64/128 MB blocks/files) and master node manages all the data distribution, retrieval, data
replication and metadata information
Shared nothing (data and compute will be done at each slave node level)
Disadvantages of Hadoop:
➨It is not suitable for small and real time data applications.
➨Joining multiple data set operations are complex.
➨Data retrieval will be slow. Since it has to get data data from multiple slave nodes which involves lot of
shuffling and sorting of data that degrades performance
➨It does not have storage or network level encryption.
➨Cluster management is hard i.e. in cluster, operations like debugging, distributing software, collection logs..
➨When operated by a single master it will cause difficulty in scaling.
➨Programming model is very restrictive.
Snowflake Architecture
HYBRID OF SHARED-DISK & SHARED-NOTHING architectures

Snowflake’s architecture is a hybrid of traditional shared-disk and shared-nothing database

architectures.

 Similar to shared-disk architectures → Snowflake uses a central data

repository for persisted data that is accessible from all compute nodes in the
platform.

 Similar to shared-nothing architectures → Snowflake processes queries using

virtual warehouses where each node in the cluster stores a portion of the entire
data set locally.

Data is stored in the cloud storage and works as a shared-disk model thereby providing simplicity in
data management

For compute it will take advantage of performance and scale-out benefits of shared nothing
architecture

This approach offers the data management simplicity of shared-disk architecture, along with the
performance and scale-out benefits of a shared-nothing architecture.

Snowflake architecture allows storage and compute to scale independently, so customers can use and
pay for storage and computation separately

While data is a core asset for modern enterprises, technology’s ability to scale (cheaper storage) has
created a surge of big data.

Managing and storing that data has become a critical function for modern business operations.

Most enterprises are already using a cloud data platform, but many are evaluating whether a data
migration might be needed in order to stay competitive
Snowflake Architecture
Multi-Cluster Shared Data Architecture

SNOWFLAKE LAYERS

Snowflake’s unique architecture consists of three key layers, all of them with High Availability. The
price is also charged separately for each layer.

Each layer can independently scale : storage, compute, and services.

1) Centralized Storage → When data (structured. semi structured) is loaded into Snowflake,
Snowflake reorganizes that data into its internal optimized, compressed, columnar format in
this layer.
Snowflake manages all aspects of how this data is stored like organization, file size, structure,
compression, metadata, and statistics. This storage layer runs independently of compute
resources.
2) Compute → The compute layer is made up of virtual warehouses that execute data processing
tasks required for queries. Each virtual warehouse (or cluster) can access all the data in the
storage layer, then work independently, so the warehouses do not share, or compete for,
compute resources.
Virtual warehouse = cluster of compute nodes/machines
Query execution is performed in the compute/processing layer using “virtual warehouses”. Each
virtual warehouse is an MPP (Massive Parallel Processing) compute cluster composed of
multiple compute nodes allocated by Snowflake from a cloud provider.
*5XL and 6XL are in preview state

3) Cloud Services →
The overall brain in the system, this layer is a collection of services that handle query
management, optimization, transactions, security and governance, metadata, and sharing and
collaboration

The cloud services layer uses ANSI SQL and coordinates the entire system. It eliminates the
need for manual data warehouse management and tuning Collection of services that
coordinate activities across Snowflake. It includes:
 Authentication. (user logins)

 Infrastructure management. (Assigning compute resources..)

 Metadata management. (Table structures, columns, micro partition details)

 Query parsing and optimization (query performance optimization,

execution plan)

 Access control. (Access Management)

Cloud Agnostic Layer → there is another fourth layer also, known as the Cloud Agnostic Layer. It is
used only the first time when we choose a cloud provider.

“Cloud Agnostic” is generally regarded to refer to applications and workloads that can be moved
seamlessly between cloud platforms

Snowflake’s architecture allows flexibility in handling big data.

Snowflake decouples the storage and compute functions, which means organizations that have high
storage demands but less need for CPU cycles, or vice versa, don’t have to pay for an integrated
bundle that requires them to pay for both.

Users can scale up or down as needed and pay for only the resources they use.

Storage is billed by terabytes stored per month, and computation is billed on a per-second basis.

Warehouse Billing

30 secs – 1 minute

45 secs – 1minute

61 secs – 61 secs

65 secs – 65

90 secs – 90 secs

Snowflake account creation

Its Multi-cluster Shared Data architecture

Snowflake: Cloud Data Platform Guide
No ratings yet
Snowflake: Cloud Data Platform Guide
108 pages
Unit 4 Databases, Cloud & Snowflake: Prof. Thushara Weerawardane
No ratings yet
Unit 4 Databases, Cloud & Snowflake: Prof. Thushara Weerawardane
50 pages
Snowflake PPT 22
No ratings yet
Snowflake PPT 22
220 pages
What Is Snowflake - 1
No ratings yet
What Is Snowflake - 1
91 pages
Architecture
No ratings yet
Architecture
4 pages
Snowflake Material
No ratings yet
Snowflake Material
2,157 pages
Key Concepts & Architecture: Data Platform As A Cloud Service
No ratings yet
Key Concepts & Architecture: Data Platform As A Cloud Service
4 pages
What Is The Snowflake Data Warehouse
No ratings yet
What Is The Snowflake Data Warehouse
7 pages
Tecnical Seminar
No ratings yet
Tecnical Seminar
16 pages
Snowflake
No ratings yet
Snowflake
3 pages
Snowflake Platform Training - Student Material
No ratings yet
Snowflake Platform Training - Student Material
58 pages
2 - Snowflake de Feb25
No ratings yet
2 - Snowflake de Feb25
90 pages
Snowflake
No ratings yet
Snowflake
22 pages
Snowflake Overview
No ratings yet
Snowflake Overview
44 pages
Snowflake Intro 5 Min Guide SravanKumarBodla 1749390051
No ratings yet
Snowflake Intro 5 Min Guide SravanKumarBodla 1749390051
5 pages
Best Practices For Optimizing Your DBT and Snowflake Deployment
100% (1)
Best Practices For Optimizing Your DBT and Snowflake Deployment
30 pages
Snowflake Cloud Data Warehouse Intro
No ratings yet
Snowflake Cloud Data Warehouse Intro
6 pages
SnowFlake Notes
100% (1)
SnowFlake Notes
40 pages
Snowflake Cloud Data Warehouse Guide
100% (2)
Snowflake Cloud Data Warehouse Guide
8 pages
An Introduction To Snowflake - SQLKonferenz
No ratings yet
An Introduction To Snowflake - SQLKonferenz
29 pages
Snowflake Overview
No ratings yet
Snowflake Overview
8 pages
Snowflake Notes
100% (10)
Snowflake Notes
67 pages
Snowflake For Data Applications
No ratings yet
Snowflake For Data Applications
2 pages
Presentation 1
No ratings yet
Presentation 1
57 pages
Snowflake Architecture Overview
No ratings yet
Snowflake Architecture Overview
5 pages
Snowflake Unit 1 Introduction
No ratings yet
Snowflake Unit 1 Introduction
43 pages
Snowflake Questions V2
No ratings yet
Snowflake Questions V2
6 pages
Snowflake Question
No ratings yet
Snowflake Question
446 pages
Snowflake
No ratings yet
Snowflake
11 pages
Snowflake Cloud Data Warehouse Overview
No ratings yet
Snowflake Cloud Data Warehouse Overview
40 pages
Snowflake Elastic Data Warehouse
No ratings yet
Snowflake Elastic Data Warehouse
2 pages
7 Snowflake Reference Architectures For Application Builders
No ratings yet
7 Snowflake Reference Architectures For Application Builders
13 pages
Snowflake Cloud Data Warehouse Guide
0% (1)
Snowflake Cloud Data Warehouse Guide
15 pages
SF Notes Anuja
No ratings yet
SF Notes Anuja
12 pages
Snowflake
No ratings yet
Snowflake
73 pages
3 Snowflake+Architecture
No ratings yet
3 Snowflake+Architecture
20 pages
Snowflake Intro
No ratings yet
Snowflake Intro
2 pages
Cloud Data Management Condensed 9 Pages
No ratings yet
Cloud Data Management Condensed 9 Pages
9 pages
Snowflake Training Slide SANMs
71% (7)
Snowflake Training Slide SANMs
218 pages
Snowflake Learning Path. Let Your Data Take Centerstage - by DataCouch - Medium
No ratings yet
Snowflake Learning Path. Let Your Data Take Centerstage - by DataCouch - Medium
19 pages
Snowflake 101 - For Data Architects - LinkedIn
No ratings yet
Snowflake 101 - For Data Architects - LinkedIn
17 pages
Mithun Snowflake
No ratings yet
Mithun Snowflake
3 pages
SnowPro Core Certification Prep
No ratings yet
SnowPro Core Certification Prep
37 pages
SF Notes Anuja Ibm
No ratings yet
SF Notes Anuja Ibm
16 pages
Snowflake Vs Data Bricks
100% (1)
Snowflake Vs Data Bricks
10 pages
Snowflake Optimization Best Practices: A Guide To Balancing Cost and Performance at Scale
No ratings yet
Snowflake Optimization Best Practices: A Guide To Balancing Cost and Performance at Scale
16 pages
What Is A Data Centre?
No ratings yet
What Is A Data Centre?
19 pages
Snowflake Notes
No ratings yet
Snowflake Notes
2 pages
Snowflake
No ratings yet
Snowflake
16 pages
Snowflake Fundamentals Anand Jha
No ratings yet
Snowflake Fundamentals Anand Jha
50 pages
GCP Snowflake
No ratings yet
GCP Snowflake
83 pages
7 Snowflake Reference Architectures For Application Builders
No ratings yet
7 Snowflake Reference Architectures For Application Builders
13 pages
Introduction To Snowflake: Sunil Gurav
No ratings yet
Introduction To Snowflake: Sunil Gurav
65 pages
Snowflake Unique Architecture
No ratings yet
Snowflake Unique Architecture
3 pages
Unit 3 (Ii) - CC
No ratings yet
Unit 3 (Ii) - CC
10 pages
What Is Snowflake
No ratings yet
What Is Snowflake
34 pages
T I 1633526880 Christmas Tree Decorating Scratch Worksheet - Ver - 2
No ratings yet
T I 1633526880 Christmas Tree Decorating Scratch Worksheet - Ver - 2
4 pages
Asme b16.10 RF Bwe Ftof Dims Ball GGC Plug
No ratings yet
Asme b16.10 RF Bwe Ftof Dims Ball GGC Plug
1 page
Osman's Proposal Thesis
No ratings yet
Osman's Proposal Thesis
41 pages
Drawing Menus in MS Word
No ratings yet
Drawing Menus in MS Word
15 pages
ESG20230106 Information On Newly Launched Galaxy F04
No ratings yet
ESG20230106 Information On Newly Launched Galaxy F04
6 pages
Take A Tour
No ratings yet
Take A Tour
87 pages
Chopper Stabilized Amplifier
No ratings yet
Chopper Stabilized Amplifier
71 pages
Bosch MEVD1728 ECU Connection Guide
No ratings yet
Bosch MEVD1728 ECU Connection Guide
6 pages
Bondiolipavesi CardanDrivelineCatalog Series100-006
No ratings yet
Bondiolipavesi CardanDrivelineCatalog Series100-006
250 pages
Business Administration - BS101 - Group Assignments 2024
No ratings yet
Business Administration - BS101 - Group Assignments 2024
2 pages
Uber-DSInterviewChallengeV 2 4 3 PaulWyatt PDF
No ratings yet
Uber-DSInterviewChallengeV 2 4 3 PaulWyatt PDF
5 pages
Problem Solving Using C
100% (1)
Problem Solving Using C
199 pages
(B) The Potentiometer Can Be Connected To A 9 V Supply To Provide A Variable Output For A Buzzer As Shown. A
No ratings yet
(B) The Potentiometer Can Be Connected To A 9 V Supply To Provide A Variable Output For A Buzzer As Shown. A
1 page
Https Cssps - Gov.gh Placement Placementslip
No ratings yet
Https Cssps - Gov.gh Placement Placementslip
2 pages
Reservoir PDF
No ratings yet
Reservoir PDF
5 pages
KS PeggyAhwesh Handout en-FINAL-pages
No ratings yet
KS PeggyAhwesh Handout en-FINAL-pages
4 pages
MIS for Students and Professionals
No ratings yet
MIS for Students and Professionals
3 pages
Power Bi Youtube Course Links
No ratings yet
Power Bi Youtube Course Links
7 pages
할리 배선도 - en - US
No ratings yet
할리 배선도 - en - US
102 pages
Notes Dated 8-4 Pharmaceutical Calibration Introduction, Definition and Principles BP 606T
No ratings yet
Notes Dated 8-4 Pharmaceutical Calibration Introduction, Definition and Principles BP 606T
8 pages
A Guide To The Business Analysis Body of Knowledge 3rd Edition Iiba PDF Download
100% (9)
A Guide To The Business Analysis Body of Knowledge 3rd Edition Iiba PDF Download
60 pages
Transformer Basics for Engineers
No ratings yet
Transformer Basics for Engineers
10 pages
ERA De-Clipper Manual
No ratings yet
ERA De-Clipper Manual
15 pages
Move Things With CSS Jhey Tompkins 2020
No ratings yet
Move Things With CSS Jhey Tompkins 2020
78 pages
Postal Ballot Notice - Final
No ratings yet
Postal Ballot Notice - Final
10 pages
User Manual of Suction Machine: OSA-370-2A
No ratings yet
User Manual of Suction Machine: OSA-370-2A
8 pages
12 - G30 Passive Safety Systems
No ratings yet
12 - G30 Passive Safety Systems
46 pages
PED 3 - Post Tes-WPS Office
No ratings yet
PED 3 - Post Tes-WPS Office
3 pages
Nebo Torchy
No ratings yet
Nebo Torchy
2 pages
Java Applets & Graphics Programming
No ratings yet
Java Applets & Graphics Programming
19 pages

RDBMSvsHadoopVsSnowflake Architecture

Uploaded by

RDBMSvsHadoopVsSnowflake Architecture

Uploaded by

What level of AWS, Azure or GCP expertise is required?

Snowflake is Ideal for below purposes

● Data Warehouse (primary)

● Data Lake (primary)

Machine/node => computer

Shared Disk Architecture

1) Scalability, performance and speed

What is Big Data?

Hadoop architecture consists of two layers.

Snowflake’s architecture is a hybrid of traditional shared-disk and shared-nothing database

 Similar to shared-disk architectures → Snowflake uses a central data

 Similar to shared-nothing architectures → Snowflake processes queries using

Each layer can independently scale : storage, compute, and services.

 Infrastructure management. (Assigning compute resources..)

 Metadata management. (Table structures, columns, micro partition details)

 Query parsing and optimization (query performance optimization,

 Access control. (Access Management)

Snowflake’s architecture allows flexibility in handling big data.

Snowflake account creation

Its Multi-cluster Shared Data architecture

You might also like