0% found this document useful (0 votes)

47 views22 pages

Azure Data Fundamentals - Study Notes

The document provides comprehensive study notes on Azure Data Fundamentals, covering core data concepts, data storage types, and data processing methodologies. It details relational and non-relational data in Azure, including various Azure services for data management and analytics. Additionally, it outlines data roles, services, and architectures relevant to data analytics within the Azure ecosystem.

Uploaded by

Kadimi Kiranraj

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

47 views22 pages

Azure Data Fundamentals - Study Notes

Uploaded by

Kadimi Kiranraj

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 22

Azure Data Fundamentals

Study Notes

Htay Aung
https://www.linkedin.com/in/htayaung

January 24, 2025

Table of Contents
Core Data Concepts ....................................................................................................................................................... 4
Structured Data ........................................................................................................................................................... 4
Semi-Structured Data................................................................................................................................................ 4
Unstructured Data ...................................................................................................................................................... 4
Data Stores .................................................................................................................................................................... 4
File Stores ...................................................................................................................................................................... 4
Databases ...................................................................................................................................................................... 6
Transactional Data Processing ............................................................................................................................... 6
Analytical Data Processing ...................................................................................................................................... 7
Common Architecture .......................................................................................................................................... 7
Data Roles ..................................................................................................................................................................... 8
Data Services ................................................................................................................................................................ 8
Relational Data in Azure ............................................................................................................................................ 11
SQL ................................................................................................................................................................................ 11
Data Objects .............................................................................................................................................................. 11
Non-Relational Data in Azure.................................................................................................................................. 12
Azure Blob Storage ................................................................................................................................................. 12
Types of blobs ...................................................................................................................................................... 12
Access tiers ............................................................................................................................................................ 13
Azure Data Lake Storage (ADLS) Gen1 ............................................................................................................ 13
Azure Data Lake Storage (ADLS) Gen2 ............................................................................................................ 13
Azure Files .................................................................................................................................................................. 13
Azure Tables............................................................................................................................................................... 14
Azure Cosmos DB .................................................................................................................................................... 14
Supported APIs .................................................................................................................................................... 15
Data Analytics in Azure .............................................................................................................................................. 16
General data analytics architecture .................................................................................................................. 16
Batch Processing ...................................................................................................................................................... 16
Stream Processing ................................................................................................................................................... 17
Real-time analytics services ................................................................................................................................. 17
Sources for stream processing ........................................................................................................................... 17
Sinks for streaming processing .......................................................................................................................... 17
Delta Lake ................................................................................................................................................................... 18
Microsoft Power BI .................................................................................................................................................. 18
Choosing for data visualization ..................................................................................................................... 18
Azure Synapse Analytics ............................................................................................................................................ 22
Resources ........................................................................................................................................................................ 22
Core Data Concepts
▪ Entities
o E.g. customers, products, sales orders, and so on
▪ Attributes
o E.g. a customer’s name, address, phone number, and so on

Structured Data
▪ Fixed schema
▪ The schema type is tubular
▪ Often stored in a database
▪ Reference multiple tables using key values – relational model

Semi-Structured Data
▪ Has some structures
▪ Allow some variation between entities
▪ Common format - JavaScript Object Notation (JSON) document
▪ E.g. Customer entity
o Some customers include one contact
o Some customers include two or more contacts
o Some customers don’t include any contacts

Unstructured Data
▪ Don’t have a specific structure
▪ Don’t have a specific semi-structure
▪ E.g. documents, images, audio, video, and binary files

Data Stores
Common data store categories:

▪ File stores
▪ Databases

File Stores
▪ Local file systems
▪ USB drives
▪ Shared file storage system

Central storage locations hosted in the cloud enable cost-effective, secure, and reliable
storage for large volumes of data.
Common file formats:

▪ Delimited text files

o Comma-separated values (CSV)
o Tab-separated values (TSV)
▪ JSON
o Attributes represented by name : value pairs and separated by comma (,)
o Attribute might contain object collection
o Objects are enclosed in braces ({…})
▪ XML
o Use tags
o Enclosed in angle-brackets (<…/>)
o Contains elements and attributes
▪ Binary Large Object (BLOB)
o Stored as binary data (1’s and 0’s)
o Common file types
▪ Images, video, audio, and application-specific documents
▪ Optimized file formats
o Common file formats
▪ Avro
• Row-based format
• Created by Apache
• Each record contains a header (data structure in the record).
Stored as JSON.
• Stored as binary information
• Good format for compressing data, minimizing storage, and
network bandwidth requirements
▪ ORC (Optimized Row Columnar format)
• Organize data into columns
• Developed by HortonWorks for optimizing read and write
operations in Apache Hive
▪ Parquet
• Columnar data format
• Created by Cloudera and X
• Contains row groups
• Data for each column is stored together in the same row group.
Each row group contains one or more chunks of data
• Includes metadata that describes the set of rows found in each
chunk
• Support efficient compression and encoding schemes
Databases
▪ Can store data and queries
▪ Relational databases
o Apply relational schema to data
o Commonly used to store and query structured data
o Tables represent entities
o Primary key – unique identifier
o Use the primary key to reference other tables. It enables a relational database to
be normalized
o Normalized – elimination of duplicate data values
o Use Structured Query Language (SQL)
o Normalization
▪ Use the least possible amount of storage
▪ Optimizes for updates, inserts, and deletes
▪ Non-relational databases
o No relational schema to data
o Often referred to as NoSQL database
o Common types
▪ Key-value databases
• Each record contains a unique key and associated value in any
format
▪ Document databases
• JSON
▪ Column family databases
• Tabular data – rows and columns
• Columns can be divided into groups (column-families)
▪ Graph databases
• Store as nodes with links to define relationships between them

Transactional Data Processing

▪ Primary function of business computing
▪ Records transactions
▪ A transaction is a small, discrete, unit of work
▪ Often referred to as Online Transactional Processing (OTP)
o Optimized for both read and write operations (CRUD)
o Enforced transactions that support ACID semantics
▪ Atomicity
Each transaction is treated as a single unit
▪ Consistency
Transactions can only take the data in the database from one valid state
to another
▪ Isolation
Concurrent transactions cannot interfere with one another
▪ Durability
When a transaction has been committed, it will remain committed.
o Typically used to support live applications – Line of Business (LOB) applications

Analytical Data Processing

▪ Typically using read-only (or read-mostly) systems that store vast volumes of historical
data or business metrics

Common Architecture

1. Data is Extracted, Transformed, and Loaded (ETL) into a data lake for analysis
2. Data is loaded into a schema of tables – typically in a Spark-based data lakehouse or a
data warehouse
3. Aggregated or loaded into an Online Analytical Processing (OLAP) model or cube
4. Data in the data lake, data warehouse, and analytical model can be queried to produce
reports, visualizations, and dashboards

Data Lake
A centralized repository that ingests and stores large volumes of data in its original form

Data scientists might work directly with data files in a data lake to explore and model data.

Data Warehouse
A centralized repository that stores structured data (database tables, Excel sheets) and semi-
structured data (XML files, webpages) for reporting and analysis

Data Analysts might query tables directly in the data warehouse to produce complex reports and
visualizations.

Data Lakehouse
A data management system that combines the benefits of data lakes and data warehouses

Online Analytical Processing (OLAP) model

- An aggregated type of data storage optimized for analytical workload.
- Enable to drill up/down to view aggregations at multiple hierarchical levels

Business users might consume pre-aggregated data in an analytical model through reports or
dashboards.

Data Roles
Role Responsibilities

Database Administrator - Manage databases

- Assign permissions to users
- Backup and restore data

Data Engineer - Manage infrastructure and processes

for data integration
- Apply data cleaning routines
- Identify data governance rules
- Implement pipelines to transfer and
transform data

Data Analyst - Explore and analyze data

- Create visualizations and charts

Data Services
▪ Azure SQL
o Azure SQL Database
▪ Fully managed PaaS database hosted in Azure
▪ PaaS
▪ For new cloud solutions or to migrate applications that have minimal
instance-level dependencies
▪ Available a Single Database or an Elastic Pool
• Single Database
o Charged per hour
o Able to specify a serverless config
• Elastic Pool
o Multiple databases can share the same resources (memory,
storage space, processing power through multiple tenancy)
o Useful if resource requirement varies over time
o Not fully compatible with on-premises SQL server
installations
o Azure SQL Managed Instance
▪ Hosted SQL Server instance with automated maintenance
▪ More administrative responsibility than Azure SQL DB
▪ PaaS
▪ For most cloud migration scenarios. Minimal changes to existing
applications
o Azure SQL VM
▪ A VM with an SQL Server installation
▪ Maximum configurability
▪ Full management responsibility
▪ IaaS
▪ To migrate or extend on-premises SQL Server solution and retain full
control of the server and database configuration
▪ Open-source databases in Azure
o Azure Database for MySQL
▪ Commonly used in Linux, Apache, MySQL, and PHP (LAMP) stack apps
▪ Automatic backups and point-in-time restore for the last 35 days
o Azure Database for MariaDB
▪ Created by original developers of MySQL
▪ Compatible with Oracle DB
▪ Automatic backups and point-in-time restore for the last 35 days
o Azure Database for PostgreSQL
▪ Hybrid relational-object DB
▪ Able to store custom data types
▪ Azure Cosmos DB
o Global-scale non-relational DB system (NoSQL)
o Support multiple APIs
o Able to store and manage data like JSON documents, key-value pairs, column
families, and graphs
▪ Azure Storage
o Store data in Blob containers – scalable, cost-effective storage for binary files
o Store data in File shares – network file shares
o Store data in Tables – key-value storage
o Data Engineers used to host data lakes – blob storage with a hierarchical
namespace that enables files to be organized in folders in a distributed system
▪ Azure Data Factory
o Able to define and schedule data pipelines to transfer and transform data
o Data Engineers used to build ETL solutions
▪ Microsoft Fabric
o Unified SaaS analytics platform based on the open and governed lakehouse
o Data ingestion and ETL
o Data lakehouse analytics
o Data warehouse analytics
o Data Science and Machine Learning
o Realtime analytics
o Data visualization
o Data governance and management
o AI-powered insights
o Apache Spark-based service
▪ Microsoft OneLake
o A single, unified, logical data lake for the whole organization
o Comes automatically with every Microsoft Fabric tenant
o Built on top of Azure Data Lake Storage Gen2
o Store data in Delta Parquet format
o Able to navigate data from Windows using OneLake file explorer
▪ Azure Databricks
o Azure-integrated version of popular Databricks platform
o Combines Apache Spark data processing platform with SQL database semantics
o Integrated Management Interface
o For large-scale data analytics
▪ Azure Stream Analytics
o Real-time stream processing engine
o Capture data stream from input, apply query to extract and manipulate data, and
write the results to output for analysis or further processing
▪ Azure Data Explorer
o Fully managed, standalone, big data analytics platform
o Offer high-performance querying of log and telemetry data
▪ Microsoft Purview
o Provide enterprise-wide data governance and discoverability
o Able to create a map of data and track data lines across multiple data sources
and systems
▪ Azure Synapse Analytics
o Enterprise analytics service
o Built on Spark
o Databricks and Spark pool run a large amount of data processing using Scala
o SQL + Spark + Data Explorer + Pipelines + ETL/ELT + Integration with Power BI,
CosmosDB, and Azure ML

▪ Data warehouse
o Use fact and dimension tables in a star/snowflake schema

Relational Data in Azure

SQL
SQL statement types

▪ Data Definition Language (DDL)

o CREATE – create a new object (e.g. table, view)
o ALTER
o DROP
o RENAME
▪ Data Control Language (DCL)
o GRANT – grant permission to perform specific actions
o DENY
o REVOKE
▪ Data Manipulation Language (DML)
o SELECT
o INSERT
o UPDATE
o DELETE

Data Objects
▪ View
o A virtual table based on the results of a SELECT query
▪ Stored Procedure
o Encapsulate programmatic logic in a database for actions
▪ Index
o Help search for data in a table
o Create a tree-based structure that the database system’s query optimizer can use
to find rows in the table quickly

Non-Relational Data in Azure

Azure Blob Storage
▪ Able to store massive amounts of unstructured data as binary large objects (blob)
▪ Application can read and write using Azure blob storage API
▪ Within a container, organize blobs in a hierarchy of virtual folders
▪ Can’t perform folder-level operations to control access or bulk operations

Types of blobs
▪ Block blobs
o Handle as a set of blocks
o Each block’s up to 4000 MiB
o Block blob can contain up to 190.7 TiB
o Maximum 5000 MiB
o Best for discrete, large, binary objects that change infrequently
▪ Page blobs
o A collection of fixed-size 512-byte pages
o Can hold up to 8 TB of data
o Azure uses page blobs to implement virtual disk storage for virtual machines.
▪ Append blobs
o Optimized Block blob to support append operations
o Can only add blocks to the end
o Doesn’t support updating or deleting existing blocks
o Each block’s size up to 4 MB
o Maximum 195 GB
Access tiers
▪ Hot
o Frequent access
o High performance
o Can be migrated to the Cool tier
o A few milliseconds for reading latency
▪ Cool
o Lower performance
o Lower storage cost than Hot tier
o Can be migrated back to the Hot tier
o A few milliseconds for reading latency
▪ Archive
o Lowest storage cost
o Increased latency to retrieve
o Stored in an offline state
o Can take hours for the data to become available
o To retrieve a blob, change the access tier to Hot or Cool. The blob will be
rehydrated.

Azure Data Lake Storage (ADLS) Gen1

▪ Historical data storage for analytical data lakes
▪ Big data analytical solution
▪ Work with structured, semi-structured, unstructured data stored in files

Azure Data Lake Storage (ADLS) Gen2

▪ Newer version of Gen1
▪ Integrated into Azure Storage
▪ Often used in the batch processing scenarios
▪ Azure Databricks can mount a distributed file system hosted in Azure Data Lake Store
Gen2
▪ To create a Gen2 file system, you must enable the Hierarchical Namespace option of the
Azure Storage account
▪ Enabling Hierarchical Namespace can’t revert to the flat namespace

Azure Files
▪ A way to create cloud-based network shares
▪ Enable to share up to 100 TB of data in a single storage account
▪ Max size of a single file is 1 TB
▪ Support up to 2000 concurrent connections per shared file
▪ 2 performance tiers
o Standard
▪ Hard disk-based
o Premium
▪ Solid-state disk-based (great throughput, higher rate)
▪ 2 common network file-sharing protocols
o Server Message Block (SMB)
o Network File System (NFS)
▪ Must use the premium tier
▪ Able to upload files to the file share, and grant access to remote users

Azure Tables
▪ NoSQL storage
▪ Key/value data items
▪ Item is represented by a row that contains columns for the data field
▪ Enable to store semi-structured data
▪ All rows must have a unique key (composed of a partition key and a row key)
▪ No concept of FK, relationships, stored procedures, views, or other objects that might be
found in a RDBMS
▪ Usually, denormalized
▪ Splits a table into partitions (grouping related rows, based on common property or
partition key)

Azure Cosmos DB
▪ NoSQL
▪ Fully managed, serverless distributed DB
▪ Support both relational and non-relational workloads
▪ Use indexes and partitioning for fast read and write performance
▪ Index – created and maintained automatically
▪ Each partition can grow up to 10 GB
▪ Can enable multi-region writes (globally distributed users can work with data in their
local replica)
▪ Highly suitable for the following scenarios
o IoT and telematics
o Retail and marketing
o Gaming
o Web and mobile applications

Supported APIs
▪ Azure Cosmos DB for NoSQL
o JSON document format
▪ Azure Cosmos DB for MongoDB
o Able to use MongoDB client libraries
▪ Azure Cosmos DB for PostgreSQL
▪ Azure Cosmos DB for Table
o Work with data in key-value tables (like Azure Table Storage)
▪ Azure Cosmos DB for Apache Cassandra
o Column-family storage structure
o Not mandatory for every row to have the same columns
▪ Azure Cosmos DB for Apache Gremlin
o Graph structure
o Entities – vertices that form nodes in a connected graph
o Nodes are connected by edges – relationships
Data Analytics in Azure
General data analytics architecture

1. Data ingestion and processing

a. Data from different sources is loaded into a data lake or a relational data
warehouse
b. ETL – data is cleaned, filtered, and restructured for analysis
c. 2 types of processing
i. Batch processing – static data
ii. Real-time processing – streaming data
2. Analytical data store
a. Relational data warehouses
b. File-system-based data lakes
c. A hybrid architecture that combines features of the data warehouse and data lake
(data lakehouses/lake databases)
3. Analytical data model
a. Aggregate data (aka cubes)
b. Encapsulates the relationships between data values and dimensional entities to
support drill-up/drill-down analysis
4. Data visualization
a. Consume data from the analytical model and from analytical stores to create the
reports, dashboards, and other visualizations

Batch Processing
▪ Multiple data records are collected and stored before being processed together in a
single operation
▪ Advantages
o Large volumes of data can be processed at a convenient time
o Can be scheduled to run
▪ Disadvantages
o Time delay between ingesting data and getting results
o All input data must be ready before a batch can be processed
o Data problems, errors, and program crashes bring the whole process to a halt

Stream Processing
▪ The source of data is constantly monitored and processed in real-time as new data
events occur
▪ Can aggregate data over time intervals

Real-time analytics services

▪ Azure Stream Analytics
o PaaS
o Can define streaming jobs that ingest data from a streaming source, apply a
perpetual query, and write results to an output
▪ Spark Structured Streaming
o Open-source library
o Enable to develop streaming solutions on Apache Spark-based service (Microsoft
Fabric and Azure Databricks)
▪ Microsoft Fabric

Sources for stream processing

▪ Azure Event Hubs
o Data ingestion service to manage queues of event data
▪ Azure IoT Hub
o Data ingestion service optimized to manage event data from IoT devices
▪ Azure Data Lake Store Gen 2
o Highly scalable storage service, often used in the batch processing scenarios
o Can be used as a source of streaming data
▪ Apache Kafka
o Open-source data ingestion solution
o Commonly used together with Apache Spark

Sinks for streaming processing

▪ Azure Event Hubs
▪ Azure Data Lake Store Gen 2, Microsoft OneLake, Azure blob storage
o Used to persist processed results as a file
▪ Azure SQL Database, Azure Databricks, Microsoft Fabric
o Used to persist processed results in a table for querying and analysis
▪ Microsoft Power BI
o Used to generate real-time data visualizations in reports and dashboards
Delta Lake
▪ Open-source storage layer that adds support for transactional consistency, schema
enforcement, and other common data warehousing features to data lake storage.
▪ Combined with Spark Structured Streaming is a good solution when needing to abstract
batch and stream processed data in a data lake behind a relational schema for SQL-
based querying and analysis

Microsoft Power BI
▪ Power BI Desktop
o Can import data from a wide range of data sources
o Combine and organize data in an analytics data model
o Create reports
▪ Power BI Service
o A cloud service in which reports can be published and interacted with by business
users
▪ Power BI Phone App
o Users can consume reports, dashboards, and apps in the Power BI service
through a web browser or mobile device

Dimensions – used to aggregate data

Fact tables – contain measures that are aggregated

Relationship – used to tie facts to dimensions

Keys – used to establish relationships

Choosing for data visualization

▪ Tables and text
o Tables – when numerous related values must be displayed
o Individual text in cards – when to show important figures or metrics

▪ Bar and column charts

o Good way to visually compare numeric values for discrete categories

▪ Line charts
o Used to compare categorized values
o When need to examine trends, often over time

▪ Pie charts
o To visually compare categorized values as proportions of a total

▪ Scatter plots
o To compare two numeric measures and identify a relationship or correlation
between them

▪ Maps
o To visually compare values for different geographic areas or locations
Azure Synapse Analytics

Linked Services
- to integrate commonly used data stores, processing platforms, and visualization tools

Synapse Analytics workspace

- an instance of the Synapse Analytics service

Two types of SQL pools

- Serverless pool (build-in) – to query file-based data in a data lake
- Dedicated SQL pool (custom) – host relational data warehouses

Data Explorer
- uses Kusto Query Language (KQL)

Azure Synapse Link

- near-real-time synchronization between operational data in Azure Cosmos DB, Azure
SQL Database, SQL Server, and Microsoft Power Platform Dataverse, and analytical
data storage

Resources
▪ Microsoft Certified: Azure Data Fundamentals
▪ Study guide for Exam DP-900: Microsoft Azure Data Fundamentals

Module 1 Notes
No ratings yet
Module 1 Notes
7 pages
Azure Data Fundamentals Guide
No ratings yet
Azure Data Fundamentals Guide
264 pages
Module 1
No ratings yet
Module 1
11 pages
DP900 Chapter1 Notes
No ratings yet
DP900 Chapter1 Notes
10 pages
DP 900 Data Fundamentals 1710103456
No ratings yet
DP 900 Data Fundamentals 1710103456
35 pages
Microsoft Azure Data Fundamentals
No ratings yet
Microsoft Azure Data Fundamentals
6 pages
Microsoft Azure Data Fundamentals
No ratings yet
Microsoft Azure Data Fundamentals
97 pages
Azure Data Concepts for Beginners
No ratings yet
Azure Data Concepts for Beginners
8 pages
Data Fundamentals DP-900 Guide
No ratings yet
Data Fundamentals DP-900 Guide
37 pages
Data Management & Analytics Guide
No ratings yet
Data Management & Analytics Guide
24 pages
Azure Data Solution Exam Prep
No ratings yet
Azure Data Solution Exam Prep
108 pages
Azure Data Fudamentals - DP 900
No ratings yet
Azure Data Fudamentals - DP 900
181 pages
Unit 1
No ratings yet
Unit 1
111 pages
Algorithm and Data Structure Lecture 1a
No ratings yet
Algorithm and Data Structure Lecture 1a
4 pages
Session - 6 - Complex Data Types
No ratings yet
Session - 6 - Complex Data Types
27 pages
Fundamental Concepts - 20.02.23 To 24.02.23 (Autosaved)
No ratings yet
Fundamental Concepts - 20.02.23 To 24.02.23 (Autosaved)
54 pages
Introduction
No ratings yet
Introduction
11 pages
Introduction and Scope of DBMS Subject
No ratings yet
Introduction and Scope of DBMS Subject
47 pages
Unit 2
No ratings yet
Unit 2
34 pages
Unit - Big - Data
No ratings yet
Unit - Big - Data
107 pages
CH 3 Data Modeling
No ratings yet
CH 3 Data Modeling
31 pages
ADBMS
No ratings yet
ADBMS
8 pages
Untitled Document.
No ratings yet
Untitled Document.
7 pages
Azure Data Engineering Complete Guide
No ratings yet
Azure Data Engineering Complete Guide
130 pages
CloudComputing DATABASE
No ratings yet
CloudComputing DATABASE
27 pages
Multimedia Database Report
No ratings yet
Multimedia Database Report
28 pages
BUAN6320 - Chapter 2 & 9
No ratings yet
BUAN6320 - Chapter 2 & 9
55 pages
Big Data & Analytics (CSE448) L1
No ratings yet
Big Data & Analytics (CSE448) L1
50 pages
Semi Structured Associative Database
No ratings yet
Semi Structured Associative Database
5 pages
Data Science Guide by Viktoriia Burko
No ratings yet
Data Science Guide by Viktoriia Burko
32 pages
Unit 4 DigitalData
No ratings yet
Unit 4 DigitalData
22 pages
Presentation1 DP900
No ratings yet
Presentation1 DP900
34 pages
Comprehensive Guide to DBMS Basics
No ratings yet
Comprehensive Guide to DBMS Basics
52 pages
CSC4404 Chap3
No ratings yet
CSC4404 Chap3
84 pages
Big Data & Analytics (CSE448) L1
No ratings yet
Big Data & Analytics (CSE448) L1
51 pages
BSC Unit 1
No ratings yet
BSC Unit 1
10 pages
DA (Unit 1)
No ratings yet
DA (Unit 1)
45 pages
SQL Notes
No ratings yet
SQL Notes
45 pages
SQL ANalyst by CT Taylor Part 4
No ratings yet
SQL ANalyst by CT Taylor Part 4
5 pages
Selected Topic: Data Modeling and Management: What Are You Thinking of When We Talk About ?
No ratings yet
Selected Topic: Data Modeling and Management: What Are You Thinking of When We Talk About ?
28 pages
Bda Unit 1 - Mam
No ratings yet
Bda Unit 1 - Mam
198 pages
Chapter Two Database System Architecture
No ratings yet
Chapter Two Database System Architecture
39 pages
Data Models
No ratings yet
Data Models
6 pages
Overview of Databases Languages and Architectures
No ratings yet
Overview of Databases Languages and Architectures
28 pages
Session 5-Azure Components
No ratings yet
Session 5-Azure Components
28 pages
Info125 Notes1
No ratings yet
Info125 Notes1
54 pages
Module 2
No ratings yet
Module 2
40 pages
Introduction to Data Structures
No ratings yet
Introduction to Data Structures
60 pages
Introduction To Data Models 677e35511a823
No ratings yet
Introduction To Data Models 677e35511a823
45 pages
DM Theory
No ratings yet
DM Theory
31 pages
Object Oriented Dabatases
No ratings yet
Object Oriented Dabatases
48 pages
NOSQL
No ratings yet
NOSQL
55 pages
Database Fundamentals by Natan Asrat
No ratings yet
Database Fundamentals by Natan Asrat
27 pages
Database System Concepts Guide
No ratings yet
Database System Concepts Guide
22 pages
Database System Concepts and Architecture
No ratings yet
Database System Concepts and Architecture
44 pages
SQL 2
No ratings yet
SQL 2
7 pages
Pyq DMDW
No ratings yet
Pyq DMDW
8 pages
Data Modeling
No ratings yet
Data Modeling
14 pages
Manual - Excel Masterclass 1 - DS7
No ratings yet
Manual - Excel Masterclass 1 - DS7
4 pages
Dbms Unit-5 Notes
100% (2)
Dbms Unit-5 Notes
27 pages
SAP OS/DB Migration Guide
No ratings yet
SAP OS/DB Migration Guide
11 pages
Karthik: Sr. AWS Data Engineer Expertise
No ratings yet
Karthik: Sr. AWS Data Engineer Expertise
8 pages
ASP.NET Lab Exercises Guide
No ratings yet
ASP.NET Lab Exercises Guide
9 pages
Answer Key (WBP)
No ratings yet
Answer Key (WBP)
14 pages
Transaction Management Database Recovery
No ratings yet
Transaction Management Database Recovery
19 pages
Index Creation
No ratings yet
Index Creation
30 pages
JMeter Setup & Testing Guide
No ratings yet
JMeter Setup & Testing Guide
6 pages
Unit 5 Java
No ratings yet
Unit 5 Java
23 pages
Model Question Paper of Data Base Management System
No ratings yet
Model Question Paper of Data Base Management System
2 pages
Data Wrangling
No ratings yet
Data Wrangling
18 pages
Fundamentals of Spatial Information Systems
No ratings yet
Fundamentals of Spatial Information Systems
12 pages
Informatica - Power Center - Lesson 2
No ratings yet
Informatica - Power Center - Lesson 2
31 pages
Core Extensions For Postgresql Performance Tuning
No ratings yet
Core Extensions For Postgresql Performance Tuning
4 pages
ABAP Standards for MIT Developers
No ratings yet
ABAP Standards for MIT Developers
36 pages
Harshit Project File
No ratings yet
Harshit Project File
10 pages
Domain Constraints in Database Management Systems
No ratings yet
Domain Constraints in Database Management Systems
2 pages
Discouraged Use of IAL Views
No ratings yet
Discouraged Use of IAL Views
4 pages
Exam Has 50 Questions
No ratings yet
Exam Has 50 Questions
20 pages
Power Bi Guidance
No ratings yet
Power Bi Guidance
1,292 pages
Scenario Overview - 20250525 - 233522 - 0000
No ratings yet
Scenario Overview - 20250525 - 233522 - 0000
3 pages
Programming in Python 4639304
50% (2)
Programming in Python 4639304
41 pages
DBMS Analysis for Engineering Students
No ratings yet
DBMS Analysis for Engineering Students
12 pages
Advancedchapter 2 2013
No ratings yet
Advancedchapter 2 2013
16 pages
Understanding Entities in DBMS
No ratings yet
Understanding Entities in DBMS
9 pages
Assignment 1A
No ratings yet
Assignment 1A
13 pages
Extended ER Model
No ratings yet
Extended ER Model
4 pages
5 6061978178002355489
No ratings yet
5 6061978178002355489
25 pages
Training Module Access
No ratings yet
Training Module Access
137 pages

Azure Data Fundamentals - Study Notes

Uploaded by

Azure Data Fundamentals - Study Notes

Uploaded by

Azure Data Fundamentals

January 24, 2025

▪ Delimited text files

Transactional Data Processing

Analytical Data Processing

Online Analytical Processing (OLAP) model

Database Administrator - Manage databases

Data Engineer - Manage infrastructure and processes

Data Analyst - Explore and analyze data

Relational Data in Azure

▪ Data Definition Language (DDL)

Non-Relational Data in Azure

Azure Data Lake Storage (ADLS) Gen1

Azure Data Lake Storage (ADLS) Gen2

1. Data ingestion and processing

Real-time analytics services

Sources for stream processing

Sinks for streaming processing

Dimensions – used to aggregate data

Fact tables – contain measures that are aggregated

Relationship – used to tie facts to dimensions

Keys – used to establish relationships

Choosing for data visualization

▪ Bar and column charts

Synapse Analytics workspace

Two types of SQL pools

Azure Synapse Link

You might also like