MANDATORY MODULES
PYTHON PYSPARK & DATABRICKS
Introduction to Python Fundamentals & Setup Delta Lake Advanced Features
Python Basics Introduction Delta Lake Architecture:
Overview of Apache Spark Transaction log (DeltaLog), Table
CLOUD DATA
Control Flow
Setting Up Spark Environment versioning, and time travel
Functions
Installing Apache Spark Advanced Delta Lake Operations:
Data Structures Cloud Configuration for Spark Merge (upserts), delete, update,
File Handling Spark Core Concepts Change Data Feed (CDF), Optimize,
ENGINEERING Modules & Packages
Object-Oriented Programming
Spark Architecture Concepts
Core Components of Spark
Understanding Spark Execution
Z-Ordering, Auto Compaction, and
Vacuum
Schema Management: Enforced
Functional Programming
PATH
Model schemas, schema evolution,
Advanced Python Features RDDs (Resilient Distributed Datasets) handling corrupted and malformed
Python Best Practices Resilient Distributed Datasets records
Testing in Python (RDDs) Concurrency and Isolation: Conflict
Project: Command-line Creating and Transforming RDDs resolution in concurrent writes
Application Broadcast Variables and Production-Grade Data Pipelines
JOB Accumulators Job Scheduling: Databricks Jobs,
SQL Security Workflows, Task orchestration, and
Spark Security Best Practices multi-task jobs
Introduction to SQL User Authentication and Error Handling and Recovery: Retry
SQL Basics Authorization in Spark policies, handling nulls, corrupt
Database Fundamentals Streaming & Real-Time Processing records, dead letters
Database Design Spark Streaming Monitoring and Logging: Job runs,
h Real-Time Data Processing with alerts, notebook logs
a t Database Performance
P Spark Unity Catalog
i ng CLOUD Data Validation Structured Streaming Data Governance Concepts
e er
n SPECIALISATION Stream Processing in Spark Catalogs, Schemas, Tables, Views
gi DATAWAREHOUSE
a En Specialise in any Performance & Optimization Access Controls and Audits
t
Da one cloud
OLAP vs OLTP Performance Optimization Managing Privileges (GRANT /
Techniques REVOKE)
What is a Data Warehouse?
Monitoring and Tuning Spark
Difference between Data Applications
PYSPARK & Warehouse, Data Lake, and Performance Metrics in Spark
DATABRICKS Data Mart DataFrames & Spark SQL
SQL & Fact Tables Spark DataFrame API
Dimension Tables Data Manipulation with DataFrames
DATA Spark SQL
Slowly Changing Dimensions
WAREHOUSE Executing SQL Queries in Spark
(SCDs) Reading and Writing Different File
PYTHON Types of SCDs Formats
Star Schema Design Working with CSV and JSON in
Snowflake Schema Design Spark
Data Warehousing Case Parquet File Format and
Optimizations
Studies
A Roadmap to Successful Journey in Data Engineering
+91 900-038-4889 / +91 789-351-3124 www.levelupedu.net / levelupgenai.com
AZURE DATA ENGINEERING AWS DATA ENGINEERING GCP DATA ENGINEERING
Introduction to Cloud Computing and Introduction to Cloud Computing Introduction to GCP and Data Engineering
Microsoft Azure Cloud Computing Deployment Models Google Cloud Storage
Azure Storage and Operations Types of Cloud Computing Services Google Cloud Storage (Continued)
Azure Data Lake AWS Fundamentals Google BigQuery Introduction
BigQuery Data Manipulation
CHOOSE ANY ONE CLOUD
Azure Data Factory AWS Cloud Architecture Design Principles
Azure Databricks Databases in AWS BigQuery Advanced Queries
BigQuery Performance Optimization
Azure Cosmos DB Data Warehousing in AWS
Google Cloud Dataflow Introduction
Azure Synapse AWS Services Overview
Dataflow Pipeline Development
Azure Security AWS Step Functions
Advanced Dataflow Concepts
Azure Stream Analytics Amazon Kinesis and Data Analytics Services Google Cloud Pub/Sub
Azure Service Fabric Amazon Kinesis Data Firehose Pub/Sub Advanced Topics
Azure Logic Apps Amazon SQS (Simple Queue Service) Google Cloud Dataproc Introduction
Azure Key Vault IoT in AWS and Big Data (AWS IoT Greengrass) Spark on Dataproc
Azure CI/CD AWS Data Pipeline Dataproc Advanced Topics
Serving Layer Design and Implementation AWS Big Data Storage Services (S3, Glacier, Google Cloud Data Fusion Introduction
Work on Data using Azure Synapse Analytics Snowball) Data Fusion Transformations
(SQL Pools) NoSQL Databases (DynamoDB) Data Fusion Orchestration
Work on Data using Azure Synapse Analytics Amazon Redshift Serverless & ML Integration Data Governance in GCP
(Apache Spark) AWS DMS & Aurora Security in GCP
Data Exploration and Transformation in Azure Amazon Athena Machine Learning on GCP
BigQuery ML
Databricks AWS Big Data Processing Services (EMR,
Data Visualization with Data Studio
Ingest and Load Data into the Data Hadoop, Hive, HBase, Spark, Presto)
Advanced Data Studio Techniques
Warehouse AWS Lambda in Big Data Ecosystem
DataOps Concepts
Transform Data with Azure Data Factory or Amazon Redshift Deep Dive
Real-time Data Processing
Azure Synapse Pipelines Amazon Machine Learning & Amazon Data Lake Architecture
Optimize Query Performance with Dedicated SageMaker GCP Networking
SQL Pools in Azure Synapse Analytics Amazon Elasticsearch & Logstash Advanced GCP Networking
Cosmos DB RStudio on AWS GCP Logging and Monitoring
End-to-End Security with Azure Synapse AWS Glue – ETL and Data Catalog Cost Management in GCP
Analytics Amazon QuickSight – Data Visualization Data Migration Strategies
Real-Time Stream Processing with Azure Other Visualization Tools (e.g., Kibana) Hybrid and Multi-Cloud Architectures
Stream Analytics AWS Big Data Security (EMR, Redshift, KMS, BigQuery Data Security
Create a Stream Processing Solution with STS, CloudTrail) Building ETL Pipelines
Event Hubs and Azure Databricks Real-Time Analytics on Streaming Data GCP Certifications and Career Pathways
Power BI Using Its Integration with Azure Batch Time Analysis of Transactional Data Data Ethics and Privacy
GCP Case Studies
Synapse Analytics
Capstone Project Introduction
Perform Integrated Machine Learning
Capstone Project Development
Processes in Azure Synapse Analytics
Capstone Project Review
Airflow Capstone Project Presentation
Snowflake Final Exam Preparation
Final Exam
Course Conclusion and Next Steps
SPECIALISATION MODULES
+91 900-038-4889 / +91 789-351-3124 www.levelupedu.net / levelupgenai.com