0% found this document useful (0 votes)

128 views12 pages

Data Steaming Sylll

This document provides an overview of the Data Streaming Nanodegree program from Udacity. The goal of the program is to teach students how to process data in real-time using tools like Apache Spark, Kafka, Spark Streaming, and Kafka Streaming. The program consists of 2 courses and 2 projects, taking an estimated 2 months at 5-10 hours per week to complete. The first course covers foundations of data streaming and SQL/data modeling, and the first project involves using Kafka to optimize Chicago bus and train availability by streaming public transit status data.

Uploaded by

harsh varudkar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

128 views12 pages

Data Steaming Sylll

Uploaded by

harsh varudkar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

NANODEGREE PROGR AM SYLL ABUS

Data Streaming

Need Help? Speak with an Advisor: www.udacity.com/advisor

Overview
The ultimate goal of the Data Streaming Nanodegree program is to provide students with the latest skills to
process data in real-time by building fluency in modern data engineering tools, such as Apache Spark,Kafka,
Spark Streaming, and Kafka Streaming. A graduate of this program will be able to:

• Understand the components of data streaming systems. Ingest data in real-time using Apache Kafka
and Spark and run analysis
• Use the Faust Stream Processing Python library to build a real-time stream-based application. Compile
real-time data and run live analytics, as well as draw insights from reports generated by the streaming
console.
• Learn about the Kafka ecosystem, and the types of problems each solution is designed to solve. Use
the Confluent Kafka Python library for simple topic management, production, and consumption.
• Explain the components of Spark Streaming (architecture and API), integrate Apache Spark Structured
Streaming and Apache Kafka, manipulate data using Spark, and understand the statistical report
generated by the Structured Streaming console.

This program is comprised of 2 courses and 2 projects. Each project you build will be an opportunity to
demonstrate what you’ve learned in the course, and will demonstrate to potential employers that you have
skills in these areas.

Prerequisite Knowledge: Intermediate SQL, Python, and experience with ETL. Basic familiarity with
traditional batch processing and traditional service architectures is desired, but not required.

Estimated Time: Prerequisites:

2 Months at Intermediate
5-10 hrs/week SQL, Python, and
experience with
ETL

Flexible Learning: Need Help?

Self-paced, so udacity.com/advisor
you can learn on Discuss this program
the schedule that with an enrollment
works best for you. advisor.

Need Help? Speak with an Advisor: www.udacity.com/advisor Data Streaming | 2

Course 1: Foundations of Data Streaming,
and SQL & Data Modeling for the Web
The goal of this course is to demonstrate knowledge of the tools taught throughout, including Kafka
Consumers, Producers, & Topics; Kafka Connect Sources and Sinks, Kafka REST Proxy for producing data
over REST, Data Schemas with JSON and Apache Avro/Schema Registry, Stream Processing with the Faust
Python Library, and Stream Processing with KSQL.

For your first project, you’ll be streaming public transit status

using Kafka and the Kafka ecosystem to build a stream processing
application that shows the status of trains in real-time. Based on the
skills you learn, you will be able to optimize the availability of buses
and trains in Chicago based on streaming data. You will learn how
to have your own Python code produce events, use REST Proxy to
Course Project send events over HTTP, and use Kafka Connect to collect data from
Optimize Chicago Bus a Postgres database to produce streaming data from a number of
and Train Availability sources into Kafka. Then, you will use KSQL to combine related data
models into a single topic ready for consumption by the downstream
Using Kafka
Python applications, and complete a simple Python application that
ingests data from the Kafka topics for analysis. Finally, you will use
the Faust Python Stream Processing library to further transform train
station data into a more streamlined representation: using stateful
processing, this library will show whether passenger volume is
increasing, decreasing, or staying steady.

LEARNING OUTCOMES

• Describe and explain streaming data stores and stream

processing
• Describe and explain real-world usages of stream processing
Introduction to
LESSON ONE • Describe and explain append-only logs, events, and how
Stream Processing
stream processing differs from batch processing
• Utilize Kafka CLI tools and the Confluent Kafka Python library
for topic management, production, and consumption

• Understand Kafka architecture, topics, and

configuration
• Utilize Confluent Kafka Python to create topics and
configuration
LESSON TWO Apache Kafka
• Understand Kafka producers, consumers, and
configuration
• Utilize Confluent Kafka Python to create producers and
configuration

Need Help? Speak with an Advisor: www.udacity.com/advisor Data Streaming | 3

LEARNING OUTCOMES

• Utilize Confluent Kafka Python to create topics, configuration,

and manage offsets
LESSON TWO
Apache Kafka • Describe and explain user privacy considerations
(CONTINUED)
• Describe and explain performance monitoring for consumers,
producers, and the cluster itself

• Understand what a data schema is and the value it provides

• Understand what Apache Avro is and what value it provides
• Utilize AvroProducer and AvroConsumer in Confluent Kafka
Data Schemas and Python
LESSON THREE
Apache Avro • Describe and explain schema evolution and data compatibility
types
• Utilize Schema Registry components in Confluent Kafka Python
to manage compatibility

• Describe and explain what problem Kafka Connect solves

for and where it would be more appropriate than a traditional
consumer
• Describe and explain common connectors and how they work
• Utilize Kafka Connect FIleStream & JDBC Source and Sink
Kafka Connect and • Describe and explain what problem Kafka REST Proxy solves
LESSON FOUR
REST Proxy for and where it would be more appropriate than alternatives
• Describe, explain, and utilize the REST Proxy metadata and
administrative APIs
• Describe and explain the REST Proxy consumer APIs
• Utilize the REST Proxy consumer, subscription, and offset APIs
• Describe, explain, and utilize the REST Proxy producer APIs

• Describe and explain common scenarios for stream processing,

and where you would use stream versus batch
• Describe and explain common stream processing strategies
• Describe and explain how time and windowing works in stream
Stream Processing
LESSON FIVE processing
Fundamentals
• Describe and explain what a stream versus a table is in stream
processing, and where you would use on over the other
• Describe and explain how data storage works in stream
processing applications and why it is needed

Need Help? Speak with an Advisor: www.udacity.com/advisor Data Streaming | 4

LEARNING OUTCOMES

• Describe and explain the Faust Stream Processing Python

library, and how it fits into the ecosystem relative to
solutions
like Kafka Streams
• Describe and explain Faust stream-based processing
• Utilize Faust to create a stream-based application
Stream Processing
LESSON SIX • Describe and explain how Faust table-based processing
with Faust
works
• Utilize Faust to create a table-based application
• Describe and explain Faust processors and function usage
• Utilize Faust processor and function
• Describe and explain Faust serialization and deserialization
• Utilize Faust serialization and deserialization

• Describe and explain how KSQL fits into the Kafka

ecosystem, and why you would choose it over a stream
processing application built from scratch
• Describe and explain KSQL architecture
• Describe and explain how to create KSQL streams and
tables from topics. Understand the importance of KEY and
schema transformations
LESSON SEVEN KSQL
• Utilize KSQL to create tables and streams
• Describe and explain KSQL selection syntax
• Utilize KSQL syntax to query tables and streams
• Describe and explain KSQL windowing
• Utilize KSQL windowing within the context of table analysis
• Describe and explain KSQL grouping and aggregates
• Utilize KSQL grouping and aggregates within queries

Need Help? Speak with an Advisor: www.udacity.com/advisor Data Streaming | 5

Course 2: Streaming API Development and
Documentation
The goal of this course is to grow your expertise in the components of streaming data systems, and build a real-
time analytics application. Specifically, you will be able to: explain components of Spark Streaming (architecture
and API), ingest streaming data to Apache Spark Structured Streaming and perform analysis, integrate Apache
Spark Structured Streaming and Apache Kafka, and understand the statistical report generated by the Structured
Streaming console.

In this project, you will analyze a real-world dataset of the SF Crime

Course Project Rate, extracted from kaggle, to provide statistical analysis using
Apache Spark Structured Streaming. You will be provided with
Analyze San Francisco dataset, and use a Kafka server locally to produce and ingest data
Crime Rate with Apache through Spark Structured Streaming. Then, you will use various APIs
Spark Streaming to create and execute logics. You will create an ETL pipeline that
produces Kafka data and ingests the data through Spark. Finally,
you will generate a meaningful statistical report from the data.

LEARNING OUTCOMES

• Describe and explain the big data ecosystem

• Describe and explain the hardware behind big data
LESSON ONE The Power of Spark
• Describe and explain distributed systems
• Understand when to use Spark and when not to use it

• Manipulate data using Functional Programming

Data Wranglng with • Manipulate data using Maps and Lambda functions
LESSON TWO
Spark • Read and write data into SparkSQL and Spark dataframes
• Manipulate data using Spark for ETL purposes

• Set up a Spark cluster on AWS (transition from local to

distributed mode)
• Upload and retrieve data on AWS Cloud using Jupyter
Notebook
Debugging and
LESSON THREE • Submit data using Python notebook
Optimization
• Read and write data using distributed data storage,
Amazon S3, and HDFS
• Diagnose, correct errors, and optimize code using Spark
WebUI and Accumulators

Need Help? Speak with an Advisor: www.udacity.com/advisor Data Streaming | 6

LEARNING OUTCOMES

• Learn Apache Fundamental’s core building blocks (RDD/

Introduction to Dataframe/Dataset)
LESSON FOUR
Spark Streaming • Review Action/Transformation functions and learn how
these concepts apply in streaming

• Understand the concept of lazy evaluation

Structured
LESSON FIVE • Describe different join types between streaming and static
Streaming APIs
dataframes

• Describe Kafka Source Provider

• Describe Kafka Offset Management
• Describe Triggers in Spark Streaming
Integration of Spark
LESSON SIX • Describe Progress Report in Spark Console to analyze
Streaming and Kafka
batches in Kafka
• Understand sample business architectures and learn how
to tune them for best performance from examples

Need Help? Speak with an Advisor: www.udacity.com/advisor Data Streaming | 7

Our Classroom Experience
REAL-WORLD PROJECTS
Build your skills through industry-relevant projects. Get
personalized feedback from our network of 900+ project
reviewers. Our simple interface makes it easy to submit
your projects as often as you need and receive unlimited
feedback on your work.

KNOWLEDGE
Find answers to your questions with Knowledge, our
proprietary wiki. Search questions asked by other students
and discover in real-time how to solve the challenges that
you encounter.

STUDENT HUB
Leverage the power of community through a simple, yet
powerful chat interface built within the classroom. Use
Student Hub to connect with your technical mentor and
fellow students in your Nanodegree program.

WORKSPACES
See your code in action. Check the output and quality of
your code by running them on workspaces that are a part
of our classroom.

QUIZZES
Check your understanding of concepts learned in the
program by answering simple and auto-graded quizzes.
Easily go back to the lessons to brush up on concepts
anytime you get an answer wrong.

CUSTOM STUDY PLANS

Work with a mentor to create a custom study plan to suit
your personal needs. Use this plan to keep track of your
progress toward your goal.

PROGRESS TRACKER
Stay on track to complete your Nanodegree program with
useful milestone reminders.

Need Help? Speak with an Advisor: www.udacity.com/advisor Data Streaming | 8

Learn with the Best

Ben Goldberg Judit Lantos

S TA F F E N G I N E E R S E N I O R DATA E N G I N E E R
AT S P OT H E R O AT N E T F L I X
In his career as an engineer, Ben Goldberg Currently, Judit is a Senior Data Engineer
has worked in fields ranging from at Netflix. Formerly a Data Engineer at
Computer Vision to Natural Language Split, where she worked on the statistical
Processing. At SpotHero, he founded engine of their full-stack experimentation
and built out their Data Engineering platform, she has also been an instructor
team, using Airflow as one of the key at Insight Data Science, helping software
technologies. engineers and academic coders transition
to DE roles.

David Drummond Jillian Kim

VP OF ENGINEERING S E N I O R DATA E N G I N E E R
AT I N S I G H T AT C H A N G E H E A LT H C A R E
David is VP of Engineering at Insight where Jillian has worked in roles from building data
he enjoys breaking down difficult concepts analytics platforms to machine learning
and helping others learn data engineering. pipelines. Previously, she was a research
David has a PhD in Physics from UC engineer at Samsung focused on data
Riverside. analytics and ML, and now leads building
pipelines at scale as a Senior Data Engineer at
Change Healthcare.

Need Help? Speak with an Advisor: www.udacity.com/advisor Data Streaming | 9

All Our Nanodegree Programs Include:

EXPERIENCED PROJECT REVIEWERS

RE VIE WER SERVICES

• Personalized feedback
• Unlimited submissions and feedback loops
• Practical tips and industry best practices
• Additional suggested resources to improve

INDIVIDUAL 1-ON-1 MENTORSHIP

M E N TO R S H I P S E R V I C E S

• 6+ hrs of mentor support per month

• Weekly 1-on-1 personal mentor calls
• 1-on-1 mentor chats anytime
• Custom weekly learning plan focused on your
progress, goals and availability
• Daily progress tracking
• Proactive check-ins with you
• Mentors are compensated based on your
progress and success

PERSONAL CAREER SERVICES

C A R E E R CO A C H I N G

• Personal assistance in your job search

• Monthly 1-on-1 calls
• Personalized feedback and career guidance
• Access Udacity Talent Program used by our
network of employers to source candidates
• Advice on negotiating job offers
• Interview preparation
• Resume services
• Github portfolio review
• LinkedIn profile optimization

Need Help? Speak with an Advisor: www.udacity.com/advisor Data Streaming | 10

Frequently Asked Questions
PROGR AM OVERVIE W

WHY SHOULD I ENROLL?

As businesses increasingly rely on applications that produce and process data in
real-time, data streaming is an increasingly in-demand skill for data engineers.
The Data Streaming Nanodegree program will prepare you for the cutting edge of
data engineering as more and more companies look to derive live insights from
data at scale.

Students will learn how to process data in real-time by building fluency in modern
data engineering tools, such as Apache Spark, Kafka, Spark Streaming, and Kafka
Streaming.

You’ll start by understanding the components of data streaming systems. You’ll

then build a real-time analytics application. You will also compile data and run
analytics, as well as draw insights from reports generated by the streaming
console.

WHAT JOBS WILL THIS PROGRAM PREPARE ME FOR?

This program is designed to upskill experienced Software Engineers and Data
Engineers to learn the latest advancements in data processing, sending data
records continuously to support live updating.

The projects in the Data Streaming Nanodegree program will prepare you to
develop systems and applications capable of interpreting data in real-time,
and position you for roles in all industries that require live data processing
for functions including big data, cloud computing, web personalization, fraud
detection, sensor monitoring, anomaly detection, supply chain maintenance,
location-based services, and much more.

HOW DO I KNOW IF THIS PROGRAM IS RIGHT FOR ME?

This program is intended for software engineers looking to build real-time data
processing proficiency, as well as data engineers looking to enhance their existing
skill set with the next advancement in data engineering.

ENROLLMENT AND ADMISSION

DO I NEED TO APPLY? WHAT ARE THE ADMISSION CRITERIA?

There is no application. This Nanodegree program accepts everyone,
regardless of experience and specific background.

WHAT ARE THE PREREQUISITES FOR ENROLLMENT?

The Data Streaming Nanodegree program is designed for students with
intermediate Python and SQL skills, as well as experience with ETL.

Need Help? Speak with an Advisor: www.udacity.com/advisor Data Streaming | 11

FAQs Continued
Basic familiarity with traditional batch processing and basic conceptual
familiarity with traditional service architectures is desired, but not required.

IF I DO NOT MEET THE REQUIREMENTS TO ENROLL, WHAT SHOULD I DO?

Udacity’s Programming for Data Science with Python Nanodegree program is
great preparation for the Data Engineer Nanodegree program. You’ll learn to
code with Python and SQL.

Similarly, the Data Engineering Nanodegree program is great preparation for the
Data Streaming Nanodegree program.

TUITION AND TERM OF PROGR AM

HOW IS THIS NANODEGREE PROGRAM STRUCTURED?

The Data Streaming Nanodegree program is comprised of content and
curriculum to support two projects. We estimate that students can complete the
program in two months, working five to ten hours per week.

Each project will be reviewed by the Udacity reviewer network. Feedback will be
provided, and if you do not pass the project, you will be asked to resubmit the
project until it passes.

HOW LONG IS THIS NANODEGREE PROGRAM?

Access to this Nanodegree program runs for the length of time specified in
the payment card on the Nanodegree program overview page. If you do not
graduate within that time period, you will continue learning with month to
month payments. See the Terms of Use for other policies around the terms of
access to our Nanodegree programs.

CAN I SWITCH MY START DATE? CAN I GET A REFUND?

Please see the Udacity Nanodegree program FAQs for policies on enrollment in
our programs.

S O F T WA R E A N D H A R D WA R E

WHAT SOFTWARE AND VERSIONS WILL I NEED IN THIS PROGRAM?

There are no software and version requirements to complete this Nanodegree
program. All coursework and projects can be completed via Student Workspaces
in the Udacity online classroom.

Need Help? Speak with an Advisor: www.udacity.com/advisor Data Streaming | 12

Apache Doris Docs (English) - Compressed
No ratings yet
Apache Doris Docs (English) - Compressed
1,714 pages
Bachelor of Business Administration (COMP. APPLI.) (2019 Pattern)
No ratings yet
Bachelor of Business Administration (COMP. APPLI.) (2019 Pattern)
99 pages
Acolite Manual 20221114.0
No ratings yet
Acolite Manual 20221114.0
45 pages
Amazon SAP ERP Case Project
0% (1)
Amazon SAP ERP Case Project
5 pages
Introduction To System Analysis and Design
100% (2)
Introduction To System Analysis and Design
22 pages
205CS006
No ratings yet
205CS006
1 page
Chapter 12-CL-2oP PDF
No ratings yet
Chapter 12-CL-2oP PDF
32 pages
YoozRising RestAPI User Manual EN
No ratings yet
YoozRising RestAPI User Manual EN
24 pages
OffensiveCon2018 - The Return of Robin Hood Vs Cisco ASA
No ratings yet
OffensiveCon2018 - The Return of Robin Hood Vs Cisco ASA
159 pages
FSD - Internet Ordering System-1
No ratings yet
FSD - Internet Ordering System-1
31 pages
Log
No ratings yet
Log
30 pages
ANDTEK Product Lifecycle Policy - 1.6
No ratings yet
ANDTEK Product Lifecycle Policy - 1.6
12 pages
7 Best Practices For Building Data Applications On Snowflake
No ratings yet
7 Best Practices For Building Data Applications On Snowflake
15 pages
Job Opportunities For Digital Health IT Consultancy Experts
No ratings yet
Job Opportunities For Digital Health IT Consultancy Experts
7 pages
Azure Cosmos DB: Technical Deep Dive
No ratings yet
Azure Cosmos DB: Technical Deep Dive
31 pages
Assignment 3.1 K Means Clustering in Python PART 1
No ratings yet
Assignment 3.1 K Means Clustering in Python PART 1
7 pages
DP 201
No ratings yet
DP 201
200 pages
Developing Solutions For Microsoft Azure AZ 204 1726611181
No ratings yet
Developing Solutions For Microsoft Azure AZ 204 1726611181
74 pages
Ef Aspnet Cheat Sheet
No ratings yet
Ef Aspnet Cheat Sheet
4 pages
MC5305-OOAD Syllabus
No ratings yet
MC5305-OOAD Syllabus
152 pages
Tabular Data - Deep Learning Is Not All You Need
No ratings yet
Tabular Data - Deep Learning Is Not All You Need
13 pages
SRS Gaming
100% (3)
SRS Gaming
10 pages
Software Reliability1
No ratings yet
Software Reliability1
19 pages
Ekon27 Pas2js Sign
No ratings yet
Ekon27 Pas2js Sign
22 pages
Azure Data Factory Interview Questions: Click Here
No ratings yet
Azure Data Factory Interview Questions: Click Here
28 pages
04-Entity-Relational Model (Part 1) - SCD
No ratings yet
04-Entity-Relational Model (Part 1) - SCD
46 pages
Ashwini's Resume
No ratings yet
Ashwini's Resume
1 page
Unit - 5
No ratings yet
Unit - 5
91 pages
HP Trim: Installation and Setup Guide
No ratings yet
HP Trim: Installation and Setup Guide
80 pages
Sandra Access 2013 Tutorial Part 1
No ratings yet
Sandra Access 2013 Tutorial Part 1
22 pages
www.devilzmu.net
No ratings yet
www.devilzmu.net
68 pages
Dipu Pradhan-2
No ratings yet
Dipu Pradhan-2
2 pages
CS 563-DeepLearning-SentimentApplication-April2022 (27403)
No ratings yet
CS 563-DeepLearning-SentimentApplication-April2022 (27403)
124 pages
Divyansh Antwadia MPR
No ratings yet
Divyansh Antwadia MPR
69 pages
CSC712 - Questions On Chapter 10 - Project Management
No ratings yet
CSC712 - Questions On Chapter 10 - Project Management
3 pages
23 Big Data and Data Wrangling
No ratings yet
23 Big Data and Data Wrangling
56 pages
Azure AnalysisServiceOverview
No ratings yet
Azure AnalysisServiceOverview
173 pages
How-To - Install CDH On Mac OSX 10
No ratings yet
How-To - Install CDH On Mac OSX 10
20 pages
Kanini Legacy Dot-Net App Modernization Ebook
No ratings yet
Kanini Legacy Dot-Net App Modernization Ebook
32 pages
Introduction To Information and Big Data Security
No ratings yet
Introduction To Information and Big Data Security
39 pages
Operating System
No ratings yet
Operating System
60 pages
Commands
100% (1)
Commands
21 pages
Learneverythingai 1661068200
No ratings yet
Learneverythingai 1661068200
66 pages
OC - Module 1 - Intro To BDA 021312
No ratings yet
OC - Module 1 - Intro To BDA 021312
38 pages
CAClarityPPM XOG DeveloperGuide ENU v13
No ratings yet
CAClarityPPM XOG DeveloperGuide ENU v13
483 pages
Lab - Qlik Replicate With Google BigQuery
No ratings yet
Lab - Qlik Replicate With Google BigQuery
23 pages
Aks Codes
No ratings yet
Aks Codes
53 pages
Azure Functions: A Comprehensive Guide For Beginners
No ratings yet
Azure Functions: A Comprehensive Guide For Beginners
17 pages
Pandas
No ratings yet
Pandas
16 pages
Distributed System
100% (1)
Distributed System
119 pages
Chapter 8 - Social Media Information Systems
No ratings yet
Chapter 8 - Social Media Information Systems
38 pages
T09 Data Streaming
No ratings yet
T09 Data Streaming
52 pages
Ankit PYTHON LAB
No ratings yet
Ankit PYTHON LAB
17 pages
Prepared By:-Shraddha Ponkiya
No ratings yet
Prepared By:-Shraddha Ponkiya
18 pages
Cloud Foundations - Notes
No ratings yet
Cloud Foundations - Notes
33 pages
Praveenkumar Resume
No ratings yet
Praveenkumar Resume
2 pages
Kotlin Flow API - Android Cheat Sheet 1.1 - Feb 2022
No ratings yet
Kotlin Flow API - Android Cheat Sheet 1.1 - Feb 2022
1 page
Attendance System Using File Handling in C
No ratings yet
Attendance System Using File Handling in C
34 pages
What Is Web Services and Middleware
No ratings yet
What Is Web Services and Middleware
8 pages
PSD02 - Data Science Overview
No ratings yet
PSD02 - Data Science Overview
64 pages
History of Computing Hardware - Wikipedia
No ratings yet
History of Computing Hardware - Wikipedia
11 pages
DP 900T00A ENU TrainerPrepGuide
No ratings yet
DP 900T00A ENU TrainerPrepGuide
10 pages
Dataengieer
No ratings yet
Dataengieer
23 pages
DBMS Interview Questions (2021) - Javatpoint
No ratings yet
DBMS Interview Questions (2021) - Javatpoint
17 pages
API SCUTI Documentation-En V 1.0.0
100% (1)
API SCUTI Documentation-En V 1.0.0
26 pages
SAFe 4 Whitepaper Digital 7-16
100% (1)
SAFe 4 Whitepaper Digital 7-16
32 pages
UiPath Certified - Test Automation Engineer Professional Exam Description
No ratings yet
UiPath Certified - Test Automation Engineer Professional Exam Description
8 pages
FuseByExample - Camel Example Tcpip Proxy GitHub
No ratings yet
FuseByExample - Camel Example Tcpip Proxy GitHub
4 pages
Is Java Bootcamp Required For Data Analyst
No ratings yet
Is Java Bootcamp Required For Data Analyst
4 pages
9 - CT071-3-3-DDAC - Introduction To Azure Cosmos DB
No ratings yet
9 - CT071-3-3-DDAC - Introduction To Azure Cosmos DB
30 pages
Talend ESB Container AG 50b en
No ratings yet
Talend ESB Container AG 50b en
63 pages
Course 10776A Developing Microsoft SQL Server 2012 Databases
No ratings yet
Course 10776A Developing Microsoft SQL Server 2012 Databases
9 pages
Stock Management System
No ratings yet
Stock Management System
29 pages
Internet of Things
No ratings yet
Internet of Things
11 pages
DW Olap
No ratings yet
DW Olap
57 pages
ABD22 1st Exam - 6 January - Attempt Review
No ratings yet
ABD22 1st Exam - 6 January - Attempt Review
13 pages
Heli Bhavsar: Education
No ratings yet
Heli Bhavsar: Education
2 pages
FT - Python Flask Developer
No ratings yet
FT - Python Flask Developer
2 pages
Stream Processing at Lyft
No ratings yet
Stream Processing at Lyft
20 pages
Set Your Data in Motion
No ratings yet
Set Your Data in Motion
8 pages
DevOps CV
No ratings yet
DevOps CV
1 page
Applied Coding Track
No ratings yet
Applied Coding Track
10 pages
Workflow Through OOPs
No ratings yet
Workflow Through OOPs
16 pages
Move The Data That Moves Your Business: Attunity Replicate
No ratings yet
Move The Data That Moves Your Business: Attunity Replicate
2 pages
Confluent Training Developer DS - 112216
No ratings yet
Confluent Training Developer DS - 112216
1 page
Hemanshu Kumar Saraf - Resume New
No ratings yet
Hemanshu Kumar Saraf - Resume New
1 page
Integration Basics (Level-I) Training Document
No ratings yet
Integration Basics (Level-I) Training Document
4 pages
Linux System Administration Syllabus
No ratings yet
Linux System Administration Syllabus
2 pages
Cloudera Nokia Case Study Final
No ratings yet
Cloudera Nokia Case Study Final
2 pages
The Apache Kafka® and Generative AI Handbook
From Everand
The Apache Kafka® and Generative AI Handbook
Joseph Matthew Stein
No ratings yet

Data Steaming Sylll

Uploaded by

Data Steaming Sylll

Uploaded by

NANODEGREE PROGR AM SYLL ABUS

Need Help? Speak with an Advisor: www.udacity.com/advisor

Estimated Time: Prerequisites:

Flexible Learning: Need Help?

Need Help? Speak with an Advisor: www.udacity.com/advisor Data Streaming | 2

For your first project, you’ll be streaming public transit status

• Describe and explain streaming data stores and stream

• Understand Kafka architecture, topics, and

Need Help? Speak with an Advisor: www.udacity.com/advisor Data Streaming | 3

• Utilize Confluent Kafka Python to create topics, configuration,

• Understand what a data schema is and the value it provides

• Describe and explain what problem Kafka Connect solves

• Describe and explain common scenarios for stream processing,

Need Help? Speak with an Advisor: www.udacity.com/advisor Data Streaming | 4

• Describe and explain the Faust Stream Processing Python

• Describe and explain how KSQL fits into the Kafka

Need Help? Speak with an Advisor: www.udacity.com/advisor Data Streaming | 5

In this project, you will analyze a real-world dataset of the SF Crime

• Describe and explain the big data ecosystem

• Manipulate data using Functional Programming

• Set up a Spark cluster on AWS (transition from local to

Need Help? Speak with an Advisor: www.udacity.com/advisor Data Streaming | 6

• Learn Apache Fundamental’s core building blocks (RDD/

• Understand the concept of lazy evaluation

• Describe Kafka Source Provider

Need Help? Speak with an Advisor: www.udacity.com/advisor Data Streaming | 7

CUSTOM STUDY PLANS

Need Help? Speak with an Advisor: www.udacity.com/advisor Data Streaming | 8

Ben Goldberg Judit Lantos

David Drummond Jillian Kim

Need Help? Speak with an Advisor: www.udacity.com/advisor Data Streaming | 9

EXPERIENCED PROJECT REVIEWERS

INDIVIDUAL 1-ON-1 MENTORSHIP

• 6+ hrs of mentor support per month

PERSONAL CAREER SERVICES

• Personal assistance in your job search

Need Help? Speak with an Advisor: www.udacity.com/advisor Data Streaming | 10

WHY SHOULD I ENROLL?

You’ll start by understanding the components of data streaming systems. You’ll

WHAT JOBS WILL THIS PROGRAM PREPARE ME FOR?

HOW DO I KNOW IF THIS PROGRAM IS RIGHT FOR ME?

ENROLLMENT AND ADMISSION

DO I NEED TO APPLY? WHAT ARE THE ADMISSION CRITERIA?

WHAT ARE THE PREREQUISITES FOR ENROLLMENT?

Need Help? Speak with an Advisor: www.udacity.com/advisor Data Streaming | 11

IF I DO NOT MEET THE REQUIREMENTS TO ENROLL, WHAT SHOULD I DO?

TUITION AND TERM OF PROGR AM

HOW IS THIS NANODEGREE PROGRAM STRUCTURED?

HOW LONG IS THIS NANODEGREE PROGRAM?

CAN I SWITCH MY START DATE? CAN I GET A REFUND?

WHAT SOFTWARE AND VERSIONS WILL I NEED IN THIS PROGRAM?

Need Help? Speak with an Advisor: www.udacity.com/advisor Data Streaming | 12

You might also like