Data Steaming Sylll
Data Steaming Sylll
Data Streaming
• Understand the components of data streaming systems. Ingest data in real-time using Apache Kafka
and Spark and run analysis
• Use the Faust Stream Processing Python library to build a real-time stream-based application. Compile
real-time data and run live analytics, as well as draw insights from reports generated by the streaming
console.
• Learn about the Kafka ecosystem, and the types of problems each solution is designed to solve. Use
the Confluent Kafka Python library for simple topic management, production, and consumption.
• Explain the components of Spark Streaming (architecture and API), integrate Apache Spark Structured
Streaming and Apache Kafka, manipulate data using Spark, and understand the statistical report
generated by the Structured Streaming console.
This program is comprised of 2 courses and 2 projects. Each project you build will be an opportunity to
demonstrate what you’ve learned in the course, and will demonstrate to potential employers that you have
skills in these areas.
Prerequisite Knowledge: Intermediate SQL, Python, and experience with ETL. Basic familiarity with
traditional batch processing and traditional service architectures is desired, but not required.
LEARNING OUTCOMES
LEARNING OUTCOMES
KNOWLEDGE
Find answers to your questions with Knowledge, our
proprietary wiki. Search questions asked by other students
and discover in real-time how to solve the challenges that
you encounter.
STUDENT HUB
Leverage the power of community through a simple, yet
powerful chat interface built within the classroom. Use
Student Hub to connect with your technical mentor and
fellow students in your Nanodegree program.
WORKSPACES
See your code in action. Check the output and quality of
your code by running them on workspaces that are a part
of our classroom.
QUIZZES
Check your understanding of concepts learned in the
program by answering simple and auto-graded quizzes.
Easily go back to the lessons to brush up on concepts
anytime you get an answer wrong.
PROGRESS TRACKER
Stay on track to complete your Nanodegree program with
useful milestone reminders.
• Personalized feedback
• Unlimited submissions and feedback loops
• Practical tips and industry best practices
• Additional suggested resources to improve
Students will learn how to process data in real-time by building fluency in modern
data engineering tools, such as Apache Spark, Kafka, Spark Streaming, and Kafka
Streaming.
The projects in the Data Streaming Nanodegree program will prepare you to
develop systems and applications capable of interpreting data in real-time,
and position you for roles in all industries that require live data processing
for functions including big data, cloud computing, web personalization, fraud
detection, sensor monitoring, anomaly detection, supply chain maintenance,
location-based services, and much more.
Similarly, the Data Engineering Nanodegree program is great preparation for the
Data Streaming Nanodegree program.
Each project will be reviewed by the Udacity reviewer network. Feedback will be
provided, and if you do not pass the project, you will be asked to resubmit the
project until it passes.
S O F T WA R E A N D H A R D WA R E