IBM DS0101EN
What is Data Science?
4/28/2024 1
Instructor Name
• MACHINE LEARNING INTERN, IBM EGYPT
• MACHINE LEARNING ENGINEER at ABC
• MACHINE LEARNING ENGINEER at ABD
• PYTHON AND DATA SCIENCE INSTRUCTOR
4/28/2024 2
• In this course, you’ll learn how to:
⚬ Define data science and its importance in today’s data-driven world.
⚬ Describe the various paths that can lead to a career in data science.
⚬ Summarize advice given by seasoned data science professionals to
data scientists who are just starting out.
• Course Modules:
⚬ Define data science & What Data Scientists Do
⚬ Data science Topics
⚬ Applications and Careers in Data Science
⚬ Data Literacy for Data Science
4/28/2024 Course Overview 3
Module I
Define data
science &
What Data
Scientists Do
4/28/2024 4
Module I
Define data science & What
Data Scientists Do
What is Data Science?
4/28/2024 5
Understanding Data Science
• Data Science is a continuous process of utilizing data to gain
insights.
• It involves validating hypotheses or models using available data.
• The goal is to uncover trends and insights hidden within datasets.
4/28/2024 6
Understanding Data Science
• Data is transformed into compelling narratives through storytelling.
• These insights drive strategic decision-making for organizations.
• It encompasses extracting and analyzing data in structured and
unstructured forms.
4/28/2024 7
Understanding Data Science
4/28/2024 8
The Essence of Data Science
• Data science explores, manipulates, and analyzes data to find answers.
• Just as other sciences study specific subjects, data science focuses on
understanding data.
• Today’s world offers an abundance of data, algorithms, and accessible tools.
• The affordability and accessibility of these resources make data science more
relevant than ever.
4/28/2024 What is Data Science? 9
The Essence of Data Science
• It's a time of unprecedented opportunity for those interested in data science.
• Data science thrives on curiosity, exploration, and leveraging available
resources.
4/28/2024
Module I
Define data science & What
Data Scientists Do
Fundamentals of Data Science
4/28/2024 11
Understanding Data Science
• Data Science encompasses significant data analysis across various sources.
• It leverages vast quantities of data from diverse sources like social media and
sales.
• Advancements in computing power enable meaningful analysis and new
discoveries.
4/28/2024 12
Understanding Data Science
• Data science aids organizations in understanding their environments
and uncovering opportunities.
• Data scientists investigate data to add value and insight to the
organization's knowledge.
• The process starts with clarifying the organization's question or
problem.
4/28/2024 Fundamentals of Data Science 13
The Data Science Process
4/28/2024 Fundamentals of Data Science 14
Module I
Define data science & What
Data Scientists Do
Real-Life Applications of Data Science
4/28/2024 15
REAL-LIFE APPLICATIONS Let's see some use cases
OF DATA SCIENCE
CHATBOT VOICE ASSISTANTS RECOMMENDATION SELF-DRIVING CARS
SYSTEM
REAL-LIFE APPLICATIONS Let's see some use cases
OF DATA SCIENCE
CHESS ADS TARGETING HEALTH CARE FRAUD DETECTION
Module I
Define data science & What
Data Scientists Do
The Many Paths to Data Science
4/28/2024 18
Evolution of Data Science Careers
• Data science was not a recognized field until around 2009-2011.
• DJ Patil and Andrew Gelman are credited with coining the term.
• Before data science, statistics was a prevalent field.
• Individuals often pursued business or other quantitative analysis
disciplines.
• Exposure to data science often occurred during academic or
professional endeavors.
• The term "data science" gained prominence in various industries
over time.
4/28/2024 The Many Paths to Data Science 19
Personal Journeys into Data Science
• Many individuals stumbled into data science through academic or professional paths.
• Backgrounds varied from engineering to business, economics, and analytics.
• Exposure to data science often occurred during higher education or internships.
• Practical applications in fields like transportation engineering introduced individuals to data
science.
• Gradual immersion in data analysis and modeling paved the way for careers in data science.
• The journey into data science showcases diverse paths and backgrounds converging into the
field.
4/28/2024 The Many Paths to Data Science 20
Module I
Define data science & What
Data Scientists Do
Advice for New Data Scientists
4/28/2024 21
Advice
FOR NEW DATA SCIENTISTS
Curiosity Being judgmental
01 Curiosity is fundamental for exploring
and understanding complex data.
02 Being judgmental helps in forming
hypotheses and initial assumptions.
Argumentativeness Flexibility
03 Argumentativeness aids in advocating for
a specific direction and learning from
04 Comfort and flexibility with analytics
platforms are valuable secondary skills.
data.
Career Development Strategies for Data Scientists
• Identify your competitive advantage and preferred industry focus.
• Tailor your analytical skills to match the needs of your chosen field.
• Acquire proficiency in industry-specific analytics platforms and tools.
• Apply your skills to real-world problems to demonstrate your
capabilities.
• Develop storytelling abilities to effectively communicate insights and
findings.
• Continuously refine and adapt your skills to stay relevant and
competitive in the field.
4/28/2024 Advice for New Data Scientists 23
Module I
Define data science & What
Data Scientists Do
Lesson Summary: Defining Data Science
4/28/2024 24
Understanding Data Science
• Data science studies data to understand the world around us.
• It uncovers insights and trends hidden within vast amounts of data.
• Recent advancements in computing power enable deeper analysis and new
knowledge.
• Data scientists play a crucial role in translating data into actionable insights.
• The process involves problem clarification, data collection, analysis, and
visualization.
• Curiosity, argumentation, and judgment are key traits for successful data
scientists.
4/28/2024 Lesson Summary: Defining Data Science 25
Developing Skills and Career Paths
• Skilled data scientists possess versatile knowledge beyond statistics and
programming.
• They come from diverse backgrounds such as economics, engineering, or
medicine.
• Mastery of data analysis tools and techniques is essential for success.
• Specialization in a particular field enhances expertise and industry relevance.
• Certification may become necessary as companies prioritize qualified
candidates.
• Future data scientists will adapt to evolving technology and changing job roles
for successful business outcomes.
4/28/2024 Lesson Summary: Defining Data Science 26
Module I
Define data science & What
Data Scientists Do
Data Science & Big Data Skills
4/28/2024 27
Data Scientist
Programming Skills
Skills Domain Communicatio
Proficiency in languages like Knowledge n
Effectively communicating
Understanding of the specific
01 Python, R, or SQL for data 03 industry or domain to
05 findings and insights to
manipulation and analysis. stakeholders through reports
interpret data in context.
and presentations.
Data
Data Analysis
Visualization Problem-
02 04 06
Ability to analyze large Creating visual Solving
Applying analytical skills to
datasets using statistical representations of data to solve complex business
methods and machine communicate insights problems using data-driven
learning algorithms. effectively. approaches.
Big Data
Volume Variety Velocity
Dealing with large volumes of data Handling diverse data types such as Processing data streams in real-time
that traditional systems cannot structured, semi-structured, and or near real-time to derive timely
handle efficiently. unstructured data. insights.
Module I
Define data science & What
Data Scientists Do
Understanding Different Types of File Formats
4/28/2024 31
Understanding File Formats
• Data professionals work with various file types and formats.
• Importance of understanding file structure, benefits, and limitations.
• Choosing suitable formats for data and performance requirements.
Covered file formats: Delimited text, XLSX, XML, PDF, JSON.
Delimited text files: Rows with values separated by delimiters like
commas or tabs.
CSVs and TSVs are common in this category and suited for
straightforward information.
4/28/2024 Understanding Different Types of File Formats 32
Overview of File Formats
XLSX:
• Microsoft Excel's Open XML format.
• Data is organized into worksheets (rows and columns).
• Supports formulas, charts, and data manipulation.
• Widely used for data analysis and reporting.
XML:
• A markup language designed to encode data.
• Both human- and machine-readable.
• Often used for data exchange between systems and applications.
• Flexible and customizable structure for storing and transmitting data.
4/28/2024 Understanding Different Types of File Formats 33
Overview of File Formats
• PDF:
• Developed by Adobe for consistent document presentation.
• Maintains formatting across different devices and platforms.
• Ideal for sharing documents where layout, fonts, and images must
remain intact.
• Commonly used for reports, contracts, and presentations.
4/28/2024 Understanding Different Types of File Formats 34
Overview of File Formats
• JSON: Text-based standard for transmitting structured data over the web.
• JSON is language-independent, easy to use, and widely compatible.
Understanding popular file formats is crucial for effective data handling and
analysis.
4/28/2024 Understanding Different Types of File Formats 35
Example of File Formats
4/28/2024 36
Q&A
4/28/2024 Q&A 37
Thank you!
4/28/2024 38