0% found this document useful (0 votes)

44 views78 pages

Module 2

The document outlines a course on Exploratory Data Analysis (EDA) and the Data Science Process, emphasizing the importance of EDA in understanding and preparing data for analysis. It covers fundamental concepts, techniques, and tools used in EDA, along with a case study of RealDirect, a data-driven real estate company. The document also introduces basic machine learning algorithms such as Linear Regression, k-Nearest Neighbours, and k-means.

Uploaded by

Divyaraj

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

44 views78 pages

Module 2

Uploaded by

Divyaraj

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 78

MODULE - 2

Course Outcome

At the end of the course the student will be able to:

CO 2. Apply different techniques to Explore Data

Analysis and the Data Science Process
Syllabus
• Exploratory Data Analysis and the Data Science Process Basic tools
(plots, graphs and summary statistics) of EDA, Philosophy of EDA, The
Data Science Process, Case Study: Real Direct (online real estate firm).
Three Basic Machine Learning Algorithms: Linear Regression, k-Nearest
Neighbours (k- NN), k-means.

Textbook: Doing Data Science, Cathy O’Neil and Rachel Schutt, O'Reilly
Media, Inc O'Reilly Media, Inc, 2013

Textbook 1: Chapter 2, Chapter 3

Note:

• Outliers are data points that stand out from

the majority of the data because they are
significantly different or unusual compared
to the rest.
What is Data Science Modelling?

• Data science modeling is a set of steps

from defining the problem to
deploying the model in reality.

• EDA is one of the important Data Modelling

step among several steps.
EDA - Exploratory Data Analysis

• Exploratory data analysis is one of the

basic and essential steps of a data
science project.
• A data scientist involves almost 70% of
his work in doing the EDA of the dataset.
Example of EDA:

1. Starting Point: You don't have a specific

question in mind, like "Are customers satisfied?"
Instead, you're just curious about what the
reviews can tell you.
2. Exploring the Data:
1. Reading Through Reviews: You start by reading some
reviews. You notice some mention quality, some mention price,
and others mention customer service.
2. Creating Word Clouds: You create a word cloud to see which
words appear most frequently. You might find "great",
"expensive", and "helpful" are common.
3. Looking at Ratings: You plot the ratings on a histogram to see
how they are distributed. You might see that most reviews are 4
stars, but there are also a few 1-star reviews.
3. Finding the Unexpected:
1. Surprises: While exploring, you might notice that many
low ratings mention "late delivery," something you didn't
expect to be an issue.
2. Expected Findings: You see lots of mentions of "good
quality," which you expected because your product team
focused on quality.

4. New Insights: Now you realize that, apart from quality,

delivery times are a significant issue that needs addressing.
What is Exploratory Data Analysis
(EDA)?

• Exploratory Data Analysis (EDA) means

that EDA is like having an open mind
and being flexible when looking at data.
What It involves?
• Being Open-Minded: You're ready to find things you
didn't expect, as well as things you thought might be
there.

• Curiosity: You explore the data with a sense of curiosity,

without making any assumptions beforehand.

• In simple terms, it's about being willing to see and

discover anything in the data, whether it's surprising or
expected.
What is Exploratory Data Analysis
(EDA)?

“Exploratory data analysis” is an attitude, a

state of flexibility, a willingness to look for
those things that we believe are not there,
as well as those we believe to be there. —
John Tukey
Who Developed EDA?

• John Tukey, a mathematician at Bell Labs, developed

exploratory data analysis (EDA).

• In EDA, there is no initial hypothesis or model, and the

"exploratory" aspect means that your understanding
of the problem evolves as you analyze the data.
Historical Perspective: Bell Labs
• Bell Labs: A research laboratory established in the
1920s, known for its innovations in physics, computer
science, statistics, and mathematics.

• It has produced notable technologies like the C++

programming language and numerous Nobel Prize
winners.

• Statistics Group: Bell Labs had a highly successful and

productive statistics group, which included John Tukey.
Historical Perspective: Bell Labs
cont..
• Tukey is considered the father of EDA and contributed
to the development of the S language, which evolved
into R, the open-source statistical software.

• Bell Labs is seen as a birthplace of data science due

to its collaborative environment, where statisticians
and computer scientists worked together with large
and complex datasets.
What is Exploratory Data Analysis
(EDA)?
• Exploratory Data Analysis (EDA) is an approach to
analyze the data using visual techniques.

• It is used to discover trends, patterns, or to check

assumptions with the help of statistical summary
and graphical representations.

• It's the process of exploring and becoming familiar

with the data so you can make better decisions
Why EDA?
• EDA is a crucial step in the data analysis process
because it helps you to
• Understand,
• Clean,
• and Prepare your data,
• Generate hypotheses,
• and Make better decisions.
• It's the foundation upon which reliable and
meaningful analysis is built.
Philosophy of Exploratory Data
Analysis
• Exploratory Data Analysis (EDA) is about understanding your data before
trying to convince others of any findings.

• Rachel, while at Google, learned from former Bell Labs statisticians that
EDA is essential even with huge datasets.

• Here’s why EDA is important:

1.Understand the Data: Gain intuition and make comparisons.

2.Sanity Check: Ensure data is in the expected format and scale.

3.Identify Issues: Find missing data or outliers.

4.Debugging: Detect errors in data logging, which helps engineers fix

Here are some references to help you
understand best practices and historical
context:
1. Exploratory Data Analysis by John Tukey (Pearson)
2. The Visual Display of Quantitative Information by Edward Tufte
(Graphics Press)
3. The Elements of Graphing Data by William S. Cleveland (Hobart Press)
4. Statistical Graphics for Visualizing Multivariate Data by William G.
Jacoby (Sage)
5. “Exploratory Data Analysis for Complex Models” by Andrew Gelman
(American Statistical Association)
6. The Future of Data Analysis by John Tukey. Annals of Mathemat ical
Statistics, Volume 33, Number 1 (1962), 1-67.
7. Data Analysis, Exploratory by David Brillinger [8-page excerpt from
International Encyclopedia of Political Science (Sage)]
Exercise: EDA

• There are 31 datasets named nyt1.csv,

nyt2.csv,…,nyt31.csv, which you can find
here:
https://github.com/oreillymedia/doing_data
_science
.
The Data Science Process
1. Raw Data Collection:

• Raw Data: This is the unprocessed, original

data that is gathered directly from various
real-world sources.

• It has not yet been cleaned or transformed

and may contain errors, duplicates, missing
values, or irrelevant information.
Sources of Raw Data: Raw data can come from a wide variety of
sources, including:
• Log Files: Records of events or transactions, such as web server
logs or application logs.
• Social Media: Data from platforms like Twitter, Facebook, or
Google+.
• Sensors: Data from IoT devices, environmental sensors, or
medical devices.
• Databases: Data stored in relational or NoSQL databases.
• Manual Entry: Data entered by humans, such as survey
responses or forms.
• Public Records: Data from public datasets, such as government
reports, academic publications, or historical records.
2. Data Processing

• Data Processing: This involves

transforming raw data into a clean,
organized, and usable format.
3. Data Cleaning
• The raw data often contains noise, errors, and
inconsistencies. It needs to be cleaned and transformed
into a usable format.

• Tools: Python, shell scripts, R, SQL, or a combination

of these tools can be used for data cleaning.

• Output: The cleaned data is typically structured in a

tabular format with rows and columns, making it ready
for analysis.
4. Exploratory Data Analysis
(EDA):
• Initial Examination: Conduct an initial analysis to
understand the data's characteristics and to identify any
issues such as missing values, duplicates, or outliers.

• Iterative Process: EDA is iterative, meaning you may

need to go back and clean the data further if new issues
are discovered during the analysis.
Model Building:

• Algorithm Selection: Choose a suitable algorithm

based on the problem type (e.g., classification,
regression, clustering).

• Examples: Algorithms like k-nearest neighbors (k-NN),

linear regression, and Naive Bayes are common choices.

• Training: Train the model using the cleaned data to learn

patterns and relationships.
Interpretation and
Communication:
• Results Interpretation: Understand the model's
results and insights.

• Communication: Communicate findings through

reports, visualizations, presentations, or
academic papers to stakeholders, such as
managers or coworkers.
Building a Data Product

• Prototype Development: Develop a prototype of

the data product, like a recommendation system,
spam classifier, or search ranking algorithm.

• Deployment: Deploy the data product so that

users can interact with it.
• In summary, the data science process is a
comprehensive and iterative workflow that
involves collecting, cleaning, analyzing, and
modeling data, followed by interpreting and
communicating the results.
A Data Scientists Role in This
Process
• The process of working with data doesn't happen
automatically; it requires a data scientist or a data
science team.
• These experts make key decisions about what data to
collect and why.
• They also formulate questions, create hypotheses, and
plan how to tackle problems.
• In addition to coding and analyzing the data, they are
involved in the entire process from start to finish.
• This involvement ensures that the data collection,
cleaning, analysis, and modeling are all aligned with
the project goals.
Case Study: RealDirect

https://www.youtube.com/watch?v=JXaf2I6C0Ho
About RealDirect

• RealDirect is a real estate company focused

on using data to improve the home
buying and selling process.
Purpose and Mission
• Goal: The primary goal of RealDirect is to leverage data and
technology to streamline and optimize the process of buying
and selling homes. This involves providing real-time data and
insights to homeowners and buyers, reducing commission
costs, and enhancing the overall efficiency of real estate
transactions.

• Mission: RealDirect aims to make the real estate market more

transparent and accessible by offering detailed data and
actionable advice to both buyers and sellers.
Business Model
• Subscription Service: RealDirect offers a subscription
service for home sellers at approximately $395 per month.
This subscription provides access to various selling tools
and data-driven recommendations.
• Reduced Commission: Sellers can also choose to use
RealDirect's agents, who work at a reduced commission rate
of 2%, compared to the typical 2.5% to 3%. This lower rate
is possible due to the efficiency gained from pooling data
and resources.
• Platform Features: The RealDirect platform helps manage
the sale or purchase process with various statuses (e.g.,
active, offer made, showing, in contract) and provides
Technology and Data Use
• Data Integration: RealDirect integrates various sources
of data, including publicly available information and real-
time interaction data, to offer comprehensive insights.
This includes data on property listings, sales trends, and
market conditions.
• User Interface: The platform provides an interface for
sellers with tips and recommendations on how to sell
their house effectively. It also uses interaction data to
give real-time advice on what steps to take next.
• Information Collection: The agents at RealDirect
become proficient in using tools to collect and analyze
data, which helps them stay updated on new and
relevant information.
Challenges and Legal
Considerations
• Legal Hurdles: In New York, RealDirect must comply with laws that
require housing listings to be behind a registration wall. This means
users need to register to see listings, which can be a barrier but is
similar to other platforms like Pinterest.

• Industry Resistance: RealDirect has faced pushback from

traditional real estate brokers who are unhappy with its approach to
lowering commission rates. However, these brokers often have to
cooperate because buyers can find the same listings on other
platforms, leading to transparency in the market.
Competitive Advantage
• Data-Driven Approach: By leveraging data and technology,
RealDirect offers a more efficient and transparent process for buying
and selling homes.

• Cost Savings: The reduced commission rates and subscription model

make it a cost-effective alternative for sellers.

• Comprehensive Service: RealDirect not only provides listings but

also offers detailed information about properties, such as nearby
amenities, price comparisons, and market trends.
• In summary, RealDirect was founded in 2010 in New York
City by Doug Perlson and Perry Tamarkin with the
mission to improve the real estate market using data and
technology.

• The company offers subscription services and reduced

commission rates, integrates various data sources, and
faces challenges from traditional brokers and legal
requirements.
Exercise: RealDirect Data
Strategy

• You've been hired as the chief data scientist at

RealDirect, a real estate website, and need to develop a

data strategy for the company.

• Here are steps to approach this task:

Step 1: Understand the Current
System
• Explore the Website: Navigate the RealDirect
site to understand how buyers and sellers use it
and how it’s organized.

• Research Questions: Identify what data should

be collected, how it would look, how it would be
used for reporting and monitoring, and how it
Step 2: Use Auxiliary Data

• Find Market Data: Since there’s no internal data yet, use external
data from sources like GitHub’s Rolling Sales Update (use
external data from sources like GitHub’s Rolling Sales
Update).

• Data Cleanup: Load and clean the data by fixing outliers,

formatting dates correctly, and ensuring numerical values are
treated as such.

• Exploratory Analysis: Visualize and compare data across

neighborhoods and time to find meaningful patterns.
Step 3: Report Findings

• Write a Report: Summarize your findings in a simple,

clear report for the CEO.

• Communication: Develop strategies to communicate

effectively with non-data scientists. Identify other
people in the company you should talk to for more
information.
Step 4: Understand the Domain

• Step Out of Comfort Zone: Collecting data in a new

field can provide insights for your own work.

• Learn the Vocabulary: Real estate has specific terms.

Asking questions to understand these terms is
important for grasping the problem fully.
• In simple terms, your job is to explore how RealDirect
operates, determine what data to collect and analyze,
use external data to gain insights, and communicate
your findings effectively while learning the real estate
terminology and developing best practices for data
strategy.
Note:

• Data scientist (noun): Person who is better

at statistics than any soft ware engineer
and better at software engineering than
any statistician.

— Josh Wills
Three Basic Machine Learning
Algorithms:
• Linear Regression
• k-Nearest Neighbours (k- NN)
• k-means
Linear Regression

• Linear regression is also a type of machine-

learning algorithm more specifically a supervised
machine-learning algorithm.

• That learns from the labelled datasets and

maps the data points to the most optimized
linear functions which can be used for prediction
on new datasets.
Definition of Linear Regression
• Linear Regression is a statistical method used to
model and analyze the relationships between a
dependent variable and one or more independent
variables.

• The goal is to find the best-fitting straight line

through the data points that can be used to
predict the dependent variable based on the
• The blue dots represent
actual data points,
showing the number of
hours studied and the
corresponding test
scores.
• The red line is the
regression line, which is
the best-fit straight line
that represents the
overall trend of the data.
Linear Regression cont..
• Linear regression is a method used to understand the
relationship between two things.

• Imagine you have a bunch of dots on a graph that represent how

two things change together, like hours studied and test scores.

• Linear regression draws a straight line through these dots in a way

that best shows the overall trend.

• This line can help predict one thing (like a test score) if you know
the other thing (like hours studied).

• It's like finding the best-fitting path that the dots generally follow.
Linear Regression cont..
• Equation of linear regression:
y= β0+β1x
y: Dependent variable (outcome)
x: Independent variable (predictor)
β0: Intercept (the value of y when x=0)
β1: Slope (the change in y for a one-unit
change in x)
Example 1:
• A social networking site charges a monthly subscription fee of
$25.

• The site's revenue is recorded daily over two years, resulting in a

series of data points (number of users and total revenue).

• The first four data points are (1, 25), (10, 250), (100, 2500), and
(200, 5000).

• A clear relationship 𝑦=25𝑥 is observed.

• Indicating a linear pattern with a coefficient of 25, meaning
The graph shows these data points as blue dots, with a red
dashed line representing the equation 𝑦=25𝑥, illustrating the
linear relationship.
Fitting the model
Fitting the model
• To find the optimal line that best fits the data points by minimizing the
distance between the points and the line.

• Least Squares Estimation:

• Linear regression uses a method called least squares estimation.

• The objective is to minimize the Residual Sum of Squares (RSS),

which is the sum of the squared vertical distances between the
observed data points and the predicted values on the line.
In Summary
1. Plotting Points:
• Plot the data points on a graph.
2. Drawing the Line:
• Draw a line that seems to fit the trend of the points.
3. Calculating Residuals:
• Measure the vertical distances (residuals) from each
point to the line.
4. Squaring and Summing Residuals:
• Square these residuals to avoid negative values and
sum them up to get RSS.
5. Finding the Best Fit Line:
• Adjust the line to minimize RSS.
Refer Class notes for the values and full solution
The K-Nearest Neighbors (KNN)
algorithm
• The K-Nearest Neighbors (KNN) algorithm is a
supervised machine learning method employed to
tackle classification and regression problems.

• Evelyn Fix and Joseph Hodges developed this

algorithm in 1951, which was subsequently
expanded by Thomas Cover.
Definition of KNN

• k-Nearest Neighbors (k-NN) is a non-parametric, lazy

learning algorithm that classifies a new data point
based on the classes of its k nearest neighbors.

• The "k" in k-NN is a user-defined constant that

determines the number of nearest neighbors to
consider when making a classification or prediction.
Different Distance Metrics in
KNN
• Euclidean Distance
• Manhattan
• Cosine Similarity
• Jaccard Distance or Similarity
• Mahalanobis Distance
• Hamming Distance
Euclidean Distance
• This distance is the most widely used one as it is the
default metric that SKlearn library of Python uses for
K-Nearest Neighbor.
• It is a measure of the true straight line distance between
two points in Euclidean space.
Manhattan
• This distance is also known as taxicab distance or
city block distance, that is because the way this
distance is calculated.
• The distance between two points is the sum of the
absolute differences of their Cartesian coordinates.
Cosine Similarity

• This distance metric is used mainly to calculate

similarity between two vectors.

• It is measured by the cosine of the angle

between two vectors and determines
whether two vectors are pointing in the
same direction.
Jaccard Distance or Similarity
• This measures how similar two sets are.
• Imagine comparing two friend lists.
• You look at how many friends are in both lists
(intersection) and how many unique friends
are in either list (union).
• The Jaccard Similarity is the ratio of the
intersection to the union. Higher values mean
more similarity.
Mahalanobis Distance

• Similar to Euclidean distance but takes into account how

the data is spread out (correlation) and adjusts for
different scales in the data.
Calculate the Covariance Matrix (S):
•The covariance matrix represents how each feature varies
with every other feature. It helps understand the relationships
and spread of the data.
Invert the Covariance Matrix (S⁻¹):
•The inverse of the covariance matrix is used to adjust for the
relationships between features.
Hamming Distance

• Used for strings of the same length.

• It counts how many positions have different

characters.

• For example, comparing "cat" and "bat" gives a

distance of 1 because only the first letters are
different and olive and ocean is 4 .
K - Means

• K-Means is a popular and straightforward

clustering algorithm used in machine learning and
data analysis to partition a dataset into K distinct,
non-overlapping groups (clusters) based on the
inherent structure of the data. Here's a simple
definition:
K Means Definition

• K-Means is an iterative algorithm that

divides a set of data points into K
clusters, where each data point belongs
to the cluster with the nearest mean
(centroid).
How K-Means Works
1.Choose K: Decide the number of clusters, K, you want
to form in the dataset.
2.Initialize Centroids: Randomly select K data points
from the dataset as the initial centroids (cluster centers).
3.Assign Clusters: Assign each data point to the nearest
centroid, forming K clusters.
4.Update Centroids: Calculate the new centroids by
taking the mean of all data points assigned to each
cluster.
5.Repeat: Repeat steps 3 and 4 until the centroids no
longer change significantly or a maximum number of
iterations is reached.

Data Science and Visualization (21CS644) : Text Books
No ratings yet
Data Science and Visualization (21CS644) : Text Books
27 pages
Work Rules - Laszlo Bock
80% (5)
Work Rules - Laszlo Bock
16 pages
Unit 1
No ratings yet
Unit 1
11 pages
EDA Lecture Notes
No ratings yet
EDA Lecture Notes
205 pages
Unit - 1 EDA
No ratings yet
Unit - 1 EDA
123 pages
Unit 1 - Exploratory Data Analysis Fundamentals
No ratings yet
Unit 1 - Exploratory Data Analysis Fundamentals
47 pages
UNIT 1 Exploratory Data Analysis
100% (2)
UNIT 1 Exploratory Data Analysis
21 pages
Exploratory Data Analysis & Data Preprocessing
No ratings yet
Exploratory Data Analysis & Data Preprocessing
16 pages
Exploratory Data Analysis and Data Science - Part 1
No ratings yet
Exploratory Data Analysis and Data Science - Part 1
7 pages
Ccs346 Eda Unit 1
No ratings yet
Ccs346 Eda Unit 1
129 pages
Data Exploration and Visualization
100% (1)
Data Exploration and Visualization
281 pages
Derrick Beckett (Auth.), J. E. Harding, G. A. R. Parke, M. J. Ryall (Eds.) - Bridge Management - Inspection, Maintenance, Assessment and Repair-Springer US (1990)
No ratings yet
Derrick Beckett (Auth.), J. E. Harding, G. A. R. Parke, M. J. Ryall (Eds.) - Bridge Management - Inspection, Maintenance, Assessment and Repair-Springer US (1990)
772 pages
Unfitness To Plead Consultation
No ratings yet
Unfitness To Plead Consultation
268 pages
Data Science - Ebook
No ratings yet
Data Science - Ebook
32 pages
Exploratory Data Analysis
No ratings yet
Exploratory Data Analysis
62 pages
EDA - Task
No ratings yet
EDA - Task
20 pages
Unit I Exploratory Data Analysis
No ratings yet
Unit I Exploratory Data Analysis
38 pages
Datascience and Visualization
No ratings yet
Datascience and Visualization
8 pages
02report F 1
No ratings yet
02report F 1
45 pages
Data Preparation & Exploration
No ratings yet
Data Preparation & Exploration
12 pages
Python For Data Analysis 2nd Module
No ratings yet
Python For Data Analysis 2nd Module
14 pages
m2 Final
No ratings yet
m2 Final
151 pages
EDA Unit 1 Notes
No ratings yet
EDA Unit 1 Notes
27 pages
Contract 2 Solved Problmes
No ratings yet
Contract 2 Solved Problmes
19 pages
DAT100 - Int - Data - Ana - Lec2 - Intro II
No ratings yet
DAT100 - Int - Data - Ana - Lec2 - Intro II
39 pages
Unit 1
No ratings yet
Unit 1
29 pages
Devish All Unit
No ratings yet
Devish All Unit
42 pages
Unit I - Part I Notes
100% (7)
Unit I - Part I Notes
33 pages
Accounting Clinic I
100% (1)
Accounting Clinic I
40 pages
Data Science Unlocking Insights From Information
No ratings yet
Data Science Unlocking Insights From Information
8 pages
Eda 2
No ratings yet
Eda 2
69 pages
Unit 3
No ratings yet
Unit 3
83 pages
Data Science Lecture No 02
No ratings yet
Data Science Lecture No 02
21 pages
DS Lecture 15
No ratings yet
DS Lecture 15
44 pages
d7 - sff3 - SFF 9 SFFa00 - A - SCH - 1400 - 20150720
No ratings yet
d7 - sff3 - SFF 9 SFFa00 - A - SCH - 1400 - 20150720
86 pages
Module 2
No ratings yet
Module 2
81 pages
CV Yassine Moundelssi
No ratings yet
CV Yassine Moundelssi
1 page
Group 7
No ratings yet
Group 7
19 pages
Eda 1
No ratings yet
Eda 1
25 pages
DSP Unit - Ii
No ratings yet
DSP Unit - Ii
14 pages
2011 - Production Enhancement by Using Electrical Submersible Pump
100% (1)
2011 - Production Enhancement by Using Electrical Submersible Pump
66 pages
Session1 DataCharacteristics
No ratings yet
Session1 DataCharacteristics
41 pages
Data Science Lecture No 02
No ratings yet
Data Science Lecture No 02
21 pages
Unit 1
No ratings yet
Unit 1
50 pages
Round 2 General Va Exam
0% (3)
Round 2 General Va Exam
3 pages
What Is EDA in Data Science - Everything About Exploratory Data - by Aman Kharwal - Medium
No ratings yet
What Is EDA in Data Science - Everything About Exploratory Data - by Aman Kharwal - Medium
11 pages
Emerging - 2021 - Module 2 PDF
No ratings yet
Emerging - 2021 - Module 2 PDF
61 pages
Unit 1
No ratings yet
Unit 1
23 pages
Unit - 1
No ratings yet
Unit - 1
25 pages
Exercises of Financial Accounting
No ratings yet
Exercises of Financial Accounting
25 pages
Unit 2
No ratings yet
Unit 2
48 pages
Notes Unit I
No ratings yet
Notes Unit I
47 pages
Notes - Unit 1 - Exploratory Data Analysis
No ratings yet
Notes - Unit 1 - Exploratory Data Analysis
33 pages
There Is / There Are: Grammar Worksheet
No ratings yet
There Is / There Are: Grammar Worksheet
3 pages
Monday 14 October 2019: Biology
No ratings yet
Monday 14 October 2019: Biology
28 pages
Lesson 5 Exploratory Data Analysis
No ratings yet
Lesson 5 Exploratory Data Analysis
10 pages
112 Pentatonic Scales
No ratings yet
112 Pentatonic Scales
3 pages
Introduction To Data Science
No ratings yet
Introduction To Data Science
33 pages
Difference Between Transhipment and Cargo in Transit.
No ratings yet
Difference Between Transhipment and Cargo in Transit.
30 pages
Unit 4
No ratings yet
Unit 4
33 pages
Traditional Marketing vs. Modern Marketing: A.Fathima Nowreen 20-PCO-001
No ratings yet
Traditional Marketing vs. Modern Marketing: A.Fathima Nowreen 20-PCO-001
24 pages
Notes - EDA-Unit1
No ratings yet
Notes - EDA-Unit1
34 pages
Data Sciecnce
No ratings yet
Data Sciecnce
16 pages
Data Science Process
No ratings yet
Data Science Process
30 pages
of Restaurant
100% (1)
of Restaurant
14 pages
Unit - Iii - Eda
No ratings yet
Unit - Iii - Eda
25 pages
DSV Module-2
No ratings yet
DSV Module-2
23 pages
Session 1 - Group - Team Overview - Lesson 2 - Cooperation
No ratings yet
Session 1 - Group - Team Overview - Lesson 2 - Cooperation
12 pages
6220010
No ratings yet
6220010
37 pages
Lecture 2 The Data Science Process and Tools For Each Step
No ratings yet
Lecture 2 The Data Science Process and Tools For Each Step
8 pages
Unit 1
No ratings yet
Unit 1
19 pages
Datascience (Mod1)
No ratings yet
Datascience (Mod1)
4 pages
Exploratory Data Analysis (Eda)
No ratings yet
Exploratory Data Analysis (Eda)
10 pages
Unit3 Eda
No ratings yet
Unit3 Eda
13 pages
Exploratory Data Analysis in ML
No ratings yet
Exploratory Data Analysis in ML
7 pages
Exam Results
100% (1)
Exam Results
6 pages
PLP - Philo Q2 - WK1 - Day1-2
No ratings yet
PLP - Philo Q2 - WK1 - Day1-2
2 pages
Essay - Spirituality
No ratings yet
Essay - Spirituality
4 pages
It's A Wrap! Indeed, A Great Success!: Barkada Kontra Droga
No ratings yet
It's A Wrap! Indeed, A Great Success!: Barkada Kontra Droga
2 pages
CRPF Signal Form
No ratings yet
CRPF Signal Form
2 pages
The Abyssinian Cat
No ratings yet
The Abyssinian Cat
8 pages
The Three Inscriptions of Indravarman
No ratings yet
The Three Inscriptions of Indravarman
6 pages
Eda Sandhya
No ratings yet
Eda Sandhya
7 pages
Sikacontrol - wt-220ph CRYSTALLINE
No ratings yet
Sikacontrol - wt-220ph CRYSTALLINE
3 pages
Research Assistant Job Opportunity: NED UET Karachi
No ratings yet
Research Assistant Job Opportunity: NED UET Karachi
1 page
Mercury
No ratings yet
Mercury
2 pages
Professionalinterview
No ratings yet
Professionalinterview
4 pages
Chronology of Lonavala Development
No ratings yet
Chronology of Lonavala Development
6 pages
Data Analytics with Generative AI
From Everand
Data Analytics with Generative AI
Younish P
No ratings yet
Be Data Curious!: Be Data Curious!, #1
From Everand
Be Data Curious!: Be Data Curious!, #1
Nick Jewell
No ratings yet