0% found this document useful (0 votes)

40 views56 pages

3 - Introduction To Data

This document discusses key concepts in data mining including attributes and objects, types of data, data quality, and similarity and distance. It defines attributes as properties or characteristics of objects. Attributes can take on nominal, binary, ordinal, or numeric values. Numeric attributes are further classified as interval or ratio. Attributes are also described as being discrete or continuous. The document also discusses measuring the central tendency and dispersion of data using statistical descriptions such as the mean, median, mode, variance, standard deviation, quartiles, and boxplots.

Uploaded by

Kanika Chanana

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

40 views56 pages

3 - Introduction To Data

Uploaded by

Kanika Chanana

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 56

Outline

Attributes and Objects

Types of Data

Data Quality

Similarity and Distance

01/27/2021 Introduction to Data Mining, 2nd Edition 1

Tan, Steinbach, Karpatne, Kumar
What is Data?

Collection of data objects Attributes

and their attributes
An attribute is a property Tid Refund Marital Taxable
or characteristic of an Status Income Cheat

object 1 Yes Single 125K No

– Examples: eye color of a 2 No Married 100K No
person, temperature, etc.
3 No Single 70K No

Objects
– Attribute is also known as
variable, field, characteristic, 4 Yes Married 120K No
dimension, or feature 5 No Divorced 95K Yes

A collection of attributes 6 No Married 60K No

describe an object 7 Yes Divorced 220K No
– Object is also known as 8 No Single 85K Yes
record, point, case, sample, 9 No Married 75K No
entity, or instance
10 No Single 90K Yes
10
Attribute Values

Attribute values are numbers or symbols assigned to

an attribute for a particular object

Distinction between attributes and attribute values

– Same attribute can be mapped to different attribute values
◆ Example: height can be measured in feet or meters

– Different attributes can be mapped to the same set of values

◆ Example: Attribute values for ID and age are integers

– But properties of attribute can be different than the

properties of the values used to represent the
attribute

01/27/2021 Introduction to Data Mining, 2nd Edition 3

Tan, Steinbach, Karpatne, Kumar
Attribute Types

Nominal: categories, states, or “names of things”

– Hair_color = {auburn, black, blond, brown, grey, red, white}
– marital status, occupation, ID numbers, zip codes
Binary
– Nominal attribute with only 2 states (0 and 1)
– Symmetric binary: both outcomes equally important
◆ e.g., gender
– Asymmetric binary: outcomes not equally important.
◆ e.g., medical test (positive vs. negative)
◆ Convention: assign 1 to most important outcome (e.g., HIV
positive)
Ordinal
– Values have a meaningful order (ranking) but magnitude between
successive values is not known.
– Size = {small, medium, large}, grades, army rankings
01/27/2021 Introduction to Data Mining, 2nd Edition 4
Tan, Steinbach, Karpatne, Kumar
Numeric Attribute Types

Interval
◆ Measured on a scale of equal-sized units
◆ Values have order
– E.g., temperature in C˚or F˚, calendar dates
◆ No true zero-point
Ratio
◆ Inherent zero-point
◆ We can speak of values as being an order of magnitude
larger than the unit of measurement (10 K˚ is twice as high as
5 K˚).
– e.g. length, counts, monetary quantities

01/27/2021 Introduction to Data Mining, 2nd Edition 5

Tan, Steinbach, Karpatne, Kumar 5
https://www.graphpad.com/support/faq/what-is-the-difference-between-ordinal-interval-and-ratio-
variables-why-should-i-care/
01/27/2021 Introduction to Data Mining, 2nd Edition 6
Tan, Steinbach, Karpatne, Kumar
01/27/2021 Introduction to Data Mining, 2nd Edition 7
Tan, Steinbach, Karpatne, Kumar
Discrete and Continuous Attributes

Discrete Attribute
– Has only a finite or countably infinite set of values
– Examples: zip codes, counts, or the set of words in a
collection of documents
– Often represented as integer variables.
– Note: binary attributes are a special case of discrete
attributes
Continuous Attribute
– Has real numbers as attribute values
– Examples: temperature, height, or weight.
– Practically, real values can only be measured and
represented using a finite number of digits.
– Continuous attributes are typically represented as floating-
point variables.
01/27/2021 Introduction to Data Mining, 2nd Edition 8
Tan, Steinbach, Karpatne, Kumar
Basic Statistical Descriptions of Data

Motivation
– To better understand the data: central tendency,
variation and spread
Data dispersion characteristics
– median, max, min, quantiles, outliers, variance, etc.
Numerical dimensions correspond to sorted intervals
– Data dispersion: analyzed with multiple granularities
of precision
– Boxplot or quantile analysis on sorted intervals
Dispersion analysis on computed measures
– Folding measures into numerical dimensions
– Boxplot orIntroduction
quantileto analysis on the transformed cube
Data Mining, 2nd Edition
01/27/2021 9 9
Tan, Steinbach, Karpatne, Kumar
Measuring the Central Tendency

Mean (algebraic measure) (sample vs. population): 1 n

x =  xi =  x
Note: n is sample size and N is population size. n i =1 N
n
– Weighted arithmetic mean:
w x i i
– Trimmed mean: chopping extreme values x= i =1
n

Median: w
i =1
i

– Middle value if odd number of values, or average of

the middle two values otherwise
– Estimated by interpolation (for grouped data):
n / 2 − ( freq)l
median = L1 + ( ) width
freqmedian
Mode
– Value that occurs most frequently in the data

01/27/2021 Introduction to Data Mining, 2nd Edition 10

Tan, Steinbach, Karpatne, Kumar 10
Symmetric vs. Skewed
Data

Median, mean and mode of symmetric

symmetric, positively and
negatively skewed data

positively skewed negatively skewed

01/27/2021 IntroductionData
to Data Mining,
Mining: 2ndand
Concepts Edition 11
August 1, 2023 Tan, Steinbach, Karpatne, Kumar
Techniques
11
Measuring the Dispersion of Data

Quartiles, outliers and boxplots

– Quartiles: Q1 (25th percentile of data below this point), Q3 (75th percentile)
– Inter-quartile range: IQR = Q3 – Q1
– Five number summary: min, Q1, median, Q3, max
– Boxplot: ends of the box are the quartiles; median is marked; add
whiskers, and plot outliers individually
– Outlier: usually, a value higher/lower than 1.5 x IQR
Variance and standard deviation (sample: s, population: σ)
– Variance: (algebraic, scalable computation)
1 n 1 n 2 1 n 2
 [ xi − ( xi ) ]
n n
1 1
s = ( xi − x ) =  =  −  =  xi −  2
2 2 2 2 2
( x )
n − 1 i =1 n − 1 i =1 n i =1 N i =1
i
N i =1

– Standard deviation s (or σ) is the square root of variance s2 (or σ2)

01/27/2021 Introduction to Data Mining, 2nd Edition 12

Tan, Steinbach, Karpatne, Kumar 12
Boxplot Analysis

Five-number summary of a distribution

– Minimum, Q1, Median, Q3, Maximum
Boxplot
– Data is represented with a box
– The ends of the box are at the first and
third quartiles, i.e., the height of the box is
IQR
– The median is marked by a line within the
box
– Outliers: points beyond a specified outlier
threshold, plotted individually.

01/27/2021 Introduction to Data Mining, 2nd Edition 13

Tan, Steinbach, Karpatne, Kumar 13
Example

01/27/2021 Introduction to Data Mining, 2nd Edition 14

Tan, Steinbach, Karpatne, Kumar
Example

01/27/2021 Introduction to Data Mining, 2nd Edition 15

Tan, Steinbach, Karpatne, Kumar
Graphic Displays of Basic Statistical Descriptions

Boxplot: graphic display of five-number summary

Histogram: x-axis are values, y-axis repres. frequencies
Quantile plot: each value xi is paired with fi indicating
that approximately 100 fi % of data are  xi
Quantile-quantile (q-q) plot: graphs the quantiles of one
univariant distribution against the corresponding quantiles
of another
Scatter plot: each pair of values is a pair of coordinates
and plotted as points in the plane
01/27/2021 Introduction to Data Mining, 2nd Edition 16
Tan, Steinbach, Karpatne, Kumar 16
01/27/2021 Introduction to Data Mining, 2nd Edition 17
Tan, Steinbach, Karpatne, Kumar
Histogram Analysis
Histogram: Graph display of
tabulated frequencies, shown as
40
bars
35
It shows what proportion of cases fall
into each of several categories 30

Differs from a bar chart in that it is 25

the area of the bar that denotes the 20
value, not the height as in bar charts,
15
a crucial distinction when the
categories are not of uniform width 10
The categories are usually specified 5
as non-overlapping intervals of some 0
10000 20000 30000 40000 50000 60000 70000 80000 90000 100000
variable. The categories (bars) must
be adjacent

01/27/2021 Introduction to Data Mining, 2nd Edition 18

Tan, Steinbach, Karpatne, Kumar 18
Histogram vs. Bar Graph

01/27/2021 Introduction to Data Mining, 2nd Edition 19

Tan, Steinbach, Karpatne, Kumar
Histograms Often Tell More than Boxplots

◼ The two histograms shown in the left

may have the same boxplot
representation
◼ The same values for: min, Q1,
median, Q3, max
◼ But they have rather different data
distributions

01/27/2021 Introduction to Data Mining, 2nd Edition 20

Tan, Steinbach, Karpatne, Kumar 20
Quantile Plot

Displays all of the data (allowing the user to assess both

the overall behavior and unusual occurrences)
Plots quantile information
– For a data xi data sorted in increasing order, fi
indicates that approximately 100 fi% of the data are
below or equal to the value xi

01/27/2021 IntroductionData
to Data Mining,
Mining: 2ndand
Concepts Edition 21
Tan, Steinbach, Karpatne, Kumar
Techniques
21
Quantile-Quantile (Q-Q) Plot

Graphs the quantiles of one univariate distribution against the

corresponding quantiles of another
View: Is there is a shift in going from one distribution to another?
Example shows unit price of items sold at Branch 1 vs. Branch 2 for each
quantile. Unit prices of items sold at Branch 1 tend to be lower than those
at Branch 2.

01/27/2021 Introduction to Data Mining, 2nd Edition 22

Tan, Steinbach, Karpatne, Kumar 22
Scatter plot

Provides a first look at bivariate data to see clusters of

points, outliers, etc
Each pair of values is treated as a pair of coordinates and
plotted as points in the plane

01/27/2021 Introduction to Data Mining, 2nd Edition 23

Tan, Steinbach, Karpatne, Kumar 23
Positively and Negatively Correlated Data

The left half fragment is positively

correlated

The right half is negative correlated

01/27/2021 Introduction to Data Mining, 2nd Edition 24

Tan, Steinbach, Karpatne, Kumar 24
Uncorrelated Data

01/27/2021 Introduction to Data Mining, 2nd Edition 25

Tan, Steinbach, Karpatne, Kumar 25
Important Characteristics of Data

– Dimensionality (number of attributes)

◆ High dimensional data brings a number of challenges

– Sparsity
◆ Only presence counts

– Resolution
◆ Patterns depend on the scale

– Size
◆ Type of analysis may depend on size of data

01/27/2021 Introduction to Data Mining, 2nd Edition 26

Tan, Steinbach, Karpatne, Kumar
Types of data sets
Record
– Data Matrix
– Document Data
– Transaction Data
Graph
– World Wide Web
– Molecular Structures
Ordered
– Spatial Data
– Temporal Data
– Sequential Data
– Genetic Sequence Data

01/27/2021 Introduction to Data Mining, 2nd Edition 27

Tan, Steinbach, Karpatne, Kumar
Record Data

Data that consists of a collection of records, each

of which consists of a fixed set of attributes
Tid Refund Marital Taxable
Status Income Cheat

1 Yes Single 125K No

2 No Married 100K No
3 No Single 70K No
4 Yes Married 120K No
5 No Divorced 95K Yes
6 No Married 60K No
7 Yes Divorced 220K No
8 No Single 85K Yes
9 No Married 75K No
10 No Single 90K Yes
10

01/27/2021 Introduction to Data Mining, 2nd Edition 28

Tan, Steinbach, Karpatne, Kumar
Data Matrix

If data objects have the same fixed set of numeric

attributes, then the data objects can be thought of as
points in a multi-dimensional space, where each
dimension represents a distinct attribute

Such a data set can be represented by an m by n matrix,

where there are m rows, one for each object, and n
columns, one for each attribute
Projection Projection Distance Load Thickness
of x Load of y load

10.23 5.27 15.22 2.7 1.2

12.65 6.25 16.22 2.2 1.1

01/27/2021 Introduction to Data Mining, 2nd Edition 29

Tan, Steinbach, Karpatne, Kumar
Document Data

Each document becomes a ‘term’ vector

– Each term is a component (attribute) of the vector
– The value of each component is the number of times
the corresponding term occurs in the document.

timeout

season
coach

game
score
play
team

win
ball

lost
Document 1 3 0 5 0 2 6 0 2 0 2

Document 2 0 7 0 2 1 0 0 3 0 0

Document 3 0 1 0 0 1 2 2 0 3 0

01/27/2021 Introduction to Data Mining, 2nd Edition 30

Tan, Steinbach, Karpatne, Kumar
Transaction Data

A special type of data, where

– Each transaction involves a set of items.
– For example, consider a grocery store. The set of products
purchased by a customer during one shopping trip constitute a
transaction, while the individual products that were purchased
are the items.
– Can represent transaction data as record data

TID Items
1 Bread, Coke, Milk
2 Beer, Bread
3 Beer, Coke, Diaper, Milk
4 Beer, Bread, Diaper, Milk
5 Coke, Diaper, Milk
01/27/2021 Introduction to Data Mining, 2nd Edition 31
Tan, Steinbach, Karpatne, Kumar
Graph Data

Examples: Generic graph, a molecule, and webpages

2
5 1
2
5

Benzene Molecule: C6H6

01/27/2021 Introduction to Data Mining, 2nd Edition 32
Tan, Steinbach, Karpatne, Kumar
Ordered Data

Sequences of transactions
Items/Events

An element of
the sequence
01/27/2021 Introduction to Data Mining, 2nd Edition 33
Tan, Steinbach, Karpatne, Kumar
Ordered Data

Genomic sequence data

GGTTCCGCCTTCAGCCCCGCGCC
CGCAGGGCCCGCCCCGCGCCGTC
GAGAAGGGCCCGCCTGGCGGGCG
GGGGGAGGCGGGGCCGCCCGAGC
CCAACCGAGTCCGACCAGGTGCC
CCCTCTGCTCGGCCTAGACCTGA
GCTCATTAGGCGGCAGCGGACAG
GCCAAGTAGAACACGCGAAGCGC
TGGGCTGCCTGCTGCGACCAGGG

01/27/2021 Introduction to Data Mining, 2nd Edition 34

Tan, Steinbach, Karpatne, Kumar
Ordered Data

Spatio-Temporal Data

Average Monthly
Temperature of
land and ocean

01/27/2021 Introduction to Data Mining, 2nd Edition 35

Tan, Steinbach, Karpatne, Kumar
Similarity and Dissimilarity Measures

Similarity measure
– Numerical measure of how alike two data objects are.
– Is higher when objects are more alike.
– Often falls in the range [0,1]
Dissimilarity measure
– Numerical measure of how different two data objects
are
– Lower when objects are more alike
– Minimum dissimilarity is often 0
– Upper limit varies
Proximity refers to a similarity or dissimilarity
01/27/2021 Introduction to Data Mining, 2nd Edition 36
Tan, Steinbach, Karpatne, Kumar
Similarity/Dissimilarity for Simple Attributes

The following table shows the similarity and dissimilarity

between two objects, x and y, with respect to a single, simple
attribute.

01/27/2021 Introduction to Data Mining, 2nd Edition 37

Tan, Steinbach, Karpatne, Kumar
Euclidean Distance

Euclidean Distance

where n is the number of dimensions (attributes) and

xk and yk are, respectively, the kth attributes
(components) or data objects x and y.

Standardization is necessary, if scales differ.

01/27/2021 Introduction to Data Mining, 2nd Edition 38

Tan, Steinbach, Karpatne, Kumar
Euclidean Distance

3
point x y
2 p1
p1 0 2
p3 p4
1
p2 2 0
p2 p3 3 1
0 p4 5 1
0 1 2 3 4 5 6

p1 p2 p3 p4
p1 0 2.828 3.162 5.099
p2 2.828 0 1.414 3.162
p3 3.162 1.414 0 2
p4 5.099 3.162 2 0
Distance Matrix
01/27/2021 Introduction to Data Mining, 2nd Edition 39
Tan, Steinbach, Karpatne, Kumar
Minkowski Distance

Minkowski Distance is a generalization of Euclidean

Distance

Where r is a parameter, n is the number of dimensions

(attributes) and xk and yk are, respectively, the kth
attributes (components) or data objects x and y.

01/27/2021 Introduction to Data Mining, 2nd Edition 40

Tan, Steinbach, Karpatne, Kumar
Minkowski Distance: Examples

r = 1. City block (Manhattan, taxicab, L1 norm) distance.

– A common example of this for binary vectors is the
Hamming distance, which is just the number of bits that are
different between two binary vectors

r = 2. Euclidean distance

r → . “supremum” (Lmax norm, L norm) distance.

– This is the maximum difference between any component of
the vectors

Do not confuse r with n, i.e., all these distances are

defined for all numbers of dimensions.

01/27/2021 Introduction to Data Mining, 2nd Edition 41

Tan, Steinbach, Karpatne, Kumar
Minkowski Distance

L1 p1 p2 p3 p4
p1 0 4 4 6
p2 4 0 2 4
p3 4 2 0 2
p4 6 4 2 0
point x y
p1 0 2 L2 p1 p2 p3 p4
p2 2 0 p1 0 2.828 3.162 5.099
p3 3 1 p2 2.828 0 1.414 3.162
p4 5 1 p3 3.162 1.414 0 2
p4 5.099 3.162 2 0

L p1 p2 p3 p4
p1 0 2 3 5
p2 2 0 1 3
p3 3 1 0 2
p4 5 3 2 0

Distance Matrix
01/27/2021 Introduction to Data Mining, 2nd Edition 42
Tan, Steinbach, Karpatne, Kumar
Mahalanobis Distance

𝐦𝐚𝐡𝐚𝐥𝐚𝐧𝐨𝐛𝐢𝐬 𝐱, 𝐲 = ((𝐱 − 𝐲)𝑇 Ʃ−1 (𝐱 − 𝐲))-0.5

 is the covariance matrix

For red points, the Euclidean distance is 14.7, Mahalanobis distance is 6.

01/27/2021 Introduction to Data Mining, 2nd Edition 43
Tan, Steinbach, Karpatne, Kumar
Common Properties of a Distance

Distances, such as the Euclidean distance,

have some well known properties.
1. d(x, y)  0 for all x and y and d(x, y) = 0 if and only
if x = y.
2. d(x, y) = d(y, x) for all x and y. (Symmetry)
3. d(x, z)  d(x, y) + d(y, z) for all points x, y, and z.
(Triangle Inequality)

where d(x, y) is the distance (dissimilarity) between

points (data objects), x and y.

A distance that satisfies these properties is a

metric
01/27/2021 Introduction to Data Mining, 2nd Edition 44
Tan, Steinbach, Karpatne, Kumar
Common Properties of a Similarity

Similarities, also have some well known

properties.

1. s(x, y) = 1 (or maximum similarity) only if x = y.

(does not always hold, e.g., cosine)
2. s(x, y) = s(y, x) for all x and y. (Symmetry)

where s(x, y) is the similarity between points (data

objects), x and y.

01/27/2021 Introduction to Data Mining, 2nd Edition 45

Tan, Steinbach, Karpatne, Kumar
Similarity Between Binary Vectors
Common situation is that objects, x and y, have only
binary attributes

Compute similarities using the following quantities

f01 = the number of attributes where x was 0 and y was 1
f10 = the number of attributes where x was 1 and y was 0
f00 = the number of attributes where x was 0 and y was 0
f11 = the number of attributes where x was 1 and y was 1

Simple Matching and Jaccard Coefficients

counts both presences and absences equally and it is normally
used for symmetric binary attributes

SMC = number of matches / number of attributes

= (f11 + f00) / (f01 + f10 + f11 + f00)

01/27/2021 Introduction to Data Mining, 2nd Edition 46

Tan, Steinbach, Karpatne, Kumar
Similarity Between Binary Vectors
Common situation is that objects, x and y, have only
binary attributes

Compute similarities using the following quantities

Jaccard Coefficients
counts only presences and it is frequently for asymmetric binary
attributes.
J = number of 11 matches / number of non-zero attributes
= (f11) / (f01 + f10 + f11)

01/27/2021 Introduction to Data Mining, 2nd Edition 47

Tan, Steinbach, Karpatne, Kumar
SMC versus Jaccard: Example

x= 1000000000
y= 0000001001

f01 = 2 (the number of attributes where x was 0 and y was 1)

f10 = 1 (the number of attributes where x was 1 and y was 0)
f00 = 7 (the number of attributes where x was 0 and y was 0)
f11 = 0 (the number of attributes where x was 1 and y was 1)

SMC = (f11 + f00) / (f01 + f10 + f11 + f00)

= (0+7) / (2+1+0+7) = 0.7

J = (f11) / (f01 + f10 + f11) = 0 / (2 + 1 + 0) = 0

01/27/2021 Introduction to Data Mining, 2nd Edition 48

Tan, Steinbach, Karpatne, Kumar
Cosine Similarity

01/27/2021 Introduction to Data Mining, 2nd Edition 49

Tan, Steinbach, Karpatne, Kumar
Cosine Similarity

If d1 and d2 are two document vectors, then

cos( d1, d2 ) = <d1,d2> / ||d1|| ||d2|| ,
where <d1,d2> indicates inner product or vector dot
product of vectors, d1 and d2, and || d || is the length of
vector d.

Example:
d1 = 3 2 0 5 0 0 0 2 0 0
d2 = 1 0 0 0 0 0 0 1 0 2
<d1, d2> = 3*1 + 2*0 + 0*0 + 5*0 + 0*0 + 0*0 + 0*0 + 2*1 + 0*0 + 0*2 = 5
| d1 || = (3*3+2*2+0*0+5*5+0*0+0*0+0*0+2*2+0*0+0*0)0.5 = (42) 0.5 = 6.481
|| d2 || = (1*1+0*0+0*0+0*0+0*0+0*0+0*0+1*1+0*0+2*2) 0.5 = (6) 0.5 = 2.449
cos(d1, d2 ) = 0.3150

01/27/2021 Introduction to Data Mining, 2nd Edition 50

Tan, Steinbach, Karpatne, Kumar
Correlation measures the linear relationship
between objects

01/27/2021 Introduction to Data Mining, 2nd Edition 51

Tan, Steinbach, Karpatne, Kumar
Correlation

01/27/2021 Introduction to Data Mining, 2nd Edition 54

Tan, Steinbach, Karpatne, Kumar
Drawback of Correlation (Non-linear Data)

x = (-3, -2, -1, 0, 1, 2, 3)

y = (9, 4, 1, 0, 1, 4, 9)

yi = xi2

mean(x) = 0, mean(y) = 4
std(x) = 2.16, std(y) = 3.74

corr = (-3)(5)+(-2)(0)+(-1)(-3)+(0)(-4)+(1)(-3)+(2)(0)+3(5) / ( 6 * 2.16 * 3.74 )

01/27/2021 Introduction to Data Mining, 2nd Edition 55

Tan, Steinbach, Karpatne, Kumar
Correlation vs cosine vs Euclidean distance

Choice of the right proximity measure depends on the domain

What is the correct choice of proximity measure for the
following situations?
– Comparing documents using the frequencies of words
◆ Documents are considered similar if the word frequencies are similar

– Comparing the temperature in Celsius of two locations

◆ Two locations are considered similar if the temperatures are similar in
magnitude

– Comparing two time series of temperature measured in Celsius

◆ Two time series are considered similar if their “shape” is similar, i.e., they
vary in the same way over time, achieving minimums and maximums at
similar times, etc.

01/27/2021 Introduction to Data Mining, 2nd Edition 56

Tan, Steinbach, Karpatne, Kumar

3 - Introduction To Data
No ratings yet
3 - Introduction To Data
55 pages
Concepts and Techniques: - Chapter 2
No ratings yet
Concepts and Techniques: - Chapter 2
36 pages
Concepts and Techniques: - Chapter 2
No ratings yet
Concepts and Techniques: - Chapter 2
29 pages
02data (Compatibility Mode)
No ratings yet
02data (Compatibility Mode)
11 pages
Lecture Notes For Chapter 2 Introduction To Data Mining, 2 Edition
No ratings yet
Lecture Notes For Chapter 2 Introduction To Data Mining, 2 Edition
87 pages
02 Data
No ratings yet
02 Data
41 pages
Data Mining (DM) : Lecture 3: Know Your Data
No ratings yet
Data Mining (DM) : Lecture 3: Know Your Data
53 pages
Chap2 Data
No ratings yet
Chap2 Data
105 pages
Lecture 2
No ratings yet
Lecture 2
62 pages
Chap2 Data
No ratings yet
Chap2 Data
87 pages
Chap2 Data
No ratings yet
Chap2 Data
92 pages
2 1 Data
No ratings yet
2 1 Data
22 pages
Chap2 Data
No ratings yet
Chap2 Data
76 pages
Chap2 Data Rev
No ratings yet
Chap2 Data Rev
91 pages
Chapter 2 - Tagged
No ratings yet
Chapter 2 - Tagged
66 pages
CH 2
No ratings yet
CH 2
68 pages
Lectur 4 Basic Statistical Descriptions of Data
No ratings yet
Lectur 4 Basic Statistical Descriptions of Data
44 pages
02 KnowYourData
No ratings yet
02 KnowYourData
44 pages
02 Data
No ratings yet
02 Data
24 pages
Chap2 Data
No ratings yet
Chap2 Data
86 pages
Getting To Know Your Data
No ratings yet
Getting To Know Your Data
78 pages
02 Data
No ratings yet
02 Data
65 pages
2DMT
No ratings yet
2DMT
73 pages
Chapter 2
No ratings yet
Chapter 2
65 pages
Lec 2
No ratings yet
Lec 2
26 pages
02 Data
No ratings yet
02 Data
62 pages
Lect 3
No ratings yet
Lect 3
51 pages
Data Mining: Data Exploration: - Chapter 6
No ratings yet
Data Mining: Data Exploration: - Chapter 6
56 pages
Transportation Data Mining: Chapter 2. Getting To Know Your Data
No ratings yet
Transportation Data Mining: Chapter 2. Getting To Know Your Data
77 pages
02 Kinds of Data
No ratings yet
02 Kinds of Data
41 pages
DM Unit-1-1
No ratings yet
DM Unit-1-1
56 pages
02 Data
No ratings yet
02 Data
65 pages
Unit 1b
No ratings yet
Unit 1b
69 pages
All Data Mining Chapters
No ratings yet
All Data Mining Chapters
235 pages
02 Data
No ratings yet
02 Data
64 pages
CH2 Data 1
No ratings yet
CH2 Data 1
35 pages
02data Edited v2
No ratings yet
02data Edited v2
43 pages
Data Mining:: Concepts and Techniques
100% (1)
Data Mining:: Concepts and Techniques
63 pages
UFE Lecture-1 Overview Data
No ratings yet
UFE Lecture-1 Overview Data
42 pages
Data Type, Data Chart, Descriptive Statistics
No ratings yet
Data Type, Data Chart, Descriptive Statistics
65 pages
CS 591.03 Introduction To Data Mining Instructor: Abdullah Mueen
No ratings yet
CS 591.03 Introduction To Data Mining Instructor: Abdullah Mueen
52 pages
Data Warehousing and Data Mining
No ratings yet
Data Warehousing and Data Mining
46 pages
IT326 - Ch2
No ratings yet
IT326 - Ch2
44 pages
02know Your Data Lecture2 3
No ratings yet
02know Your Data Lecture2 3
53 pages
Lec.02 Getting To Know Your Data
No ratings yet
Lec.02 Getting To Know Your Data
62 pages
Lecture Notes For Chapter 2 Introduction To Data Mining, 2 Edition
No ratings yet
Lecture Notes For Chapter 2 Introduction To Data Mining, 2 Edition
96 pages
VIPDMTheory Chapter 2
No ratings yet
VIPDMTheory Chapter 2
56 pages
Chap2 Data
No ratings yet
Chap2 Data
78 pages
Unit1 Statistics
No ratings yet
Unit1 Statistics
60 pages
02 Data
No ratings yet
02 Data
35 pages
1 L2 Intro DAM
No ratings yet
1 L2 Intro DAM
27 pages
02know Your Data-Lecture2-3
No ratings yet
02know Your Data-Lecture2-3
53 pages
Concepts and Techniques: - Chapter 2
No ratings yet
Concepts and Techniques: - Chapter 2
65 pages
Concepts and Techniques: - Chapter 2
No ratings yet
Concepts and Techniques: - Chapter 2
65 pages
About Data
No ratings yet
About Data
25 pages
RNA Structure and Functions Explained
No ratings yet
RNA Structure and Functions Explained
15 pages
Jyotish - 2005 - J C Sharma - Remedial Lal Kitab
100% (3)
Jyotish - 2005 - J C Sharma - Remedial Lal Kitab
214 pages
EC Quote
No ratings yet
EC Quote
2 pages
Lesson Two: Arrow Diagrams
No ratings yet
Lesson Two: Arrow Diagrams
10 pages
Mutual Fund Documents
No ratings yet
Mutual Fund Documents
7 pages
Ammann Customer Magazine Gmc-1423-10-En Arx 200916
No ratings yet
Ammann Customer Magazine Gmc-1423-10-En Arx 200916
20 pages
Teacher Assessment Strategies
No ratings yet
Teacher Assessment Strategies
3 pages
Growth Performance of Broilers Fed A Quality Protein Maize
No ratings yet
Growth Performance of Broilers Fed A Quality Protein Maize
13 pages
Anshul Khanna: Business Development & Sales Expert
No ratings yet
Anshul Khanna: Business Development & Sales Expert
4 pages
Process Costing Sample Problem
No ratings yet
Process Costing Sample Problem
1 page
Technical Seminar Report
No ratings yet
Technical Seminar Report
15 pages
GFM Divine Healing Booklet
No ratings yet
GFM Divine Healing Booklet
56 pages
Poly Studio x52 Datasheet en
No ratings yet
Poly Studio x52 Datasheet en
3 pages
QB 91 C 24 Gax 0
No ratings yet
QB 91 C 24 Gax 0
3 pages
Riprap - RR
No ratings yet
Riprap - RR
9 pages
Industrial Valve Solutions Guide
No ratings yet
Industrial Valve Solutions Guide
10 pages
CFR - Code of Federal Regulations Title 21
No ratings yet
CFR - Code of Federal Regulations Title 21
5 pages
Adobe 2013 Security Breach Analysis
No ratings yet
Adobe 2013 Security Breach Analysis
14 pages
Software Development Sheet
No ratings yet
Software Development Sheet
23 pages
Climate Change: The Ultimate Determinant of Health
No ratings yet
Climate Change: The Ultimate Determinant of Health
11 pages
Challenges in Blended Learning A Narrative From Working Students
No ratings yet
Challenges in Blended Learning A Narrative From Working Students
7 pages
Product Information Human PSA-Total ELISA Kot
No ratings yet
Product Information Human PSA-Total ELISA Kot
5 pages
Test Syllabus
No ratings yet
Test Syllabus
4 pages
86 Measuring A Discharge Coefficient of An Orifice For An Unsteady Compressible Flow
No ratings yet
86 Measuring A Discharge Coefficient of An Orifice For An Unsteady Compressible Flow
5 pages
WFJ-80 Kratom Grinder Quotation
No ratings yet
WFJ-80 Kratom Grinder Quotation
15 pages
01 Production Manual - CM602
No ratings yet
01 Production Manual - CM602
56 pages
Wire Ropes and Drilling
No ratings yet
Wire Ropes and Drilling
32 pages
Freud's Theory of The Id, The Ego and The Superego
No ratings yet
Freud's Theory of The Id, The Ego and The Superego
5 pages
Critical Appreciation The Road Not Taken
100% (1)
Critical Appreciation The Road Not Taken
2 pages
Shaltari Complete PDF
No ratings yet
Shaltari Complete PDF
18 pages