S27
S27
S27
• https://datascience.stackexchange.com/questions/11485/k-means-
in-r-usage-of-nstart-parameter
Let’s do k-means clustering with “iris” data:
#Load two packages for special # Fitting K-Means clustering
plot Model to training dataset
• library(ClusterR) • set.seed(240)
• library(cluster)
• kmeans.res <- kmeans(iris_1,
#Get, check and make data centers = 3, nstart = 20)
• data(iris)
• str(iris) • kmeans.res
• iris_1 <- iris[,-5]
We have used k=3 as we know that there are 3 types of
flowers!
k-means fit:
• K-means clustering with 3 • Within cluster sum of squares by
clusters of sizes 50, 62, 38 cluster:
In Data Science, we
need to use all the
four methods and
find best k and the
fit k-mean for each
of the best K’s. Then
select the best
clustering model
based on the highest
R-square value.
Question/queries?
• Next two classes: • Final class: Projects in R