0% found this document useful (0 votes)

212 views41 pages

Lecture 10 Clustering and Classification

The document provides an overview and introduction to various machine learning techniques for time series analysis in neuroscience, including clustering and classification algorithms. It outlines sections on machine learning, clustering methods like k-means and hierarchical clustering, and classification models like Gaussian mixture models. Code examples are provided to demonstrate how these techniques can be applied to time series neuroscience data for tasks like signal separation and clustering.

Uploaded by

Noud Jaspers

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

212 views41 pages

Lecture 10 Clustering and Classification

Uploaded by

Noud Jaspers

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 41

Time series analysis in neuroscience

Lecture 10. Clustering and Classification

Alexander Zhigalov / Dept. of CS, University of Helsinki and Dept. of NBE, Aalto University
Time series analysis in neuroscience 2

Outline / overview
Section 1. Machine learning
Section 2. Clustering
Section 3. Classification
Section 4. Regression
Time series analysis in neuroscience 3

Section 1. Machine learning

Time series analysis in neuroscience 4

Machine learning

Introducing Machine Learning - MathWorks

Section 1 Machine learning

Time series analysis in neuroscience 5

Machine learning

Unsupervised learning is useful when you want to explore

your data but do not yet have a specific goal or are not sure
what information the data contains.

A supervised learning algorithm takes a known set of input

data (the training set) and known responses to the data
(output), and trains a model to generate reasonable
predictions for the response to new input data.

Section 1 Machine learning

Time series analysis in neuroscience 6

Machine learning

Introducing Machine Learning - MathWorks

Section 1 Machine learning

Time series analysis in neuroscience 7

Section 2. Clustering
Time series analysis in neuroscience 8

Clustering

In cluster analysis, data is partitioned into groups based on

some measure of similarity or shared characteristic.

Clusters are formed so that objects in the same cluster are

very similar and objects in different clusters are very
distinct.

Section 2 Clustering
Time series analysis in neuroscience 9

K-means clustering

How it works
Partitions data into k number of mutually exclusive clusters.
How well a point fits into a cluster is determined by the
distance from that point to the cluster’s center.

Best used ...

• When the number of clusters is known
• For fast clustering of large data sets

Result
Cluster centers

Section 2 Clustering
Time series analysis in neuroscience 10

K-means clustering (1/2)

The number of clusters (K) is equal to the

number of sources.
# create copies
X[i] = np.tile(S[i, :], (R, 1)) +
np.random.randn(R, N) * SNR

# measurements
Y = X[np.random.permutation(M*R), :]

# clustering using sklearn

model = cluster.KMeans(n_clusters=K)
model.fit(Y)

# clustering outcome
labels = model.labels_
Z = model.cluster_centers_
inertia = model.inertia_
print(inertia) # within-cluster sum-of-squares

inertia = 0.0
See, “L10_clustering_kmeans.py”

Section 2 Clustering
Time series analysis in neuroscience 11

K-means clustering (1/2)

The number of clusters (K) is greater or less

than the number of sources.
# create copies
X[i] = np.tile(S[i, :], (R, 1)) +
np.random.randn(R, N) * SNR

# measurements
Y = X[np.random.permutation(M*R), :]

# clustering using sklearn

model = cluster.KMeans(n_clusters=K)
model.fit(Y)

# clustering outcome
labels = model.labels_
Z = model.cluster_centers_
inertia = model.inertia_
print(inertia)

inertia = 603.5 inertia = 0.0

See, “L10_clustering_kmeans.py”

Section 2 Clustering
Time series analysis in neuroscience 12

Noisy measurements (1/2)

How does it work in presence of noise?

# create copies
X[i] = np.tile(S[i, :], (R, 1)) +
np.random.randn(R, N) * 0.5

# measurements
Y = X[np.random.permutation(M*R), :]

inertia = 2550.0
See, “L10_clustering_kmeans.py”

Section 2 Clustering
Time series analysis in neuroscience 13

Noisy measurements (2/2)

Can the algorithm put noise to a single

cluster?
# create copies
X[i] = np.tile(S[i, :], (R, 1)) +
np.random.randn(R, N) * 0.5

# measurements
Y = X[np.random.permutation(M*R), :]

inertia = 3381.0 inertia = 2279.0

See, “L10_clustering_kmeans.py”

Section 2 Clustering
Time series analysis in neuroscience 14

Hierarchical clustering

How it works
Produces nested sets of clusters by analyzing similarities
between pairs of points and grouping objects into a binary,
hierarchical tree.

Best used ...

• When you do not know in advance how many clusters
are in your data
• You want visualization to guide your selection

Result
Dendrogram showing the hierarchical relationship
between clusters

Section 2 Clustering
Time series analysis in neuroscience 15

Hierarchical clustering (1/2)

What are the distance measures between

signals?
# pair-wise distance between signals
PX = np.zeros((MR, MR))
PX[np.triu_indices(MR, 1)] = pdist(X,
'euclidean')

# distance after permutation

PY = np.zeros((MR, MR))
PY[np.triu_indices(MR, 1)] = pdist(Y,
'euclidean')

See, “L10_clustering_hierarchical.py”

Section 2 Clustering
Time series analysis in neuroscience 16

Hierarchical clustering (2/2)

How does it work?

# clustering
model =
cluster.AgglomerativeClustering()
model.fit(Y)
labels = model.labels_
children = model.children_

See, “L10_clustering_hierarchical.py”

Section 2 Clustering
Time series analysis in neuroscience 17

Noisy measurements

How does noise affect the clustering

results?
# create copies
X[i] = np.tile(S[i, :], (R, 1)) +
np.random.randn(R, N) * 0.5

# measurements
Y = X[np.random.permutation(M*R), :]

See, “L10_clustering_hierarchical.py”

Section 2 Clustering
Time series analysis in neuroscience 18

Hierarchical clustering 2D (1/2)

Could we cluster the covariance matrix

instead of sources?
# covariance
CX = np.cov(X)
CY = np.cov(Y)

# clustering
model =
cluster.AgglomerativeClustering(n_clusters=K)
model.fit(Y)
labels = model.labels_

# sort matrix
indices = np.squeeze(np.argsort(labels))
CZ = CY
CZ = CZ[indices, :]
CZ = CZ[:, indices]

See, “L10_clustering_hierarchical_2D.py”

Section 2 Clustering
Time series analysis in neuroscience 19

Hierarchical clustering 2D (2/2)

Sub-optimal number of clusters …

# covariance
CX = np.cov(X)
CY = np.cov(Y)

# clustering
model =
cluster.AgglomerativeClustering(n_clusters=K)
model.fit(Y)
labels = model.labels_

# sort matrix
indices = np.squeeze(np.argsort(labels))
CZ = CY
CZ = CZ[indices, :]
CZ = CZ[:, indices]

See, “L10_clustering_hierarchical_2D.py”

Section 2 Clustering
Time series analysis in neuroscience 20

Gaussian Mixture Model

How it works
Partition-based clustering where data points come from
different multivariate normal distributions with certain
probabilities.

Best used ...

• When a data point might belong to more than one
cluster
• When clusters have different sizes and correlation
structures within them

Result
A model of Gaussian distributions that give probabilities of
a point being in a cluster

Section 2 Clustering
Time series analysis in neuroscience 21

Gaussian Mixture Model (1/2)

Why does it work?

# generate data
Z = [np.random.randn(1, N) * 0.5 + 0.0,
np.random.randn(1, N) * 1.25 + 4.0]

# scatter plot (joint distribution)

plt.scatter(Z[0], Z[1])

# gaussian PDF
b = np.linspace(-3, 10, 1000)
p0 = norm.pdf(b, np.mean(Z[0]), np.std(Z[0]))
p1 = norm.pdf(b, np.mean(Z[1]), np.std(Z[1]))

# multivariate gaussian PDF

x, y = np.meshgrid(np.arange(-10.0, 10.0,
delta), np.arange(-10.0, 10.0, delta))
z = mlab.bivariate_normal(x, y, np.std(Z[0]),
np.std(Z[1]), np.mean(Z[0]), np.mean(Z[1]))
plt.contour(x, y, z)

See, “L10_clustering_gmm.py”

Section 2 Clustering
Time series analysis in neuroscience 22

Gaussian Mixture Model (2/2)

How does it work?

# fit model
model = mixture.GaussianMixture(n_components=K)
model.fit(X)

# model properties
Y = model.predict(X)
model_mu = model.means_
model_cov = model.covariances_

See, “L10_clustering_gmm.py”

Section 2 Clustering
Time series analysis in neuroscience 23

Section 3. Classification
Time series analysis in neuroscience 24

Classification

Classification techniques predict discrete responses, for

example, whether an email is genuine or spam.

Classification models are trained to classify data

into categories.

Section 3 Classification
Time series analysis in neuroscience 25

Support Vector Machine

How it works
Classifies data by finding the linear decision boundary
(hyperplane) that separates all data points of one class from
those of the other class. The best hyperplane for an SVM is
the one with the largest margin between the two classes.

Best used ...

• For data that has exactly two classes
• For high-dimensional, nonlinearly separable data
• When you need a classifier that is simple, easy to
interpret and accurate

Result
Training/fitting transforms data and labels to coefficients,
while testing/prediction transforms data and coefficients to
labels.

Section 3 Classification
Time series analysis in neuroscience 26

Support Vector Machine

When it works
SVM works if data has exactly two classes.

Section 3 Classification
Time series analysis in neuroscience 27

Support Vector Machine (1/2)

SVM as any other classification approach

training testing
consists of two stages: training and testing.
# data
X = np.random.randn(M, N)

# binary labels
y = get_sequence(5, 0.8, N)

# induce some correlation between X and y

X = X + 2.0 * np.tile(y, (M, 1))

# training and testing datasets

L = N // 2
Y = y[:L] # training labels
U = y[L:] # testing labels
XY = X[:, :L] # training data
XU = X[:, L:] # testing data

# train classifier
model = SVC(kernel='linear')
model.fit(XY.T, Y)

See, “L10_classification_svm_2_signals.py”

Section 3 Classification
Time series analysis in neuroscience 28

Support Vector Machine (2/2)

The classifier gives the coefficients that can

training testing
be converted to the decision function.
# classifier outcome
coef = model._get_coef()
intercept = model.intercept_

# decision function
Z = np.zeros(N)
for i in range(0, N):
Z[i] = np.sum(X[:, i] * coef) + intercept

# testing
v = U
u = model.predict(XU.T)
u = u > 0.5

# accuracy
a = np.mean(v == u)
print('accuracy: %1.2f' % (a))

accuracy = 99%
See, “L10_classification_svm_2_signals.py”

Section 3 Classification
Time series analysis in neuroscience 29

Classification accuracy (1/2)

What if two datasets cannot be clearly

training testing
separated?
# data
X = np.random.randn(M, N)

# binary labels
y = get_sequence(5, 0.8, N)

# induce some correlation between X and y

X = X + 2.0 * np.tile(y, (M, 1))

# training and testing datasets

L = N // 2
Y = y[:L] # training labels
U = y[L:] # testing labels
XY = X[:, :L] # training data
XU = X[:, L:] # testing data

# train classifier
model = SVC(kernel='linear')
model.fit(XY.T, Y)

See, “L10_classification_svm_2_signals.py”

Section 3 Classification
Time series analysis in neuroscience 30

Classification accuracy (2/2)

The decision function looks like a random

training testing
noise.
# classifier outcome
coef = model._get_coef()
intercept = model.intercept_

# decision function
Z = np.zeros(N)
for i in range(0, N):
Z[i] = np.sum(X[:, i] * coef) + intercept

# testing
v = U
u = model.predict(XU.T)
u = u > 0.5

# accuracy
a = np.mean(v == u)
print('accuracy: %1.2f' % (a))

accuracy = 76%
See, “L10_classification_svm_2_signals.py”

Section 3 Classification
Time series analysis in neuroscience 31

Multichannel recordings (1/2)

More channels could provide a better fit of

training testing
the model but also overfitting may occur.
# data
X = np.random.randn(M, N)

# binary labels
y = get_sequence(5, 0.8, N)

# induce some correlation between X and y

X = X + 2.0 * np.tile(y, (M, 1))

accuracy = 100%
See, “L10_classification_svm.py”

Section 3 Classification
Time series analysis in neuroscience 32

Classification accuracy (2/3)

In case of weak correlation between data

and labels …
# data
X = np.random.randn(M, N)

# binary labels
y = get_sequence(5, 0.8, N)

# induce some correlation between X and y

X = X + 0.5 * np.tile(y, (M, 1))

accuracy = 67%
See, “L10_classification_svm.py”

Section 3 Classification
Time series analysis in neuroscience 33

Discriminant Analysis

How it works
Discriminant analysis classifies data by finding linear
combinations of features. Discriminant analysis assumes
that different classes generate data based on Gaussian
distributions.

Best used ...

• When you need a simple model that is easy to interpret
• When you need a model that is fast to predict

Result
Coefficients

Section 3 Classification
Time series analysis in neuroscience 42

Logistic regression

How it works
Fits a model that can predict the probability of a binary
response belonging to one class or the other. Because of its
simplicity, logistic regression is commonly used as a starting
point for binary classification problems.

Best used ...

• When data can be clearly separated by a single, linear
boundary
• As a baseline for evaluating more complex classification
methods

Result
Coefficients

Section X Classification
Time series analysis in neuroscience 34

Section 4. Regression
Time series analysis in neuroscience 35

Regression

Regression techniques predict continuous responses, for

example, changes in temperature or fluctuations in
electricity demand.

Section 4 Regression
Time series analysis in neuroscience 36

Linear regression

How it works
Linear regression is a statistical modeling technique used to
describe a continuous response variable as a linear function
of one or more predictor variables.

Best used ...

• When you need an algorithm that is easy to interpret
and fast to fit
• As a baseline for evaluating other, more complex,
regression models

Result
Coefficients

Section 4 Regression
Time series analysis in neuroscience 37

Nonlinear regression

How it works
Nonlinear regression is a statistical modeling technique that
helps describe nonlinear relationships in experimental data.

Best used ...

• When data has strong nonlinear trends and cannot be
easily transformed into a linear space
• For fitting custom models to data

Result
Coefficients

Section 4 Regression
Time series analysis in neuroscience 38

SVM regression

How it works
SVM regression algorithms work like SVM classification
algorithms, but are modified to be able to predict a
continuous response.

Best used ...

• For high-dimensional data with a large number of
predictor variables

Result
Coefficients

Section 4 Regression
Time series analysis in neuroscience 39

Literature
Time series analysis in neuroscience 40

• Python programming language

- http://www.scipy-lectures.org/, see “materials/L02_ScipyLectures.pdf”

• Data analysis
- Andreas Müller and Sarah Guido “Introduction to Machine Learning with Python: A Guide for Data Scientists”

Literature

ML Lecture06 Unsupervised Learning
No ratings yet
ML Lecture06 Unsupervised Learning
87 pages
Machine Learning Notes-1 (Clustering-1)
No ratings yet
Machine Learning Notes-1 (Clustering-1)
25 pages
Unsupervised Machine Learning in Python
100% (2)
Unsupervised Machine Learning in Python
89 pages
Machine Learning Section3 Ebook v05
No ratings yet
Machine Learning Section3 Ebook v05
15 pages
ML Clustering
No ratings yet
ML Clustering
33 pages
SJNanda - Spider and CollidingBodies
No ratings yet
SJNanda - Spider and CollidingBodies
50 pages
Week 5 v1.1 - Unsupervised Learning
No ratings yet
Week 5 v1.1 - Unsupervised Learning
40 pages
Lect 10 - Unsupervised Learning
No ratings yet
Lect 10 - Unsupervised Learning
50 pages
Machine Learning Note Modul 4 5
No ratings yet
Machine Learning Note Modul 4 5
20 pages
(KtabPDF Com) xrwA7TEBGp
No ratings yet
(KtabPDF Com) xrwA7TEBGp
32 pages
UnsupervisedLearning FoundationalMathofAI S24
No ratings yet
UnsupervisedLearning FoundationalMathofAI S24
6 pages
Clustering Algorithm
No ratings yet
Clustering Algorithm
47 pages
Exp5 - Unsupervised Learning
No ratings yet
Exp5 - Unsupervised Learning
13 pages
Outline: Three Basic Algorithms
No ratings yet
Outline: Three Basic Algorithms
34 pages
Unit 3 & 4 (p18)
No ratings yet
Unit 3 & 4 (p18)
18 pages
Data Science Unit 5 Sppu Notes
No ratings yet
Data Science Unit 5 Sppu Notes
23 pages
ML DSBA Lab7
No ratings yet
ML DSBA Lab7
6 pages
DSE Lab Assignment - Writeup - 7
No ratings yet
DSE Lab Assignment - Writeup - 7
4 pages
9 Som
No ratings yet
9 Som
32 pages
Unsupervised Learning: Clustering
No ratings yet
Unsupervised Learning: Clustering
57 pages
M Learning
No ratings yet
M Learning
11 pages
Clustering of Time-Series Data
No ratings yet
Clustering of Time-Series Data
20 pages
Introduction To Data Science Unsupervised Learning: CS 194 Fall 2015 John Canny
No ratings yet
Introduction To Data Science Unsupervised Learning: CS 194 Fall 2015 John Canny
54 pages
Clustering in Machine Learning
No ratings yet
Clustering in Machine Learning
4 pages
Clustering Slides
No ratings yet
Clustering Slides
22 pages
Lecture Notes For Chapter 7 Introduction To Data Mining, 2 Edition
No ratings yet
Lecture Notes For Chapter 7 Introduction To Data Mining, 2 Edition
108 pages
Week 4 - Lecture Slides - K-Means, Mixture Models, & EM
No ratings yet
Week 4 - Lecture Slides - K-Means, Mixture Models, & EM
65 pages
1.supervised and Unsupervised
No ratings yet
1.supervised and Unsupervised
42 pages
AppliedML Chap1 Clustering
No ratings yet
AppliedML Chap1 Clustering
37 pages
Lecture Unsupervised (17!04!2024)
No ratings yet
Lecture Unsupervised (17!04!2024)
61 pages
Day 3 - Content
No ratings yet
Day 3 - Content
50 pages
Data Clustering for Informatics Students
No ratings yet
Data Clustering for Informatics Students
5 pages
Clustering Examples
No ratings yet
Clustering Examples
47 pages
Chapter 4
No ratings yet
Chapter 4
60 pages
Chapter8-Basic Cluster Analysis2018
No ratings yet
Chapter8-Basic Cluster Analysis2018
143 pages
Detecting Patterns With Unsupervised Learning
No ratings yet
Detecting Patterns With Unsupervised Learning
21 pages
Unit 3 Clustering Algorithm
No ratings yet
Unit 3 Clustering Algorithm
44 pages
ML-Lab Programs - VTU
No ratings yet
ML-Lab Programs - VTU
5 pages
K-Means Clustering Basics Lab
No ratings yet
K-Means Clustering Basics Lab
3 pages
Assignment 2
No ratings yet
Assignment 2
8 pages
Chapter8-Basic Cluster Analysis2016
No ratings yet
Chapter8-Basic Cluster Analysis2016
143 pages
Machine Learning Topic 4
No ratings yet
Machine Learning Topic 4
36 pages
Unit 5
No ratings yet
Unit 5
5 pages
Clustering Techniques for Analysts
No ratings yet
Clustering Techniques for Analysts
7 pages
Classification vs Clustering Guide
No ratings yet
Classification vs Clustering Guide
31 pages
Unsupervised Learning
No ratings yet
Unsupervised Learning
43 pages
Chapter 8
No ratings yet
Chapter 8
15 pages
22AIP3101A Session 9
No ratings yet
22AIP3101A Session 9
38 pages
DSML-ML09. Unsupervised Learning
No ratings yet
DSML-ML09. Unsupervised Learning
69 pages
2022 Istdm 06
No ratings yet
2022 Istdm 06
76 pages
Chapter 1 Introduction
No ratings yet
Chapter 1 Introduction
49 pages
1 ML Overview
No ratings yet
1 ML Overview
62 pages
Week6 Clustering Regression
No ratings yet
Week6 Clustering Regression
101 pages
M5
No ratings yet
M5
40 pages
CEC453 Machine Learning
No ratings yet
CEC453 Machine Learning
168 pages
Machine Learning and Deep Learning Supervised Learning 1682688720
No ratings yet
Machine Learning and Deep Learning Supervised Learning 1682688720
121 pages
DSA Presentation Group 6
No ratings yet
DSA Presentation Group 6
34 pages
Unsupervised Learning: Clustering & Anomaly Detection
No ratings yet
Unsupervised Learning: Clustering & Anomaly Detection
50 pages
100 Electronic Projects With Circuit Diagram PDF
97% (37)
100 Electronic Projects With Circuit Diagram PDF
105 pages
Digital Electronics - 2 Marks With Answers PDF
100% (24)
Digital Electronics - 2 Marks With Answers PDF
40 pages
Fundamentals of Digital Circuits-Anand Kumar
89% (38)
Fundamentals of Digital Circuits-Anand Kumar
1,102 pages
A Nagoor Kani - SIGNALS AND SYSTEMS-Tata McGraw-Hill (2010) PDF
80% (20)
A Nagoor Kani - SIGNALS AND SYSTEMS-Tata McGraw-Hill (2010) PDF
829 pages
Power Electronics Devices and Circuits Second Edition PDF
100% (20)
Power Electronics Devices and Circuits Second Edition PDF
383 pages
Machine Learning Notes
91% (11)
Machine Learning Notes
19 pages
Machine Learning PPT For Students
73% (11)
Machine Learning PPT For Students
18 pages
Analog Electronic Circuit Engineering PDF
100% (11)
Analog Electronic Circuit Engineering PDF
434 pages
Communication System - Analog and Digital - by Dr. Sanjay Sharma
100% (2)
Communication System - Analog and Digital - by Dr. Sanjay Sharma
550 pages
Digital Electronics: Question and Answers (Question Bank)
100% (13)
Digital Electronics: Question and Answers (Question Bank)
12 pages
Process Control and Data Acquisition System2
No ratings yet
Process Control and Data Acquisition System2
56 pages
Boolean Algebra and Logic Gates PDF
100% (6)
Boolean Algebra and Logic Gates PDF
100 pages
Edc Bakshi PDF
91% (11)
Edc Bakshi PDF
748 pages
All-In-One Electronics Guide Your Complete Ultimate Guide To Understanding and Utilizing Electronics!
96% (24)
All-In-One Electronics Guide Your Complete Ultimate Guide To Understanding and Utilizing Electronics!
469 pages
Digital Principles and Application by Leach & Malvino
82% (17)
Digital Principles and Application by Leach & Malvino
700 pages
Microprocessor and Programming
100% (2)
Microprocessor and Programming
229 pages
Logic Gate Question and Answers
100% (4)
Logic Gate Question and Answers
5 pages
Digital Electronics - Principal and Application PDF
100% (6)
Digital Electronics - Principal and Application PDF
723 pages
IoT Lab Manual - VTU (21EC581) by Raviteja Balekai
100% (2)
IoT Lab Manual - VTU (21EC581) by Raviteja Balekai
284 pages
Digital Electronics RK Kanodia & Ashish Murolia PDF
78% (9)
Digital Electronics RK Kanodia & Ashish Murolia PDF
487 pages
Maths FORMULA SHEET Class 10th (Prashant Kirad)
93% (130)
Maths FORMULA SHEET Class 10th (Prashant Kirad)
26 pages
DSP Guide for B.Tech Students
80% (5)
DSP Guide for B.Tech Students
332 pages
Digital Circuits & Design E-Book
50% (6)
Digital Circuits & Design E-Book
42 pages
CMOS Digital Integrated Circuits
100% (5)
CMOS Digital Integrated Circuits
405 pages
Microprocessor BY Ramesh Gaonkar (Color) PDF
93% (30)
Microprocessor BY Ramesh Gaonkar (Color) PDF
832 pages
Programmable Logic Controller: Presented by
100% (4)
Programmable Logic Controller: Presented by
23 pages
MATH019A Engineering Data Analysis
81% (26)
MATH019A Engineering Data Analysis
50 pages
MCQ Digital Electronics - 1
100% (1)
MCQ Digital Electronics - 1
38 pages
Understand in Digital Electronics
100% (5)
Understand in Digital Electronics
137 pages
Op-Amp and Linear Integrated Circuit by Ramakant A. Gayakwad
73% (15)
Op-Amp and Linear Integrated Circuit by Ramakant A. Gayakwad
558 pages
Fluids 06 00386
No ratings yet
Fluids 06 00386
16 pages
C++ Matrix Class
No ratings yet
C++ Matrix Class
11 pages
Appendix B Forouzan
No ratings yet
Appendix B Forouzan
8 pages
Tutorial QS
No ratings yet
Tutorial QS
18 pages
M03 Geom43 Content Guide
No ratings yet
M03 Geom43 Content Guide
6 pages
Steffan's Boltzman Apparatus
No ratings yet
Steffan's Boltzman Apparatus
5 pages
ISO Methodology PDF
No ratings yet
ISO Methodology PDF
132 pages
Differential Equations (WORD)
No ratings yet
Differential Equations (WORD)
5 pages
CSPBankProjectReport - Draft For AppendixE - ThermalEnergyStorage - 170109
No ratings yet
CSPBankProjectReport - Draft For AppendixE - ThermalEnergyStorage - 170109
30 pages
Antennas: Antenna Theory and Design, 2
No ratings yet
Antennas: Antenna Theory and Design, 2
46 pages
Formulas For Thin-Walled Pressure Vessels PDF
No ratings yet
Formulas For Thin-Walled Pressure Vessels PDF
4 pages
Barriers and Logistic Enablers
No ratings yet
Barriers and Logistic Enablers
22 pages
Amey Shinagare - 20106A1023 - BRM Assignment 3
No ratings yet
Amey Shinagare - 20106A1023 - BRM Assignment 3
3 pages
Similar Triangles
No ratings yet
Similar Triangles
7 pages
Fomula MM
No ratings yet
Fomula MM
3 pages
Some Notes From The Book - Pairs Trading - Quantitative Methods and Analysis by Ganapathy Vidyamurthy Weatherwax - Vidyamurthy - Notes
No ratings yet
Some Notes From The Book - Pairs Trading - Quantitative Methods and Analysis by Ganapathy Vidyamurthy Weatherwax - Vidyamurthy - Notes
32 pages
Appendix A: Authors: John Hennessy & David Patterson
No ratings yet
Appendix A: Authors: John Hennessy & David Patterson
15 pages
419 DataSceince SQP
No ratings yet
419 DataSceince SQP
7 pages
Phosphate Solubilizing Bacteria Optimization
No ratings yet
Phosphate Solubilizing Bacteria Optimization
14 pages
Lab - Selection Problems - ARPM
No ratings yet
Lab - Selection Problems - ARPM
6 pages
(Ebooks PDF) Download Quantitative Methods For Business 13th Edition Full Chapters
100% (2)
(Ebooks PDF) Download Quantitative Methods For Business 13th Edition Full Chapters
55 pages
CT-2 Exam 2022-2023 - Final - .
No ratings yet
CT-2 Exam 2022-2023 - Final - .
2 pages
3 Demand Forecasting
No ratings yet
3 Demand Forecasting
8 pages
Case Analysis On Cafe Momento
No ratings yet
Case Analysis On Cafe Momento
37 pages
Kriz I Kriz S Introduction To Algebraic Geometry
100% (1)
Kriz I Kriz S Introduction To Algebraic Geometry
481 pages
Total Station Instrument: Unit 1
No ratings yet
Total Station Instrument: Unit 1
20 pages
Wick Theory PDF
No ratings yet
Wick Theory PDF
10 pages
Mathematics in The Modern World The Nature of Mathematics
No ratings yet
Mathematics in The Modern World The Nature of Mathematics
13 pages
SAP PI Context
No ratings yet
SAP PI Context
11 pages

Lecture 10 Clustering and Classification

Uploaded by

Lecture 10 Clustering and Classification

Uploaded by

Time series analysis in neuroscience

Lecture 10. Clustering and Classification

Section 1. Machine learning

Introducing Machine Learning - MathWorks

Section 1 Machine learning

Unsupervised learning is useful when you want to explore

A supervised learning algorithm takes a known set of input

Section 1 Machine learning

Introducing Machine Learning - MathWorks

Section 1 Machine learning

In cluster analysis, data is partitioned into groups based on

Clusters are formed so that objects in the same cluster are

Best used ...

K-means clustering (1/2)

The number of clusters (K) is equal to the

# clustering using sklearn

K-means clustering (1/2)

The number of clusters (K) is greater or less

# clustering using sklearn

inertia = 603.5 inertia = 0.0

Noisy measurements (1/2)

How does it work in presence of noise?

Noisy measurements (2/2)

Can the algorithm put noise to a single

inertia = 3381.0 inertia = 2279.0

Best used ...

Hierarchical clustering (1/2)

What are the distance measures between

# distance after permutation

Hierarchical clustering (2/2)

How does it work?

How does noise affect the clustering

Hierarchical clustering 2D (1/2)

Could we cluster the covariance matrix

Hierarchical clustering 2D (2/2)

Sub-optimal number of clusters …

Gaussian Mixture Model

Best used ...

Gaussian Mixture Model (1/2)

Why does it work?

# scatter plot (joint distribution)

# multivariate gaussian PDF

Gaussian Mixture Model (2/2)

How does it work?

Classification techniques predict discrete responses, for

Classification models are trained to classify data

Support Vector Machine

Best used ...

Support Vector Machine

Support Vector Machine (1/2)

SVM as any other classification approach

# induce some correlation between X and y

# training and testing datasets

Support Vector Machine (2/2)

The classifier gives the coefficients that can

Classification accuracy (1/2)

What if two datasets cannot be clearly

# induce some correlation between X and y

# training and testing datasets

Classification accuracy (2/2)

The decision function looks like a random

Multichannel recordings (1/2)

More channels could provide a better fit of

# induce some correlation between X and y

Classification accuracy (2/3)

In case of weak correlation between data

# induce some correlation between X and y

Best used ...

Best used ...

Regression techniques predict continuous responses, for

Best used ...

Best used ...

Best used ...

• Python programming language

You might also like