0% found this document useful (0 votes)

66 views42 pages

Machine Learning for Business

This webinar covers machine learning fundamentals including supervised and unsupervised learning. Supervised learning involves using labelled training data to develop models that can predict outputs for new data, such as classification and regression models. Unsupervised learning is used to find hidden patterns in unlabeled data through techniques like clustering.

Uploaded by

slowkimo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

66 views42 pages

Machine Learning for Business

Uploaded by

slowkimo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 42

Machine Learning Fundamentals

Introduction

This webinar covers

 Identifying the needs and goals

 analysing the requirements
 gathering and prepossessing data
 understanding how to apply machine learning in commercials

Lecturer: Samson Hui

IT Support for Research: https://www.polyu.edu.hk/its/researchsupport/en/
Materials on Git Repo: https://polyu.hk/OJETT
Contact Person

Timothy Yim
Senior Specialist
Information and Technology Service
timothy.yim@polyu.edu.hk
Computer vs Human

 Computers are good at

 94893 ×1235 = 117192855, 2394 ÷ 123804 = 0.17342799….
 Fast Memory
 Fast Calculation
 Fast Signal Transmit

 Humans are good at

 Recognition

 Think out of the box.

 Make decisions base on intelligence and life experience
Our Goal

 Develop algorithms and models so that computers can perform

tasks that traditionally humans are better at.

 And with the help of high computational power and data storage,
hopefully computers can out perform humans in terms of
accuracy, speed and volume.
AI vs Machine Learning vs Deep Learning

 Artificial intelligence - programs and machines to solve problems

like human

 Machine learning is a subset of AI – without explicitly programmed

 Deep learning is a subset of machine learning – neural network

Data Visualization

 Data visualization is the graphical representation of information and data.

 Charts, graphs and maps

 Key tools to tell stories

 Curating data into a form easier to understand

Data Visualization – Data Table vs Graph
Data Visualization – Classifying Iris

Problem:
• Classifying three types of Iris, Setosa, Versicolour and Virginica.

Existing dataset information

• Sepal length (cm)
• Sepal width (cm)
• Petal length (cm)
• Petal width (cm)
Data Visualization – Demo with Orange
Data Prepossessing

 Data Cleaning

 Data Integration

 Data Transformation
Data Transformation

 Examples: Distance (1 km, 1m, 1 cm)

 Measurement in different scale do not contribute equally to our

senses.

 Correlation is much important.

 Data transformation process or feature scaling methods are

needed
Standardization

 Standardization is a widely used data transformation technique to

change feature vectors into representation.

 Transform the data to center it by removing the mean value of each

feature, then scale it by dividing non-constant features by their
standard deviation

 Therefore, scaled data has zero mean and scaled variance.

Standardization Demo with Python
Example Python Code for Standardization
Machine Learning Algorithms

 Machine learning algorithms or models are used to make decision

or prediction with the data. E.g. KNN, Neural Network, SVM…

 The model is said to learn from existing data and giving outputs
with new data.

 For example, traffic patterns prediction

 We will be focusing on machine learning algorithms in later

webinars.
Validation

We need to know that our trained algorithm is working as expected.

Validation is important before we publish our machine learning
program to the world.

The challenge of validation

 Limited existing data
 Past data may not representing the future

We will be introducing one of the widely used validations method to

solve these problems.
K-Fold Cross-Validation

 Cross-validation is a resampling procedure

 Limited data sample.
 k refers the number of folds

Senerio
 The data set is divided into k groups, e.g. 10 groups
 9 groups of data is used to train the machine learning model and
the remaining group is used for testing.
 Iterate each group to become the testing data set.
K-Fold Cross-Validation
Applied Machine Learning in Business
Thanks!
Machine Learning Fundamentals (Session 2)
Objectives

 Machine learning is a subset of AI

 Without explicitly programmed

 Supervised learning VS Unsupervised learning

 And now, we are going to learn how to build a self learn program
How Human Learns

Imagine we are learning how to throw darts….

Brain
Eyes(feedbacks
)

Our goal is to hit the specific the

targets, e.g. bull’s eye, triple 20,
single 16….. Algorithms
Supervised Learning

• Trained on a pre-defined set of data

• Reach conclusion when given new data.

• Develop the function , where is input

Supervised Learning – Classification vs
Regression

• Supervised learning problems can be further grouped into classification and

regression problems.

• Classification – When the output variable is a category, e.g. true or false, red or
blue

• Regression – When the output variable is a real value, e.g. exchange rate,
weight
Supervised Learning – K Nearest Neighbors

• K nearest neighbours is a simple algorithm that stores all available cases and
classifies new cases based on a similarity measure (e.g. distance function).

• Classify by majority votes of its neighbours

• Measured by a distance function

• If K = 1, assigned to the class of its nearest neighbour

Supervised Learning – K Nearest Neighbors

• When K=3, Class B

• When K=6, Class A

Supervised Learning – K Nearest Neighbors

The black line is the decision

boundary
KNN – How K Influences the algorithm

• The boundary becomes smoother with increasing the value of K.

• When K is 1, the algorithm is overfitting the boundary.

• When K is infinite, the prediction will become only one class depending on the
total majority, which is useless….
Error Rates

Most of the time, our trained model will have errors

• Classifying the target to a wrong class
• The predicted value is not exactly equal to the real value

We calculate the error rate to evaluate the effectiveness of our trained model

Bayes Error
• The lowest possible error
rate for any classifier of a
random outcome and is
analogous to the
irreducible error.
Error Rates

In the KNN example, we fine tune the value k to lower the error as much as possible.

But what if we cannot improve the successful rate anymore and it’s still bad….
Supervised Learning – Neural Network
Supervised Learning – Neural Network History

• Warren McCulloch and Walter Pitts (1943) opened the subject by creating a
computational model for neural network.

• First functional networks with many layers were published by Ivakhnenko and
Lapa in 1965.

• The basics of continuous backpropagation were derived in the context by Kelley

in 1960 and by Bryson in 1961, using principles of dynamic programming.

• In 1970, a lot of research were carried out but stagnated because of computers at
that time lacked sufficient power to process useful neural networks.

• Recently, the rise of high performance GPUs and CPUs make multiple layers
neural network feasible and neural network becomes popular.
Supervised Learning – Neural Network

Neural networks are computing systems vaguely inspired by the biological neural
networks that constitute animal brains

Components
• Neurons
• Input layer
• Hidden layer
• Output layer

• Connections and Weights

Supervised Learning – Classifying Iris

Problem:
• Classifying three types of Iris, Setosa, Versicolour and Virginica.

Existing dataset information

• Sepal length (cm)
• Sepal width (cm)
• Petal length (cm)
• Petal width (cm)
Unsupervised Learning

• Dataset without labelled responses

• Find hidden patterns

• Find grouping in data

• Usually less accurate and trustworthy

• Clustering is a common
Clustering

• Involves the grouping of data points

• Similar properties in the same group

• Highly Dissimilar properties in different group

• Work best if the classes not overlapping

Examples:
• K-means clustering
• Hierarchical clustering
• Fuzzy c-means clustering
K-means Clustering

• Target number k – number of centroids

• A centroid is the imaginary or real location representing the center of the cluster

• Allocates every data point to the nearest cluster

• Keeping centroids as small as possible

K-means Clustering

• Target number k – number of centroids

• A centroid is the imaginary or real location representing the center of the cluster

• Allocates every data point to the nearest cluster

• Keeping centroids as small as possible

K-means Clustering - Steps

1. Randomly Initialize a number of classes/groups

2. Classify each point to the closest centre

3. Re-computer the centres by the means of data points

4. Iterate a set number or until centres do not change much

Summary

Supervised Learning
• Labelled data
• Develop the finely tuned function to predict with inputs
• Can be very precise and data are harder to be collected

Unsupervised learning
• Unlabelled data
• Find hidden pattern
• Less trustworthy but data are easier to be collected

Machine Learning
No ratings yet
Machine Learning
28 pages
CH 4
No ratings yet
CH 4
106 pages
Week 09 Lesson 1 Intro Machine Learning 1 To 32
No ratings yet
Week 09 Lesson 1 Intro Machine Learning 1 To 32
61 pages
Lecture 1
No ratings yet
Lecture 1
36 pages
Machine Learning Fundamentals
No ratings yet
Machine Learning Fundamentals
19 pages
Introduction to AI: Machine Learning Basics
No ratings yet
Introduction to AI: Machine Learning Basics
72 pages
Tirth PDF
No ratings yet
Tirth PDF
19 pages
Bike Buyer Prediction Using Classification Algorithm
No ratings yet
Bike Buyer Prediction Using Classification Algorithm
19 pages
Donalek Classif
No ratings yet
Donalek Classif
69 pages
ML and DL
No ratings yet
ML and DL
15 pages
Machine Learning
No ratings yet
Machine Learning
24 pages
Intro to Machine Learning Concepts
No ratings yet
Intro to Machine Learning Concepts
70 pages
Machine Learning Updated
No ratings yet
Machine Learning Updated
14 pages
Lec-1 Introduction
No ratings yet
Lec-1 Introduction
65 pages
Types of ML Systems
No ratings yet
Types of ML Systems
5 pages
Iu 3.6.4 ML 101
No ratings yet
Iu 3.6.4 ML 101
39 pages
Intro
No ratings yet
Intro
35 pages
Basics of Machine Learning and Classifications: Dr. Helal Uddin Ahmed
No ratings yet
Basics of Machine Learning and Classifications: Dr. Helal Uddin Ahmed
18 pages
Class10-Introduction To ML
No ratings yet
Class10-Introduction To ML
32 pages
MAchine Learning Notes
No ratings yet
MAchine Learning Notes
6 pages
Mlintro 2
No ratings yet
Mlintro 2
28 pages
Intro to Machine Learning Course
No ratings yet
Intro to Machine Learning Course
83 pages
Presentation On ML
No ratings yet
Presentation On ML
469 pages
Unit I MACHINE LEARNING
No ratings yet
Unit I MACHINE LEARNING
87 pages
An Introduction To Machine Learning
No ratings yet
An Introduction To Machine Learning
136 pages
Asset-V1 MKAU+SEng9032+DEV 01+type@asset+block@ChapOne
No ratings yet
Asset-V1 MKAU+SEng9032+DEV 01+type@asset+block@ChapOne
29 pages
Mlintro 4
No ratings yet
Mlintro 4
28 pages
01 Introduction
No ratings yet
01 Introduction
28 pages
5.1 Large Scale ML
No ratings yet
5.1 Large Scale ML
10 pages
CS3491-AI ML-Chapter 1
No ratings yet
CS3491-AI ML-Chapter 1
19 pages
Week 8
No ratings yet
Week 8
70 pages
SEng5305-chap-1-Introduction To ML
No ratings yet
SEng5305-chap-1-Introduction To ML
85 pages
MLUnit - 1 Share
No ratings yet
MLUnit - 1 Share
162 pages
MLUnit 1
No ratings yet
MLUnit 1
131 pages
Machine Learning Basics & Techniques
No ratings yet
Machine Learning Basics & Techniques
13 pages
Machine Learning With Matlab
100% (1)
Machine Learning With Matlab
36 pages
ML Notes
No ratings yet
ML Notes
52 pages
ML Notes-1
No ratings yet
ML Notes-1
59 pages
ML Unit 1
No ratings yet
ML Unit 1
21 pages
Mlintro 3
No ratings yet
Mlintro 3
28 pages
Machine Learning Slides
No ratings yet
Machine Learning Slides
281 pages
Data Analysis ch1
No ratings yet
Data Analysis ch1
13 pages
Supervised Learning (WWW - Anuupdates.org)
No ratings yet
Supervised Learning (WWW - Anuupdates.org)
60 pages
01 ML Basics
No ratings yet
01 ML Basics
61 pages
Supervised & Deep Learning Guide
No ratings yet
Supervised & Deep Learning Guide
83 pages
Machine Learning Part: Domain Overview
No ratings yet
Machine Learning Part: Domain Overview
20 pages
Lecture 2 Unit 1
No ratings yet
Lecture 2 Unit 1
60 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
24 pages
ML Chap 2
No ratings yet
ML Chap 2
60 pages
ML Module I
No ratings yet
ML Module I
71 pages
What Is Machine Learning
No ratings yet
What Is Machine Learning
4 pages
Basic Concepts of Machine Learning For Beginners
No ratings yet
Basic Concepts of Machine Learning For Beginners
102 pages
Machine Learning Course Overview
No ratings yet
Machine Learning Course Overview
51 pages
Lecture 1 - Introduction To Machine Learning-HO - Ch0
No ratings yet
Lecture 1 - Introduction To Machine Learning-HO - Ch0
44 pages
ML Unit 1 CLS Notes
No ratings yet
ML Unit 1 CLS Notes
26 pages
Machine Learning IAI
No ratings yet
Machine Learning IAI
94 pages
Machine Learning-Lecture 01
No ratings yet
Machine Learning-Lecture 01
28 pages
Machine Learning Course Overview
No ratings yet
Machine Learning Course Overview
225 pages
2013 AMC 10 A Problems
No ratings yet
2013 AMC 10 A Problems
8 pages
SEM 1 GE HINDI Groups 2024-25
No ratings yet
SEM 1 GE HINDI Groups 2024-25
14 pages
2023 Mining and Acids Assignment
No ratings yet
2023 Mining and Acids Assignment
4 pages
Hypothesis Presentation
No ratings yet
Hypothesis Presentation
12 pages
Year 5 Tuesday Problem Solving
No ratings yet
Year 5 Tuesday Problem Solving
5 pages
Spring Constant Practice
No ratings yet
Spring Constant Practice
2 pages
Discourse Analysis Assignment
No ratings yet
Discourse Analysis Assignment
2 pages
Charles: by Shirley Jackson
No ratings yet
Charles: by Shirley Jackson
9 pages
Class Assessments: Artificial Intelligence - CS-6601-O01
No ratings yet
Class Assessments: Artificial Intelligence - CS-6601-O01
5 pages
Educator's Comprehensive CV
No ratings yet
Educator's Comprehensive CV
2 pages
HR Training and Development Questionnaire
100% (3)
HR Training and Development Questionnaire
5 pages
Verb To Be Worksheet
No ratings yet
Verb To Be Worksheet
2 pages
Cardiopulmonary PT Expertise
No ratings yet
Cardiopulmonary PT Expertise
4 pages
101 Positive Things To Say To Myself
No ratings yet
101 Positive Things To Say To Myself
1 page
School and Health Cert 2023
50% (2)
School and Health Cert 2023
2 pages
Semantics Task
No ratings yet
Semantics Task
7 pages
Academic Performance of Senior High School Working Students
83% (6)
Academic Performance of Senior High School Working Students
15 pages
Routledge Handbook of Latin America in The World 1st Edition Jorge I Dominguez Instant Download
100% (1)
Routledge Handbook of Latin America in The World 1st Edition Jorge I Dominguez Instant Download
53 pages
Parents' Experiences with Blind Kids
No ratings yet
Parents' Experiences with Blind Kids
18 pages
A Magic Square
100% (1)
A Magic Square
10 pages
AMIIChE 1
No ratings yet
AMIIChE 1
1 page
Constructivist Learning
No ratings yet
Constructivist Learning
22 pages
Women's Impact on Societal Progress
No ratings yet
Women's Impact on Societal Progress
1 page
Learn To Drag and Drop
No ratings yet
Learn To Drag and Drop
5 pages
The Blue Print Work Book
No ratings yet
The Blue Print Work Book
28 pages
Group Descision Support System
No ratings yet
Group Descision Support System
15 pages
Adaptation in Sports Training - 1st Edition ISBN 0849301718, 9780849301711 Entire Ebook Download
No ratings yet
Adaptation in Sports Training - 1st Edition ISBN 0849301718, 9780849301711 Entire Ebook Download
15 pages
q3 Science 7 LC 3 A
No ratings yet
q3 Science 7 LC 3 A
63 pages
The Happiness Advantage by Shawn Achor - Excerpt
49% (76)
The Happiness Advantage by Shawn Achor - Excerpt
33 pages
García Et Al. (2025) - Disorders Understudied Langs. Cortex
No ratings yet
García Et Al. (2025) - Disorders Understudied Langs. Cortex
6 pages

Machine Learning for Business

Uploaded by

Machine Learning for Business

Uploaded by

Machine Learning Fundamentals

This webinar covers

 Identifying the needs and goals

Lecturer: Samson Hui

 Computers are good at

 Humans are good at

 Think out of the box.

 Develop algorithms and models so that computers can perform

 Artificial intelligence - programs and machines to solve problems

 Machine learning is a subset of AI – without explicitly programmed

 Deep learning is a subset of machine learning – neural network

 Data visualization is the graphical representation of information and data.

 Charts, graphs and maps

 Key tools to tell stories

 Curating data into a form easier to understand

Existing dataset information

 Examples: Distance (1 km, 1m, 1 cm)

 Measurement in different scale do not contribute equally to our

 Correlation is much important.

 Data transformation process or feature scaling methods are

 Standardization is a widely used data transformation technique to

 Transform the data to center it by removing the mean value of each

 Therefore, scaled data has zero mean and scaled variance.

 Machine learning algorithms or models are used to make decision

 For example, traffic patterns prediction

 We will be focusing on machine learning algorithms in later

We need to know that our trained algorithm is working as expected.

The challenge of validation

We will be introducing one of the widely used validations method to

 Cross-validation is a resampling procedure

 Machine learning is a subset of AI

 Without explicitly programmed

 Supervised learning VS Unsupervised learning

Imagine we are learning how to throw darts….

Our goal is to hit the specific the

• Trained on a pre-defined set of data

• Reach conclusion when given new data.

• Develop the function , where is input

• Supervised learning problems can be further grouped into classification and

• Classify by majority votes of its neighbours

• Measured by a distance function

• If K = 1, assigned to the class of its nearest neighbour

• When K=3, Class B

• When K=6, Class A

The black line is the decision

• The boundary becomes smoother with increasing the value of K.

• When K is 1, the algorithm is overfitting the boundary.

Most of the time, our trained model will have errors

• The basics of continuous backpropagation were derived in the context by Kelley

• Connections and Weights

Existing dataset information

• Dataset without labelled responses

• Find hidden patterns

• Find grouping in data

• Usually less accurate and trustworthy

• Involves the grouping of data points

• Similar properties in the same group

• Highly Dissimilar properties in different group

• Work best if the classes not overlapping

• Target number k – number of centroids

• Allocates every data point to the nearest cluster

• Keeping centroids as small as possible

• Target number k – number of centroids

• Allocates every data point to the nearest cluster

• Keeping centroids as small as possible

1. Randomly Initialize a number of classes/groups

2. Classify each point to the closest centre

3. Re-computer the centres by the means of data points

4. Iterate a set number or until centres do not change much

You might also like