Introduction to Artificial Intelligence,
Machine Learning, and Data Science
By Yaohang Li, Ph.D.
Department of Computer Science
Old Dominion University
yaohang@cs.odu.edu
-1-
Syllabus
• Class Web Page
– http://www.cs.odu.edu/~yaohang/cs795895
• Instructional E-Mail Addresses
– yaohang@cs.odu.edu
• Instructor: Yaohang Li
– Office phone: 757-683-7721
– Office location: 3212 E&CS
– Office hours:
• Tuesdays: 2:00PM-3:00PM
• by appointment
-2-
Projects
Participate Kaggle Competitions
Project 1:
Project 2:
– To Be Release Later
-3-
Grading
Grading Policy
– 2 Kaggle projects 100%
– You must participate in Kaggle competitions to claim the credits
-4-
Agenda
Artificial Intelligence
– Turing Test
– What can AI do today?
– What can’t AI do today?
Machine Learning
– Definition
– Types of Machine Learning
Supervised Learning
Unsupervised Learning
Semi-supervised Learning
Reinforcement Learning
Machine Learning Framework
Data Science and Machine Learning
-5-
Artificial Intelligence
Goal of artificial intelligence
– Make machines do things that would require intelligence if done by
humans
Can machine think?
– Vitally important
– But “thinking” is hard to define
– No simple answer of “Yes” or “No”
– A fuzzy answer
-6-
What is AI?
Artificial Intelligence
– Build machines that imitate human intelligence
1. Thinking Humanly
2. Acting Humanly
3. Thinking Rationally
4. Acting Rationally
Artificial Intelligence Applications
– Perform tasks done by a person
– But often faster and at a larger scale
-7-
Can a Machine Think?
Goal of artificial intelligence
– Make machines do things that would require intelligence if done by
humans
Can a machine think?
– Vitally important
– But “thinking” is hard to define
– No simple answer of “Yes” or “No”
– A fuzzy answer
-8-
Turing Test
Can a machine pass a behavior test for intelligence?
– Turing Test: can a machine appear indistinguishable from a
human to an experimenter?
-9-
Eugene Goostman
Turing predicted that by year 2000, computers would be intelligent
enough to trick humans into thinking they were real 30% of the time.
Eugene Goostman
– A Chatterbot
– Portrayed as a 13-year old Ukrainian boy
– On June 7, 2014, in University of Reading Competition, Eugene convinced 33%
contest judges that it is a real boy
– “Turing test has been passed for the first time”
- 10 -
Searle’s Chinese Room Experiment
Does passing Turing test indicate a machine has a mind?
Searle’s Chinese Room Experiment
– Locked in a room
– Input: Chinese Characters
– Processing: Rule book in English
– Output: Chinese Characters
Assuming that the output would make sense to a Chinese speaker, does
the person inside the room understand Chinese literally?
– Yes “Strong AI”
– No “Weak AI”
- 11 -
What can AI do today?
Playing Games
- 12 -
What CAN AI do today?
AlphaGo
– Convolutional Neural Network (CNN)
AlphaGo Zero
– Adversarial Network
Source: DeepMind - 13 -
What can AI do today?
Robotics Vehicles
– Stanley finished the 132-mile course of rough terrain to win 2005 DARPA
grand challenge
– CMU’s BOSS won the 2006 Urban challenge
Obey traffic rules and avoid pedestrians and other vehicles
- 14 -
What CAN AI do today?
Image Recognition
- 15 -
What can AI do today?
AI and Web Search
- 16 -
What can AI do today?
IBM’s Watson on Jeopardy
– Outperforms Brad Rutter and Ken Jennings in 2011
– Natural language processing, information retrieval, knowledge
representation, automated reasoning, and machine learning
- 17 -
What can AI do today?
STAR (Smart Tissue Autonomous Robot) bests human surgeon
Source: STAR - 18 -
What can AI do today?
Generating virtual arts
http://www.businessinsider.com/robot-can-paint-as-well-as-van-gogh-2015-9 - 19 -
What can AI do today?
Virtual Reality
Source: Metz and Collins, 2018 - 20 -
What can AI do today?
Collaborative Games
- 21 -
What CAN’T AI do today?
Robust Machine Translation
– Example: English to Russian
“The spirit is willing but the flesh is weak” (English)
“The vodka is good but the meat is rotten” (Russian)
– Need to translate words and their meaning
- 22 -
What CAN’T AI do today?
Write an original
story that can make
you cry
- 23 -
What CAN’T AI do today?
Bug-free Software
- 24 -
What CAN’T AI do today?
APKFFRGGNWKMNGKRSLGI
ELIHTLGDAKLSADTEVVCG
PSITEKVVFQETKAIADNDA
WSKVEVHESRIYGGSVTNCK
ELASQHDVDGFLVGGASLKP
VDGFLHALAEGLGVDINAKH
...........
Sequence Structure
Predicting the 3-D Structure of protein
Grand Challenge with Broad Economic and Scientific Impacts
X-ray Low- Low-Resolution
NMR Crystallization Resolution Models with Erroneous
Native Models
Resolution Resolution Models Certain Errors
RMSD (A)
Our Goal
- 25 -
Agenda
Artificial Intelligence
– Turing Test
– What can AI do today?
– What can’t AI do today?
Machine Learning
– Definition
– Types of Machine Learning
Supervised Learning
Unsupervised Learning
Semi-supervised Learning
Reinforcement Learning
Machine Learning Framework
Data Science and Machine Learning
- 26 -
What is Machine Learning?
“Learning is any process by which a system improves
performance by experience” (Herbert Simon)
Machine learning is the study of algorithms that (Tom
Mitchell, 1997)
– improve their performance P
– at some task T
– with experience E
– A well-defined learning task is given by <P, T, E>
- 27 -
Machine Learning Tasks
Improve on task T, with respect to performance metric P, based on
experience E
– T: Playing chess
– P: Percentage of games won against an arbitrary opponent
– E: Playing practice games against itself
– T: Recognizing hand-written words
– P: Percentage of words correctly classified
– E: Database of human-labeled images of handwritten words
– T: Driving on four-lane highways using vision sensors
– P: Average distance traveled before a human-judged error
– E: A sequence of images and steering commands recorded while observing a
human driver.
– T: Categorize email messages as spam or legitimate.
– P: Percentage of email messages correctly classified.
– E: Database of emails, some with human-given labels
- 28 -
Slide credit: R. Mooney
Relation between AI and Machine Learning
and Data Science
- 29 -
Programming vs. Machine Learning
Programming
Machine Learning
- 30 -
When do we use machine learning?
Human expertise does not exist Models must be customized
navigating on Mars personalized recommendations
Humans can’t explain their Models are based on huge
expertise amounts of data
pattern recognition human genome analysis
- 31 -
When do we use machine learning?
Machine learning is not always useful
– There is no need to “learn” to build a calculator
– There is no need to “learn” to calculate payroll
- 32 -
Types of Learning
Supervised (inductive) learning Semi-supervised learning
– Given: training data + labels – Given: training data (not all of
them have labels)
Unsupervised learning Reinforcement learning
– Given: training data (without labels) – Rewards from previous actions
- 33 -
Supervised Learning
- Regression
Given (X1, y1), (X2, y2), ..., (Xn, yn), learn a function f(X) to
predict y given X
– y is real (regression)
- 34 -
Supervised Learning
- Classification
Given (X1, y1), (X2, y2), ..., (Xn, yn), learn a function f(X) to
predict y given X
– y is categorical (classification)
f( ) = “apple”
f( ) = “tomato”
f( ) = “cow”
- 35 -
Unsupervised Learning
Given X1, X2, ..., Xn (without labels), discover the structure under X
- 36 -
Unsupervised Learning
- Human Genome Project Example
Group individuals by genetic similarity
- 37 -
Source: Human Genome Project
Unsupervised Learning
- Image Segmentation
- 38 -
Unsupervised Learning
- Distributed Representation
- 39 -
Source: GAN
Unsupervised Learning
- Outlier Detection
- 40 -
Semi-supervised Learning
In many applications, labeled data is
rare
– Expensive
Need someone to label it
Require special equipment
Semi-supervised learning
– A mix of supervised and unsupervised
learning
– Take advantage of unlabeled data
- 41 -
Image from analyticsvidhya.com
Semi-supervised Learning
- The Netflix Problem
- 42 -
Semi-supervised Learning
- March Madness
https://wtkr.com/2019/03/18/a-computer-helped-odu-researchers-fill-out-the-2015-march-madness-bracket-how-did-it-do/
- 43 -
Reinforcement Learning
Given a sequence of states and actions with (delayed)
rewards, output a policy
Policy
– A mapping from states -> actions that tells you what to do in a given
state
- 44 -
Reinforcement Learning
- Game Playing
https://www.youtube.com/watch?v=qv6UVOQ0F44 https://www.youtube.com/watch?v=V1eYniJ0Rnk
Super Mario Atari Breakout
https://www.youtube.com/watch?v=Gl3EjiVlz_4 - 45 -
Robot Jump
Agenda
Artificial Intelligence
– Turing Test
– What can AI do today?
– What can’t AI do today?
Machine Learning
– Definition
– Types of Machine Learning
Supervised Learning
Unsupervised Learning
Semi-supervised Learning
Reinforcement Learning
Machine Learning Framework
Data Science and Machine Learning
- 46 -
Machine learning framework
Apply a prediction function to a feature representation of
the image to get the desired output
f( ) = “apple”
f( ) = “tomato”
f( ) = “cow”
- 47 -
Slide credit: L. Lazebnik
The machine learning framework
y = f(x)
output prediction sample
function
representation
Training: given a training set of labeled samples {(x1,y1), …, (xN,yN)}, estimate
the prediction function f by minimizing the prediction error on the training
set
Testing: apply f to a never before seen test sample x and output the
predicted value y = f(x)
Evaluation: want to know how good the learned model is
- 48 -
Slide modified from L. Lazebnik
Steps
Training Training
Labels
Training
samples
Represe Training
ntation (optimization)
Evaluation
Testing
Represent Learned
Prediction
ation model
Test Image
- 49 -
Modified from L. Lazebnik
Machine Learning Algorithms
Many machine learning algorithms available
– Hundreds new algorithms every year
Every machine learning algorithm has three components
– Representation
– Optimization (Minimization)
– Evaluation
Slide modified from Pedro Domingos - 50 -
Representations of Functions
Numerical functions Instance-based functions
– Linear regression – Nearest-neighbor
– Neural networks – Case-based
– Support vector machines
Probabilistic Graphical Models
Symbolic functions – Naïve Bayes
– Decision trees – Bayesian networks
– Rules in propositional logic – Hidden-Markov Models (HMMs)
– Rules in first-order predicate – Probabilistic Context Free
logic Grammars (PCFGs)
– Rules in fuzzy logic – Markov networks
Slide modified from Pedro Domingos - 51 -
Optimization Methods
Gradient descent Divide and Conquer
– Perceptron – Decision tree induction
– Backpropagation – Rule learning
– Stochastic Gradient Descent
Evolutionary Computation
Dynamic Programming – Genetic Algorithms
Monte Carlo – Genetic Programming
– Markov Chain Monte Carlo – Neuro-evolution
– Simulated annealing
– Parallel tempering
Slide modified from Pedro Domingos - 52 -
Evaluations
Accuracy
Precision and recall
Convergence
Error (MSE, RMSE, etc.)
Likelihood
Posterior probability
Cost / Utility
Margin
Entropy
K-L divergence
Many others
- 53 -
A Lot of Data
1 1.8 ZB 8.0 ZB
logarithmic scale
Zettabyte 800 EB
Data produced each year
161 EB
5 EB
1 Exabyte
120 PB
100-years of HD video + audio
60 PB
1 Petabyte Human brain's capacity
14 PB
1 Petabyte == 1000 TB 2002 2006 2009 2011 2015
- 54 -
Source of Big Data
Sensors Social Networks Business Transactions
Genetics Physics experiments Biomedical
- 55 -
kaggle
- 56 -
Now it is the BEST time
Time is Right
– Better Understanding of Human Cognition
– Recent progress in algorithms and theory
Deep Learning
Recommendation Systems
– Rapidly growing volume of data from various sources
– Available computational power
Large-scale parallel/distributed computing
– Growth and interest of AI-based industries
Google, Intel, Amazon, DeepMind, etc.
- 57 -
Summary
Artificial Intelligence
– Turing Test
– What AI can do and can’t do today?
Machine Learning
– Definition
– Types of Machine Learning
Machine Learning Framework
– Representation
– Optimization (Minimization)
– Evaluation
- 58 -
Others
Brief introduction to Jupyter Notebook
Brief introduction to Pandas
– jupyter notebook intro_pandas
Brief introduction to Matplotlib
– jupyter notebook introductiontoplt
- 59 -