What's Next For ML & You: Emily Fox & Carlos Guestrin
What's Next For ML & You: Emily Fox & Carlos Guestrin
What's Next For ML & You: Emily Fox & Carlos Guestrin
Deployment
Measuring quality of
deployed models
Evaluation
Choosing between
deployed models
Management
Tracking model
quality & operations
3
Monitoring
©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
Lifecycle of ML in Production
Deployment Evaluation
Management
Monitoring
4
©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
The Setup…
• 34.6M reviews
• 2.4M products
• 6.6M users
Live
Data
Feedback
6
©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
What happens after
(initial) deployment
Deployment Evaluation
Management
Monitoring
8
©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
After deployment
Model
Historical
Data
Predictions
Live
Data
Feedback
10
©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
Learning new, alternative models
Model
Historical
Data
Predictions
Model 2
Live
Data
+
Evaluation Predictions Metric
What data?
Which metric?
2000 visits
10% CTR
Model 1 Everybody gets
Model 2
Model 2
Group B
2000 visits
30% CTR
Product info
Classifier
No
Other info
≈
Parameters
19
of model
©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
Open challenges:
Feature engineering/representation
1
0
0
0
5
3
0
0
1
0
0
0
0
• Bag of word raw counts?
• Normalize?
• tf-idf? (which version???)
• Bigrams
• Trigrams
20
• … ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
Open challenges:
Scaling
Data is getting big…
across-‐
channel
state
mul$-‐channel
spa$al
EEG
data
covariance
model
channels
22
-me
©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
CPUs stopped getting faster…
10
processor speed GHz
constant
0.1
0.01
1988
1990
1996
1998
2000
2006
2008
2010
1992
2002
1994
2004
release date
23
©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
ML in the context of parallel architectures
GPUs Multicore
Clusters
Clouds
Supercomputers
RSS(w0,w1) =
($house 1-[w0+w1sq.ft.house 1])2
+ ($house 2-[w0+w1sq.ft.house 2])2
+ ($house 3-[w0+w1sq.ft.house 3])2
+ … [include all houses]
27
ŵ ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
2. Regression
Case study: Predicting house prices
• Loss functions, bias-variance
Concepts tradeoff, cross-validation, sparsity,
overfitting, model selection
price ($)
square feet
28
(sq.ft.) ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
3. Classification
Case study: Analyzing sentiment
• Linear classifiers
(logistic regression, SVMs, perceptron)
Models • Kernels
• Decision trees
Time
31
©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
4. Clustering & Retrieval
Case study: Finding documents
• Nearest neighbors
Models • Clustering, mixtures of Gaussians
• Latent Dirichlet allocation (LDA)
ENTERTAINMENT SCIENCE
32
©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
4. Clustering & Retrieval
Case study: Finding documents
• KD-trees, locality-sensitive
hashing (LSH)
Algorithms • K-means
• Expectation-maximization (EM)
1000530010000 1*3
+
5*2
3000200101000
= 13
≈
Parameters
of model
35
©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
5. Matrix Factorization & Dimensionality Reduction
Case study: Recommending Products
• Coordinate descent
Algorithms • Eigen decomposition
• SVD
Xij known
Form for black cells
estimates
Xij unknown for white cells
Rating
X= LuRows
andindex
Rv movies
Columns index users
≈
36
©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
5. Matrix Factorization & Dimensionality Reduction
Case study: Recommending Products
Text
sentiment Computer
vision
analysis
Capstone
project
Recommenders Deep
learning
Deploy
intelligent
web app
38
©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization