0% found this document useful (0 votes)

336 views8 pages

Solutions: 10-601 Machine Learning, Midterm Exam: Spring 2008 Solutions

The document provides information about a machine learning midterm exam, including instructions, parts of the exam, and questions. It states that the exam is open book and has a total of 80 points across 6 questions to be completed in 80 minutes. The questions cover topics like probability, Bayes networks, statistical significance, PAC learning, Bayes rule, and online learning.

Uploaded by

reshma khemchandani

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

336 views8 pages

Solutions: 10-601 Machine Learning, Midterm Exam: Spring 2008 Solutions

Uploaded by

reshma khemchandani

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

10-601 Machine Learning, Midterm Exam: Spring 2008

SOLUTIONS

• Please put your name on this cover sheet

NS
• If you need more room to work out your answer to a question, use the back of the page
and clearly mark on the front of the page if we are to look at what’s on the back.
• This exam is open book, open notes, no applets, no wireless communication
• There are 80 points total on the entire exam, and you have 80 minutes to
complete it. Good luck!

Name:

O
TI
Andrew ID:

Q Topic Max. Score Score

1 Short answer questions 20

2 d-separation 10

3 On-line learning 10
SO

4 Learning methods we studied 10

5 A new learning method 15

6 Naive Bayes and Decision Trees 15

Total 80

1
Part I. Short-answer questions (20 pts)

1. (4 pts) Basic probability.

(a) You have received a shiny new coin and want to estimate the probability θ that it will
come up heads if you flip it. A priori you assume that the most probable value of θ is 0.5.
You then flip the coin 3 times, and it comes up heads twice. Which will be higher, your
maximum likelihood estimate (MLE) of θ, or your maximum a posteriori probability
(MAP) estimate of θ?. Explain intuitively why, using only one sentence..

NS
MLE- setting a prior at 0.5 will make the MAP estimate closer to 0.5, while the MLE
will be p= 32 .

O
P r(A, B|C) = P r(A|B, C)P r(B|C) = P (A|C)P (B|C)

2. (6 pts) Bayes network structures.

TI
(a) Using the chain rule of probability (also called the “product rule”) it is easy to express
the joint distribution over A, B and C,Pr(A, B, C), in several ways. For example,
i. Pr(A, B, C) = Pr(A|B, C) Pr(C|B) Pr(B)
ii. Pr(A, B, C) = Pr(B|A, C) Pr(C|A) Pr(A)
Write the Bayes net structures corresponding to each of these two ways of factoring
LU

Pr(A, B, C) (i.e., the Bayes nets whose Conditional Probability Tables correspond to
the factors). Indicate which is which.

P r(A|B, C)P r(C|B)P r(B) - B is parent of both C and A, C is parent only of A

P r(B|A, C)P r(C|A)P r(A) - A is parent of both C and B, C is parent only of B
SO

(b) How many distinct Bayes Network graph structures can you construct to correctly de-
scribe the joint distribution Pr(A, B, C) assuming there are no conditional independen-
cies? Please give the exact number, plus a one sentence explanation. (You do not need
to show the networks).

6- the only possible structure in order to have no conditional independencies is to use

the chain rule structures in the previous part, but permuting the roles of A, B and C.

2
3. (4 pts) Statistical significance.

(a) You have trained a logistic regression classifier to predict movie preferences from past
movie rentals. Over the 100 training examples, it correctly classifies 90. Over a separate
set of 50 test examples, it correctly classifies 43. Give an unbiased estimate of the true
error of this classifier, and a 90% confidence interval around that estimate (you may give
your confidence interval in the form of an expression).
The thing to notice here is that using the training data results (10 incorrect out of 100)
will yield a biased estimate, and so you may only use the test results (7 errors out of
50). Some people confused accuracy and error (which is asked for in the question). The

NS
answer is the following (where Z90 = 1.64):
r
7 0.14 ∗ (1 − 0.14)
± Z90 ∗
50 50
(b) Show that you really know what “confidence interval” means, by describing in one
sentence a simple experiment you could perform repeatedly, where you would expect in
90% of the experiments to observe an accuracy that falls inside your confidence interval.

O
If we repeated 100 times the experiment “collect 50 randomly drawn test examples and
measure the fraction misclassified by the above trained classifier,” then we expect that
the confidence intervals derived by the above expression would contain the true error in
at least 90 of these experiments.”
TI
4. (3 pts) PAC learning.
You want to learn a classifier over examples described by 49 real-valued features, using a
learning method that is guaranteed to always find a hyperplane that perfectly separates the
training data (whenever it is separable). You decide to use PAC learning theory to determine
the number m of examples that suffice to assure your learner will with probability 0.9 output
LU

a hypothesis with accuracy at least 0.95. Just as you are about to begin, you notice that the
number of features is actually 99, not 49. How many examples must you use now, compared
to m? Give a one sentence justification for your answer.
One thing to notice here is that the class of hypotheses are hyperplanes, and therefore we
can use the formula relating the dimensionality of the data to its VC dimension, ie, |V C| =
dimensionality+1. Therefore, going from 49 to 99 features doubles the VC dimension from
50 to 100. Since m is linear in |V C| (as opposed to log in |H|), this will also double m.
SO

5. (3 pts) Bayes rule.

Suppose that in answering a question on a true/false test, an examinee either knows the
answer, with probability p, or she guesses with probability 1 − p. Assume that the proba-
bility of answering a question correctly is 1 for an examinee who knows the answer and 0.5
for the examinee who guesses. What is the probability that an examinee knew the answer
to a question, given that she has correctly answered it? Classic application of Bayes rule.
Common mistake was to assume p = .5, although this isn’t stated in the problem. We get:
p(knew|correct) = p(correct|knew)p(knew)
p(correct)
p(correct|knew)p(knew)
= p(correct|knew)p(knew)+p(correct|guessed)p(guessed) . No-
tice in the denominator we had to consider the two ways she can get a question correct: by
1∗p 2p
knowing, or by guessing. Plugging in we get: 1∗p+.5∗(1−p) = p+1 .

3
NS
Part II. d-separation (10 pts)
Consider the Bayes network above.

O
1. (2 pts) Is the set {W, X} d-separated from {V, Y } given {E}? Why or why not?

No. W → U → V is not blocked since U ∈ E

TI
X → Z → Y is not blocked since a descendent of Z is in E.

2. (4 pts) Let S be a set of variables such that {W, X} is d-separated from {V, Y } given S. What
variables must be in S and why? What variables must not be in S and why?
LU

In S: U.
Not in S: Z and E

3. (4 pts) Consider the subgraph containing only X, Y, Z and E. Is X ⊥ Y |E ? If so, prove it.
If not, give an example of CPTs such that X 6⊥ Y |E.
SO

X 6⊥ Y |E.
One example: P r(X) = P r(Y ) = 0.5.
Z X ∪Y
EZ

4
Part III. On-line learning (10 pts)
Consider an on-line learner B and a teacher A. Recall that learning proceeds as follows:

• For each round t = 1, . . . , T

– A sends B an instance xt , withholding the associated label yt

– B predicts a class ŷt for xt , and is assessed an “error” if ŷt 6= yt .
– A sends B the correct label yt for xt .

1. (5 pts) Let H be a set of hypotheses, and assume that A is reasonable for H in the following

NS
sense: for every sequence (x1 , yt ), . . . , (xT , yT ) presented by A to B, there exists some h∗ ∈ H
such that ∀t, yt = h∗ (xt ). Show A can still force the learner B to make at least d mistakes,
where d is the VC dimension of H.

V Cdim = d means that there is a set S such that ∀S 0 ⊂ S∃h ∈ H : h ∩ S = S 0 . Present each
element of S to B; whatever B predicts, A can provide the opposite prediction and still be
consistent with some h.

O
2. (5 pts) Suppose D = (x1 , y1 ), . . . , (xT , yT ) is a dataset where

• each xi represents a document di , where d has no more than 100 English words (and for
TI
the purpose of this question there are no more than 100,000 English words);
• each xi is a sparse vector (b1 , . . . , b100,000 ), where bw = 1 if word w is in the document
di , and bw = 0 otherwise;
• y1 is a label (+1 or −1) indicating the class of the document;
• there exists some vector u such that ∀i, yi u · xi > 1
LU

Write down an upper bound on R, the maximum distance of an example xi from the origin,
and m, the number of mistakes made by a perceptron before converging to a correct hypothesis
vk , when trained on-line using data from D.

m < γR2 .
Plugging in, R ≤ ||x̄||2 ≤ 100
SO

γ = 1 (as yn ūx¯n > 1. It’s also ok to make an assumption about ||n||)

So m < 100.

5
Part IV. Understanding the performance of previously-studied learning meth-
ods (10 pts)

NS
Consider the three datasets (A), (B), and (C) above, where the circles are positive examples

O
(with two numeric features – the coordinate of the circle in 2D space) and the squares are negative
examples. Now consider the following four learning methods: Naive bayes (NB), logistic regression
(LR), decision trees with pruning (DT+p) , and decision trees without pruning (DT-p).
TI
1. (5pts) For each of the three datasets, list the learning system(s) that you think will perform
well, and say why.
There was some freedom here as to how many methods you listed, and in which order, etc.
The general idea, though, is that (A) is linearly separable and should be learnable by NB, LR,
DT-p and DT+p. (B) is not linearly separable (and so can’t be learned by NB or LR), but
can be perfectly classified with two lines, ie, a decision tree like DT-p or DT+p. Finally, (C)
LU

is a noisy version of (B) and so we need DT+p’s pruning to deal with the noise (DT-p would
overfit the noise).

2. (5pts) For each of the three datasets, list the learning system(s) that you think will perform
poorly, and say why.
As above, all methods should do well on (A), NB and LR should fail (B) due to decision
boundary form, and NB, LR and DT-p should fail (C) due to noise.
SO

6
Part V. Understanding a novel learning method: Uniform Naive Bayes (15
pts)
Suppose you are building a Naive Bayes classifier for data containing two continuous attributes,
X1 and X2 , in addition to the boolean label Y . You decide to represent the conditional distribu-
tions p(Xi |Y ) as uniform distributions; that is, to build a Uniform Naive Bayes Classifier. Your
conditional distributions would be represented using the parameters
1
p(Xi = x|Y = yk ) =
bik − aik

where, aik < bik . (of course P (Xi = x|Y = yk ) = 0 when x < aik or x > bik .)

NS
1. (5 pts) Given a set of data, how would we choose MLEs for the parameters aik and bik ?

Take ai , bi to be the minimum and maximum for each class i. (See homework 2.1)

2. (5 pts) What will the decision surface look like? (If you like you can describe it in words).

O
Two rectangles that may or may not intersect. If they intersect, there is a “tie”.
TI
3. (5 pts) Sometimes it is desirable to use a classifier to find examples with highly confident
predictions—e.g., to rank examples x according to P (Y = +|x, θ). If you’re using a classifier
for this, it is usually undesirable for the scores for two examples to be tied (e.g., for P (Y =
+|x1 , θ) = P (Y = +|x2 , θ).
Are ties more or less likely for Uniform Naive Bayes, compared to Gaussian Naive Bayes, and
LU

why?

More likely– Under UNB, in the intersection of the rectangles, as well as any area outside
(where probability is 0 for both classes), there will be ties. In GNB ties only occur along the
(linear) decision boundary.
SO

7
Part VI. Understanding a novel learning method: Naive Bayes/Decision tree
hybrids (15 pts)
Suppose you were to augment a decision tree by adding a Naive Bayes classifier at each leaf.
That is, consider learning a decision tree of depth k, (where k is smaller than the number of variables
n), where each leaf contains not a class, but a Naive Bayes classifier, which is based on the n − k
variables not appearing on the path to that leaf.

1. (5 pts) Briefly outline a learning algorithm for this representation, assuming that k is a fixed
parameter. The simplest idea was to grow a decision tree by the usual method to depth k.
Then learn a NB model over the examples in each of the leaves at level k. More sophisticated

NS
versions of this method, eg. iterating one round of decision tree growth with one round of NB
learning, were also acceptable.

O
TI
2. (5 pts) Will this result in having the same NB classifier at each leaf? Why or why not? If
not, how will they be different?
The NB classifiers learned at each leaf will be different because they will have been trained
on different examples, based on the different paths the examples take through the tree and the
LU

corresponding conditions each example satisfies or doesn’t.

3. (5 pts) Briefly describe a plausible means of selecting a value for k.

Two common answers here: either a pruning-like method, based on information gain or some-
thing similar, or a cross validation method, ie, growing the tree for various values of k until
performance on held-out data starts to fall. We didn’t consider the time complexity for these
various methods, but we did penalize for not explicitly using held-out data for cross-validation,
eg. choosing k so as to maximize performance directly on test data was not acceptable.

Midterm Solutions
No ratings yet
Midterm Solutions
8 pages
Midterm 2006
No ratings yet
Midterm 2006
11 pages
Machine Learning 10-701 Exam Prep
No ratings yet
Machine Learning 10-701 Exam Prep
14 pages
Practice Midterm
No ratings yet
Practice Midterm
4 pages
Final 2006
No ratings yet
Final 2006
15 pages
Exam Spring 10
No ratings yet
Exam Spring 10
10 pages
cs675 SS2022 Midterm Solution PDF
No ratings yet
cs675 SS2022 Midterm Solution PDF
10 pages
Machine Learning MCQ Assignment
No ratings yet
Machine Learning MCQ Assignment
56 pages
MLvsMAP Merged
No ratings yet
MLvsMAP Merged
208 pages
CS725 2020 Midsem
No ratings yet
CS725 2020 Midsem
3 pages
CS725 2020 Quiz1
No ratings yet
CS725 2020 Quiz1
3 pages
University of Edinburgh College of Science and Engineering School of Informatics
No ratings yet
University of Edinburgh College of Science and Engineering School of Informatics
5 pages
Practice Midterm 2010
No ratings yet
Practice Midterm 2010
4 pages
Final 2006
No ratings yet
Final 2006
15 pages
2011 End Spring 2011 Computer Science Machine Learning
No ratings yet
2011 End Spring 2011 Computer Science Machine Learning
10 pages
HW 2
No ratings yet
HW 2
4 pages
10-601 Machine Learning Midterm Exam Fall 2011: Tom Mitchell, Aarti Singh Carnegie Mellon University
No ratings yet
10-601 Machine Learning Midterm Exam Fall 2011: Tom Mitchell, Aarti Singh Carnegie Mellon University
16 pages
ML Question CMU
No ratings yet
ML Question CMU
12 pages
AI Practice Final Exam Spring 2014
No ratings yet
AI Practice Final Exam Spring 2014
19 pages
Ai ML Exam - 1march 16 2022-Michael Magreola
No ratings yet
Ai ML Exam - 1march 16 2022-Michael Magreola
8 pages
Assignment 1
No ratings yet
Assignment 1
6 pages
Solution of Final Exam: 10-701/15-781 Machine Learning: Fall 2004 Dec. 12th 2004
No ratings yet
Solution of Final Exam: 10-701/15-781 Machine Learning: Fall 2004 Dec. 12th 2004
27 pages
Endsem ML Regular AK
No ratings yet
Endsem ML Regular AK
7 pages
hw3 Red
No ratings yet
hw3 Red
4 pages
Midterm 2002
No ratings yet
Midterm 2002
10 pages
Machine Learning Exam Solutions
100% (1)
Machine Learning Exam Solutions
17 pages
SMAI Question Papers
No ratings yet
SMAI Question Papers
13 pages
CS229 Midterm Solutions 2010
No ratings yet
CS229 Midterm Solutions 2010
8 pages
2022 CS244 End Sem Soln
No ratings yet
2022 CS244 End Sem Soln
6 pages
ML Midsem 2018 Solutions
No ratings yet
ML Midsem 2018 Solutions
7 pages
IBM322 Last Year ETE
No ratings yet
IBM322 Last Year ETE
5 pages
CS 540: Introduction To Artificial Intelligence: Final Exam: 2:45-4:45pm, May 11, 2008 Room 1240 Computer Sciences
No ratings yet
CS 540: Introduction To Artificial Intelligence: Final Exam: 2:45-4:45pm, May 11, 2008 Room 1240 Computer Sciences
11 pages
CS 540: Introduction To Artificial Intelligence: Final Exam: 12:25-2:25pm, May 16, 2013 Room 228 Educational Sciences
No ratings yet
CS 540: Introduction To Artificial Intelligence: Final Exam: 12:25-2:25pm, May 16, 2013 Room 228 Educational Sciences
10 pages
hw2 Red
No ratings yet
hw2 Red
4 pages
Final Exam Solutions
No ratings yet
Final Exam Solutions
12 pages
CS 7641 Midterm Exam 2 Solutions
No ratings yet
CS 7641 Midterm Exam 2 Solutions
12 pages
Repeat 2014 Wwith Answers
No ratings yet
Repeat 2014 Wwith Answers
9 pages
CSCI 5521 Spring 2025 Final Exam
No ratings yet
CSCI 5521 Spring 2025 Final Exam
8 pages
2022 Exam2 Solution
No ratings yet
2022 Exam2 Solution
10 pages
Exam 2015
No ratings yet
Exam 2015
12 pages
CS 540: Introduction To Artificial Intelligence: Final Exam: 10:05am-12:05pm, December 19, 2011 Room 168 Noland
No ratings yet
CS 540: Introduction To Artificial Intelligence: Final Exam: 10:05am-12:05pm, December 19, 2011 Room 168 Noland
10 pages
10-701/15-781 Machine Learning Mid-Term Exam Solution: Your Name
No ratings yet
10-701/15-781 Machine Learning Mid-Term Exam Solution: Your Name
12 pages
Midterm 2008s Solution
No ratings yet
Midterm 2008s Solution
12 pages
Endsem ML Makeup AK - 1
No ratings yet
Endsem ML Makeup AK - 1
7 pages
HW 3
No ratings yet
HW 3
7 pages
Cos4852 2018 A1
No ratings yet
Cos4852 2018 A1
11 pages
Machine Learning Assignments and Answers
No ratings yet
Machine Learning Assignments and Answers
35 pages
Machine Learning Exam Prep
No ratings yet
Machine Learning Exam Prep
9 pages
2019-20-I ES Key
No ratings yet
2019-20-I ES Key
4 pages
10-601 Machine Learning Exam
No ratings yet
10-601 Machine Learning Exam
21 pages
SMAI End 2015 S
No ratings yet
SMAI End 2015 S
4 pages
Exam 2011
No ratings yet
Exam 2011
22 pages
CS412 512 Final 2017 v2
No ratings yet
CS412 512 Final 2017 v2
11 pages
Final 2012 Wsolutions
No ratings yet
Final 2012 Wsolutions
14 pages
Midterm Sol
No ratings yet
Midterm Sol
16 pages
AI Midterm Solution Guide 2013
No ratings yet
AI Midterm Solution Guide 2013
11 pages
Stanford University CS 229, Autumn 2014 Midterm Examination
No ratings yet
Stanford University CS 229, Autumn 2014 Midterm Examination
23 pages
Ustalik - Robert Greene
50% (4)
Ustalik - Robert Greene
420 pages
365 Days With Self-Discipline - 365 Life-Altering Thoughts On Self-Control, Mental Resilience, and Success (PDFDrive)
88% (41)
365 Days With Self-Discipline - 365 Life-Altering Thoughts On Self-Control, Mental Resilience, and Success (PDFDrive)
816 pages
Limitless
97% (153)
Limitless
355 pages
The ONE Thing - The Surprisingly Simple Truth Behind Extraordinary Results (PDFDrive)
100% (62)
The ONE Thing - The Surprisingly Simple Truth Behind Extraordinary Results (PDFDrive)
214 pages
(Industrial and Applied Mathematics) Martin Brokate, Pammy Manchanda, Abul Hasan Siddiqi - Calculus For Scientists and Engineers (2019, Springer)
94% (16)
(Industrial and Applied Mathematics) Martin Brokate, Pammy Manchanda, Abul Hasan Siddiqi - Calculus For Scientists and Engineers (2019, Springer)
655 pages
10WaysToChangeYourMindset PDF
97% (36)
10WaysToChangeYourMindset PDF
7 pages
ELS English Grammar Inside and Out
64% (11)
ELS English Grammar Inside and Out
771 pages
The Art of Saying NO
97% (34)
The Art of Saying NO
138 pages
50 Challenging Calculus Problems (Fully Solved) - Chris McMullen
100% (18)
50 Challenging Calculus Problems (Fully Solved) - Chris McMullen
236 pages
İngilizce - Gramer - Mükemmel - Super
100% (10)
İngilizce - Gramer - Mükemmel - Super
257 pages
Success Patterns Explained
100% (7)
Success Patterns Explained
325 pages
Reader at Work 1
100% (6)
Reader at Work 1
351 pages
Blink The Power of Thinking Without Thinking - Malcolm Gladwell PDF
95% (39)
Blink The Power of Thinking Without Thinking - Malcolm Gladwell PDF
337 pages
Deep Work
98% (43)
Deep Work
212 pages
The Daily Stoic - 366 Meditations On Wisdom, Perseverance, and The Art of Living (PDFDrive)
98% (43)
The Daily Stoic - 366 Meditations On Wisdom, Perseverance, and The Art of Living (PDFDrive)
406 pages
Man's Search For Meaning (PDFDrive)
100% (2)
Man's Search For Meaning (PDFDrive)
187 pages
English 10 Quarter 1 Module 1
89% (261)
English 10 Quarter 1 Module 1
32 pages
Machine Learning - An Applied Mathematics Introduction PDF
100% (14)
Machine Learning - An Applied Mathematics Introduction PDF
246 pages
Rüştü Hoca Taktiklerle Paragraf
83% (6)
Rüştü Hoca Taktiklerle Paragraf
310 pages
Reader at Work - I-Key
No ratings yet
Reader at Work - I-Key
23 pages
The Courage To Be Disliked
31% (26)
The Courage To Be Disliked
249 pages
1001 Algebra Problems
96% (72)
1001 Algebra Problems
292 pages
Data Structure and Algorithms With Python
100% (16)
Data Structure and Algorithms With Python
369 pages
PLC Handbook PDF
100% (16)
PLC Handbook PDF
119 pages
Artificial Intelligence With Python (Machine Learning Foundations, Methodologies, and Applications) (Teik Toe Teoh, Zheng Rong)
94% (17)
Artificial Intelligence With Python (Machine Learning Foundations, Methodologies, and Applications) (Teik Toe Teoh, Zheng Rong)
334 pages
Seni Yoran Herşeyi Bırak
100% (2)
Seni Yoran Herşeyi Bırak
204 pages
Yds Reader 150 Paragraf Modadil
78% (9)
Yds Reader 150 Paragraf Modadil
337 pages
Machine Learning Projects in Python
100% (16)
Machine Learning Projects in Python
135 pages
How To Stop Overthinking
91% (23)
How To Stop Overthinking
117 pages
Python Basics for Beginners
100% (12)
Python Basics for Beginners
2 pages
Grade 7 Science Worksheet
No ratings yet
Grade 7 Science Worksheet
1 page
Chapter 3: The Reinforcement Learning Problem (Markov Decision Processes, or MDPS)
No ratings yet
Chapter 3: The Reinforcement Learning Problem (Markov Decision Processes, or MDPS)
27 pages
ICSE Class 7 Maths Chapter 01 Number System
100% (1)
ICSE Class 7 Maths Chapter 01 Number System
10 pages
Chapter 6: Temporal Difference Learning: Objectives of This Chapter
No ratings yet
Chapter 6: Temporal Difference Learning: Objectives of This Chapter
49 pages
What We Learned Last Time: 1. Intelligence Is The Computational Part of The Ability To Achieve Goals
No ratings yet
What We Learned Last Time: 1. Intelligence Is The Computational Part of The Ability To Achieve Goals
32 pages
Unified View: Multi-Step Bootstrapping
No ratings yet
Unified View: Multi-Step Bootstrapping
18 pages
Machine Learning Midterm Solutions
No ratings yet
Machine Learning Midterm Solutions
14 pages
Statistical Concepts in The Determination of Weight Variation
No ratings yet
Statistical Concepts in The Determination of Weight Variation
4 pages
Slides (Handout) - Caio - Chapter 4 (Wooldridge)
No ratings yet
Slides (Handout) - Caio - Chapter 4 (Wooldridge)
99 pages
Khan Et Al-2017-Phytotherapy Research
No ratings yet
Khan Et Al-2017-Phytotherapy Research
12 pages
Sample Size Formulas and Examples
No ratings yet
Sample Size Formulas and Examples
26 pages
Business Math & Environment Studies
No ratings yet
Business Math & Environment Studies
64 pages
Online Store Perception:: How To Turn Browsers Into Buyers?
No ratings yet
Online Store Perception:: How To Turn Browsers Into Buyers?
22 pages
m248 Block C
No ratings yet
m248 Block C
123 pages
S28 Statistical Inferences & Hypothesis Testing (NEW)
No ratings yet
S28 Statistical Inferences & Hypothesis Testing (NEW)
95 pages
Making Regression Tables From Stored Estimates
No ratings yet
Making Regression Tables From Stored Estimates
21 pages
BBA2STATSPASTPAPERS
No ratings yet
BBA2STATSPASTPAPERS
14 pages
460 Bai Tap Dien Hoa
50% (2)
460 Bai Tap Dien Hoa
69 pages
Sigmaplot 12 Brochure
No ratings yet
Sigmaplot 12 Brochure
6 pages
Grade 10 & 11 Math Competencies
No ratings yet
Grade 10 & 11 Math Competencies
20 pages
Cosmetic Stability Testing
100% (3)
Cosmetic Stability Testing
10 pages
Hybrid Swara-Copras Method For Risk Assessment in Deep Foundation Excavation Project: An Iranian Case Study
No ratings yet
Hybrid Swara-Copras Method For Risk Assessment in Deep Foundation Excavation Project: An Iranian Case Study
9 pages
Statistics FinalReview
No ratings yet
Statistics FinalReview
8 pages
Identifying The Length of A
No ratings yet
Identifying The Length of A
15 pages
17 Sampling Methods
No ratings yet
17 Sampling Methods
25 pages
Labrador Goniometry Reliability
No ratings yet
Labrador Goniometry Reliability
9 pages
Stats Answer Key
No ratings yet
Stats Answer Key
10 pages
Approaches On Reliability Estimation - Kiia Dizon
No ratings yet
Approaches On Reliability Estimation - Kiia Dizon
6 pages
(Ebook PDF) Mind On Statistics: Australian & New Zealand 2nd Download
100% (5)
(Ebook PDF) Mind On Statistics: Australian & New Zealand 2nd Download
57 pages
Business Statistics A First Course 6th Edition David Levine Timothy Krehbiel
No ratings yet
Business Statistics A First Course 6th Edition David Levine Timothy Krehbiel
309 pages
Chapter 4 in Managerial Economic
100% (2)
Chapter 4 in Managerial Economic
46 pages
Probability Applications in Mechanical Engineering Design PDF
No ratings yet
Probability Applications in Mechanical Engineering Design PDF
290 pages
Evaluation of The DDSolver
No ratings yet
Evaluation of The DDSolver
10 pages
Chapter 10 Two Sample Inferences
No ratings yet
Chapter 10 Two Sample Inferences
82 pages
PSM Syllabus
No ratings yet
PSM Syllabus
13 pages
Anova Biometry
No ratings yet
Anova Biometry
33 pages
Inferential Statistics
No ratings yet
Inferential Statistics
2 pages

Solutions: 10-601 Machine Learning, Midterm Exam: Spring 2008 Solutions

Uploaded by

Solutions: 10-601 Machine Learning, Midterm Exam: Spring 2008 Solutions

Uploaded by

10-601 Machine Learning, Midterm Exam: Spring 2008

• Please put your name on this cover sheet

Q Topic Max. Score Score

1 Short answer questions 20

4 Learning methods we studied 10

5 A new learning method 15

6 Naive Bayes and Decision Trees 15

1. (4 pts) Basic probability.

2. (6 pts) Bayes network structures.

P r(A|B, C)P r(C|B)P r(B) - B is parent of both C and A, C is parent only of A

6- the only possible structure in order to have no conditional independencies is to use

5. (3 pts) Bayes rule.

No. W → U → V is not blocked since U ∈ E

• For each round t = 1, . . . , T

– A sends B an instance xt , withholding the associated label yt

γ = 1 (as yn ūx¯n > 1. It’s also ok to make an assumption about ||n||)

corresponding conditions each example satisfies or doesn’t.

3. (5 pts) Briefly describe a plausible means of selecting a value for k.

You might also like