MALLAREDDY ENGINEERING COLLEGE (AUTONOMOUS)
III B. Tech I Sem II Mid Objective Paper (MR22)
Subject: Advanced Machine Learning (C6605)
Faculty Name: Mr. Ch Satyanarayana
Branch: CSE (AIML)
Multiple Choice Questions
1. Which of the following statements is not true about the pruning in the decision tree?
a) When the decision tree is created, many of the branches will reflect anomalies in the
training data due to noise
b) The over fitting happens when the learning algorithm continues to develop
hypothesis that reduce training set error at the cost of an increased test set errors
c) It optimises the computational efficiency
d) It reduces the classification accuracy
2. Post pruning is also known as backward pruning.
a) True
b) False
3. Which of the following statements is not true about Post pruning?
a) It begins by generating the (complete) tree and then adjust it with the aim of
improving the classification accuracy on unseen instances
b) It begins by converting the tree to an equivalent set of rules
c) It would not overfit trees
d) It converts a complete tree to a smaller pruned one which predicts the classification
of unseen instances at least as accurately
4. Which of the following statements is not true about Reduced error pruning?
a) It is the simplest and most understandable method in decision tree pruning
b) It considers each of the decision nodes in the tree to be candidates for pruning,
consist of removing the subtree rooted at that node, making it a leaf node
c) If the error rate of the new tree would be equal to or smaller than that of the original
tree and that subtree contains no subtree with the same property, then subtree is
replaced by leaf node
d) If the error rate of the new tree would be greater than that of the original tree and that
subtree contains no subtree with the same property, then subtree is replaced by leaf
node, means pruning is done
5. Which of the following statements is not an advantage of Reduced error pruning?
a) Linear computational complexity
b) Over pruning
c) Simplicity
d) Speed
6. Minimum error pruning is a Top down approach.
a) True
b) False
7. Which of the following statements is not a step in Minimum error pruning?
a) At each non leaf node in the tree, calculate expected error rate if that subtree is
pruned
b) Calculate the expected error rate for that node if subtree is not pruned
c) If pruning the node leads to greater expected error rate, then keep the subtree
d) If pruning the node leads to smaller expected error rate, then don’t prune it
8. Pre pruning is also known as online pruning.
a) True
b) False
9. Minimum number of objects pruning is a Post pruning technique.
a) True
b) False
10. Which of the following is not a Post pruning technique?
a) Reduced error pruning
b) Error complexity pruning
c) Minimum error pruning
d) Chi – square pruning
11. Which of the following is not a Post pruning technique?
a) Pessimistic error pruning
b) Iterative growing and pruning
c) Reduced error pruning
d) Early stopping pruning
12. Consider we have a set of data with 3 classes, and we have observed 20 examples of
which the greatest number 15 is in class c. If we predict that all future examples will be
in class c, what is the expected error rate using minimum error pruning?
a) 0.304
b) 0.5y
c) 0.402
d) 0.561
13. Consider we have a set of data with 3 classes, and we have observed 20 examples of
which the greatest number 15 is in class c. If we predict that all future examples will be
in class c, what is the expected error rate without pruning?
a) 0.22
b) 0.17
c) 0.15
d) 0.05
14. Consider the example, number of corrected mis – classifications at a particular node,
n'(t) = 15.5, and number of corrected mis – classifications for sub – tree, n'(Tt) = 12.
N(t) is the number of training set examples at node t and it is equal to 35. Here the tree
should be pruned.
a) True
b) False
15. Which of the following statements is false about k-Nearest Neighbor algorithm?
a) It stores all available cases and classifies new cases based on a similarity measure
b) It has been used in statistical estimation and pattern recognition
c) It cannot be used for regression
d) The input consists of the k closest training examples in the feature space
16. Which of the following statements is not true about k-Nearest Neighbor classification?
a) The output is a class membership
b) An object is classified by a plurality vote of its neighbors
c) If k = 1, then the object is simply assigned to the class of that single nearest neighbor
d) The output is the property value for the object
17. Which of the following statements is not true about k Nearest Neighbor?
a)It belongs to the supervised learning domain
b)It has an application in data mining and intrusion detection
c)It is Non-parametric
d) It is not an instance based learning algorithm
18. Which of the following statements is not supporting in defining k Nearest Neighbor as a
lazy learning algorithm?
a) It defers data processing until it receives a request to classify unlabeled data
b) It replies to a request for information by combining its stored training data
c) It stores all the intermediate results
d) It discards the constructed answer
19. Which of the following statements is not supporting kNN to be a lazy learner?
a) When it gets the training data, it does not learn and make a model
b) When it gets the training data, it just stores the data
c) It derives a discriminative function from the training data
d) It uses the training data when it actually needs to do some prediction
20. Euclidian distance and Manhattan distance are the same in kNN algorithm to calculate
the distance.
a) True
b) False
21. What is the Manhattan distance between a data point (9, 7) and a new query instance
(3, 4)?
a) 7
b) 9
c) 3
d) 4
22. In kNN too large value of K has a negative impact on the data points.
a) True
b) False
23. It is good to use kNN for large data sets.
a) True
b) False
24. When we set K = 1 in kNN algorithm, the predictions become more stable.
a) True
b) False
25. MODULE-4
26. Any ERM rule is a successful PAC learner for hypothesis space H.
a) True
b) False
27. If distribution D assigns zero probability to instances where h not equal to c, then an
error will be ______
a) 1
b) 0.5
c) 0
d) infinite
28. If distribution D assigns zero probability to instances where h = c, then an error will be
______
a) Cannot be determined
b) 0.5
c) 1
d) 0
29. Error strongly depends on distribution D.
a) True
b) False
30. PAC learning was introduced by ____________
a) Vapnik
b) Leslie Valiant
c) Chervonenkis
d) Reverend Thomas Bayes
31. Error is defined over the _____________
a) training set
b) test Set
c) domain set
d) cross-validation set
32. The error of h with respect to c is the probability that a randomly drawn instance will
fall into the region where _________
a) h and c disagree
b) h and c agree
c) h is greater than c but not less
d) h is lesser than c but not greater
33. When was PAC learning invented?
a) 1954
b) 1964
c) 1974
d) 1984
34. What does VC dimension do?
a) Reduces complexity of hypothesis space
b) Removes noise from dataset
c) Measures complexity of training dataset
d) Measures the complexity of hypothesis space H
35. An instance set S is given. How many dichotomies are possible?
a) 2*|S|
b) 2/|S|
c) 2^|S|
d) |S|
36. If h is a straight line, what is the maximum number of points that can be shattered?
a) 4
b) 2
c) 3
d) 5
37. What is the VC dimension of a straight line?
a) 3
b) 2
c) 4
d) 0
38. A set of 3 instances is shattered by _____ hypotheses.
a) 4
b) 8
c) 3
d) 2
39. What is the relation between VC dimension and hypothesis space H?
a) VC(H) <= |H|
b) VC(H) != log2|H|
c) VC(H) <= log2|H|
d) VC(H) > log2|H|
40. VC Dimension can be infinite.
a) True
b) False
41. Who invented VC dimension?
a) Francis Galton
b) J. Ross Quinlan
c) Leslie Valiant
d) Vapnik and Chervonenkis
42. What is the advantage of VC dimension over PAC learning?
a) VC dimension reduces complexity of training data
b) VC dimension outputs more accurate predictors
c) VC dimension can work for infinite hypothesis space
d) There is no advantage
43. IF VC(H) increases, number of maximum training examples required (m) increases.
a) False
b) True
44. Instance space: X = set of real numbers, Hypothesis space H: the set of intervals on the
real number line. a and b can be any constants used to represent the hypothesis. How is
H represented?
a) a – b < a + b
b) a + b < x < 2(a+b)
c) a/b < x < a*b
d) a < x < b
45. S = {3.1, 5.7}. How many hypotheses are required?
a) 2
b) 3
c) 4
d) 1
46. S = {x0, x1, x2}. This set can be shattered by hypotheses of form a < x < b, where a and
b are arbitrary constants.
a) True
b) False
47. S = {x0, x1, x2}. Hypotheses are of the form a < x < b. What is H?
a) infinite
b) 0
c) 2
d) 1
48. S = {x0, x1, x2}. Hypotheses are of the form a < x < b. What is VC(H)?
a) 0
b) 2
c) 1
d) infinite
49. S = {x0, x1, x2} and H is finite. What is VC(H)?
a) 1
b) 2
c) 3
d) infinite
50. S = {x0, x1, x2}. Hypotheses are straight lines. What is H?
a) 8
b) 3
c) 4
d) infinite
51. The algorithm is trying to find a suitable day for swimming. What is the most general
hypothesis?
a) A rainy day is a positive example
b) A sunny day is a positive example
c) No day is a positive example
d) Every day is a positive example
52. Candidate-Elimination algorithm can be described by ____________
a) just a set of candidate hypotheses
b) depends on the dataset
c) set of instances, set of candidate hypotheses
d) just a set of instances
53. How is the version space represented?
a) Least general members
b) Most general members
c) Most general and least general members
d) Arbitrary members chosen form hypothesis space
54. Let G be the set of maximally general hypotheses. While iterating through the dataset,
when is it changed for the first time?
a) Negative example is encountered for the first time
b) Positive example is encountered for the first time
c) First example encountered, irrespective of whether it is positive or negative
d) S, the set of maximally specific hypotheses, is changed
55. Let S be the set of maximally specific hypotheses. While iterating through the dataset,
when is it changed for the first time?
a) Negative example is encountered for the first time
b) Positive example is encountered for the first time
c) First example encountered, irrespective of whether it is positive or negative
d) G, the set of maximally general hypotheses, is changed
56. S = <sunny, warm, high, same>. Training data = <sunny, warm, normal, same> => Yes
(positive example). How will S be represented after encountering this training data?
a) <sunny, warm, high, same>
b) <phi, phi, phi, phi>
c) <sunny, warm, ?, same>
d) <sunny, warm, normal, same>
57. S = <phi, phi, phi, phi>Training data = <rainy, cold, normal, change> => No (negative
example). How will S be represented after encountering this training data?
a) <phi, phi, phi, phi>
b) <sunny, warm, high, same>
c) <rainy, cold, normal, change>
d) <?, ?, ?, ?>
58. G = <?, ?, ?, ?>. Training data = <sunny, warm, normal, same> => Yes (positive
example). How will G be represented after encountering this training data?
a) <sunny, warm, normal, same>
b) <phi, phi, phi, phi>
c) <rainy, cold, normal, change>
d) <?, ?, ?, ?>
59. G = (<sunny, ?, ?, ?> ; <?, warm, ?, ?> ; <?, ?, high, ?>). Training data = <sunny, warm,
normal, same> => Yes (positive example). How will G be represented after
encountering this training data?
a) <phi, phi, phi, phi>
b) (<sunny, ?, ?, ?> ; <?, warm, ?, ?> ; <?, ?, high, ?>)
c) (<sunny, ?, ?, ?> ; <?, warm, ?, ?>)
d) <?, ?, ?, ?>
60. It is possible that in the output, set S contains only phi.
a) False
b) True
61. The Soft SVM assumes that the training set is linearly separable.
a) True
b) False
62. Soft SVM is an extended version of Hard SVM.
a) True
b) False
63. Linear Soft margin SVM can only be used when the training data are linearly separable.
a) True
b) False
64. Given a two-class classification problem with data points x1 = -5, x2 = 3, x3 = 5, having
class label +1 and x4 = 2 with class label -1. The problem can be solved using Soft
SVM.
a) True
b) False
65. Given a two-class classification problem with data points x1 = -5, x2 = 3, x3 = 5, having
class label +1 and x4 = 2 with class label -1. The problem can never be solved using
Hard SVM.
a) True
b) False
66. The SVM relies on hinge loss.
a) True
b) False
67. Which of the following statements is not true about Soft SVM classification?
a) If the data are non separable it needs to introduce some tolerance to outlier data
points
b) If the data are non separable, slack variable can be added to allow misclassification
of noisy examples
c) If the data are non separable it will add one slack variable greater than or equal to
zero for each training data point
d) The slack variable value is greater than one for the points that are on the correct side
of the margin
68. The slack variable value of the point on the decision boundary of the Soft SVM is
equal to one.
a) True
b) False
69. The slack variables value ξi ≥ 1 for misclassified points, and 0 < ξi < 1 for points close
to the decision boundary.
a) True
b) False
70. The bounds derived for Soft-SVM do not depend on the dimension of the instance
space.
a) True
b) False
71. A Lagrange dual of a convex optimisation problem is another convex optimisation
problem.
a) True
b) False
72. The difference between the primal and dual solutions is known as duality gap.
a) True
b) False
73. In optimisation problems Lagrangian is used to find out only the local minima of a
function subject to certain constraints.
a) True
b) False
74. Karush–Kuhn–Tucker (KKT) conditions are second derivative tests for a solution in
nonlinear programming to be optimal, provided that some regularity conditions are
satisfied.
a) False
b) True
75. MODULE-5
76. Reinforcement learning is a ____
A. Prediction-based learning technique
B. Feedback-based learning technique
C. History results-based learning technique
77. How many types of feedback does reinforcement provide?
A. 1
B. 2
C. 3
D. 4
78. Which kind of data does reinforcement learning use?
A. Labeled data
B. Unlabelled data
C. None
D. Both
79. Reinforcement learning methods learned through ____?
A. Experience
B. Predictions
C. Analyzing the data
80. How many types of machine learning are there?
A. 2
B. 3
C. 4
D. 5
81. Which of the following is the practical example of reinforcement
learning?
A. House pricing prediction
B. Market basket analysis
C. Text classification
D. Driverless cars
82. What is an agent in reinforcement learning?
A. Agent is the situation in which rewards are being exchanged
B. Agent is the simple value in reinforcement learning.
C. An agent is an entity that explores the environment.
83. What is the environment in reinforcement learning?
A. Environment is a situation that is based on the current state
B. Environment is a situation in which an agent is present.
C. Environment is similar to feedback
D. Environment is a situation that the agent returns as a result.
84. What are actions in reinforcement learning?
A. Actions are the moves that the agent takes inside the environment.
B. Actions are the function that the environment takes.
C. Actions are the feedback that an agent provides.
85. What is the state of reinforcement learning?
A. State is a situation in which an agent is present.
B. A state is the simple value of reinforcement learning.
C. A state is a result returned by the environment after an agent takes an
action.
86. What are the Rewards of Reinforcement learning?
A. An agent's action is evaluated based on feedback returned from the
environment.
B. Environment gives value in return which is known as a reward.
C. A reward is a result returned by the environment after an agent takes
an action.
87. What is the Policy in reinforcement learning?
A. The agent's policy determines what environment model should be
decided
B. The agent's policy determines what action to take based on the current
state.
C. The agent's policy determines what the state reward would be.
88. Does reinforcement learning follow the concept of the Hit and try
method?
A. Yes
B. No
89. In how many ways can you implement reinforcement learning?
A. 2
B. 3
C. 4
D. 5
90. In which of the following approaches of reinforcement learning, do we
find the optimal value function?
A. Value-based
B. Policy-based
C. Model-based
91. How many types of policy-based approaches are there in reinforcement
learning?
A. 1
B. 2
C. 3
D. 4
92. In which of the following approaches of reinforcement learning, a
virtual model is created for the environment?
A. Value-based
B. Policy-based
C. Model-based
93. ____ is a synonym for random and probabilistic?
A. Deterministic
B. Stochastic
94. How many elements does reinforcement learning consist of?
A. 2
B. 3
C. 4
D. 5
95. The agent's main objective is to ____the total number of rewards for
good actions.?
A. Minimize
B. Maximize
C. Null
96. Reinforcement learning is defined by the ____?
A. Policy
B. Reward Signal
C. Value Function
D. Model of the environment
97. Which element in reinforcement learning defines the behavior of the
agent?
A. Policy
B. Reward Signal
C. Value Function
D. Model of the environment
98. Can reward signals change the policy?
A. Yes
B. No
99. On which of the following elements of reinforcement learning, the
reward that an agent can expect is dependent?
A. Policy
B. Reward Signal
C. Value Function
D. Model of the environment
100. Which of the following elements of reinforcement learning imitates the
behavior of the environment?
A. Policy
B. Reward Signal
C. Value Function
D. Model of the environment
101. The approach in which reinforcement learning problems are solved with
the help of models is known as ____?
A. Model-based approach
B. Model-free approach
C. Model known approach
102. Who introduced the Bellman equation?
A. Richard Ernest Bellman
B. Alfonso Shimbel
C. Edsger W. Dijkstra
103. Gamma (γ) in the bellman equation is known as?
A. Value factor
B. Discount factor
C. Environment factor
104. How many types of reinforcement learning?
A. 3
B. 4
C. 2
D. 5
105. How do you represent the agent state in reinforcement learning?
A. Discount state
B. Discount factor
C. Markov state
106. . P[St+1 | St ] = P[St +1 | S1,......, St], in this condition
What is the meaning of St?
A. State factor
B. Discount factor
C. Markov state
107. What do you mean by MDP in reinforcement learning?
A. Markov discount procedure
B. Markov discount process
C. Markov deciding procedure
D. Markov decision process
108. Why do we use MDP in reinforcement learning?
A. We use MDP to formalize the reinforcement learning problems.
B. We use MDP to predict reinforcement learning problems.
C. We use MDP to analyze the reinforcement learning problems.
109. How many tuples does MDP consist of?
A. 2
B. 3
C. 4
D. 5
110. Which of the following algorithms will find the best course of action,
based on the agent's current state, without using a model and off-policy
reinforcement learning?
A. Q-learning
B. Markov property
C. State action reward state action
D. Deep Q neural network
111. What do you mean by SARSA in reinforcement learning?
A. State action reward state action
B. State achievement rewards state action
C. State act reward achievement
D. State act reward act
112. ___ is the policy that an agent is trying to learn?
A. behavior policy
B. Target policy
C. On-policy
D. Off-policy
113. ____- is the policy which is used by an agent for action selection?
A. behavior policy
B. Target policy
C. On-policy
D. Off-policy
114. Which of the following type of policy is a learning algorithm in which the
same policy is improved and evaluated?
A. behavior policy
B. Target policy
C. On-policy
D. Off-policy
115. Which of the following types of policy is a learning algorithm that
evaluates and improves a policy that is dissimilar from the Policy that is
used for action selection?
A. behavior policy
B. Target policy
C. On-policy
D. Off-policy
116. Among On-policy and off-policy, which of the following target policy is
not equal to behavior policy?
A. On-policy
B. Off-policy
117. Among On-policy and off-policy, which of the following target policy is
equal to behavior policy?
A. On-policy
B. Off-policy
118. Q-learning follows an on-policy learning algorithm or an off-policy
learning algorithm?
A. On-policy
B. Off-policy
119. SARSA follows an on-policy learning algorithm or an off-policy learning
algorithm?
A. On-policy
B. Off-policy
120. What is DQN in reinforcement learning?
A. Dynamic Q-learning network
B. Dynamic Q-neural network
C. Deep Q-neural network
121. Which of the following correctly states the difference between Q-learning
and SARSA?
A. In comparison to SARSA, QL directly learns the optimal policy,
whereas SARSA learns a policy that is "near" the optimal
B. In comparison to QL, SARSA directly learns the optimal policy,
whereas QL learns a policy that is "near" the optimal.
122. Which of the following gives the better final performance?
A. QL
B. SARSA
123. Which of the following is faster?
A. QL
B. SARSA
124. Q-learning is a model-free or model-based learning algorithm?
A. Model-free
B. Model-based
125. Which of the following is not a Pruning technique?
a) Cost based pruning
b) Cost complexity pruning
c) Minimum error pruning
d) Maximum error pruning
126.
FACULTY SIGNATURE HOD