0% found this document useful (0 votes)

63 views22 pages

AI Unit 4 QA

The document discusses machine learning techniques including supervised and unsupervised learning, decision trees, statistical learning models, learning with complete data using naive Bayes models, learning with hidden data using the EM algorithm, and reinforcement learning. It also includes questions and answers about these topics.

Uploaded by

HOW to BASIC INDIAN

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

63 views22 pages

AI Unit 4 QA

Uploaded by

HOW to BASIC INDIAN

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 22

UNIT -4

Machine Learning :
 Supervised and unsupervised learning
 Decision trees.
 Statistical learning models
 Learning with complete data - Naive Bayes models.


Learning with hidden data

 EM algorithm
 Reinforcement learning
Short Questions & Answers
Ques 1. Name out three basic techniques of machine learning .

Ans : (a) Supervised Learning (b) Unsupervised Learning (c) Reinforcement Learning.

Ques 2. Write some applications of Supervised Learning.

Ans :

 Implementation of Perceptrons in AI.

 Implementation of Adaline network
 Application in Back propagation algorithms.
 Used in Hetero associative learning.

Ques 3. What is Boolean Decision Tree?

Ans : These are used in Decision Making learning technique. This consists of a vector of input attributes
X, and a single Boolean output y. Example: Set of examples ( X1 , Y1)……( X6 , Y6).
Positive examples are in which goal is true . Negative examples are in which goal is false.
Complete set is called Training Set.

Ques 4. Compare the Decision tree method with Naïve Baye’s Learning.
Ans : (i) Naïve Baye’s learns little less efficiently as compared to decision tree learning.
(ii) Naïve Baye’s learning works well fro wide range of applications as compared to decision tree.
(iii) Naïve Baye’s Scale well to very large problems. E.g : If n Boolean attributes , then 2n + 1
Parameters are required.

Ques 5. What is Reward Function in Re-enforcement learning ?

Ans : Reward function is used to define a goal. It maps each perceived state action pair of environment to a
single number; i.e. a reward that indicates desirability of that state. A re-enforcement agent’s only objective
is to maximize total reward received in long run. Reward functions are stochastic/ random in nature.
Long Question & Answers
Ques 6. Explain Machine learning. Illustrate learning model? Mention some factors that affect the
learning.
Ans : Machine learning is the sub field of AI in which we try to improve decision making power of
intelligent agents. Agent has a performance element that decides what actions to take and a learning element
that modifies the performance element so that it makes better decisions. Design of learning element is
affected by following three major factors :
1) Which components of performance element are to be learned.
2) What feedback is available to learn these components.
3) What is representation method used for components.
Following are some ways of learning mostly used in machines:
(A) Logical learning (B) Inductive learning (C) Deductive learning.
(B)
Logical Learning: In this process a new concept or solution through the use of similar known concepts.
We use this type of learning when solving problems on an exam , where previously learned examples serve
as a guide or when we learn to drive a truck using our knowledge of car driving.

Inductive Learning: This technique requires the use of inductive inference, a form of invalid but useful
inference. We use inductive learning when we formulate a general concept after seeing a number of
instances or examples of the concept. E.g : When we learn the concept of color or sweet taste after
experiencing sensation associated with several objects.

Deductive Learning: This is performed through a sequence of deductive inference steps using known facts.
From the known facts , new facts or relationships are logically delivered. E.g : If we have an information
that weather is Hot and Humid then we can infer that it may Rain also. Another example may be , let
P → Q & Q → R , 𝑡ℎ𝑒𝑛 𝑤𝑒 𝑐𝑎𝑛 𝑖𝑛𝑓𝑒𝑟 𝑡ℎ𝑎𝑡 𝑃 → 𝑅

General Learning Model

Environment has been included as a part of the overall learning system. It produces random stimuli, which
work as a organized training source such as a teacher which provides carefully selected training examples
for learner component. A user working on a keyboard can also be an environment for some specific
systems.
Inputs to the learning system may be physical stimuli, some sound , signal ,description of text , symbolic
notations . Information is used to create and modify knowledge structures in the KB. Same knowledge is
used by the performance component to carry out some tasks, such as solving a problem, playing a computer
game .
Performance component produces a response/actions when a task is provided. The Critic module then
evaluates this response relative to an optimal response. A feedback indicating whether or not the
performance is acceptable. It is then forwarded by critic module to learner component for its subsequent use
in modifying the structure in knowledge base.
Factors affecting the Machine Learning Process:
1) Types of training provided. E.g: Supervised technique , Unsupervised technique etc.
2) Form and extent of any initial background knowledge or past history.
3) The types of feedbacks provided.
4) Learning algorithms applied.
Ques7. Differentiate between Supervised Learning and Unsupervised Learning. Also mention some of
the application areas of both.
Ans :
S.No Supervised Learning Unsupervised Learning
1. Learning of a function can be done from Learning can be used to draw inference from some data
its inputs and outputs, set containing input data
Classifies the data on the basis of training Clusters the data on the basis of similarities according
set available and uses that data for to the characteristics found in the data and grouping
2.
classifying new data. similar objects into clusters.

3. Also known as Classification Also known as Clustering

The class labels on the training data is Class labels on the training data is not known in
known in advance which further helps in
4. advance i.e. no predefined class.
data classification.

Classification Methods: Clustering Methods :

Decision Trees, Bayesian Classification. Hierarchical, Partitioning, Density Based.
Rule Based Classification Grid Based, Model Based.
5. Classification by back propagation,
Associative Classification.

Issues in supervised learning

 Data Cleaning: In data cleaning, noise and missing values are handled.
 Feature Selection: Abundant an irrelevant attributes are removed while feature selection is done.
 Data Transformation: Data normalization and data generalization is included in data transformation.

Ques. 8 Write Short notes on the following: (a) Statistical Learning (b) Naïve Baye’s Model
Ans : (a) Statistical Learning Technique: In this technique main idea is data and hypothesis. Here data is
evidence i.e. instantiations of some or all random variables describing the domain. Bayesian learning
calculates probabilities of each hypothesis given the data and makes prediction.
Let D: data set, with observed value d as an output. Then the probability of each hypothesis is obtained by

Baye’s Rule as: P ( hi | d) = 𝜶 𝑷 (𝒅 |𝒉𝒊)𝑷 ( 𝒉𝒊 ).

For prediction of an unknown quantity x , expression is given as below :
P ( x | d ) = ∑𝒊 𝑷 ( 𝒙 |𝒅 , 𝒉𝒊) 𝑷 (𝒉𝒊 | 𝒅 ) = ∑𝒊 𝑷 ( 𝒙 | 𝒉𝒊 ) 𝑷 (𝒉𝒊 | 𝒅 ).
Prediction above is weighted averages over predictions of individual hypothesis. Hypothesis are
intermediate values between raw data and predictions. A very common approximation which is generally
used is to make predictions based on a single most probable hypothesis i.e. an hi that maximizes
P ( hi | d ) is called Maximum a Posteriori.

(b) Naïve Baye’s Model: This is the most common Bayesian network model used in machine learning.
In this model the class variable C ( to be predicted) is the root and attribute Xi are leaves. Model is called
Naïve because it assumes that attributes are conditionally independent of each other, given the class.
Once the model has been trained using maximum likelihood technique, it can be used to classify new
examples for which the class variable C is unobserved. For the observed attributes x1 , x2 ,……xn, the

Probability of each class is given as: P ( C | x1 , x2 …., Xn ) = 𝜶 𝑷(𝑪) ∏𝒊 𝑷 (𝑿𝒊 |𝑪) .

Ques.9 What is learning with complete data? Explain Maximum Likelihood Parameter Learning
with Discrete Model in detail.
Ans . Statistical learning methods are based on simple task parameter learning with complete data.
Parameter learning involves finding the numerical parameters for a probability model with a fix
structure. E.g: In Bayesian network conditional probabilities are obtained for a given scenario. Data are
complete when each point contains values for every variable in a specific learning model.

Maximum Likelihood Parameter Learning : Suppose we buy a bag of lime and cherry candy from a
new manufacturer whose lime–cherry proportions are completely unknown—that is, the fraction could be
anywhere between 0 and 1. Parameter 𝜃 is proportion of cherry candies.
Hypothesis is : h , proportion of limes = 1 - 𝜃
If we assume that all proportions are equally likely a priori, then a maximum-likelihood approach is
reasonable. If we model the situation with a Bayesian network, we need just one random variable, Flavor
(the flavor of a randomly chosen candy from the bag). It has values cherry and lime, where the probability
of cherry is . Now suppose we unwrap N candies, of which c are cherries and l = N - c are limes
Likelihood of above data set is as given below:
So maximum likelihood is value of 𝜃 that maximizes above equation .Computing log likelihood:

By taking logarithms, we reduce the product to a sum over the data, which is usually easier
to maximize.) To find the maximum-likelihood value ofθ, we differentiate L with respect to
θ and set the resulting expression to zero:

1. Write down an expression for the likelihood of the data as a function of the parameter(s).
2. Write down the derivative of the log likelihood with respect to each parameter.
3. Find the parameter values such that the derivatives are zer

when the data set is small enough that some

events have not yet been observed—for instance, no cherry candies—the maximum likelihood
hypothesis assigns zero probability to those events. Various tricks are used to
avoid this problem, such as initializing the counts for each event
to 1 instead of zero. With complete data maximum likelihood parameter
learning problem for a Bayesian Network
variable given its parents are just observed frequencies of variable values for each setting of parent
values.
Let us look at another example: Suppose this new candy manufacturer wants to give a
little hint to the consumer and uses candy wrappers colored red and green. The Wrapper for
each candy is selected probabilistically, according to some unknown conditional distribution,
depending on the flavor. The corresponding probability model has three parameters: θ, θ1, and θ2.
θ1 : wrapper color of cherry candy. θ2. : Wrapper color of lime candy.
Let us assume a case for Cherry Candy Wrapper, then using Joint probability distribution we can have
following equation:
P (Flavor = Cherry, Wrapper = Green | hθ , θ1, θ2 ).

Now let N candies are to be unwrapped: C : cherries , L = N – C : Lime

Let wrapper count is as given: rc : Cherries with red wrappers , gc : Cherries with green wrappers
rl : Limes with red wrappers , gl : Limes with green wrappers.
So the likelihood of data is given as below:

Now for Maximum Likelihood Estimation , simplify it by taking Log , to come up with addition form :

Now compute I order partial derivatives w.r.t θ, θ1, θ2 , Equate it to zero , we will get values of parameters.
Ques.10 Write short notes on
(a) Continuous model for Maximum likelihood Estimation
(b) Learning with Hidden Variables.
(c) EM Algorithm.

Ans : (a) Continuous model for Maximum likelihood Estimation : Continuous variables are very common
in real-world applications, it is important to know how to learn continuous models from data. The principles for
maximum-likelihood learning are identical to those of the discrete case. In learning the parameters of a Gaussian
density function on a single variable. That is, the data are generated as follows:

The parameters of this model are the mean and the standard deviation. Let the observed values be x1, X2 … xN .

Then the log likelihood is

:
Now setting theI order partial derivative equal to zero we obtain:

The maximum-likelihood value of the mean is the sample average and the maximum likelihood value of
the standard deviation is the square root of the sample variance.

(b) Learning with Hidden Variables : Many real world problems have hidden variables (also called
Latent Variables), which are not observable in given data set samples.
Example : (i) In medical diagnosis, records mostly consist of symptoms , treatment used and outcom
of the treatment. But seldom have direct observation of disease itself.
(ii) A scenario of traffic congestion prediction at office hours ( Hidden variables can be an
unobservable “ Rainy Day” causing very less traffic at peak hours.
Example : Let Bayesian Network for heart disease ( a hidden variable ) is as given in below figure :
. In figure (a): Each variable has three possible values and is labeled with the number
of independent parameters in its conditional distribution. In figure (b): The
equivalent network with Heart Disease removed. Note that the symptom variables are no
longer conditionally independent given their parents. Therefore Latent Variables can dramatically reduce
the number of parameters required to specify a Bayesian Network. This can reduce he amount of data
needed to learn the parameters.
(c) EM Algorithm( Expectation Maximization Algorithm) : This algorithm is used to solve the
problems arised in Laerning with hidden variables. Basic idea is to pretend that we know the
parameters of model and then infer the probability that each data point belongs to each component is
fitted to entire data set with each point weighted by the probability that it belongs to that component.

 Expectation maximization the process that is used for clustering the data sample.
 EM for a given data, has the ability to predict feature values for each class on the basis of
classification of examples by learning the theory that specifies it.
 It works on the concept of, starting with the random theory and randomly classified data along
with the execution of below mentioned steps. Compute expected values of each hidden
variables for each examples and then re-computing the parameters using the expected values as
if they were observed values. Let X is the observed values in all examples. Z is the set of all
hidden variables. 𝜃 is all parameters for probability model. 𝜽 = { 𝝁, 𝜮 }
 E- Step: In this computation of sum (i.e. expectation of Log likelihood of completed data w.r.t.
P ( Z = z | x , 𝜃𝑖 ) , which is posteriori over hidden variables.

 M – Step: In this step we find new values of the parameters that maximize the Log Likelihood
of data given the expected values of hidden indicator variables.

 EM algorithm increases the Log Likelihood of data at every iteration. Under certain conditions
EM can be proven to reach a local maximum in likelihood. So EM is like Gradient Based Hill
Climbing Algorithm.

Ques. 11 Explain Re-inforcement learning technique in detail .Also Mention its applications in the
field of Artificial intelligence.
Ans : Re-inforcement learning : This type of learning technique is used for agents learning when there is
no teacher telling the agent what action to take in each circumstances.
Example 1 : Let a chess playing agent by supervised learning given examples of game situations along
with the best moves for those situations. He can also try random moves , so agent can eventually build a
predictive model of its environment. Issue is that “Without some feedback about what is good and bad ,
agent will have no grounds for deciding which move to select.” Agent needs to know that something good
has happened when it wins and that something bad has occurred. This kind of Feedback is called
Reward or Re-inforcement .
A General Learning Model of Reinforcement Learning:

 Reinforcement learning was developed in context to optimal control strategy.

 This method is useful in making sequential decisions
 Critic converts a primary reinforcement signal received from the environment into a higher quality
signal (Heuristic Signal), both of which are scalar inputs.
 System is designed to learn delayed reinforcement ( Temporal sequence of stimuli).
Example 2 : A mobile robot decides whether it should enter a new room in search of more trash to collect
or start trying to find its way back to its battery recharging situation. It makes it decision based on how
quickly and easily it has been able to find the recharger in past.
 Agent’s actions are permitted to affect the future state of environment .E.g : Next chess position.
 This involves interaction between an active decision making agent and its environment, where goal
is to be searched.
Markov Decision Process: Rewards serve to define optimal policies in MDP’s. An optimal policies that
maximizes expected total reward. Task of re-inforcement learning is to ise observed rewards to learn an
optimal policy.
Elements of re-inforcement Learning:
a). A policy b). A reward function c). A value function d ). A model of environment
Architectures in Reinforcement Learning

Policy: This defines learning agent’s behavior at a particular time. It is a mapping from perceived states of
environment to actions to be taken when present in those states. Policy can be a simple function , a look up
table or a search process too.
Reward Function: This is used to define a goal. It maps each perceived state action pair of environment to
a single number; a reward point that indicates desirability of that state. Objective is to maximize total reward
function received in long run. Reward functions are stochastic/random.
Value function: Reward function indicates what is good in an immediate sense, a value function specifies
what is good in the long run. Value of a state is total amount of reward an agent can expect to accumulate
over the future.
Model: this represents behavior of the environment . Models are used for planning, i.e a way of deciding for
a course of actions by considering future situations.
Application areas of Reinforcement learning are as mentioned below:
1) The most recent version of Deep Mind’s AI system for playing Go) means interest in reinforcement
learning (RL) is bound to increase.
2) RL requires a lot of data, and as such, it has often been associated with domains where simulated
data is available (gameplay, robotics).
3) Automation of well-defined tasks, that would benefit from sequential decision-making that RL can
help automate (or at least, where RL can augment a human expert).
4) Industrial automation is another promising area. It appears that RL technologies from
DeepMind helped Google significantly reduce energy consumption (HVAC) in its own data centers.
5) The use of RL can lead to training systems that provide custom instruction and materials tuned to the
needs of individual students. A group of researchers is developing RL algorithms and statistical
methods that require less data for use in future tutoring systems.
6) Many RL applications in health care mostly pertain to finding optimal treatment policies.
7) Companies collect a lot of text, and good tools that can help unlock unstructured text will find users.
8) A technique for automatically generating summaries from text based on content “abstracted” from
some original text document).
9) A Financial Times article described an RL-based system for optimal trade execution. The system
(dubbed “LOXM”) is being used to execute trading orders at maximum speed and at the best
possible price.
10) Many warehousing facilities used by E - Commerce sites and other supermarkets use these
intelligent robots for sorting their millions of products every day and helping to deliver the right
products to the right people. If you look at Tesla’s factory, it comprises of more than 160 robots that
do major part of
work on its cars to reduce the risk of any
defect.
11) Reinforcement learning algorithms can be built to reduce transit time for stocking as well as
retrieving products in the warehouse for optimizing space utilization and warehouse
operations.
12).Reinforcement Learning and optimization techniques are utilized to assess the security of the electric
power systems and to enhance Microgrid performance. Adaptive learning methods are employed to
develop control and protection schemes.
Ques 12. Discuss Various Types of Reinforcement Learning Techniques.
Ans : Reinforcement learning are of following three types :
(a). Passive Reinforcement (b) Temporal Difference Learning (c) Active Reinforcement learning.
Passive Reinforcement Learning: In this technique agent’s policy is fixed and the task to learn the utilities
of state action pairs. If policy is 𝜋 and state is S , then agent always executes the action 𝜋(𝑆).
 Goal is to learn how good policy is i.e to learn th e utility function 𝑈𝜋( S). Passive learning agent is
not aware of the transition model T ( S , a , S’) , which specifies probability of reaching state S’ from
state S after action a.
 Passive learning also not knows the Reward Function R (S).
 A utility is defined to be the expected sum of rewards obtained if policy 𝜋 is followed.
𝐔𝛑( S) = E [ ∑∞
𝐭=𝟎 𝛄 𝐑 (𝐒𝐭) ∶ 𝛑 , 𝐒𝟎 = 𝐒 ] , where 𝛄 𝐢𝐬 𝐚 𝐝𝐢𝐬𝐜𝐨𝐮𝐧𝐭 𝐟𝐚𝐜𝐭𝐨𝐫.
𝐭

Temporal difference Learning: When a transition occurs from state S to state S’, we update 𝑈𝜋( S) as

following: 𝐔𝛑( S) ← 𝐔𝛑( S) + 𝜶 ( 𝑹 (𝑺) + 𝜸 𝐔𝛑( S’) - 𝐔𝛑( S) ) .

: Learning rate parameter. This update rule uses the difference in utilities between successive states, it is often
called TEMPORAL DIFFERENCE equation.

Active Reinforcement Learning: The compression achieved by a function approximator allows the
learning agent to generalize from states it has visited to states it has not visited.
E.g : An evaluation function for CHESS that is represented as a weighted linear function of a set of features
or a basis function f1 , f2 , ……. fn.

Where 𝜃𝑖∶ is the coefficient we want to learn and

𝑓𝑖∶ is feature extracted from state.
Ques 13. What is Decision Tree Learning? Why it is useful in AI applications?
Ans : Decision tree method is one of the most simplest and yet most successful forms of learning algorithm.
It emphasis is towards the area of Inductive Learning. In inductive learning “ a collection of examples of f is
given , we return a function h that approximates f ”, where example f is “ A pair ( x , f(x) ) ”, where x is
input and f (x) is output of function applied to x.
 H is hypothesis. A good hypothesis will generalize well i.e will predict examples correctly.
 A decision tree takes input an object , with certain feature set and returns a decision of predicted
output value. Output may be Discrete or Continuous.
 Learning a discrete function is known as classification learning, wheras learning a continuous
function is termed as Regression in decision tree.
 Decision tree reaches itsdecision by performing a sequence of tests.
 Each internal node is a test value of one of the properties and branches from node are labeled with
possible values of the test.
 Each leaf node consists of return value.
 Application of Decision Tree learning is in designing an expert System based on Decision Tree
Architecture.
 Decision trees are completely expressive with the class of propositional logic.
 Various propositions are connected via logical OR operator( V).

Example : ∀ s F1 (s) → (P1 (s) ˅ P2 (s) ˅ ……˅ Pn (s))

∀ x P1 (x) → (F1 (x) ˅ F2 (x) )
∀ y P2 (y) → (Q1 (y) ˅ Q2(y))
………so on for Pn (s).
∀ z Pn (z) → (R1 (z) ˅ R2(z))
A general decision tree for above propositional formula can be as given below:
Boolean Decision Trees : This technique consists of a vector of input attributes , X and a single Boolean
output Y.
E.g : Set of examples ( X1 , Y1….., (X6 , Y6) ).
 Positive examples are those in which goal is true.
 Negative examples are those in which goal is false.
 Complete set is known as a TRAINING SET.

a) In case of numeric attributes, decision trees can be geometrically interpreted

as a collection of hyper planes , each orthogonal to one of the axes.
b) The tree complexity has a crucial effect on its accuracy.
It is explicitly controlled by the stopping criteria used and the pruning method
employed.
c) Usually the tree complexity is measured by one of the following
metrics: the total number of nodes, total number of leaves, tree depth and
number of attributes used.
d) Decision tree induction is closely related to rule
induction. Each path from the root of a decision tree to one of its leaves can be
transformed into a rule simply by conjoining the tests along the path to form
the antecedent part, and taking the leaf’s class prediction as the class value.

Example:
 Given this classifier, the analyst can predict the response of a potential customer (by sorting
it down the tree), and understand the behavioral characteristics of the entire potential
customers population regarding direct mailing.
 Each node is labeled with the attribute it tests, and its branches are labeled with
its corresponding values.
 For example, one of the paths in below figure can be converted into the rule :“If customer
age is is less than or equal to or equal to 30, and the customer is “Male” – then the
customer will respond to the mail”.
Application Areas of Decision Tree Learning

1) Variable selection: The number of variables that are routinely monitored in clinical settings has
increased dramatically with the introduction of electronic data storage. Many of these variables are of
marginal relevance and, thus, should probably not be included in data mining exercises.
2) Handling of missing values: A common - but incorrect - method of handling missing data is to
exclude cases with missing values; this is both inefficient and runs the risk of introducing bias in the
analysis. Decision tree analysis can deal with missing data in two ways: it can either classify missing
values as a separate category that can be analyzed with the other categories or use a built decision tree
model which set the variable with lots of missing value as a target variable to make prediction and
replace these missing ones with the predicted value.
3) Prediction: This is one of the most important usages of decision tree models. Using the tree model
derived from historical data, it’s easy to predict the result for future records.
4) Data manipulation: Too many categories of one categorical variable or heavily skewed continuous
data are common in medical research.
Ques 14 : Write Short Notes on the following : (A) Regression Trees
(B) Bayesian Parameter Learning.
Ans : Regression Trees : Regression trees are commonly used to solve the problems where target variable
is numerical / continuous instead of discrete. Regression trees posses following properties :
a) Leaf nodes predict the average value of all instances.
b) Splitting criteria : Minimize the variance of the values in each subset 𝐒𝐢
| 𝑺𝒊 |
c) Standard Deviation Reduction : SDR ( A, S) = SD (S) - ∑ SD (𝑺 )
𝒊| 𝑺| 𝒊

d) Termination Criteria: Lower bound on SD in a node and Lower bound on number of

examples in a node.
e) Pruning criteria is Mean Squared Error.

Bayesian Parameter Learning: This learning technique works on parametric variables which are random
having some prior distribution. An optimal learning classifier can be designed using “Class conditional
densities”, p ( x | 𝑤𝑖). In a typical case we merely have some unclear knowledge about situations with given
number of samples and training. Observation of samples converts this to a posteriori density, and true values
of parameters are revised. In Bayesian learning sharpening of Posteriori Density Function is done, causing it
to peak near the true values.
• We assume priors are known: P (i | D) =
p(x i , D)P(i )
P(i). p( | x, D)  c
• Also, assume functional independence : i

 p(x  j , D)P( j )
j 1

• Any information we have about about  prior to collecting samples is contained in p(D|).
• Observation of samples converts this to a posterior, p(|D), which we hope is peaked around the true
value of .

• Our goal is to estimate a parameter vector: p( x D )  p( x,  D )d


• We can write the joint distribution as a product:
p(x D)   p(x  , D) p( D)d
  p(x  ) p( D)d
[ END OF 4th UNIT ]

Unit 5
No ratings yet
Unit 5
61 pages
ML Question Bank
No ratings yet
ML Question Bank
7 pages
Articulo Tanaka de La Densidad Del Agua
No ratings yet
Articulo Tanaka de La Densidad Del Agua
9 pages
Geomining Guide GB
No ratings yet
Geomining Guide GB
165 pages
Daa M-4
No ratings yet
Daa M-4
28 pages
XStream Pro ShortManual 002 en
0% (1)
XStream Pro ShortManual 002 en
28 pages
AIML Module - 03 21CS4
No ratings yet
AIML Module - 03 21CS4
34 pages
AIML Module - 03
No ratings yet
AIML Module - 03
34 pages
ML Unit-4
No ratings yet
ML Unit-4
9 pages
AI-unit 3
No ratings yet
AI-unit 3
55 pages
ML-3-Decision Tree
No ratings yet
ML-3-Decision Tree
17 pages
Deep Learning Notes
No ratings yet
Deep Learning Notes
14 pages
Machine Learning QB
No ratings yet
Machine Learning QB
3 pages
Unit 3
No ratings yet
Unit 3
99 pages
Ai-Unit Ii
No ratings yet
Ai-Unit Ii
61 pages
ML Unit-4
No ratings yet
ML Unit-4
40 pages
AI-Unit 5
100% (1)
AI-Unit 5
6 pages
Unit 2 v1.
No ratings yet
Unit 2 v1.
41 pages
Ai Unit 2 Notes
No ratings yet
Ai Unit 2 Notes
52 pages
Unit I
No ratings yet
Unit I
10 pages
Chapter-V CLASSIFICATION & CLUSTERING
No ratings yet
Chapter-V CLASSIFICATION & CLUSTERING
153 pages
MCQQQQQQQQQ
No ratings yet
MCQQQQQQQQQ
35 pages
r22 1 9 ML Lab Manual r22 Regulations
No ratings yet
r22 1 9 ML Lab Manual r22 Regulations
24 pages
AI & ML Unit 4 Notes
No ratings yet
AI & ML Unit 4 Notes
16 pages
AI 2marks Questions
100% (1)
AI 2marks Questions
121 pages
Module 2 Principle of AI
No ratings yet
Module 2 Principle of AI
15 pages
LM39 - Naïve Bayes Models
No ratings yet
LM39 - Naïve Bayes Models
14 pages
NN DL
No ratings yet
NN DL
1 page
Question Bank
No ratings yet
Question Bank
4 pages
Artificial Intelligence - AL3391 - Important Questions With Answer - Unit 2 - Problem Solving
No ratings yet
Artificial Intelligence - AL3391 - Important Questions With Answer - Unit 2 - Problem Solving
9 pages
AI Spectrum U5
No ratings yet
AI Spectrum U5
30 pages
18AI61
No ratings yet
18AI61
3 pages
Convolutional Neural Network
No ratings yet
Convolutional Neural Network
35 pages
Question Bank COURSE: Artificial Intelligence Department: Cse Class: Iii B.Tech Sem Ii Year: 2009-2010 Unit I
No ratings yet
Question Bank COURSE: Artificial Intelligence Department: Cse Class: Iii B.Tech Sem Ii Year: 2009-2010 Unit I
9 pages
18CSC305J - Artificial Intelligence Unit IV Question Bank Part A
No ratings yet
18CSC305J - Artificial Intelligence Unit IV Question Bank Part A
7 pages
UNIT-I PPT Introduction To Artificial Intelligence
No ratings yet
UNIT-I PPT Introduction To Artificial Intelligence
86 pages
Machine Learning-2
No ratings yet
Machine Learning-2
16 pages
Back Propagation
100% (1)
Back Propagation
27 pages
21CS54 Aiml Module3 PPT
No ratings yet
21CS54 Aiml Module3 PPT
102 pages
Chapter 6 - Feedforward Deep Networks
No ratings yet
Chapter 6 - Feedforward Deep Networks
27 pages
Unit 2 - Computer Organization and Architecture - WWW - Rgpvnotes.in
No ratings yet
Unit 2 - Computer Organization and Architecture - WWW - Rgpvnotes.in
25 pages
Unit Iv
No ratings yet
Unit Iv
14 pages
ML Unit 1
No ratings yet
ML Unit 1
25 pages
Ontology Engineering PDF
No ratings yet
Ontology Engineering PDF
25 pages
Unit-5 DS Notes
No ratings yet
Unit-5 DS Notes
19 pages
ML Unit-1
No ratings yet
ML Unit-1
26 pages
LM27.2 - Structure of CSP
No ratings yet
LM27.2 - Structure of CSP
21 pages
Artificial Intelligence Module 5
No ratings yet
Artificial Intelligence Module 5
23 pages
Module-02 AIML NOTES
No ratings yet
Module-02 AIML NOTES
29 pages
Data Engineering Lab: List of Programs
No ratings yet
Data Engineering Lab: List of Programs
2 pages
Concept Learning
No ratings yet
Concept Learning
85 pages
Deep Learning: Prof:Naveen Ghorpade
No ratings yet
Deep Learning: Prof:Naveen Ghorpade
43 pages
Artificial Intelligence R20 Notes-Unit 1
No ratings yet
Artificial Intelligence R20 Notes-Unit 1
24 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
46 pages
Machine Learning
100% (1)
Machine Learning
124 pages
Question Bank With Answers
No ratings yet
Question Bank With Answers
21 pages
Unit III Knowledge, Reasoning and Planning
No ratings yet
Unit III Knowledge, Reasoning and Planning
99 pages
Unit V
100% (1)
Unit V
24 pages
Ai Unit 2
No ratings yet
Ai Unit 2
135 pages
Deep Learning Exp
No ratings yet
Deep Learning Exp
25 pages
Artificial Intelligence Class 6: Skill Education for Class 6th, Code (417)
From Everand
Artificial Intelligence Class 6: Skill Education for Class 6th, Code (417)
Geeta Zunjani
No ratings yet
Textbook of Engineering Chemistry
From Everand
Textbook of Engineering Chemistry
C. Parameswara Murthy
No ratings yet
The Datadog Handbook: A Guide to Monitoring, Metrics, and Tracing
From Everand
The Datadog Handbook: A Guide to Monitoring, Metrics, and Tracing
Robert Johnson
No ratings yet
Adint Install (Maxdb)
No ratings yet
Adint Install (Maxdb)
20 pages
Copulas and Their Applications - Lan Zhang, Vijay P. Singh
No ratings yet
Copulas and Their Applications - Lan Zhang, Vijay P. Singh
620 pages
O Principio Destropico - Eng
No ratings yet
O Principio Destropico - Eng
5 pages
Lecture 4 - VISSIM - Hadi
No ratings yet
Lecture 4 - VISSIM - Hadi
24 pages
00028904
No ratings yet
00028904
19 pages
Procedural Modeling of A Building From A Single Image
No ratings yet
Procedural Modeling of A Building From A Single Image
10 pages
Week 2 Notes - Advanced CAD
No ratings yet
Week 2 Notes - Advanced CAD
8 pages
User's Manual For Single-Phase Simulator: Appendix A
No ratings yet
User's Manual For Single-Phase Simulator: Appendix A
12 pages
Scalping and Intraday Trading (MT4) Indicator
0% (1)
Scalping and Intraday Trading (MT4) Indicator
4 pages
Three-Parameter vs. Two-Parameter Weibull Distribution
No ratings yet
Three-Parameter vs. Two-Parameter Weibull Distribution
7 pages
Unidrive m701 PRG (Rfc-A)
No ratings yet
Unidrive m701 PRG (Rfc-A)
560 pages
Blast Design Using Measurement While Drilling Parameters
No ratings yet
Blast Design Using Measurement While Drilling Parameters
5 pages
DFSS BB318 Intro To Rob
No ratings yet
DFSS BB318 Intro To Rob
26 pages
Cai and Kaiser (2006) - Visualization of The Rock Mass Classification Systems
No ratings yet
Cai and Kaiser (2006) - Visualization of The Rock Mass Classification Systems
15 pages
The Chi-Square ( ) Test: A Test of Significance
No ratings yet
The Chi-Square ( ) Test: A Test of Significance
40 pages
Incropera Et Al. - 2007 - Fundamentals of Heat and Mass Transfer
No ratings yet
Incropera Et Al. - 2007 - Fundamentals of Heat and Mass Transfer
4 pages
HW4
No ratings yet
HW4
7 pages
NCHRP RPT 655
100% (2)
NCHRP RPT 655
118 pages
Phase Field Method To Simulate Fluid Flow
No ratings yet
Phase Field Method To Simulate Fluid Flow
5 pages
Panel Meter Series C124.1 / D124.1 / E124.1 Measuring Frequency, Ratio, Difference
No ratings yet
Panel Meter Series C124.1 / D124.1 / E124.1 Measuring Frequency, Ratio, Difference
24 pages
UMG8900 Documentation Bookshelf G&W&TD&IMS (04) - 20130916-C
No ratings yet
UMG8900 Documentation Bookshelf G&W&TD&IMS (04) - 20130916-C
39 pages
Hydrodynamic Hull Form Optimization Using Parametr
No ratings yet
Hydrodynamic Hull Form Optimization Using Parametr
18 pages
Tolerance Design Using Taguchi Methods
No ratings yet
Tolerance Design Using Taguchi Methods
30 pages
GIS Applications in Archaeology
No ratings yet
GIS Applications in Archaeology
29 pages
Format of Report Writting
100% (2)
Format of Report Writting
11 pages
Classical Conditioning
No ratings yet
Classical Conditioning
9 pages
8 WindPRO3.5-Optimization
No ratings yet
8 WindPRO3.5-Optimization
36 pages

AI Unit 4 QA

Uploaded by

AI Unit 4 QA

Uploaded by

UNIT -4

Learning with hidden data

Ques 2. Write some applications of Supervised Learning.

 Implementation of Perceptrons in AI.

Ques 3. What is Boolean Decision Tree?

Ques 5. What is Reward Function in Re-enforcement learning ?

General Learning Model

3. Also known as Classification Also known as Clustering

Classification Methods: Clustering Methods :

Issues in supervised learning

Baye’s Rule as: P ( hi | d) = 𝜶 𝑷 (𝒅 |𝒉𝒊)𝑷 ( 𝒉𝒊 ).

Probability of each class is given as: P ( C | x1 , x2 …., Xn ) = 𝜶 𝑷(𝑪) ∏𝒊 𝑷 (𝑿𝒊 |𝑪) .

when the data set is small enough that some

Now let N candies are to be unwrapped: C : cherries , L = N – C : Lime

Then the log likelihood is

 Reinforcement learning was developed in context to optimal control strategy.

following: 𝐔𝛑( S) ← 𝐔𝛑( S) + 𝜶 ( 𝑹 (𝑺) + 𝜸 𝐔𝛑( S’) - 𝐔𝛑( S) ) .

Where 𝜃𝑖∶ is the coefficient we want to learn and

Example : ∀ s F1 (s) → (P1 (s) ˅ P2 (s) ˅ ……˅ Pn (s))

a) In case of numeric attributes, decision trees can be geometrically interpreted

d) Termination Criteria: Lower bound on SD in a node and Lower bound on number of

• Our goal is to estimate a parameter vector: p( x D )  p( x,  D )d

You might also like