[go: up one dir, main page]

0% found this document useful (0 votes)
6 views39 pages

AI Unit 3

Uploaded by

chithra
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views39 pages

AI Unit 3

Uploaded by

chithra
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 39

UNIT III

Reasoning under uncertainty: Logics of non-monotonic reasoning - Implementation- Basic


probability notation - Bayes rule – Certainty factors and rule based systems-Bayesian
networks – Dempster - Shafer Theory - Fuzzy Logic.
Two Marks
1. State the conditions under which uncertainity can arise?
Ans: Uncertainity can arise because of incompleteness and incorrectness in the agent's
understanding of the properties of the environment.

2. What are the three reasons why FOL fails in medical diagnosis?
Ans: The three reasons why the FOL fails in medical diagnosis are:
(i) Laziness: Too much work to lilst the complete set of antecedents and consequents
needed.
(ii) Theoretical ignorance: Medical science has no complete theory for domain.
(iii) Practical ignorance: Even if we know all the rules uncertainity arises because
some tests cannot be run on the patient's body.
3. What is the tool that is used to deal with degree of belief?
Ans: The tool that is used to deal with the degree of belief is the probability theory which
assigns a numeric degree of belief between 0 and 1.
4. For what the utility theory is useful and in what way it is related to decision
theory? Ans: The utility theory is useful to represent and reason with preferences.
The decision theory is related to utility theory as addition of the probability theory and
the utility theory.
5. What is the fundamental idea of the decision theory?
Ans: The fundamental idea of the decision theory is that an agent is rational if and only if it
chooses the action that yields the highest expected utility.

6. How are the probability over the simple and complex propositions are classified?
Ans: The probability over the simple and complex propositions are classified as
(i) Unconditional or Prior probabilities.
(ii)Conditional or Posterior probabilities.
7. State the axioms of probability.
Ans: The axioms are:

All probability are between 0 and 1. 0<= P(A)<=1


(ii) Necessarily true propositions have probability 1 and necessarily false
propositions have probability 0.P(True)=1,P(False)=0.

(iii) The probability of a disjunction is given by

P(A V B)= P(A) + P(B) - P(A ^ B)


(iv) Let B=(negation)A in the axiom3.

P(A V (negation)A )= P(A) + P( (negation)A) - P(A ^ (negation)A)

(v)P(true)=P(A) + P( (negation)A)-P(false)(by logical equivalence)

(vi)1= P(A) + P( (negation)A) (by step2)

(vii)P((negation)A)= 1- P(A) (by algebra)


8. What is Joint Probability Distribution?
Ans: An agent's probability assignments to all propositions in the domain (both simple and
complex) is called as Joint Probability Distribution.

9. What is the disadvantage of Baye's rule?


Ans: It requires three terms to compute one conditional probability (P(B/A))
*One conditional probability -> P(A/B)
*Two unconditional probability -> P(B) & P(A)
10. What is the advantage of Baye's rule?
Ans: If three values are known then the unknown fourth value -> P(B/A) is computed easily.

11. What is a belief network?


Ans: A Belief network is a graph in which the following holds:
1. A set of random variables makes up the nodes of the network.
2. A set of directed links or arrows connects pairs of nodes X->Y, X has a direct influence
on Y.
3. Each node has a conditional probability table that quantities the effects that the parents
have on the node. The parents of a node are all nodes that have arrows pointing to it.
4. Graph has no directed cycles -> DAG.
12. What is the task of any Probabilistic inference system?
Ans: The Task of any Probabilistic inference system is to compute the posterior probability
distribution for a set of query variables, given exact values of some evidence variable.

13. State the uses of belief networks.


Ans: 1.Making decisions based on probabilities in the network & on the agent's abilities.
2. Deciding which additional evidence variables should be observed in order to gain useful
information.
3. Sensitivity Analysis: which model has great impact.
4. Explaining the probabilistic inference result to user.

14. What are the two ways in which one can understand the semantics of belief
networks?
Ans: The two ways are
1. Network as a representation of the Joint Probability distribution
-used to know how to construct networks.
2. Encoding of a collection of conditional independence statements.
-designing inference procedures.

15. What is Probabilistic Reasoning? Explain


Ans: It explains how to build network models to reason under uncertainity according to the
laws of probability theory.
16. What are the disadvantages of Full joint Probability distribution?
Ans: The disadvantage is that, the interaction between the domain and the agent increases,
the number of variables is also increases and to overcome this we use a new data structure
called Bayesian network.

17. Explain Bayesian Network.


Ans: Bayesian network is used to represent the dependencies among variables and to give a
concise specification of any full joint probability distribution.

18. What makes the nodes of the Bayesian network and how are they connected?
Ans: A set of random variables makes up the nodes of the network. Variables may be discrete
or continuous and a set of directed links or arrows connect pair of nodes.
19. What is Conditional Probability Table (CPT)?
Ans: The table representation of the Bayesian network is called the Conditional probability
table.
20. What is Conditioning Case?
Ans: A Conditioning case is just a possible combination of values for the parent nodes.
21. What are the Semantics of Bayesian Network?
Ans: The Semantics can be represented as the
1. Global Semantics (Full joint probability distribution)
2. Local Semantics (Conditional independent statements)
22. State the way to representing the Full joint Distribution.
Ans: A generic entry in the joint distribution is the probability of a conjunction of particular
assignments to each variables, such as
p(x1=x1^----^xn)xn)=p(x1,---- xn)
23. When will a Bayesian Network is said as Compact?
Ans: A Bayesian network is compact, when the information is complete and non-redundant.
The Compactness of Bayesian Network is an example of a very general property of
locally structured systems.
24. Explain Node Ordering.
Ans: The correct order in which to add nodes is to add the root causes (parents first and then
the variables) and the addition process is continued, until we reach the leaves which have no
direct casual inference on other variables.

25. What is Topological Semantics?


Ans: Topological Semantics is a graph structure and from these we can derive the numerical
statements.

26. What is the specification for the Topological Semantics?


Ans: 1. A node is conditionally independent of its non-descendants, given its parents
2. A node is conditionally independent of all other nodes in the network, given its
parents children and children parents and this called the Markov blanket.
27. What are the two ways to represent Canonical Distributions?
Ans: 1.Deterministic nodes
2. Noisy-OR relation
28. What are Deterministic nodes?
Ans: It is used to represent the certain knowledge, by means of the x=f(parents(x)) for some
function.
29. Explain Noisy-OR Relationship.
Ans: It is the generalization of logical OR and it is used to represent the uncertain
knowledge.
30. What are the four types of inferences?
Ans: 1.Diagnostic Inferences
2. Casual Inferences
3. Intereasual Inferences
4. Mixed Inferences

31. What are the three basic classes of algorithms for evaluating multiply connected
networks?
Ans: The three basic classes of algorithms for evaluating multiply connected networks are
clustering, conditioning methods and stochastic simulation methods.

32. What is done in clustering?

Ans: It transforms the network into a probabilistically equivalent polytrees by merging


offending nodes.

33. What is cutset?


Ans: A set of variable that can be instantiated to yield polytree is called a cutset i.e this
method transforms the network into several simpler polytrees.
34. What is a utility function?
Ans: An agent's preferences between world states are captured by a utility function which
assigns a single no. to express the derivability of a state utilities are combined with the
outcome probabilities for actions to give an expected utility for each action.
35. What is the principle behind the maximum expected utility?
Ans: The principle of MEU (Maximum Expected Utility) is that a rational agent should choose
an action that maximizes the agent's expected utility.

36. Explain order ability.


Ans: Given two states agent should prefer one or the other or else rate the two as equally
preferable i.e. an agent should know what it was:(A>B)V(B>A)V(A-B)

37. What is meant by multi attribute utility theory-MAUT?


Ans: Problems in which outcomes are characterized by two or more attributes is known as
MAUT. The basic approach adopted in MAUT is to identify regularity in the preference
behavior, representation theorems to show that an agent with a certain kind of preference
structure has a utility function.

38. What are the two types of the dominance?


Ans: The two types of dominance are strict dominance and stochastic dominance.

39. What are the two roles of decision analysis?


Ans: The two roles of decision analysis are decision makers and decision analyst.

40. What are the axioms of the utility theory?


Ans: They are order ability, transitivity, continuity, substitutability, monotonicity and
decomposability.

41. What is non – monotonic reasoning? (Apr- May’14)


Ans: Non – monotonic reasoning is one in which axioms and / or rules of inference are
extended to make it possible to reason with incomplete information. These systems preserve
the property that at any given moment, a statement is either believed to be true, believed to
be false, or not believed to be either.

42. What is meant by nonmonotonic logic?


Ans: One system that provides a basis for default reasoning is Nonmonotonic logic in which
the operator of first order logic is augmented with a modal operator M, which is read as
“inconsistent”.

43. Give an example for nonmonotonic logic.

x,y : Related(x,y) ^ M GetAlong(x,y)  WillDefend(x,y)


It is read as “For all x and y, if x and y are related and if the fact that gets along with y is
consistent with everything else that is believed, then conclude that x will defend y.”

44. What are the 2 kinds of nonmonotonic reasoning?


Ans: The 2 kinds of nonmonotonic reasoning are:
a) Abduction
b) Inheritance

45. What is meant by ATMS?


Ans: ATMS – Assumption based truth maintenance system. An ATMS simply labels all the
states that have been considered at the same time. An ATMS keeps track for each sentence of
which assumptions would cause the sentence to be true.

46. What is meant by JTMS? (Nov’15)


Ans: JTMS - Justification based truth maintenance system. JTMS simply labels each sentence
of being in and out. The maintenance of justifications allows us to move quickly from one
state to another by making a few retractions and assertions, but only one state is
represented at a time.
47. Define belief revision.
Ans: Many of the inferences drawn by a knowledge representation system will have only
default status. Some of these inferred facts will turn out to be wrong and will have to be
retracted in the face of new information. This process is called belief revision.

48. List the limitations of CWA.


Ans: The limitations of CWA are:
1. It operates on individual predicates without considering the interactions among
predicates that are defined in the knowledge base.
2. It assumes that all predicates have all their instances listed. Although in many
database applications this is true, in many knowledge based systems it is not.

49. Give the difference between ATMS and JTMS.


Ans: ATMS:
A) An ATMS simply labels all the states that has been considered at the same time.
B) An ATMS keeps track for each sentence of which assumptions would cause the
sentence to be true.
JTMS:
A) JTMS simply labels each sentence of being in and out.
B) The maintenance of justifications allows us to move quickly from one state to another
by making a few retractions and assertions, but only one state is represented at a time.

50. Define Markov Decision Process (MDP).


Ans: The specification of a sequential decision problem for a fully observable environment
with a Markovian Transition model and additive reward is called Markov Decision Process.

51. What are the 3 components that define MDP?


Ans: The 3 components that define MDP are:
 Initial state S₀

 Transition model T(s, a, s’)

 Reward function R(s)


52. What is a policy and how is it denoted?
Ans: A solution must specify what agent should do for any state that the agent may reach.
This solution is referred to as policy. It is denoted by .

53. When complex decisions are made?


Ans: Complex decisions are made when the outcomes of an action are uncertain. Decisions
are taken fully observable, partially observable and uncertain environments.

54. What does complex decisions deal with?


Ans: Complex decisions deal with sequential decision problems where the agent’s utility
depends on a sequence of decisions.

55. Define transition model.


Ans: The specification of the outcome probabilities for each action in each possible state is
called a transition model.

56. What is meant by Markovian transition?


Ans: When action ‘a’ is done in state ‘s’, then probability of reaching ‘s’ is denoted by T(s, a,
s’). This is referred to as Markovian transition as the probability of reaching s’ from s
depends only on s and not on the environment history.

57. What is an optimal policy and how is it denoted?

Ans: An optimal policy is a policy that yields the highest expected utility. It is denoted by .
58. Define proper policy.
Ans: A policy that is guaranteed to reach a terminal state is called a proper policy.

59. What are the types of horizons for decision making?


Ans: The 2 types of horizons for decision making are:
 Finite Horizon

 Infinite horizon


60. What is meant by finite horizon?

Ans: A finite horizon means that there is fixed time after which the game is over. With a finite
horizon, the optimal action in a given state could change over time. Therefore, the optimal
policy for a finite horizon is non stationary.

61. Differentiate between finite horizon and infinite


horizon. Ans: FINITE HORIZON:
 With a finite horizon, the optimal action in a given state could change over time.

 The optimal policy for a finite horizon is non stationary.

 For an infinite horizon there is no fixed deadline.



 Since the optimal policy deals with only the current state, it is stationary.
62. What is mechanism design (Inverse game theory)?
Ans: Designing a game whose solutions consist of each agent having its own rational strategy
is called as Mechanical Design (Inverse game theory).It can be used to construct intelligent
multi agent systems that solves complex problems in distributed way.
63. What does a mechanism consists of?
Ans: A mechanism consists of
1. A language for describing the set of allowable strategies that the agent may adopt.
2. An outcome rule G that determines the payoffs to the agents given in the strategy
profile of allowable strategies.
64. Define dominant strategy.
Ans: A dominant strategy is a strategy that dominates all others. A strategy s for player p
strongly dominates strategy s’ if the outcomes for p is better for p than the outcome for s’.

65. Define weak strategy.


Ans: Strategy s weakly dominates s’ if s is better than s’ on at least one strategy profile and no
worse on any other.
66. What is Nash equilibrium?
Ans: The mathematical John Nash proved that every game has an equilibrium of the type
defined above and is called as Nash equilibrium.

67. What is maximin technique?


Ans: In order to apply equilibrium for a fi, Von Neumann developed a method for finding the
optimal mixed strategy. This is called as technique.

68. What is dominant strategy equilibrium?


Ans: In case of a two player game, when both the players have a dominant strategy, the
combinations of those strategies is called dominant strategy equilibrium.

69. What is meant by pareto optimal and pareto dominated?


Ans: An outcome is referred to as pareto optimal if there is no other outcome that all players
would prefer.
An outcome is pareto dominated by another if all players prefer the other outcome.
70. What are the 2 types of strategy?
Ans: The 2 types of strategy are:
1. Pure strategy
2. Mixed strategy.

71. Define pure strategy.


Ans: A pure strategy is a deterministic policy specifying a particular action to take in each
situation.

72. Define mixed strategy.


Ans: A mixed strategy is a randomized policy that selects particular actions according to a
specific probability distribution over actions.

73. What are the components of a game in game theory?


Ans: The components of a game in game theory are:
1. Players or agents who will make decisions.
2. Actions that the player can choose.
3. A patoff matrix that gives the utility for each combinations of actions by all the
players.
74. What is a contraction?

Ans: A contraction is a function of one argument, when applied to 2 different inputs, produce
2 output values that are closer together by some constant amount than the original values.

75. What are the properties of contraction?


Ans: The properties of contraction are:
 A contradiction has only one fixed point. If 2 fixed points are there, they would not
get closer and hence a contraction.

 When a function is applied to any argument, the value must get closer to the fixed
point and hence repeated application of a contraction always reaches the fixed
point.

 
76. Write down Bayes’ Rule (Nov’13)(Apr-May’15)(Nov’15)

77. What is meant by Fuzzy Logic? (Apr-May’14) (Nov- Dec ’14)(Nov’15)


Fuzzy logic is a form of many-valued logic in which the truth values of variables may be any real
number between 0 and 1, considered to be "fuzzy". By contrast, in Boolean logic, the truth values of
variables may only be 0 or 1, often called "crisp" values.

78. What is meant by Probabilistic Reasoning? (Nov-Dec’14)


A probabilistic model describes the world in terms of a set S of possible states - the sample space.
We don’t know the true state of the world, so we (somehow) come up with a probability
distribution over S which gives the probability of any state being the true one.
The world usually described by a set of variables or attributes. Consider the probabilistic model of
a fictitious medical expert system. The ‘world’ is described by 8 binary valued variables:
Visit to Asia? A
Tuberculosis? T
Either tub. or lung cancer? E
Lung cancer? L
Smoking? S
Bronchitis? B
Dyspnoea? D
Positive X-ray? X

79. Define MYCIN (Apr- May’15)

MYCIN was an early expert system that used artificial intelligence to identify bacteria
causing severe infections, such as bacteremia and meningitis, and to recommend antibiotics, with
the dosage adjusted for patient's body weight — the name derived from the antibiotics themselves,
as many antibiotics have the suffix "-mycin". The Mycin system was also used for the diagnosis of
blood clotting diseases.
Unit III- 11 Marks

1. What is uncertainty? Explain

Uncertainty
2. What is non-monotonic reasoning? Explain? (Apr-May ’14)(Nov – Dec ’14)(Nov’15)

A non-monotonic logic is a formal logic whose consequence relation is not


monotonic. Most studied formal logics have a monotonic consequence relation, meaning that
adding a formula to a theory never produces a reduction of its set of consequences.
Intuitively, monotonicity indicates that learning a new piece of knowledge cannot reduce the
set of what is known.

A monotonic logic cannot handle various reasoning tasks such as reasoning by default
(consequences may be derived only because of lack of evidence of the contrary), adductive
reasoning (consequences are only deduced as most likely explanations), some important
approaches to reasoning about knowledge
Default reasoning

An example of a default assumption is that the typical bird flies. As a result, if a


given animal is known to be a bird, and nothing else is known, it can be assumed to be
able to fly. The default assumption must however be retracted if it is later learned that the
considered animal is a penguin. This example shows that a logic that models default
reasoning should not be monotonic.

Logics formalizing default reasoning can be roughly divided in two categories:


logics able to deal with arbitrary default assumptions (default logic, defeasible
logic/defeasible reasoning/argument (logic), and answer set programming) and logics
that formalize the specific default assumption that facts that are not known to be true can
be assumed false by default (closed world assumption and circumscription).

Adductive reasoning

Abductive reasoning is the process of deriving the most likely explanations of the
known facts. An abductive logic should not be monotonic because the most likely
explanations are not necessarily correct.
For example, the most likely explanation for seeing wet grass is that it rained;
however, this explanation has to be retracted when learning that the real cause of the
grass being wet was a sprinkler. Since the old explanation (it rained) is retracted because
of the addition of a piece of knowledge (a sprinkler was active), any logic that models
explanations is non-monotonic.

Reasoning about knowledge

If a logic includes formulae that mean that something is not known, this logic should
not be monotonic. Indeed, learning something that was previously not known leads to the
removal of the formula specifying that this piece of knowledge is not known. This second
change (a removal caused by an addition) violates the condition of monotonicity. A logic for
reasoning about knowledge is the auto epistemic logic.

Belief revision

Belief revision is the process of changing beliefs to accommodate a new belief that might be
inconsistent with the old ones. In the assumption that the new belief is correct, some of the
old ones have to be retracted in order to maintain consistency. This retraction in response to
an addition of a new belief makes any logic for belief revision to be non-monotonic. The
belief revision approach is alternative to para consistent logics, which tolerate inconsistency
rather than attempting to remove it.

3. What is probability and Basic probability notation? (Apr-May ’15)

Probability
Given the available evidence, A25 will get me there on time with probability 0:04

(Fuzzy logic handles degree of truth NOT uncertainty e.g., WetGrass is true to degree 0:2)

Probabilistic assertions summarize effects of

laziness: failure to enumerate exceptions, qualifications, etc.

ignorance: lack of relevant facts, initial conditions, etc.


Subjective or Bayesian probability:

Probabilities relate propositions to one's own state of knowledge

e.g., P(A25/no reported accidents) = 0:06

These are not claims of a \probabilistic tendency" in the current situation (but might be
learned from past experience of similar situations)

Probabilities of propositions change with new evidence:

e.g., P(A25jno reported accidents; 5 a.m.) = 0:15

(Analogous to logical entailment status KB j= _, not truth.)

Making decisions under uncertainty


Suppose I believe the following:

P(A25 gets me there on timej : : :) = 0:04

P(A90 gets me there on timej : : :) = 0:70

P(A120 gets me there on timej : : :) = 0:95

P(A1440 gets me there on timej : : :) = 0:9999

Which action to choose?

Depends on my preferences for missing ight vs. airport cuisine, etc.

Utility theory is used to represent and infer preferences

Decision theory = utility theory + probability theory


Probabilistic Reasoning
Using logic to represent and reason we can represent knowledge about the world with facts
and rules, like the following ones:
bird(tweety).
fly(X) :- bird(X).
We can also use a theorem-prover to reason about the world and deduct new facts about the
world, for e.g.,
?- fly(tweety).
Yes
However, this often does not work outside of toy domains - non-tautologous certain rules are
hard to find. A way to handle knowledge representation in real problems is to extend logic by
using certainty factors. In other words, replace
IF condition THEN fact with
IF condition with certainty x THEN fact with certainty f(x)
Unfortunately cannot really adapt logical inference to probabilistic inference, since the latter
is not context-free. Replacing rules with conditional probabilities makes inferencing simpler.
Replace smoking -> lung cancer
or
lots of conditions, smoking -> lung cancer
with
P(lung cancer | smoking) = 0.6
Uncertainty is represented explicitly and quantitatively within probability theory, a
formalism that has been developed over centuries.
A probabilistic model describes the world in terms of a set S of possible states - the sample
space. We don’t know the true state of the world, so we (somehow) come up with a
probability distribution over S which gives the probability of any state being the true one.
The world usually described by a set of variables or attributes. Consider the probabilistic
model of a fictitious medical expert system. The ‘world’ is described by 8 binary valued
variables:
Visit to Asia? A
Tuberculosis? T
Either tub. or lung cancer? E
Lung cancer? L
Smoking? S
Bronchitis? B
Dyspnoea? D
Positive X-ray? X
We have 28 = 256 possible states or configurations and so 256 probabilities to find.
10.3 Review of Probability Theory .The primitives in probabilistic reasoning are random
variables. Just like primitives in Propositional Logic are propositions.
A random variable is not in fact a variable, but a function from a sample space S to
another space, often the real numbers. For example, let the random variable Sum
(representing outcome of two die throws) be defined thus:
Sum(die1, die2) = die1 +die2
Each random variable has an associated probability distribution determined by the
underlying distribution on the sample space
Continuing our example : P(Sum = 2) = 1/36,
P(Sum = 3) = 2/36, . . . , P(Sum = 12) = 1/36
Consider the probabilistic model of the fictitious medical expert system mentioned before.
The sample space is described by 8 binary valued variables.
Visit to Asia? A
Tuberculosis? T
Either tub. or lung cancer? E
Lung cancer? L
Smoking? S
Bronchitis? B
Dyspnoea? D
Positive X-ray? X

There are 28 = 256 events in the sample space. Each event is determined by a joint
instantiation of all of the variables.

S = {(A = f, T = f,E = f,L = f, S = f,B = f,D = f,X = f),


(A = f, T = f,E = f,L = f, S = f,B = f,D = f,X = t), . . .
(A = t, T = t,E = t,L = t, S = t,B = t,D = t,X = t)}

Since S is defined in terms of joint instantations, any distribution defined on it is called


a joint distribution. ll underlying distributions will be joint distributions in this module. The
variables {A,T,E, L,S,B,D,X} are in fact random variables, which ‘project’ values.

L(A = f, T = f,E = f,L = f, S = f,B = f,D = f,X = f) = f


L(A = f, T = f,E = f,L = f, S = f,B = f,D = f,X = t) = f
L(A = t, T = t,E = t,L = t, S = t,B = t,D = t,X = t) = t
Each of the random variables {A, T, E, L, S, B, D, X} has its own distribution,
determined by the underlying joint distribution. This is known as the margin distribution.
For example, the distribution for L is denoted P(L), and this distribution is defined by the
two probabilities P(L = f) and P(L = t). For example,
P(L = f)
= P(A = f, T = f,E = f,L = f, S = f,B = f,D = f,X = f)
+ P(A = f, T = f,E = f,L = f, S = f,B = f,D = f,X = t)
+ P(A = f, T = f,E = f,L = f, S = f,B = f,D = t,X = f)
...
P(A = t, T = t,E = t,L = f, S = t,B = t,D = t,X = t)
P(L) is an example of a marginal distribution.

5. Explain Bayes’ Rule

Bayes' Rule and conditional independence


Wumpus World

Specifying the probability model

Observations and query


Using conditional independence

Basic insight: observations are conditionally independent of other hidden squares given
neighboring hidden squares

Manipulate query into a form where we can use this!

P(b|known,P13,P22,P31)P(P22,P31)
Set 1 =0.2*{(1*0.2*0.2)+(1*0.8*0.2)+(1*0.*0.8)+0*0.8*0.2)}
=0.2*0.36
=0.072
Set 2 =0.8*{1*0.2*0.2)+(0*0.8*0.2)+(1*0.2*0.8)+(0*0.8*0.8)}
=0.8*0.02
=0.16
P(P13|known,b) =α’(0.072,0.16)
0.072 0.16
= 0.072+0.16 + 0.072+0.16
= 0.31 , 0.689
= 0.31 , 0.69 (approximate)
6. Explain Bayesian networks (Nov’13)(Nov’15)
A simple, graphical notation for conditional independence assertions and hence for compact
specification of full joint distributions

Syntax:

a set of nodes, one per variable

a directed, acyclic graph (link _ \directly inuences")

a conditional distribution for each node given its parents:

In the simplest case, conditional distribution represented as a conditional probability table


(CPT) giving the distribution over Xi for each combination of parent values

Example
Topology of network encodes conditional independence assertions:

Weather is independent of the other variables

Toothache and Catch are conditionally independent given Cavity


Example
I'm at work, neighbor John calls to say my alarm is ringing, but neighbor Mary doesn't call.
Sometimes it's set of by minor earthquakes. Is there a burglar?

Variables: Burglar, Earthquake, Alarm, JohnCalls, MaryCalls

Network topology reects \causal" knowledge:

 A burglar can set the alarm off



 An earthquake can set the alarm off

 The alarm can cause Mary to call

 The alarm can cause John to call









Compactness

Constructing Bayesian networks


Need a method such that a series of locally testable assertions of conditional independence
guarantees the required global semantics

Example
Deciding conditional independence is hard in noncausal directions (Causal models and
conditional independence seem hardwired for humans!)

Assessing conditional probabilities is hard in noncausal directions Network is less compact:


1 + 2 + 4 + 2 + 4=13 numbers needed

Example: Car diagnosis


Initial evidence: car won't start

Testable variables (green), \broken, so _x it" variables (orange)

Hidden variables (gray) ensure sparse structure, reduce parameters


Compact conditional distributions

Hybrid (discrete+continuous) networks

Option 1: discretization|possibly large errors, large CPTs

Option 2: finitely parameterized canonical families

1) Continuous variable, discrete+continuous parents (e.g., Cost)

2) Discrete variable, continuous parents (e.g., Buys?)

Continuous child variables


Need one conditional density function for child variable given continuous parents, for each
possible assignment to discrete parents

Most common is the linear Gaussian model, e.g.,:


Mean Cost varies linearly with Harvest, variance is fixed Linear variation is unreasonable
over the full range but works OK if the likely range of Harvest is narrow

All-continuous network with LG distributions Gaussian Discrete+continuous LG


network is a full joint distribution is a multivariate conditional Gaussian network
i.e., a
multivariate Gaussian over all continuous variables for each combination of
discrete variable values

Discrete variable w/ continuous parents

Inference in Bayesian networks Inference tasks

Inference by enumeration

Slightly intelligent way to sum out variables from the joint without actually
constructing its explicit representation
7. Explain Dempster - Shafer Theory in AI (Apr-May’15)(Nov’15)

The Dempster–Shafer theory (DST) is a mathematical theory of


evidence It allows one to combine evidence from different sources and arrive
at a degree of belief (represented by a belief function) that takes into account
all the available evidence. The theory was first developed by Arthur P.
Dempster and Glenn Shafer.

In a narrow sense, the term Dempster–Shafer theory refers to the


original conception of the theory by Dempster and Shafer. However, it is more
common to use the term in the wider sense of the same general approach, as
adapted to specific kinds of situations. In particular, many authors have
proposed different rules for combining evidence, often with a view to handling
conflicts in evidence better.

Dempster–Shafer theory is a generalization of the Bayesian theory of


subjective probability; whereas the latter requires probabilities for each
question of interest, belief functions base degrees of belief (or confidence, or
trust) for one question on the probabilities for a related question.

These degrees of belief may or may not have the mathematical


properties of probabilities; how much they differ depends on how closely the
two questions are related.
Put another way, it is a way of representing epistemic plausibilities but it can
yield answers that contradict those arrived at using probability theory.

Dempster–Shafer theory is based on two ideas: obtaining degrees of


belief for one question from subjective probabilities for a related question, and
Dempster's rule for combining such degrees of belief when they are based on
independent items of evidence.
In essence, the degree of belief in a proposition depends primarily upon
the number of answers (to the related questions) containing the proposition,
and the subjective probability of each answer. Also contributing are the rules
of combination that reflect general assumptions about the data.
In this formalism a degree of belief (also referred to as a mass) is
represented as a belief function rather than a Bayesian probability
distribution. Probability values are assigned to sets of possibilities rather than
single events: their appeal rests on the fact they naturally encode evidence in
favor of propositions.
Dempster–Shafer theory assigns its masses to all of the non-empty subsets
of the entities that compose a system.
Belief and plausibility

Shafer's framework allows for belief about propositions to be represented as


intervals, bounded by two values, belief (or support) and plausibility:
belief ≤ plausibility.
Belief in a hypothesis is constituted by the sum of the masses of all sets
enclosed by it (i.e. the sum of the masses of all subsets of the
hypothesis).[clarification needed]
It is the amount of belief that directly supports a given hypothesis at least in
part, forming a lower bound. Belief (usually denoted Bel) measures the
strength of the evidence in favor of a set of propositions. It ranges from 0
(indicating no evidence) to 1 (denoting certainty).
Plausibility is 1 minus the sum of the masses of all sets whose
intersection with the hypothesis is empty. It is an upper bound on the
possibility that the hypothesis could be true,
i.e. it “could possibly be the true state of the system” up to that value, because
there is only so much evidence that contradicts that hypothesis.
Plausibility (denoted by Pl) is defined to be Pl(s)=1-Bel(~s). It also
ranges from 0 to 1 and measures the extent to which evidence in favor of ~s
leaves room for belief in s. For example, suppose we have a belief of 0.5 and a
plausibility of 0.8 for a proposition, say “the cat in the box is dead.” This means
that we have evidence that allows us to state strongly that the proposition is
true with a confidence of 0.5. However, the evidence contrary to that
hypothesis (i.e. “the cat is alive”) only has a confidence of 0.2.
The remaining mass of 0.3 (the gap between the 0.5 supporting
evidence on the one hand, and the 0.2 contrary evidence on the other) is
“indeterminate,” meaning that the cat could either be dead or alive. This
interval represents the level of uncertainty based on the evidence in your
system.

Hypothesis Mass Belief Plausibility

Null (neither alive nor dead) 0 0 0

Alive 0.2 0.2 0.5

Dead 0.5 0.5 0.8

Either (alive or dead) 0.3 1.0 1.0


The null hypothesis is set to zero by definition (it corresponds to “no
solution”). The orthogonal hypotheses “Alive” and “Dead” have probabilities of
0.2 and 0.5, respectively. This could correspond to “Live/Dead Cat Detector”
signals, which have respective reliabilities of 0.2 and 0.5. Finally, the all-
encompassing “Either” hypothesis (which simply acknowledges there is a cat
in the box) picks up the slack so that the sum of the masses is 1.

The belief for the “Alive” and “Dead” hypotheses matches their corresponding
masses because they have no subsets; belief for “Either” consists of the sum of
all three masses (Either, Alive, and Dead) because “Alive” and “Dead” are each
subsets of “Either”. The “Alive” plausibility is 1 − m (Dead) and the “Dead”
plausibility is 1 − m (Alive). Finally, the “Either”
plausibility sums m(Alive) + m(Dead) + m(Either). The universal hypothesis
(“Either”) will always have 100% belief and plausibility —it acts as a checksum
of sorts.

Here is a somewhat more elaborate example where the behavior of belief and
plausibility begins to emerge. We're looking through a variety of detector
systems at a single faraway signal light, which can only be coloured in one of
three colours (red, yellow, or green):

Combining beliefs

Beliefs corresponding to independent pieces of information are


combined using Dempster's rule of combination, which is a generalization of the
special case of Bayes' theorem where events are independent. Note that the
probability masses from propositions that contradict each other can also be
used to obtain a measure of how much conflict there is in a system. This
measure has been used as a criterion for clustering multiple pieces of
seemingly conflicting evidence around competing hypotheses.

In addition, one of the computational advantages of the Dempster–Shafer


framework is that priors and conditionals need not be specified, unlike
Bayesian methods, which often use a symmetry (minimax error) argument to
assign prior probabilities to random variables (e.g. assigning 0.5 to binary
values for which no information is available about which is more likely).
However, any information contained in the missing priors and conditionals is
not used in the Dempster–Shafer framework unless it can be obtained
indirectly—and arguably is then available for calculation using Bayes
equations.

8. Explain Fuzzy Logic in AI.

Fuzzy logic is a form of many-valued logic or probabilistic logic; it deals with


reasoning that is approximate rather than fixed and exact. In contrast with
traditional logic theory, where binary sets have two-valued logic: true or false,
fuzzy logic variables may have a truth value that ranges in degree between 0
and 1. Fuzzy logic has been extended to handle the concept of partial truth,
where the truth value may range between completely true and completely
false.[1] Furthermore, when linguistic variables are used, these degrees may be
managed by specific functions.

Overview

The reasoning in fuzzy logic is similar to human reasoning. It allows for


approximate values and inferences as well as incomplete or ambiguous data
(fuzzy data) as opposed to only relying on crisp data (binary yes/no choices).
Fuzzy logic is able to process incomplete data and provide approximate
solutions to problems other methods find difficult to solve

Degrees of truth

Fuzzy logic and probabilistic logic are mathematically similar – both have
truth values ranging between 0 and 1 – but conceptually distinct, due to
different interpretations—see interpretations of probability theory. Fuzzy
logic corresponds to "degrees of truth", while probabilistic logic corresponds
to "probability, likelihood"; as these differ, fuzzy logic and probabilistic logic
yield different models of the same real-world situations.Both degrees of truth
and probabilities range between 0 and 1 and hence may seem similar at first.
For example, let a 100 ml glass contain 30 ml of water. Then we may consider
two concepts: Empty and Full. The meaning of each of them can be
represented by a certain fuzzy set. Then one might define the glass as being 0.7
empty and 0.3 full.

Applying truth values

A basic application might characterize subranges of a continuous


variable. For instance, a temperature measurement for anti-lock brakes might
have several separate membership functions defining particular temperature
ranges needed to control the brakes properly. Each function maps the same
temperature value to a truth value in the 0 to 1 range. These truth values can
then be used to determine how the brakes should be controlled.

Fuzzy logic temperature

In this image, the meanings of the expressions cold, warm, and hot are
represented by functions mapping a temperature scale. A point on that scale
has three "truth values"—one for each of the three functions. The vertical line
in the image represents a particular temperature that the three arrows (truth
values) gauge. Since the red arrow points to zero, this temperature may be
interpreted as "not hot". The orange arrow (pointing at 0.2) may describe it as
"slightly warm" and the blue arrow (pointing at 0.8) "fairly cold".
Linguistic variables

While variables in mathematics usually take numerical values, in fuzzy


logic applications, the non-numeric linguistic variables are often used to
facilitate the expression of rules and facts.

A linguistic variable such as age may have a value such as young or its
antonym old. However, the great utility of linguistic variables is that they can
be modified via linguistic hedges applied to primary terms. The linguistic
hedges can be associated with certain functions.

The most important propositional fuzzy logics are:


– Monoidal t-norm-based propositional fuzzy logic MTL is an
axiomatization of logic where conjunction is defined by a left
continuous t-norm, and implication is defined as the residuum of the t-
norm. Its models correspond to MTL-algebras that are prelinear
commutative bounded integral residuated lattices.
– Basic propositional fuzzy logic BL is an extension of MTL logic where
conjunction is defined by a continuous t-norm, and implication is also
defined as the residuum of the t-norm. Its models correspond to BL-
algebras.
– Łukasiewicz fuzzy logic is the extension of basic fuzzy logic BL where
standard conjunction is the Łukasiewicz t-norm. It has the axioms of
basic fuzzy logic plus an axiom of double negation, and its models
correspond to MV-algebras.
– Gödel fuzzy logic is the extension of basic fuzzy logic BL where
conjunction is Gödel t-norm. It has the axioms of BL plus an axiom of
idempotence of conjunction, and its models are called G-algebras.
– Product fuzzy logic is the extension of basic fuzzy logic BL where
conjunction is product t-norm. It has the axioms of BL plus another
axiom for cancellativity of conjunction, and its models are called
product algebras.
– Fuzzy logic with evaluated syntax (sometimes also called Pavelka's logic),
denoted by
EVŁ, is a further generalization of mathematical fuzzy logic. While the
above kinds of fuzzy logic have traditional syntax and many-valued
semantics, in EVŁ is evaluated also syntax. This means that each
formula has an evaluation. Axiomatization of EVŁ stems from
Łukasziewicz fuzzy logic. A generalization of classical Gödel
completeness theorem is provable in EVŁ.
9.Explain certain Factors and rule based system
QUESTION BANK & UNIVERSITY QUESTIONS

1. Consider the following price of knowledge.


A1: Tony, Mike and John belong to the Alpine Club.
A2: Every member of the Alpine Club who is not a skier is a mountain
climber.
A3: Mountain Climber does not like rain and anyone who does not like show
is
not a skier.
A4: Mike dislikes whatever Tony likes and like whatever Tony dislikes.
A5: Tony likes rain and show.
Can you conclude “There is a member of the Alpine Club who is a
mountain climber but not a skier”? Justify your answer.
2. How is knowledge represented and reasoning done for “SCRIPTS”?
3. Write short notes on:
a. Truth Maintenance.
b. Probabilistic Reasoning
4. Explain the various steps associated with the knowledge engineering
process. Discuss by applying the steps to any real world problem
5. Explain probabilistic reasoning systems in detail. (Apr-May’15)( Q. No. 3)
6. (a) Explain sequential decision problems in detail.
(b) Write in detail about Truth Maintenance.
7. Discuss the semantics Bayesian network in detail with an example.
(Nov’13)(Nov’15) (Q. No. 6)
8. Discuss the following in detail:
a. Multi attribute utility functions.
b. Nonmonotonic reasoning. (Apr-May’14)(Nov-Dec’14)(Nov’15)( Q. No. 2)
9. Describe the Dempster – Shaffer Theory in Detail (Apr- May’15)(Nov’15)( Q. No. 7

You might also like