Inference in First-Order Logic
Propositional vs First-Order Inference
Inference rules can be used to eliminate
quantifiers from sentences.
Using the rules, a KB can be converted to
propositional logic allowing for propositional
inference.
So, convert first-order logic to propositional
logic so that propositional inference can be
done (which we know how to do).
Inference Rules for Quantifiers
Suppose the KB contains:
∀x King(x) ∧ Greedy(x) ⇒ Evil(x) .
(all greedy kings are evil)
Allows us to infer any of:
King(John) ∧ Greedy(John) ⇒ Evil(John)
King(Richard) ∧ Greedy(Richard) ⇒ Evil(Richard)
King(Father(John)) ∧ Greedy(Father(John)) ⇒
Evil(Father(John))
etc...
Leads to the Universal Instantiation (UI) rule:
We can infer any sentence obtained by
substituting a ground term for the variable.
ground term defn: a term without variables.
Formally (using substitutions):
If SUBST(θ,α) is what we get when θ is
substituted in sentence α, then the rule is:
for variable v with ground term g.
Examples (for the previous sentences):
{ x/John }
{ x/Richard }
{ x/Father(John) }
For the Existential Instantiation rule:
The variable is replaced by one new
constant symbol.
Formally: for any sentence α, variable v,
and constant symbol k not appearing
elsewhere in the KB,
Example:
For the sentence:
∃x Crown(x) ∧ OnHead(x,John)
we can infer:
Crown(C1) ∧ OnHead(C1, John)
but, C1 can't be in the KB anywhere else.
IOW, we can say there's some object that
satisfies the condition and we'll call that
object C1 for now.
UI can be applied over and over. EI is applied
just the once and then discarded.
Example:
We can say we need something that is going to
kill a victim, but we don't know what it is yet:
∃x Kill(x,Victim)
Once we figure that out and add this to the KB:
Kill(Murderer,Victim)
You don't need the former anymore.
Reduction to propositional inference
WHY IT MATTERS: with these rules, first-
order inference can be reduced to
propositional inference.
First idea:
A universally quantified sentence can be
replaced by a set of all that's possible.
(example on next slide)
Example: Given the following KB:
∀x King(x) ∧ Greedy(x) ⇒ Evil(x)
King(John)
Greedy(John)
Brother(Richard,John)
Apply the UI to the first sentence using all
possible ground terms:
{x/John} and {x/Richard}
Results in:
King(John) ∧ Greedy(John) ⇒ Evil(John)
King(Richard) ∧ Greedy(Richard) ⇒ Evil(Richard)
Result?
Now, the KB is propositional if the ground atomic
sentences are treated as propositional
symbols.
So?
Now, any propositional algos from CH07 can be
used with the KB.
Process can be generalized (See 9.5).
Results in a complete process of inference
via propositionalization.
means any entailed sentence can be proved
Unification and Lifting
Process not too efficient.
Example:
Given query Evil(x), why generate a sentence:
King(Richard) ∧ Greedy(Richard) ⇒ Evil(Richard)
?
Doesn't it seem obvious that John is evil given:
∀x King(x) ∧ Greedy(x) ⇒ Evil(x)
King(John)
Greedy(John)
Make it obvious to the computer and things
speed up.
A first-order inference rule
{x/John} solves query Evil(x).
This is an inference.
How does it work?
We need to use rule greedy kings are evil
So, find an x so that x is a king and x is
greedy.
Finally, infer x is evil.
Generally:
If there's a substitution θ making all conjuncts
of the implication premise identical to existing
KB sentences, then...
...applying θ asserts the conclusion.
Here:
θ = {x/John} makes it happen.
Instead, what if we know everyone is greedy,
not just John?
John is still evil, because John is a King and
John is greedy since everyone is.
Requires multiple substitutions:
For both variables in the implication, and
For the variables in the KB.
Apply {x/John, y/John} to implication premises
King(x) and Greedy(x), then...
King(John) and Greedy(y) make them
identical.
An example of Generalized Modus Ponens:
pi, pi', q: atomic sentences
θ: a substitution so that
SUBST( θ, pi') = SUBST( θ, pi) for all i's
Gives n+1 premises: n atomic sentences and
an implication.
Example:
Generalized Modus Ponens is a lifted version
of Modus Ponens.
Raises MP from variable-free propositional
logic to first-order logic.
Advantage: makes only the needed
substitutions for inferences to proceed.
Unification
Process of identifying the substitutions is
called unification.
Required for first-order inference algos.
UNIFY algo returns a unifier given two
sentences, if possible:
Example:
What does John know?
Answers are the result of all KB sentences
unifying with (John, x).
Example Results:
Last Unification fails: x can't be John and
Elizabeth simultaneously.
But everyone knows Elizabeth:
Knows(x,Elizabeth)
Problem? X is the same variable.
Use different variables.
Called standardizing apart.
This unifies:
Complication: possible multiple unifiers.
UNIFY (Knows(John,x),Knows(y,z))
could return
{y/John, x/z}
or
{y/John, x/John, z/John}
Additionally, the latter could be obtained
from the first with one more substitution
{z/John}
The first unifier is more general than the
second (fewer variable restrictions).
Algorithm:
Recursively explore both expressions
simultaneously.
Expressions could be lists, variables,
constants, compound expressions
Fail if corresponding points in the structures
don't match.
Failure example:
S(x) can't unify with S( S(x) )
Storage and retrieval
TELL and ASK need helper functions to make
this all work: STORE and FETCH,
respectively.
STORE(s) stores sentence s in the KB
FETCH(q) returns all unifiers s.t. q unifies
with a sentence in the KB.
Finding all facts that unify with a query is
called FETCHing
EX: Knows(John, x)
How to build STORE and FETCH?
Keep all facts in a list and unify each query
against every list element.
Slow, but works.
See optional discussion on optimizing this
process on p328, p329 of the text.
9.3: Forward Chaining
Big idea:
Start with atomic sentences in the KB.
Repeatedly apply Modus Ponens in the
“forward direction”, adding new atomic
sentences.
Stop when there are no more possible
inferences.
Applicable to systems requiring a
Situation ⇒ Logic
functionality.
Can be implemented very efficiently
First-order definitive clauses
What do you mean by forward chaining algorithm?
• Forward chaining is also known as a forward deduction or forward
reasoning method when using an inference engine.
• Forward chaining is a form of reasoning, which start with atomic
sentences in the knowledge base, and applies inference rules (Modus
Ponens) in the forward direction, to extract more data until a goal is
reached.
• It is a bottom-up approach, as it moves from bottom to top.
• It is a process of making a conclusion based on known facts or data,
by starting from the initial state and reaches the goal state.
• The Forward-chaining algorithm Starting from the known facts,
• it triggers the entire rules whose premises are satisfied,
• adding their conclusions to the known facts.
• The process repeats until the query is answered or no new facts
are called.
Consider:
“The law says that it is a crime for an American
to sell weapons to hostile nations. The
country Nono, an enemy of America, has
some missiles, and all of its missiles were
sold to it by Colonel West, who is American.”
• Prove West is a criminal:
• To solve the above problem, first, we will
convert all the above facts into first-order
definite clauses.
First, represent the facts into FoL:
“...it's a crime for an American to sell weapons to hostile
nations”:
American(x) ∧ Weapon(y) ∧ Sells(x, y, z) ∧ Hostile(z) ⇒
Criminal(x) (9.3)
“Nono … has some missiles.”:
∃x Owns(Nono, x) ∧ Missile(x)
which can be transformed using M1:
Owns(Nono, M1), and (9.4)
Missile(M1) (9.5)
“All of it's missiles were sold to it by Colonel West”:
Missile(x) ∧ Owns(Nono,x) ⇒ Sells(West, x, Nono) (9.6)
Missiles are weapons:
Missile(x) ⇒ Weapon(x) (9.7)
An enemy of America is “hostile”
Enemy(x, America) ⇒ Hostile(x) (9.8)
“West, who is an American...”:
American(West) (9.9)
“The country Nono, an enemy of America...”
Enemy(Nono, America) (9.10)
A simple forward-chaining algorithm
American(x) ∧ Weapon(y) ∧ Sells(x, y, z) ∧ Hostile(z) ⇒
Criminal(x) (9.3)
Owns(Nono, M1), and (9.4)
Missile(M1) (9.5)
Missile(x) ∧ Owns(Nono,x) ⇒ Sells(West, x, Nono)(9.6)
Missile(x) ⇒ Weapon(x) (9.7)
Enemy(x, America) ⇒ Hostile(x) (9.8)
American(West) (9.9)
Enemy(Nono, America) (9.10)
American(x) ∧ Weapon(y) ∧ Sells(x, y, z) ∧ Hostile(z) ⇒ Criminal(x) (9.3)
Owns(Nono, M1), and (9.4)
Missile(M1) (9.5)
Missile(x) ∧ Owns(Nono,x) ⇒ Sells(West, x, Nono) (9.6)
Missile(x) ⇒ Weapon(x) (9.7)
Enemy(x, America) ⇒ Hostile(x) (9.8)
American(West) (9.9)
Enemy(Nono, America) (9.10)
American(x) ∧ Weapon(y) ∧ Sells(x, y, z) ∧ Hostile(z) ⇒ Criminal(x) (9.3)
Owns(Nono, M1), and (9.4)
Missile(M1) (9.5)
Missile(x) ∧ Owns(Nono,x) ⇒ Sells(West, x, Nono) (9.6)
Missile(x) ⇒ Weapon(x) (9.7)
Enemy(x, America) ⇒ Hostile(x) (9.8)
American(West) (9.9)
Enemy(Nono, America) (9.10)
American(x) ∧ Weapon(y) ∧ Sells(x, y, z) ∧ Hostile(z) ⇒ Criminal(x) (9.3)
Owns(Nono, M1), and (9.4)
Missile(M1) (9.5)
Missile(x) ∧ Owns(Nono,x) ⇒ Sells(West, x, Nono) (9.6)
Missile(x) ⇒ Weapon(x) (9.7)
Enemy(x, America) ⇒ Hostile(x) (9.8)
American(West) (9.9)
Enemy(Nono, America) (9.10)
American(x) ∧ Weapon(y) ∧ Sells(x, y, z) ∧ Hostile(z) ⇒ Criminal(x) (9.3)
Owns(Nono, M1), and (9.4)
Missile(M1) (9.5)
Missile(x) ∧ Owns(Nono,x) ⇒ Sells(West, x, Nono) (9.6)
Missile(x) ⇒ Weapon(x) (9.7)
Enemy(x, America) ⇒ Hostile(x) (9.8)
American(West) (9.9)
Enemy(Nono, America) (9.10)
The simple, forward-chaining algo:
Notes:
Simple, but inefficient.
Forms the basis for production systems (aka
expert systems).
Prolog is a language used to easily
implement such systems. (more on this in a
sec)
9.4: Backward Chaining
Backward chaining algos work backwards
from a goal, finding facts to support the proof.
Has some disadvantages relative to forward
chaining.
Used in logic programming, most common
form of automated reasoning.
Closely connected with constraint satisfaction
problems.
A backward chaining algorithm
A backward-chaining algo, FOL-BC-ASK.
Proved if the KB contains a clause in the
form:
lhs ⇒ goal
Where lhs is a list of conjuncts.
American(West) is an atomic fact considered a
clause with an empty lhs.
Variable-containing queries can be proved
multiple ways.
Ex: Person(x):
{x/John}
{x/Richard}
So, FOL-BC-ASK is a generator algorithm.
A function that returns multiple times.
Each return provides a possible result.
Backward chaining is a type of AND/OR
search.
Required parts:
OR part: goal query can be proved by any
KB rule.
AND part: all lhs conjuncts must be proved.
How does FOL-BC-OR work?
Fetches all clauses that can unify the goal.
Standardizes clause variables to be brand-
new variables.
If rhs does unify with the goal, proves every
lhs conjunct using FOL-BC-AND.
How does FOL-BC-AND work?
Proves each conjuncts in turn
Keeps track of the accumulated situation
The proof tree for Criminal(West) from
sentences 9.3 thru 9.10:
Notes on Backward chaining:
It's a depth-first search
Space requirements are linear
Has problems with repeated states and
completeness.
Notes on Backward chaining:
It's a depth-first search
Space requirements are linear
Has problems with repeated states and
completeness.
The complete algo:
Resolution
Conjunctive normal form for first-order logic
• First-order resolution requires that sentences be in
conjunctive normal form (CNF)—that is, a conjunction of
clauses, where each clause is a disjunction of literals.
• Literals can contain variables, which are assumed to be
universally quantified. For example, the sentence
• ∀x American(x) ∧ Weapon(y) ∧ Sells(x, y, z) ∧ Hostile(z) ⇒
Criminal (x) becomes, in CNF,
• ¬American(x) ∨ ¬Weapon(y) ∨ ¬Sells(x, y, z) ∨ ¬Hostile(z) ∨
Criminal (x)
• Every sentence of first-order logic can be converted into an
inferentially equivalent CNF sentence.
Conversion to CNF by translating the sentence
“Everyone who loves all animals is loved by someone,”
• ∀ x [∀ y Animal(y) ⇒ Loves(x, y)] ⇒ [∃ y Loves(y, x)]
• Eliminate implications:
• ∀x ¬[∀y Animal(y) ⇒ Loves (x, y)] V [∃y Loves (y, x)]
• ∀x ¬[∀y ¬ Animal(y) V Loves (x, y)] V [∃y Loves (y, x)]
• ∀ x [¬∀ y ¬Animal(y) ∨ Loves(x, y)] ∨ [∃ y Loves(y, x)]
• Move ¬ inwards:
• In addition to the usual rules for negated connectives, we
need rules for negated quantifiers. Thus, we have
• Our sentence goes through the following transformations:
“Either there is some animal that x doesn’t love, or (if this is not the case) someone loves x.”
• Standardize variables:
• For sentences like (∃x P(x)) ∨ (∃xQ(x)) which use the same
variable name twice, change the name of one of the
variables.
• This avoids confusion later when we drop the quantifiers.
Thus, we have
• ∀ x [∃ y Animal (y) ∧ ¬Loves(x, y)] ∨ [∃ z Loves(z, x)]
• Skolemize:
• Skolemization is the process of removing existential
quantifiers by elimination.
• In the simple case, translate ∃x P(x) into P(A), where A is a
new constant.
• However, we can’t apply Existential Instantiation to our
sentence above because it doesn’t match the pattern ∃vα;
only parts of the sentence match the pattern.
• If we blindly apply the rule to the two matching parts we
get
• If we blindly apply the rule to the two matching parts we
get
• ∀ x [Animal (A) ∧ ¬Loves(x, A)] ∨ Loves(B, x)
• which has the wrong meaning entirely: it says that
everyone either fails to love a particular animal A or is
loved by some particular entity B.
• In fact, our original sentence allows each person to fail to
love a different animal or to be loved by a different person.
• Thus, we want the Skolem entities to depend on x and z:
∀ x [Animal (F(x)) ∧¬Loves(x, F(x))] ∨ Loves(G(z), x) Here F
and G are Skolem functions.
• Drop universal quantifiers:
• At this point, all remaining variables must be universally
quantified. Moreover, the sentence is equivalent to one in
which all the universal quantifiers have been moved to the
left. We can therefore drop the universal quantifiers:
• [Animal (F(x)) ∧ ¬Loves(x, F(x))] ∨ Loves(G(z), x) .
• Distribute ∨ over ∧:
• [Animal (F(x)) ∨ Loves(G(z), x)] ∧ [¬Loves(x, F(x)) ∨
Loves(G(z), x)] .
• The resolution inference rule
Example Proofs
9.5.4 Completeness of Resolution
9.5.5 Equality
The simplest rule, demodulation, takes a unit clause x = y and some clause α
that contains the term x, and yields a new clause formed by substituting y for x
within α. It works if the term within α unifies with x; it need not be exactly equal to
x. Note that demodulation is directional; given x = y, the x always gets replaced
with y, never vice versa. That means that demodulation can be used for
simplifying expressions using demodulators such as x + 0 = x or x1 = x.