CIUnit 3
CIUnit 3
CIUnit 3
Uncertainty in reasoning
The world is an uncertain place; often the knowledge is imperfect which causes uncertainty.
Thus reasoning must be able to operate under uncertainty. AI systems must have ability to reason under
conditions of uncertainty. Uncertainties arise when there is
Incompleteness in knowledge
Inconsistencies in knowledge
Changing knowledge
Monotonic Reasoning
In monotonic reasoning all conclusions are still valid after adding more information to the existing
information.
Example
All humans are mortal
Socrates is a human
Conclusion: Socrates is mortal
In monotonic reasoning, if we enlarge a set of axioms we cannot retract any existing assertions.
i. Default reasoning
This is very common form of non-monotonic reasoning. The conclusions are drawn based
on what is most likely to be true. There are two approaches of default reasoning.
Non-monotonic logic
Default logic
Non-monotonic logic
Non-monotonic logic s a predicate logic with one extension called model operator M which means
“consistent with everything we know”. The purpose of M is to allow consistency. A way to define
consistency with prolog notation is
To show that fact p is true, we attempt to prove ┐p.
If we fail, we may say that p is consistent since ┐p is false.
Example
∀x:plays_instrument(x) manage (x)jazz_musician(x)
States that for all x, that x plays an instrument and if the fact that x can manage is consistent
with all other manage is consistent with all other knowledge then we can conclude that x is a
jazz_musician.
Default Logic
𝐴:𝐵
Default logic initiates new inference rule: 𝐶
where, A is kown as pre-requisite
B is the justification
C is the consequent
“If A, and if it is consistent with the rest of what is known to assume that B, then conclude that C”
The rule says that given the pre-requisite, the consequent inferred provided it is consistent with
the rest of the data.
Example
Rule that “Birds typically fly” would be represented as
𝑏𝑖𝑟𝑑(𝑥): 𝑓𝑙𝑖𝑒𝑠(𝑥)
𝑓𝑙𝑖𝑒𝑠(𝑥)
If x is a bird and the claim that x flies is consistent with what we know, then infer that x flies.
The idea behind non-monotonic reasoning is to reason with first order logic, and if an inference
can not be obtained then use the set of default rules available within the first order formulation.
Applying default rules
By applying default rules, it is necessary to check their justification for consistency, not only
with initial data, but also with the consequent of any other default rules that may be applied. The
application of one rule may thus block the application of another. To solve this problem, a concept of
default theory was introduced.
Default theory
It consist of set of premises W and a set of default rules D. An extension for the default theory
is a set of sentences E which can be derived from W by applying as many rules of D as possible without
generating inconsistency. D is the set of default rules has a unique syntax of the form
∝ (𝑥⃗): 𝐸𝛽(𝑥⃗)
ƞ(𝑥⃗)
∝ (𝑥⃗) is the pre-requisite of the default rule
𝐸𝛽(𝑥⃗) is the consistency test of the default rule
ƞ(𝑥⃗) is the consequent of the default rule
This rule can be read as, for all individual x1,x2,…,xn , if ∝ (𝑥⃗) is believed and if each of 𝐸𝛽(𝑥⃗)
is consistent with our beliefs, then ƞ(𝑥⃗) may be believed.
Example
A default rule says “Typically an American adult owns a car”
𝐴𝑚𝑒𝑟𝑖𝑐𝑎𝑛𝑠(𝑥)Adult(x): M((∃y). car(y)owns(x, y))
(((∃y). car(y)owns(x, y))
The rule is only accessed if we wish to know whether or not John owns a car then an answer
can be deduced from our current beliefs. This default rule is applicable if we can prove from
our beliefs that John is an American and an adult, and believing that there is some car that is
owned by John does not lead to an inconsistency. If these two sets of premises are satisfied,
then the rule states that we can conclude that John owns a car.
ii) Circumscription
Circumscription is a non-monotonic logic to formalize the common sense assumption. Circumscription
is a formalized rule of guess that can be used along with the rules of inference of first order logic.
Circumscription involves formulating rules of thumb with “abnormality” predicates and then restricting
the extension of these predicates, circumscribing them, so that they apply to only those things to which
they are currently known.
Example
The rule of thumb is that “birds typically fly” is conditional. The predicate “Abnormal” signifies
abnormality with respect to flying ability.
Observe that the rule
∀x:birds(x) ┐Abnormal(x)flies(x)
Does not allow as to infer that “Tweety flies”, since we do not know that he is abnormal with respect
to flying ability. But if we add axioms which circumscribes the abnormality predicate to which they are
known say “Bird Tweety” then the inference can be drawn.
FUZZY REASONING
Fuzzy Logic (FL) is a method of reasoning that resembles human reasoning. The approach of
FL imitates the way of decision making in humans that involves all intermediate possibilities between
digital values YES and NO. In fuzzy logic, the degree of truth is between 0 and 1.
Example: William is smart (0.8 truth)
The fuzzy logic works on the levels of possibilities of input to achieve the definite output.
Fuzzy logic is useful for commercial and practical purposes.
It can control machines and consumer products.
It may not give accurate reasoning, but acceptable reasoning.
Fuzzy logic helps to deal with the uncertainty in engineering.
Fuzzy Inference Systems Architecture
It has four main parts as shown −
Fuzzification Module − It transforms the system inputs, which are crisp numbers, into fuzzy
sets. It splits the input signal into five steps such as −
LP x is Large Positive
MP x is Medium Positive
S x is Small
MN x is Medium Negative
LN x is Large Negative
Knowledge Base − It stores IF-THEN rules provided by experts.
Inference Engine − It simulates the human reasoning process by making fuzzy inference on
the inputs and IF-THEN rules.
Defuzzification Module − It transforms the fuzzy set obtained by the inference engine into a
crisp value.
Algorithm
Define linguistic variables and terms.
Construct membership functions for them.
Construct knowledge base of rules.
Convert crisp data into fuzzy data sets using membership functions. (fuzzification)
Evaluate rules in the rule base. (interface engine)
Combine results from each rule. (interface engine)
Convert output data into non-fuzzy values. (defuzzification)
Logic Development
Step 1: Define linguistic variables and terms
Linguistic variables are input and output variables in the form of simple words or sentences. For room
temperature, cold, warm, hot, etc., are linguistic terms.
Temperature (t) = {very-cold, cold, warm, hot, Very-hot}
Step 2: Construct membership functions
A membership function (MF) is a curve that defines how each point in the input space is mapped to
a membership value (or degree of membership) between 0 and 1.
The membership functions of for Air conditioning system (temperature) variable are as shown −
Reflexivity: ∀x(x<x)
Respectively irreflexivity: ∀x(x<x)
Transitivity: ∀x∀y∀z(x<yy<zx<z)
Anti-symmetry: ∀x∀y(x<yy<xx=y)
Trichotomy: ∀x∀y(x=yx<yy<x)
Density: ∀x∀y(x<y∃z(x<zz<y))
No beginning: ∀x∃y(y<x)
No end: ∀x∃y(x<y)
Every instance has an immediate successor: ∀x∃y(x<y∀z(x<zy≤z))
Every instance has an immediate predessor: ∀x∃y(y<x∀z(z<xz≤y))
This model is suitable for reasoning about real-world events with duration.
Reflexivity of ⊆: ∀x(x⊆x)
Anti-symmetry of ⊆: ∀x∀y(x⊆yy⊆xx=y)
Atomicity of ⊆:∀x∃y(y⊆x∀z(z⊆yz=y))
Symmetry of O: ∀x∀y(xOyyOx)
Properties of TL
1. Safety Property
It implies that “something bad must not happen”. For example, system should not crash. Finite
length error trace should be there.
2. Liveness Property
It implies that “something good must happen”. For example, every packet sent must be received
at its destination. So there can be infinite-length error trace.
Semantics of TL
A temporal frame T=<T,<> defines the flow of time over which the meaning of the tense operators
are to be defined.
Operators in Temporal Logic
1. The temporal operators provide syntactic approach to linear temporal logic. These temporal
operators include always, never, next, until and before, amongst others.
2. The always operator holds if its operands holds in every single cycle, whilst the never
operator holds if its operand fails to hold in every single cycle. The next operator holds if
its operand holds in the cycle that immediately follows.
3. The next operator can take a number of cycles as an argument, enclosed in square brackets,
as in:
assert always req->next[2] (grant);
This means that whatever req is true, grant must be true two cycles later.
4. The meaning of the until operator
assert always req-> next(ack until grant);
This asserts that whenever req is true, ack is true in the following cycle and ack remains
true until the first subsequent cycle in which grant is true.
5. The before operator
assert req -> next (ack before grant);
This asserts that whenever req is true, ack must be true at least once in the period starting
in the following cycle ending in the last cycle before grant is true.
Linear Temporal Logic
It is a model logic for reasoning about dynamic scenarios similar to temporal modal logic. It contains
ANN is just a parallel computational system consisting of many simple processing elements
connected together in a specific way in order to perform a particular task. There are some important
features of artificial networks as follows.
(1) Artificial neural networks are extremely powerful computational devices (Universal
computers).
(2) ANNs are modeled on the basis of current brain theories, in which information is represented
by weights.
(3) ANNs have massive parallelism which makes them very e f f i c i e n t .
(4) They can learn and generalize from training data so there is no need for enormous feats of
programming.
(5) Storage is fault tolerant i.e. some portions of the neural net can be removed and there will be
only a small degradation in the quality of stored data.
(6) They are particularly fault tolerant which is equivalent to the “graceful degradation” found in
biological systems.
(7) Data are naturally stored in the form of associative memory which contrasts with conventional
memory, in which data are recalled by specifying address of that data.
(8) They are very noise tolerant, so they can cope with situations where normal symbolic systems
would have difficulty.
(9) In practice, they can do anything a symbolic/ logic system can do and more.
(10) Neural networks can extrapolate and intrapolate from their stored information. The neural
networks can also be trained. Special training teaches the net to look for significant features or
relationships of data.
TYPES OF NEURAL NETWORKS
Single Layer Network
A single layer neural network consists of a set of units organized in a layer. Each unit U n
receives a weighted input Ijwith weight Wjn. Figure shows a single layer neural network with j
inputs and outputs.
The goal of back propagation, as with most training algorithms, is to iteratively adjust the
weights in the network to produce the desired output by minimizing the output error. The algorithm’s
goal is to solve credit assignment problem. Back propagation is a gradient-descent approach in that it
uses the minimization of first-order derivatives to find an optimal solution. The standard back
propagation algorithm is given below.
Step1: Build a network with the choosen number of input, hidden and output u n i t s .
Step2: Initialize all the weights to low random values.
Step3: Randomly, choose a single training pair.
Step4: Copy the input pattern to the input layer.
Step5: Cycle the network so that the activation from the inputs generates the activations in the
hidden and output layers.
Step6: Calculate the error derivative between the output activation and the final o u t p u t .
Step7: Apply the method of back propagation to the summed products of the weights and errors in
the output layer in order to calculate the error in the hidden units.
Step8:Update the weights attached the each unit according to the error in that unit, the output
from the unit below it and the learning parameters, until the error is sufficiently low.
MACHINE LEARNING
Machine learning is the systematic study of algorithms and systems that improve their
knowledge or performance (learn a model for accomplishing a task) with experience (from available
data /examples)
Examples:
Given an URL decide whether it is a Sports website or not
Given that a buyer is buying a book at online store, suggest some related products for
that buyer
Given an ultrasound image of abdomen scan of a pregnant lady, predict the weight of
the baby
Given a CT scan image set, decide whether there is stenosis or not
Given marks of all the students in a class, assign relative grades based on statistical
distribution
Given a mail received, check whether it is a SPAM
Given a speech recording, identify the emotion of the speaker
Given a DNA sequence, predict the promoter regions in that sequence
These are some examples of “Intelligent tasks” — tasks that are “easy” for humans but
“extremely difficult” for a machine to achieve Artificial Intelligence is about building systems that can
efficiently perform such “intelligent tasks”
One of the important aspects that enable humans to perform such intelligent tasks is their ability
to learn from experiences (either supervised or unsupervised). Machine learning tasks are typically
classified into three broad categories, depending on the nature of the learning "signal" or "feedback"
available to a learning system.
o Prediction
o Classification (discrete labels),
o Regression (real values)
Prediction
Example: Price of a used car
x : car attributes
y : price
y = g (x | θ )
θ parameters
Classification
Example 1
Suppose you have a basket and it is filled with different kinds of fruits. Your task is to arrange them
as groups. For understanding let me clear the names of the fruits in our basket.
You already learn from your previous work about the physical characters of fruits. So arranging the
same type of fruits at one place is easy now. Your previous work is called as training data in data mining.
You already learn the things from your train data; this is because of response variable. Response variable
means just a decision variable.
Suppose you have taken a new fruit from the basket then you will see the size, color and shape of
that particular fruit. If size is Big, color is Red, shape is rounded shape with a depression at the top, you
will conform the fruit name as apple and you will put in apple group.
If you learn the thing before from training data and then applying that knowledge to the test data
(for new fruit), this type of learning is called as Supervised Learning.
Regression
Given example pairs of heights and weights of a set of people, find a model to predict the weight
of a person from her height
(ii) Unsupervised learning
Unsupervised learning, no labels are given to the learning algorithm, leaving it on its own to
find structure in its input. Unsupervised learning can be a goal in itself (discovering hidden patterns in
data) or a means towards an end.
o Clustering
Example
Suppose you have a basket and it is filled with some different types fruits, your task is to arrange
them as groups.
This time you don’t know anything about the fruits, honestly saying this is the first time you
have seen them. You have no clue about those.
So, how will you arrange them? What will you do first???
You will take a fruit and you will arrange them by considering physical character of that
particular fruit.
Suppose you have considered color.
Then you will arrange them on considering base condition as color.
Then the groups will be something like this.
RED COLOR GROUP: apples & cherry fruits.
GREEN COLOR GROUP: bananas & grapes.
So now you will take another physical character such as size.
RED COLOR AND BIG SIZE: apple.
RED COLOR AND SMALL SIZE: cherry fruits.
GREEN COLOR AND BIG SIZE: bananas.
GREEN COLOR AND SMALL SIZE: grapes.
Job done happy ending.
Here you did not learn anything before, means no train data and no response variable.
This type of learning is known as unsupervised learning.
Clustering comes under unsupervised learning.
Heuristic Reinforcement
NEURO-FUZZY INFERENCE
Neuro fuzzy system is the combination of fuzzy logic and neural network. Fuzzy logic and
neural networks are natural complementary tools in building intelligent systems. While neural networks
are low-level computational structures that perform well when dealing with raw data, fuzzy logic deals
with reasoning on a higher level, using linguistic information acquired from domain experts. However,
fuzzy systems lack the ability to learn and cannot adjust themselves to a new environment. On the other
hand, although neural networks can learn, they are opaque to the user.
Integrated neuro-fuzzy systems can combine the parallel computation and learning abilities of
neural networks with the human-like knowledge representation and explanation abilities of fuzzy
systems. As a result, neural networks become more transparent, while fuzzy systems become capable
of learning.
A neuro-fuzzy system is a neural network which is functionally equivalent to a fuzzy inference
model. It can be trained to develop IF-THEN fuzzy rules and determine membership functions for input
and output variables of the system. Expert knowledge can be incorporated into the structure of the
neuro-fuzzy system. At the same time, the connectionist structure avoids fuzzy inference, which entails
a substantial computational burden.
Neuro fuzzy system were created to solve the trade-off between:
– The mapping precision & automation of Neural Networks
– The interpretability of Fuzzy Systems
Models of NFS
• Model 1: Fuzzy System → Neural Network
• Model 2: Neural Network → Fuzzy Systems
Model 1: Fuzzy System → Neural Network
Layer 2 is the fuzzification layer. Neurons in this layer represent fuzzy sets used in the antecedents of
fuzzy rules. A fuzzification neuron receives a crisp input and determines the degree to which this input
belongs to the neuron’s fuzzy set.
The activation function of a membership neuron is set to the function that specifies the neuron’s fuzzy
set. A triangular membership function can be specified by two parameters {a, b} as follows:
Layer 3 is the fuzzy rule layer. Each neuron in this layer corresponds to a single fuzzy rule. A fuzzy
rule neuron receives inputs from the fuzzification neurons that represent fuzzy sets in the rule
antecedents. For instance, neuron R1, which corresponds to Rule 1, receives inputs from neurons A1
and B1. In a neuro-fuzzy system, intersection can be implemented by the product operator.
Layer 4 is the output membership layer. Neurons in this layer represent fuzzy sets used in the
consequent of fuzzy rules. An output membership neuron combines all its inputs by using the fuzzy
operation union.
Layer 5 is the defuzzification layer. Each neuron in this layer represents a single output of the neuro-
fuzzy system. It takes the output fuzzy sets clipped by the respective integrated firing strengths and
combines them into a single fuzzy set. Neuro-fuzzy systems can apply standard defuzzification
methods, including the centroid technique. The sum-product composition calculates the crisp output as
the weighted average of the centroids of all output membership functions. For example, the weighted
average of the centroids of the clipped fuzzy sets C1 and C2 is calculated as,
A neuro-fuzzy system is essentially a multi-layer neural network, and thus it can apply standard learning
algorithms developed for neural networks, including the back-propagation algorithm.
Applications of NFS
• Measuring opacity/transparency of water in washing machine – Hitachi, Japan
• Improving the rating of convertible bonds – Nikko Securities, Japan
• Adjusting exposure in photocopy machines – Sanyo, Japan
• Electric fan that rotates towards the user – Sanyo, Japan