Inar - Capítulo 03f - Search Algorithms
Inar - Capítulo 03f - Search Algorithms
(INTELIGENCIA ARTIFICIAL)
INTRODUCCION
• Debido a que este tipo de agentes se desarrollan en un ambiente no informado (no se tiene
mas información del problema que el problema mismo), y si a un agente se le presentan
varias opciones para tratar de solucionar el problema, este no sabe cual es la mejor opción
debido a que no tiene suficiente conocimiento de los estados que resultaran al seleccionar
dicha opción y por consiguiente este podrá fallar. Lo mejor que se puede hacer en este tipo de
situaciones es seleccionar de forma aleatoria una de las acciones.
• La función sucesor describe las posibles acciones disponibles para que realice un
agente. Por ejemplo, dado un estado x , la función sucesor regresara un conjunto de
parejas ordenadas del tipo <acción, sucesor> donde acción es una acción legal y sucesor
es el estado que puede ser alcanzado desde x al aplicarse la acción.
• El costo de acción, indica el costo numérico de aplicar una acción que permite pasar
del estado s al estados s’.
• El costo de la ruta es una función que asigna un costo numérico a cada ruta, la cual
se selecciona por el agente para poder reflejar su propia medida de desempeño.
• Se denomina solución de un problema a la ruta desde el estado inicial hasta el
estado objetivo. La calidad de una solución se mide por la función de ruta de costo.
Una solución óptima es aquella que tienen el menor costo de ruta de las soluciones
disponibles.
• La abstracción es válida si es posible expander cualquier solución la cual halla sido puesta de
forma abstracta en una solución mas detallada.
• La abstracción es muy útil si al realizar cada una de las acciones de la solución esto facilita el
problema original. La selección de una buena abstracción involucra remover tantos detalles
como sea posible mientras se mantenga la validez y se asegure que las acciones abstractas
son fáciles de llevar a cabo.
ANALISIS ASINTOTICO
Que tan rápido podría el algoritmo ejecutarse en una máquina que es 10 veces más
rápida que la computadora actual? 10 veces
Si se asume que C(n) = 1/2n(n-1), cuanto tardará el algoritmo en ejecutarse si se duplica
la cantidad de información de entrada?
C(n) = 1/2n (n-1) = 1/2 n² - 1/2 n » 1/2 n²
y
T(2n) / T(n) » cop C(2n) / cop C(n) » 1/2 (2n)² / 1/2 n² = 4
Valores aproximados para funciones en el análisis de algoritmos:
n Log2 n n Log2 n n2 n3 2n n!
Algoritmos que requieren de un número exponencial de operaciones son prácticos para solucionar
unicamente problemas de tamaño muy pequeño.
EFICIENCIA: EL MEJOR CASO, EL PEOR CASO Y EL CASO PROMEDIO
Algorithm SequentialSearch(A[0..n-1], K)
// Searches for a given value in a given array by sequential search
//Input: An array a[0..n-1] nd a search key K
//Output: Returns the index of the first element of A that matches K
// or -1 if there are no matching elements
i <- 0
while i < n and A[i] ¹ K do
i <- i + 1
if i < n return i
else return -1
cworst(n) = n
cbest(n) = 1
cavg(n) = p(n + 1) / 2 + n(1 - p); probabilidad de éxito p (0 ≤ p ≤ 1)
NOTACIÓN ASINTÓTICA Y CLASES DE EFICIENCIA
• De manera informal:
La notación O(g(n)), es el conjunto de todas las funciones con un orden de
crecimiento menor o igual a la de g(n).
Ejemplos:
n ≤ O(n²) 100n + 5 ≤ O(n²) 1/2n(n-1) ≤ O(n²)
• La notación W (g(n)) es el conjunto de todas las funciones con un orden de
crecimiento mayor o igual a la de g(n).
• Ejemplos:
n3 ≥ W (n²), 1/2n(n-1) ≥ W (n²), 100n + 5(n²) ≥ W (n²)
• La notación Q (g(n)) es el conjunto de todas las funciones que tienen el mismo orden
de crecimiento para g(n).
• Ejemplo:
an² + bn + c con a > 0 esta contenida en Q (n²)
COMPLEXITY AND BIG-OH
• The class of P-problems is a subset of the class of NP-problems, but there also exist
problems which are not NP.
COMPLEXITY THEORY
• If a solution is known to an NP-problem, it can be reduced to a single polynomial-time
verification.
• A problem is NP-complete if an algorithm for solving it can be translated into one for
solving any other NP-problem.
A problem in graph theory requiring the most efficient (i.e., least total distance)
Hamiltonian circuit a salesman can take through each of cities. No general method of
solution is known, and the problem is NP-hard
HAMILTONIAN CIRCUIT
A Hamiltonian circuit, also called a Hamiltonian cycle, Hamilton circuit, or Hamilton cycle,
is a graph cycle (i.e., closed loop) through a graph that visits each node exactly once
(Skiena 1990, p. 196). By convention, the trivial graph on a single node is considered to
posses a Hamiltonian circuit, but the connected graph on two nodes is not. A graph
possessing a Hamiltonian circuit is said to be a
Hamiltonian graph. The Hamiltonian circuit is
named after Sir William Rowan Hamilton, who
devised a puzzle in which such a path along
the polyhedron edges of an dodecahedron
was sought (the Icosian game).
ICOSIAN GAME
The Icosian game, also called the Hamiltonian game (Ball and Coxeter 1987, p. 262), is the
problem of finding a Hamiltonian circuit along the edges of an dodecahedron, i.e., a path
such that every vertex is visited a single time, no edge is visited twice, and the ending
point is the same as the starting point (left figure). The puzzle was distributed
commercially as a pegboard with holes at the nodes of the dodecahedral graph, illustrated
above (right figure). The Icosian Game was invented
in 1857 by William Rowan Hamilton . Hamilton sold
it to a London game dealer in 1859 for 25 pounds,
and the game was subsequently marketed in Europe
in a number of forms (Gardner 1957).
HAMILTONIAN GRAPH
Problemas del mundo real es aquel en el cual las soluciones propuestas por la
gente tiene que verificarse y tener cuidado al aplicarlas.
PROBLEMAS ESTANDARIZADOS (JUEGOS)
https://store.steampowered.com/app/1127180/Robot_Vacuum_Simulator_X/
https://store.steampowered.com/app/1766150/Robo_Vacuum_Simulator/
INITIAL STATE: Any state can be designated as the initial state.
ACTIONS: In the two-cell world we defined three actions: Suck, move Left, and move
Right. In a two-dimensional multi-cell world we need more movement actions. We
could add Upward and Downward, giving us four absolute movement actions, or we
could switch to egocentric actions, defined relative to the viewpoint of the agent—for
example, Forward, Backward, TurnRight, and TurnLeft.
TRANSITION MODEL: Suck removes any dirt from the agent’s cell; Forward moves
the agent ahead one cell in the direction it is facing, unless it hits a wall, in which case
the action has no effect. Backward moves the agent in the opposite direction, while
TurnRight and TurnLeft change the direction it is facing by
GOAL STATES: The states in which every cell is clean.
ACTION COST: Each action costs 1.
SOKOBAN PUZZLE
https://www.mathsisfun.com/games/sokoban.html
SLIDING-TILE PUZZLE
https://bijlmakers.com/sliding-puzzle/
STATES: A state description specifies the location of each of the tiles.
INITIAL STATE: Any state can be designated as the initial state. Note that a parity
property partitions the state space—any given goal can be reached from exactly half of
the possible initial states.
ACTIONS: While in the physical world it is a tile that slides, the simplest way of
describing an action is to think of the blank space moving Left, Right, Up, or Down. If
the blank is at an edge or corner then not all actions will be applicable.
TRANSITION MODEL: Maps a state and action to a resulting state; for example, if
we apply Left to the start state, the resulting state has the 5 and the blank switched.
GOAL STATE: Although any state could be the goal, we typically specify a state with
the numbers in order.
ACTION COST: Each action costs 1.
PROBLEMAS DEL MUNDO REAL
Un árbol de búsqueda es generado a partir del estado inicial y la función sucesor que
define el espacio de estados.
• Hay tres tipos de colas que son utilizadas en los algoritmos de búsqueda:
• Cola con prioridad (priority queue): abre primero el nodo con el costo mínimo
de acuerdo con alguna función de evaluación f, se usa en la búsqueda de mejor
primero
• Cola FIFO (primero en entrar primero en salir): abre primero el nodo que se
agregó a la cola en primer lugar; se utiliza en la búsqueda primero en amplitud.
• Cola LIFO (Ultimo en entar primero en salir o pila o stack): abre primero el
nodo agregado más recientemente; se usa en la búsqueda primero en
profundidad.
• Para lograr la efectividad de un algoritmo de búsqueda, se debe de considerar
el costo de búsqueda, el cual depende regularmente de la complejidad del
tiempo, pero también se incluye en términos de utilización de memoria.
Based on the search problems we can classify the search algorithms into:
Uninformed search (Blind search) algorithms.
Informed search (Heuristic search) algorithms.
Search Algorithms
Breadth First Search (BFS) Depth Limited Search Best First Search
Not contain any domain knowledge such as closeness, the location of the goal.
Uninformed search applies a way in which search tree is searched without any
information about the search space like initial state operators and test for the goal,
so it is also called blind search. It examines each node of the tree until it achieves the
goal node.
BREADTH FIRST SEARCH (BFS)
Breadth-first search is the most common search strategy for traversing a tree
or graph. This algorithm searches breadthwise in a tree or graph, so it is
called breadth-first search.
BFS algorithm starts searching from the root node of the tree and expands all
successor node at the current level before moving to nodes of next level.
The breadth-first search algorithm is an example of a general-graph search
algorithm.
Breadth-first search implemented using FIFO queue data structure.
BREADTH FIRST SEARCH (BFS)
Advantages:
BFS will provide a solution if any solution exists.
If there are more than one solutions for a given problem, then BFS will
provide the minimal solution which requires the least number of steps.
Disadvantages:
It requires lots of memory since each level of the tree must be saved into
memory to expand the next level.
BFS needs lots of time if the solution is far away from the root node.
BREADTH FIRST SEARCH (BFS)
That also means we can do an early goal test, checking whether a node is a
solution as soon as it is generated, rather than the late goal test that best-first
search uses, waiting until a node is popped off the queue.
BREADTH FIRST SEARCH (BFS)
BREADTH FIRST SEARCH (BFS)
Advantages:
Uniform cost search is optimal because at every state the path with the
least cost is chosen.
Disadvantages:
It does not care about the number of steps involve in searching and only
concerned about path cost. Due to which this algorithm may be stuck in an
infinite loop.
UNIFORM-COST SEARCH
Advantage:
DFS requires very less memory as it only needs to store a stack of the
nodes on the path from root node to the current node.
It takes less time to reach to the goal node than BFS algorithm (if it
traverses in the right path).
Disadvantage:
There is the possibility that many states keep re-occurring, and there is no
guarantee of finding the solution.
DFS algorithm goes for deep down searching and sometime it may go to
the infinite loop.
DEPTH FIRST SEARCH (DFS)
Completeness: DFS search algorithm is complete within finite state space as it will
expand every node within a limited search tree.
Time Complexity: Time complexity of DFS will be equivalent to the node traversed by
the algorithm. It is given by:
T(n)= 1+ 𝑛2 + 𝑛3 + ......... + nm = O(nm)
Where, m = maximum depth of any node and this can be much larger than d
(Shallowest solution depth)
Space Complexity: DFS algorithm needs to store only single path from the root node,
hence space complexity of DFS is equivalent to the size of the fringe set, which is
O(bm).
Optimal: DFS search algorithm is non-optimal, as it may generate a large number of
steps or high cost to reach to the goal node.
DEPTH-LIMITED SEARCH
To keep depth-first search from wandering down an infinite path, is possible use depth-
limited search algorithm, a version of depth-first search in which we supply a depth limit
ℓ, and treat all nodes at depth as if they had no successors.
Unfortunately, if a poor selection of ℓ is made the algorithm will fail to reach the
solution, making it incomplete again.
DEPTH-LIMITED SEARCH
Sometimes a good depth limit can be chosen based on knowledge of the problem. This
number, known as the diameter of the state-space graph, and gives us a better depth
limit, which leads to a more efficient depth-limited search.
However, for most problems we will not know a good depth limit until we have solved
the problem.
DEPTH-LIMITED SEARCH
Depth-limited search returns
one of three different types of
values: either a solution node;
or failure, when it has
exhausted all nodes and
proved there is no solution at
any depth; or cutoff, to mean
there might be a solution at a
deeper depth than ℓ.
This a tree-like search algorithm that doe not keep track of reached states, and thus
uses much less memory than best-first search, but runs the risk of visiting the same
state multiple times on different paths. Also, if the IS-CYCLE check does not check all
cycles, then the algorithm may get caught in a loop.
ITERATIVE DEEPENING SEARCH
The iterative deepening algorithm is a combination of DFS and BFS algorithms. This
search algorithm finds out the best depth limit and does it by gradually increasing the
limit until a goal is found.
This algorithm performs depth-first search up to a certain "depth limit", and it keeps
increasing the depth limit after each iteration until the goal node is found.
This Search algorithm combines the benefits of Breadth-first search's fast search and
depth-first search's memory efficiency.
The iterative search algorithm is useful uninformed search when search space is large,
and depth of goal node is unknown.
ITERATIVE DEEPENING SEARCH
Advantages:
It combines the benefits of BFS and DFS search algorithm in terms of fast
search and memory efficiency.
Disadvantages:
The main drawback of IDDFS is that it repeats all the work of the previous
phase.
Four iterations of iterative
deepening search for goal on
a binary tree, with the depth
limit varying from 0 to 3.
Note the interior nodes form a
single path.
The triangle marks the node to
expand next; green nodes with
dark outlines are on the
frontier; the very faint nodes
provably can’t be part of a
solution with this depth limit.
ITERATIVE DEEPING SEARCH
ITERATIVE DEEPENING SEARCH
Following tree structure is showing the
iterative deepening depth-first search. IDDFS
algorithm performs various iterations until it
does not find the goal node.
The iteration performed by the algorithm is
given as:
1'st Iteration-----> A
2'nd Iteration----> A, B, C
3'rd Iteration------>A, B, D, E, C, F, G
Time Complexity: Let's suppose b is the branching factor and depth is d then the
worst-case time complexity is O(𝑏 ℓ ) when there is a solution, or O(𝑏 𝑚 ) when there is
none.
Space Complexity: The space complexity of IDDFS will be O(bd), when there is a
solution, or O(bm) when there is none.
Optimal: IDDFS algorithm is optimal if path cost is a non- decreasing function of the
depth of the node.
ITERATIVE DEEPENING SEARCH
Hybrid approach that runs breadth-first search until almost all the available
memory is consumed, and then runs iterative deepening from all the nodes in
the frontier.
In general, iterative deepening is the preferred uninformed search method when
the search state space is larger than can fit in memory and the depth of the
solution is not known.
BIDIRECTIONAL SEARCH
Bidirectional search simultaneously searches forward from the initial state and
backwards from the goal state(s), hoping that the two searches will meet. The
𝑑ൗ 𝑑ൗ
motivation is that 𝑏 + 𝑏 2 is much less than 𝑏 𝑑 .
2
For this to work, its necessary to keep track of two frontiers and two tables of
reached states, and we need to be able to reason backwards: if state s’ is a
successor of in the forward direction, then we need to know that is a successor
of s’ in the backward direction. We have a solution when the two frontiers
collide.
BIDIRECTIONAL SEARCH
Bidirectional search algorithm runs two simultaneous searches, one form initial
state called as forward-search and other from goal node called as backward-
search, to find the goal node. Bidirectional search replaces one single search
graph with two small subgraphs in which one starts the search from an initial
vertex and other starts from goal vertex. The search stops when these two
graphs intersect each other.
Bidirectional search can use search techniques such as BFS, DFS, DLS, etc.
BIDIRECTIONAL SEARCH
Advantages:
Bidirectional search is fast.
Bidirectional search requires less memory
Disadvantages:
Implementation of the bidirectional search tree is difficult.
In bidirectional search, one should know the goal state in advance.
BIDIRECTIONAL SEARCH
Bidirectional best-first search keeps two
frontiers and two tables of reached
states.
When a path in one frontier reaches a
state that was also reached in the other
half of the search, the two paths are
joined (by the function JOIN-NODES) to
form a solution.
The first solution we get is not
guaranteed to be the best; the function
TERMINATED determines when to stop
looking for new solutions.
BIDIRECTIONAL SEARCH
This algorithm divides one
graph/tree into two sub-graphs. It
starts traversing from node 1 in the
forward direction and starts from
goal node 16 in the backward
direction.
BIDIRECTIONAL SEARCH
With backtracking we also have the option of maintaining an efficient set data structure
for the states on the current path, allowing us to check for a cyclic path O(1) in time
rather than O(m).
DEPTH-FIRST SEARCH (BACKTRACKING)
Space complexity of O(1) means that the space required by the algorithm
to process data is constant; it does not grow with the size of the data on
which the algorithm is operating. Comment hidden because of low score.
The heuristic method, however, might not always give the best solution, but it
guaranteed to find a good solution in reasonable time.
HEURISTICShms
On each iteration, each node n with the lowest heuristic value is expanded and
generates all its successors and n is placed to the closed list. The algorithm
continues until a goal state is found.
INFORMED SEARCH ALGORITHMS
Greedy best-first search is a form of best-first search that expands first the node with the
lowest h(n) value—the node that appears to be closest to the goal—on the grounds that
this is likely to lead to a solution quickly.
The evaluation function: f(n) = h(n). h(n)= estimated cost from node n to the goal.
Greedy best-first search algorithm always selects the path which appears best at that
moment. It is the combination of depth-first search and breadth-first search algorithms. It
uses the heuristic function and search. Best-first search allows us to take the advantages of
both algorithms. With the help of best-first search, at each step, it’s possible choose the
most promising node. In the best first search algorithm, it expand the node which is
closest to the goal node and the closest cost is estimated by heuristic function
The greedy best first algorithm is implemented by the priority queue.
GREEDY BEST-FIRST SEARCHhms
Advantages:
Best first search can switch between BFS and DFS by gaining the advantages of both
the algorithms.
This algorithm is more efficient than BFS and DFS algorithms.
Disadvantages:
It can behave as an unguided depth-first search in the worst case scenario.
It can get stuck in a loop as DFS.
This algorithm is not optimal.
GREEDY BEST-FIRST SEARCH ALGORITHMhms
Step 1: Place the starting node into the OPEN list.
Step 2: If the OPEN list is empty, Stop and return failure.
Step 3: Remove the node n, from the OPEN list which has the lowest value of h(n), and places
it in the CLOSED list.
Step 4: Expand the node n, and generate the successors of node n.
Step 5: Check each successor of node n, and find whether any node is a goal node or not. If
any successor node is goal node, then return success and terminate the search, else proceed
to Step 6.
Step 6: For each successor node, algorithm checks for evaluation function f(n), and then check
if the node has been in either OPEN or CLOSED list. If the node has not been in both list,
then add it to the OPEN list.
Step 7: Return to Step 2.
GREEDY BEST-FIRST SEARCHhms
At each iteration, each node is Node H(n)
expanded using evaluation A 12
function f(n) = h(n) , which is B 4
given in the table. C 7
D 3
E 8
F 2
H 4
I 9
S 13
G 0
GREEDY BEST-FIRST SEARCHhms
In this search example, we are using two lists which
are OPEN and CLOSED Lists. Following are the
iteration for traversing the above example.
Expand the nodes of S and put in the
CLOSED list
Initialization: Open [A, B], Closed [S]
Iteration 1: Open [A], Closed [S, B]
Iteration 2: Open [E, F, A], Closed [S, B]
: Open [E, A], Closed [S, B, F]
Iteration 3: Open [I, G, E, A], Closed [S, B, F]
: Open [I, E, A], Closed [S, B, F, G]
Hence the final solution path will be: S-> B->F-> G
GREEDY BEST-FIRST SEARCHhms
Time Complexity: The worst case time complexity of Greedy best first search is
O(bm).
Space Complexity: The worst case space complexity of Greedy best first search is
O(bm). Where, m is the maximum depth of the search space.
Complete: Greedy best-first search is also incomplete, even if the given state space is
finite.
The values of ℎ𝑆𝐿𝐷 cannot be computed from the problem description itself (that
is, the ACTIONS and RESULT functions). It takes a certain amount of world
knowledge to know that.
A* search algorithm finds the shortest path through the search space using the
heuristic function.
This search algorithm expands less search tree and provides optimal result
faster. A* algorithm is similar to UCS except that it uses g(n)+h(n) instead of
g(n).
A* SEARCH ALGORITHM
In A* search algorithm, we use search heuristic as well as the cost to reach the node.
Hence we can combine both costs as following, and this sum is called as a fitness
number.
At each point in the search space, only those node is expanded which have the lowest
value of f(n), and the algorithm terminates when the goal node is found.
A* SEARCH ALGORITHM
Advantages:
A* search algorithm is the best algorithm than other search algorithms.
A* search algorithm is optimal and complete.
This algorithm can solve very complex problems.
Disadvantages:
It does not always produce the shortest path as it mostly based on heuristics and
approximation.
A* search algorithm has some complexity issues.
The main drawback of A* is memory requirement as it keeps all generated nodes in
the memory, so it is not practical for various large-scale problems.
A* SEARCH ALGORITHM
Solution:
Initialization: {(S, 5)}
Iteration1: {(S-> A, 4), (S-->G, 10)}
Iteration2: {(S-> A-->C, 4), (S--> A-->B, 7), (S-->G, 10)}
Iteration3: {(S-> A->C->G, 6), (S-> A->C->D, 11),
(S-> A->B, 7), (S->G, 10)}
Iteration 4 will give the final result, as S->A->C->G it
provides the optimal path with cost 6.
A* SEARCH ALGORITHM
A* algorithm returns the path which occurred first, and it does not search for all
remaining paths.
The efficiency of A* algorithm depends on the quality of heuristic.
A* algorithm expands all nodes which satisfy the condition f(n).
Whether A* is cost-optimal depends on certain properties of the heuristic. A key
property is admissibility: an admissible heuristic is one that never overestimates the
cost to reach a goal. (An admissible heuristic is therefore optimistic.) With an
admissible heuristic, A* is cost-optimal.
A slightly stronger property is called consistency. A heuristic h(n) is consistent if, for
every node n and every successor n’ of generated by an action:
ℎ 𝑛 ≤ 𝑐 𝑛, 𝑎, 𝑛′ + ℎ(𝑛′ )
SEARCH CONTOURS
A useful way to visualize a search is to draw contours in the state space, just like the
contours in a topographic map.
Inside the contour labeled 𝑓 𝑛 = 𝑔 𝑛 +
ℎ 𝑛 ≤ 400, all nodes have and so on. Then,
because A* expands the frontier node of
lowest -cost, we can see that an A* search
fans out from the start node, adding nodes
in concentric bands of increasing-cost.
Map of Romania showing contours at f=380,
f=400, and f=420 with Arad as the start
state. Nodes inside a given contour have
costs less than or equal to the contour
value.
MONOTONIC
Monotonic: the path cost always increases as you go along a path, because action costs
are always positive.
Strictly monotonic for costs that always increase, and “monotonic” for costs that never
decrease, but might remain the same.
If C* is the cost of the optimal solution path, then we can say the following:
A* expands all nodes that can be reached from the initial state on a path where
every node on the path has 𝑓 𝑛 < 𝐶 ∗ . We say these are surely expanded nodes.
A* might then expand some of the nodes right on the “goal contour” (where
𝑓 𝑛 = 𝐶 ∗ ) before selecting a goal node.
A* expands no nodes with 𝑓 𝑛 > 𝐶 ∗ .
A* with a consistent heuristic is optimally efficient in the sense that any algorithm that
extends search paths from the initial state, and uses the same heuristic information,
must expand all nodes that are surely expanded by A* (because any one of them could
have been part of an optimal solution).
A* SEARCH ALGORITHM
A* is efficient because it prunes away search tree nodes that are not necessary for
finding an optimal solution.
A* search is complete, cost-optimal, and optimally efficient among all such algorithms is
rather satisfying. Unfortunately, it does not mean that A* is the answer to all our
searching needs. The catch is that for many problems, the number of nodes expanded
can be exponential in the length of the solution.
SATISFICING SEARCH
Weighted A* search weight the heuristic value more heavily, giving us the evaluation
function.
Weighted A* search finds a solution that is slightly costlier, but the search time is much
faster.
Weighted search focuses the contour of reached states towards a goal. That means
that fewer states are explored, but if the optimal path ever strays outside of the
weighted search’s contour (as it does in this case), then the optimal path will not be
found.
In general, if the optimal solution costs C*, a weighted A* search will find a solution
that costs somewhere between C*, and W × 𝐶 ∗ ; but in practice we usually get results
much closer to C* than W × 𝐶 ∗
WEIGHTED A*
There are a variety of suboptimal search algorithms, which can be characterized by the
criteria for what counts as “good enough.”
In bounded suboptimal search, look for a solution that is guaranteed to be within a constant
factor W of the optimal cost. Weighted A* provides this guarantee.
In bounded-cost search, we look for a solution whose cost is less than some constant C. And
in unbounded-cost search, we accept a solution of any cost, as long as we can find it quickly.
An example of an unbounded-cost search algorithm is Speedy search, which is a version of
greedy best-first search that uses as a heuristic the estimated number of actions required to
reach a goal, regardless of the cost of those actions. Thus, for problems where all actions
have the same cost it is the same as greedy best-first search, but when actions have different
costs, it tends to lead the search to find a solution quickly, even if it might have a high cost.
MEMORY-BOUNDED SEARCH
For many problems (such as exploring a grid), this duplication is not a concern,
because the size of frontier is much smaller than reached, so duplicating the states in
the frontier requires a comparatively trivial amount of memory.
MEMORY-BOUNDED SEARCH
Some implementations keep a state in only one of the two places, saving a bit of space
at the cost of complicating (and perhaps slowing down) the algorithm.
Another possibility is to remove states from reached when we can prove that they are
no longer needed.
SEPARATION PROPERTY
Some problems, use the separation property, along with the prohibition of U-turn
actions, to ensure that all actions either move outwards from the frontier or onto
another frontier state. In that case, only check the frontier for redundant paths, and
eliminate from the reached table.
For other problems, keep reference counts of the number of times a state has been
reached, and remove it from the reached table when there are no more ways to reach
the state.
BEAM SEARCH
Beam search limits the size of the frontier. The easiest approach is to keep only the
nodes with the best -scores, discarding any other expanded nodes.
This makes the search incomplete and suboptimal, but we can choose to make good
use of available memory, and the algorithm executes fast because it expands fewer
nodes.
For many problems it can find good near-optimal solutions.
An alternative version of beam search doesn’t keep a strict limit on the size of the
frontier but instead keeps every node whose f-score is within 𝛿 of the best f-score.
That way, when there are a few strong-scoring nodes only a few will be kept, but if
there are no strong nodes then more will be kept until a strong one emerges.
ITERATIVE-DEEPENING A*
Recursive best-first search attempts to mimic the operation of standard best-first search,
but using only linear space.
RBFS resembles a recursive depthfirst search, but rather than continuing indefinitely down
the current path, it uses the _limit variable to keep track of the -value of the best
alternative path available from any ancestor of the current node. If the current node
exceeds this limit, the recursion unwinds back to the alternative path.
As the recursion unwinds, RBFS replaces the -value of each node along the path with a
backed-up value—the best -value of its children. In this way, RBFS remembers the -value of
the best leaf in the forgotten subtree and can therefore decide whether it’s worth
reexpanding the subtree at some later time
Stages in an RBFS
search for the
shortest route to
Bucharest.
The f-limit value for
each recursive call is
shown on top of
each current node,
and every node is
labeled with its f-
cost.
RBFS is somewhat
more efficient than
IDA*, but still suffers
from excessive node
regeneration.
Stages in an RBFS
search for the
shortest route to
Bucharest.
The f-limit value for
each recursive call is
shown on top of
each current node,
and every node is
labeled with its f-
cost.
RBFS is somewhat
more efficient than
IDA*, but still suffers
from excessive node
regeneration.
RECURSIVE BEST-FIRST SEARCH (RBFS)
SMA* proceeds just like A*, expanding the best leaf until memory is full.
At this point, it cannot add a new node to the search tree without dropping an old
one.
SMA* always drops the worst leaf node—the one with the highest -value. Like RBFS,
SMA* then backs up the value of the forgotten node to its parent. In this way, the
ancestor of a forgotten subtree knows the quality of the best path in that subtree.
With this information, SMA* regenerates the subtree only when all other paths have
been shown to look worse than the path it has forgotten.
What if all the leaf nodes have the same -value? To avoid selecting the same node for
deletion and expansion, SMA* expands the newest best leaf and deletes the oldest
worst leaf.
MEMORY-BOUNDED A* (MA*) & SIMPLIFIED MA* (SMA*)
SMA* is complete if there is any reachable solution—that is, if d the depth of the
shallowest goal node, is less than the memory size (expressed in nodes).
SMA* is optimal if any optimal solution is reachable; otherwise, it returns the best
reachable solution.
In practical terms, SMA* is a fairly robust choice for finding optimal solutions,
particularly when the state space is a graph, action costs are not uniform, and node
generation is expensive compared to the overhead of maintaining the frontier and the
reached set.
Extra time required for repeated regeneration of the same nodes means that problems
intractable for SMA*.
BIDIRECTIONAL A* SEARCH
One way to characterize the quality of a heuristic is the effective branching factor 𝑏 ∗ If
the total number of nodes generated by A* for a particular problem is N and the
solution depth is d, then is the branching factor 𝑏 ∗ that a uniform tree of depth d would
have to have in order to contain N + 1 nodes.
Experimental measurements of 𝑏 ∗ on a small set of problems can provide a good guide
to the heuristic’s overall usefulness. A well-designed heuristic, allow fairly large
problems to be solved at reasonable computational cost.
A better way to characterize the effect of A* pruning with a given heuristic is that it
reduces the effective depth by a constant 𝑘ℎ compared to the true depth. This means
that the total search cost O(𝑏 𝑑−𝑘ℎ ) is compared to O(𝑏 𝑑 ) for an uninformed.
HEURISTIC ACCURACY
From the definitions of the two heuristics that for any node 𝑛, ℎ2 𝑛 ≥ ℎ1 . When ℎ2 is
always better than ℎ1 , we say that ℎ2 dominates ℎ1 .
Domination translates directly into efficiency: A* using will never expand more nodes
than A* using (except in the case of breaking ties unluckily).
RELAXED PROBLEMS
A problem with fewer restrictions on the actions is called a relaxed problem. The
state-space graph of the relaxed problem is a supergraph of the original state space
because the removal of restrictions creates added edges in the graph.
Because the relaxed problem adds edges to the state-space graph, any optimal solution
in the original problem is, by definition, also a solution in the relaxed problem; but the
relaxed problem may have better solutions if the added edges provide shortcuts.
The cost of an optimal solution to a relaxed problem is an admissible heuristic for the
original problem.
The derived heuristic is an exact cost for the relaxed problem, it must obey the
triangle inequality and is therefore consistent.
SUBPROBLEMS
Admissible heuristics can also be derived from the solution cost of a subproblem of a
given problem.
Pattern databases is to store these exact solution costs for every possible subproblem
instance.
The sum of the two costs is still a lower bound on the cost of solving the entire
problem. This is the idea behind disjoint pattern databases.
REFERENCES