[go: up one dir, main page]

0% encontró este documento útil (0 votos)
77 vistas142 páginas

Inar - Capítulo 03f - Search Algorithms

Este documento describe los conceptos fundamentales detrás de los agentes de resolución de problemas y la búsqueda. Explica que estos agentes usan representaciones de estados y transiciones para planificar rutas hacia un estado objetivo. También describe los componentes clave de un problema bien definido como el estado inicial, las acciones, el modelo de transición y los estados objetivos. Finalmente, introduce conceptos como el análisis asintótico y la abstracción, los cuales son útiles para evaluar el rendimiento de los algoritmos de búsqueda.

Cargado por

eliza.lunaee
Derechos de autor
© © All Rights Reserved
Nos tomamos en serio los derechos de los contenidos. Si sospechas que se trata de tu contenido, reclámalo aquí.
Formatos disponibles
Descarga como PDF, TXT o lee en línea desde Scribd
0% encontró este documento útil (0 votos)
77 vistas142 páginas

Inar - Capítulo 03f - Search Algorithms

Este documento describe los conceptos fundamentales detrás de los agentes de resolución de problemas y la búsqueda. Explica que estos agentes usan representaciones de estados y transiciones para planificar rutas hacia un estado objetivo. También describe los componentes clave de un problema bien definido como el estado inicial, las acciones, el modelo de transición y los estados objetivos. Finalmente, introduce conceptos como el análisis asintótico y la abstracción, los cuales son útiles para evaluar el rendimiento de los algoritmos de búsqueda.

Cargado por

eliza.lunaee
Derechos de autor
© © All Rights Reserved
Nos tomamos en serio los derechos de los contenidos. Si sospechas que se trata de tu contenido, reclámalo aquí.
Formatos disponibles
Descarga como PDF, TXT o lee en línea desde Scribd
Está en la página 1/ 142

SOLVED PROBLEMS BY SEARCHING

(INTELIGENCIA ARTIFICIAL)
INTRODUCCION

 Cuando la acción correcta a seleccionar no es ínmediatamente indicada por ser obia,


un agente necesita llevar una planificación por anticipado, la cual considera accciones
que forman una ruta hacia el estado objetivo.

 A este agente se le llama agente de resolución de problemas, y el proceso


computacional que lleva a cabo se llama búsqueda.

 Los agentes de resolución de problemas utilizan represetaciones atómicas donde los


estados son considerados como un todo y no tiene estrutura interna visible para el
agoritmo de solución.
REPRESENTACION DE ESTADOS Y TRANSICIONES

 Atómicas: un estado (como B o C) es una caja negra sin estructura interna.


 Factorizada o en components: un estado consiste de un vector de atributos valorizados
(valor lógico o un valor real o uno de un conjunto de symbolos).
 Estructurada: un estado incluye objetos (atributos y relaciones hacia otros objetos).
 Una representación mas expresiva puede capturar, al menos con la misma concisión,
todo lo que puede capturar una menos expresiva.
 Ejemplos:
 Representación estructurada: Reglas de ajedrez (una o dos páginas)
 Representación factorizada: Lógica de primer orden (cientos de páginas)
 Representación atómica: Automatas finitos (cientos de páginas)
 Una representación involucre el mapeo de conceptos a ubicaciones en la memoria
física (sea en la de un computador o en la del cerebro).
 Representación local: si el mapeo entre conceptos y la ubicación de memoria es
ono-a-uno
 Representación distribuída: si la representación de un concepto esta esparcido sobre
muchas ubicaciones de memoria y cada ubicación de memoria es utilizada como
parte de la representación de multiples conceptos diferentes.
 La representación distribuída es mas robusta contra el ruido y la perdida de
información.

 En la representación local, el mapeo de un concepto con la ubicación de memoria es


arbitraría y si un error de transmisión distorsiona un poco, esto podría crear
confusions. Pero si se utiliza una representación distríbuida, cada concepto puede
representar un punto en el espacio multidimensional y alguna distorción no tendría
mucha afectación ya que podría hacerse un moviento cercano, el cual podría tener un
significado similar.
AGENTE PARA SOLUCIONAR PROBLEMAS
• Se denomina agente para solución de problemas, al agente que sigue un modelo
basado en objetivos, este tipo de agente, decide que hacer al encontrar secuencias de
acciones que permitan llegar a estados deseables.

• En este tipo de agentes se establece un objetivo, el cual se tiene que cumplir o


satisfacer. A su vez establecer el objetivo ayuda al agente a organizar su
comportamiento delimitando las acciones que tratara de realizar.

• El primer paso al tratar de solucionar un problema es la formulación de objetivos, la


cual se basa en la situación actual y la medida de desempeño que el agente posee en
ese momento.
• A continuación se hace una formulación del problema, el cual es el proceso en el
cual se decide que acciones y estados se deben de considerar, habiendo establecido
previamente un objetivo.

• Debido a que este tipo de agentes se desarrollan en un ambiente no informado (no se tiene
mas información del problema que el problema mismo), y si a un agente se le presentan
varias opciones para tratar de solucionar el problema, este no sabe cual es la mejor opción
debido a que no tiene suficiente conocimiento de los estados que resultaran al seleccionar
dicha opción y por consiguiente este podrá fallar. Lo mejor que se puede hacer en este tipo de
situaciones es seleccionar de forma aleatoria una de las acciones.

• De forma general un agente con varias opciones inmediatas pero de valor


desconocido, puede decidir que hacer, primero examinando las diferentes posibles
secuencias de acciones que llevan a estados de valor conocido y entonces seleccionar
la mejor opción.
• Al proceso de buscar una secuencia de acciones se le denomina búsqueda.

• Un algoritmo de búsqueda toma un problema como entrada y regresa una solución


en forma de una secuencia de acciones.

• La fase de ejecución se produce cuando se realizan las acciones recomendadas por


la solución encontrada.

• Una solución a un problema es una secuencia sencilla de acciones, por consiguiente


no se pone atención a las percepciones, y cuando el agente no presta atención a las
percepciones y únicamente realiza las acciones (por lo cual debe de estar seguro que
va a suceder) se dice que cae en un esquema de lazo abierto, debido a que rompe el
lazo formado por el agente y el medio ambiente.
PROBLEMAS BIEN DEFINIDOS
 Un problema se puede definir formalmente en cinco componentes:
 Estado inicial
 Acciones
 Modelo de transición
 Estados objetivo
 Costo de acción

 El estado inicial, es el estado inicial del agente.

 Acciones, implica las acciones que el agente puede aplicar.

 El modelo de transición, indica el resultado de la acción realizada.


• El espacio de estados de un problema es el conjunto de estados alcanzables
partiendo desde el estado inicial. El espacio de estados forma un grafo en la cual los
nodos son los estados y los arcos entre nodos son las acciones. }

• Estados objetivo, es el conjunto de uno o mas estados. Algunas veces hay un


objetivo, algunas veces hay un pequeño conjunto alternativo de estados objetivo y
otras veces el objetivo es definido por una propiedad que aplica muchos estados
(potencialmente un número infinito)

• La función sucesor describe las posibles acciones disponibles para que realice un
agente. Por ejemplo, dado un estado x , la función sucesor regresara un conjunto de
parejas ordenadas del tipo <acción, sucesor> donde acción es una acción legal y sucesor
es el estado que puede ser alcanzado desde x al aplicarse la acción.
• El costo de acción, indica el costo numérico de aplicar una acción que permite pasar
del estado s al estados s’.

• Una ruta es la secuencia de estados conectados por una secuencia de acciones en un


espacio de estados.

• La prueba de objetivo determina cuando un estado es el estado objetivo. Algunas


veces existen de forma explícita un conjunto de solución y la prueba de objetivo
verifica que la el estado objetivo se encuentre en ellos. En otras ocasiones el estado
objetivo esta especificado de forma abstracta en lugar de estar expresado en un
conjunto de estados en forma explícita.

• El costo de la ruta es una función que asigna un costo numérico a cada ruta, la cual
se selecciona por el agente para poder reflejar su propia medida de desempeño.
• Se denomina solución de un problema a la ruta desde el estado inicial hasta el
estado objetivo. La calidad de una solución se mide por la función de ruta de costo.
Una solución óptima es aquella que tienen el menor costo de ruta de las soluciones
disponibles.

• Se denomina abstracción al proceso de remover información inecesaria de una


representación al formular un problema.

• La abstracción es válida si es posible expander cualquier solución la cual halla sido puesta de
forma abstracta en una solución mas detallada.

• La abstracción es muy útil si al realizar cada una de las acciones de la solución esto facilita el
problema original. La selección de una buena abstracción involucra remover tantos detalles
como sea posible mientras se mantenga la validez y se asegure que las acciones abstractas
son fáciles de llevar a cabo.
ANALISIS ASINTOTICO

 El comportamiento asintótico de una función f(n) se refiere al crecimiento


de f(n) a medida que n crece.
 Por lo general, ignoramos los valores pequeños de n , ya que generalmente
se esta interesado en estimar qué tan lento será el programa en entradas
grandes (cuando n tiende a infinito).
 Una buena regla general es que cuanto más lenta sea la tasa de crecimiento
asintótico, mejor será el algoritmo. Aunque no siempre es cierto.
 Por ejemplo, un algoritmo lineal f (n) = d * n + k siempre es
asintóticamente mejor que uno cuadrático, f (n) = cn² + q .
¿Por qué analizar el rendimiento?
 Hay muchas cosas importantes que deben tenerse en cuenta, como la
facilidad de uso, la modularidad, la seguridad, la capacidad de
mantenimiento, etc.
 ¿Por qué preocuparse por el rendimiento?
La respuesta es simple, se pueden tener todas las cosas anteriores solo si
se tiene rendimiento. Entonces, el rendimiento es como la moneda a través
de la cual se puede comprar todas las cosas anteriores.
 Dados dos algoritmos para una tarea, ¿cómo descubrir cuál es mejor?
Una forma ingenua de hacer esto es: implementar ambos algoritmos y
ejecutar los dos programas en su computadora para diferentes entradas y
ver cuál toma menos tiempo. Hay muchos problemas con este enfoque
para el análisis de algoritmos.
 Es posible que para algunas entradas, el primer algoritmo funcione mejor
que el segundo.Y para algunas entradas, el segundo funciona mejor.
 También podría ser posible que para algunas entradas, el primer algoritmo
funcione mejor en una máquina y el segundo funcione mejor en otra
máquina para algunas otras entradas.
 El análisis asintótico es la gran idea que maneja los problemas anteriores al
analizar algoritmos. En el análisis asintótico, se evalúa el rendimiento de un
algoritmo en términos de tamaño de entrada (no se mide el tiempo de
ejecución real). Se calcula cómo aumenta el tiempo (o espacio) que toma
un algoritmo con el tamaño de entrada.
Ejemplo,
 Considerese el problema de búsqueda (búsqueda de un elemento determinado) en una
matriz ordenada.
 Una forma de buscar es la búsqueda lineal (el orden de crecimiento es lineal) y la otra es la
búsqueda binaria (el orden de crecimiento es logarítmico).
 Para entender cómo el análisis asintótico resuelve los problemas mencionados
anteriormente al analizar algoritmos, suponga que se ejecuta la búsqueda lineal en una
computadora rápida y la búsqueda binaria en una computadora lenta.
 Para valores pequeños de tamaño de matriz de entrada n, la computadora rápida puede
tomar menos tiempo. Pero, después de cierto valor del tamaño de la matriz de entrada,
la búsqueda binaria definitivamente comenzará a tomar menos tiempo en comparación
con la búsqueda lineal a pesar de que la búsqueda binaria se ejecuta en una máquina
lenta.
 La razón es el orden de crecimiento de la búsqueda binaria con respecto al tamaño de
entrada logarítmico, mientras que el orden de crecimiento de la búsqueda lineal es
lineal. Por lo tanto, las constantes dependientes de la máquina siempre se
pueden ignorar después de ciertos valores de tamaño de entrada.
 El Principio de Invarianza permite deducir que no existe una unidad que se deba
utilizar para expresar la eficiencia teórica de un algoritmo. En su lugar, expresaremos
solamente el tiempo requerido por el algoritmo, salvo una constante multiplicativa.
 Por este principio, todas las implementaciones de un mismo algoritmo tienen las
mismas características, aunque la constante multiplicativa pueda cambiar de una
implementación a otra.
 El análisis asintótico no es perfecto, pero esa es la mejor manera disponible para
analizar algoritmos.
 Por ejemplo, supongase que hay dos algoritmos de clasificación que toman 1000 n Log
n y 2n Log n, respectivamente, en una máquina. Ambos algoritmos son asintóticamente
iguales (el orden de crecimiento es n Log n). Entonces, con el análisis asintótico, no se
puedes juzgar cuál es mejor ya que se ignoran las constantes en el análisis asintótico.
Además, en el análisis asintótico, siempre se habla de tamaños de entrada mayores que
un valor constante. Es posible que esas entradas grandes nunca se le den a su software
y que un algoritmo que sea asintóticamente más lento, siempre funcione mejor para su
situación particular.
 Por lo tanto, es posible que se de algún caso raro en particular en el que
termine eligiendo un algoritmo que sea asintóticamente más lento pero más rápido
para su software.
Notaciones
 Las notaciones asintóticas se usan para describir el comportamiento de una función en
el límite, cuando el argumento tiende hacia un valor particular (a menudo infinito),
generalmente en términos de funciones más simples. En la teoría de la complejidad
computacional, la notación Big O (O grande) es de uso común para clasificar los
algoritmos según cómo responden (por ejemplo, en sus requisitos de tiempo de
procesamiento o espacio de trabajo) a los cambios en el tamaño de entrada.
 Una familia de funciones que comparten el mismo comportamiento asintótico será
llamada un Orden de complejidad.
 En matemática, la Notación de Landau, también llamada «o minúscula» y «O
mayúscula», es una notación para la comparación asintótica de funciones, lo que
permite establecer la cota inferior asintótica, la cota superior asintótica y la cota
ajustada asintótica.
 El análisis en si consiste en comparar la función del comportamiento del algoritmo con
funciones «patrón», como ser:
 O(1): Complejidad constante. Cuando las instrucciones se ejecutan una vez.
 O(log n): Complejidad logarítmica. Esta suele aparecer en determinados algoritmos con iteración o
recursión no estructural, ejemplo la búsqueda binaria.
 O(n): Complejidad lineal. Es una complejidad buena y también muy usual. Aparece en la evaluación de
bucles simples siempre que la complejidad de las instrucciones interiores sea constante.
 O(n log n): Complejidad cuasi-lineal. Se encuentra en algoritmos de tipo divide y vencerás como por
ejemplo en el método de ordenación quicksort y se considera una buena complejidad. Si n se duplica, el
tiempo de ejecución es ligeramente mayor del doble.
 O(n2): Complejidad cuadrática. Aparece en bucles o ciclos doblemente anidados. Si n se duplica, el
tiempo de ejecución aumenta cuatro veces. O(n^3): Complejidad cúbica. Suele darse en bucles con triple
anidación. Si n se duplica, el tiempo de ejecución se multiplica por ocho. Para un valor grande de n
empieza a crecer dramáticamente.
 O(np): Complejidad polinómica (a > 3). Si a crece, la complejidad del programa es bastante mala.
 O(cn): Complejidad exponencial. No son útiles en la práctica por el elevadísimo tiempo de ejecución. Se
dan en subprogramas recursivos que contengan dos o más llamadas internas.
ANÁLISIS DE EFICIENCIA DE ALGORITMOS

• Eficiencia en el espacio: el espacio que el algoritmo requiere.

• Eficiencia en el tiempo: que tan rápido es el algoritmo en cuestión de ejecución.

• Tamaño de los parámetros de entrada: se debe de seleccionar una tamaño apropiado


de métrica para cuantificar los parámetros de entrada del algoritmo, ya que esto tiene
influencia sobre las operaciones del algoritmo.
b = [log2 n] + 1
• Unidades de medida para el tiempo de ejecución.
1. Número de veces en el cual cada una de las operaciones del algoritmo se lleva a
cabo.
2. Número de veces que se ejecutan las operaciones básicas (las operaciones que
contribuyen en mayor escala sobre el tiempo de ejecución)
CÁLCULO DEL TIEMPO DE EJECUCIÓN
Consideremos:
cop = tiempo de ejecución de una operación básica en una computadora
C(n) = el número de veces que la operación tiene que se ejecutada por el algoritmo
T(n) » cop C(n)

Que tan rápido podría el algoritmo ejecutarse en una máquina que es 10 veces más
rápida que la computadora actual? 10 veces
Si se asume que C(n) = 1/2n(n-1), cuanto tardará el algoritmo en ejecutarse si se duplica
la cantidad de información de entrada?
C(n) = 1/2n (n-1) = 1/2 n² - 1/2 n » 1/2 n²
y
T(2n) / T(n) » cop C(2n) / cop C(n) » 1/2 (2n)² / 1/2 n² = 4
Valores aproximados para funciones en el análisis de algoritmos:
n Log2 n n Log2 n n2 n3 2n n!

10 3.3 10 3.310 102 103 103 3.6106


102 6.6 102 6.6102 104 106 1.31030 9.310157
103 10 103 1.0103 106 109
104 13 104 1.3104 108 1012
105 17 105 1.7105 1010 1015
106 20 106 2.0106 1012 1018

Algoritmos que requieren de un número exponencial de operaciones son prácticos para solucionar
unicamente problemas de tamaño muy pequeño.
EFICIENCIA: EL MEJOR CASO, EL PEOR CASO Y EL CASO PROMEDIO
Algorithm SequentialSearch(A[0..n-1], K)
// Searches for a given value in a given array by sequential search
//Input: An array a[0..n-1] nd a search key K
//Output: Returns the index of the first element of A that matches K
// or -1 if there are no matching elements
i <- 0
while i < n and A[i] ¹ K do
i <- i + 1
if i < n return i
else return -1

cworst(n) = n
cbest(n) = 1
cavg(n) = p(n + 1) / 2 + n(1 - p); probabilidad de éxito p (0 ≤ p ≤ 1)
NOTACIÓN ASINTÓTICA Y CLASES DE EFICIENCIA

• t(n) = tiempo de ejecución (indicado por el contador de operaciones básicas C(n))

• g(n) = un función sencilla

• De manera informal:
La notación O(g(n)), es el conjunto de todas las funciones con un orden de
crecimiento menor o igual a la de g(n).

Ejemplos:
n ≤ O(n²) 100n + 5 ≤ O(n²) 1/2n(n-1) ≤ O(n²)
• La notación W (g(n)) es el conjunto de todas las funciones con un orden de
crecimiento mayor o igual a la de g(n).

• Ejemplos:
n3 ≥ W (n²), 1/2n(n-1) ≥ W (n²), 100n + 5(n²) ≥ W (n²)

• La notación Q (g(n)) es el conjunto de todas las funciones que tienen el mismo orden
de crecimiento para g(n).

• Ejemplo:
an² + bn + c con a > 0 esta contenida en Q (n²)
COMPLEXITY AND BIG-OH

 T(n) is the complexity of an algorithm.


 O(f(n)) is defined by
T(n) is O(f(n)) if T(n) <= kf(n) for some k, for all n > n0
 What this means is that the growth in time (or space) complexity is bounded above by
some constant times f(n).
 Example
 If T(n) = 2n + 2, then T(n) is O(n) using k = 3 and n0 = 2.
 Use O() to say something about long term behavior of algorithms
 O(n) is always better than O(n2) as n approaches infinity.
COMPLEXITY THEORY
• The theory of classifying problems based on how difficult they are to solve.

• A problem is assigned to the P-problem (polynomial-time) class if the number of steps


needed to solve it is bounded by some power of the problem's size.

• A problem is assigned to the NP-problem (nondeterministic polynomial-time) class if it


permits a nondeterministic solution and the number of steps to verify the solution is
bounded by some power of the problem's size.

• The class of P-problems is a subset of the class of NP-problems, but there also exist
problems which are not NP.
COMPLEXITY THEORY
• If a solution is known to an NP-problem, it can be reduced to a single polynomial-time
verification.

• A problem is NP-complete if an algorithm for solving it can be translated into one for
solving any other NP-problem.

• Examples of NP-complete problems include: the Hamiltonian cycle and traveling


salesman problems.

• Linear programming, thought to be an NP-problem, was shown to actually be a P-


problem by L. Khachian in 1979. It is not known if all apparently NP-problems are
actually P-problems.
Relation between P and NP
TRAVELING SALESMAN PROBLEM

A problem in graph theory requiring the most efficient (i.e., least total distance)
Hamiltonian circuit a salesman can take through each of cities. No general method of
solution is known, and the problem is NP-hard
HAMILTONIAN CIRCUIT

A Hamiltonian circuit, also called a Hamiltonian cycle, Hamilton circuit, or Hamilton cycle,
is a graph cycle (i.e., closed loop) through a graph that visits each node exactly once
(Skiena 1990, p. 196). By convention, the trivial graph on a single node is considered to
posses a Hamiltonian circuit, but the connected graph on two nodes is not. A graph
possessing a Hamiltonian circuit is said to be a
Hamiltonian graph. The Hamiltonian circuit is
named after Sir William Rowan Hamilton, who
devised a puzzle in which such a path along
the polyhedron edges of an dodecahedron
was sought (the Icosian game).
ICOSIAN GAME

The Icosian game, also called the Hamiltonian game (Ball and Coxeter 1987, p. 262), is the
problem of finding a Hamiltonian circuit along the edges of an dodecahedron, i.e., a path
such that every vertex is visited a single time, no edge is visited twice, and the ending
point is the same as the starting point (left figure). The puzzle was distributed
commercially as a pegboard with holes at the nodes of the dodecahedral graph, illustrated
above (right figure). The Icosian Game was invented
in 1857 by William Rowan Hamilton . Hamilton sold
it to a London game dealer in 1859 for 25 pounds,
and the game was subsequently marketed in Europe
in a number of forms (Gardner 1957).
HAMILTONIAN GRAPH

A Hamiltonian graph, also called a


Hamilton graph, is a graph
possessing a Hamiltonian circuit.
By convention, the singleton graph is
considered Hamiltonian, but the
connected graph on two nodes is not.
The numbers of simple Hamiltonian
graphs on nodes for , 2, ... are then
1, 0, 1, 3, 8, 48, 383, 6196, 177083,
... (Sloane's A003216)
TIPOS DE PROBLEMA

 Problemas estandarizados (juegos): ilustran o demuestran varios métodos de


solución de problemas de forma concisa, los cuales pueden ser utilizados por
investigadores para determinar el grado de eficiencia de los mismos.

 Problemas del mundo real es aquel en el cual las soluciones propuestas por la
gente tiene que verificarse y tener cuidado al aplicarlas.
PROBLEMAS ESTANDARIZADOS (JUEGOS)

• Mundo reticular (grid): el ambiente consiste en un arreglo de celdas rectangular de dos


dimensiones, en el cual los agentes realizan movimientos de celda en celda una a la vez.
Generalmente el agente puede hacer movimientos desplazandoze mientras este libre
de obstáculos en forma horizontal o vertical y ocasionalmente en foma diagonal. Las
celdas puedn contener objetos, los cuales el agente puede recolectar, desplazar u otra
acción. Una pared u otro obstaculo impasable ubicado en una celda evitaría moverse a
esa celda.
VACUUM WORLD

https://store.steampowered.com/app/1127180/Robot_Vacuum_Simulator_X/
https://store.steampowered.com/app/1766150/Robo_Vacuum_Simulator/
 INITIAL STATE: Any state can be designated as the initial state.
 ACTIONS: In the two-cell world we defined three actions: Suck, move Left, and move
Right. In a two-dimensional multi-cell world we need more movement actions. We
could add Upward and Downward, giving us four absolute movement actions, or we
could switch to egocentric actions, defined relative to the viewpoint of the agent—for
example, Forward, Backward, TurnRight, and TurnLeft.
 TRANSITION MODEL: Suck removes any dirt from the agent’s cell; Forward moves
the agent ahead one cell in the direction it is facing, unless it hits a wall, in which case
the action has no effect. Backward moves the agent in the opposite direction, while
TurnRight and TurnLeft change the direction it is facing by
 GOAL STATES: The states in which every cell is clean.
 ACTION COST: Each action costs 1.
SOKOBAN PUZZLE

https://www.mathsisfun.com/games/sokoban.html
SLIDING-TILE PUZZLE

https://bijlmakers.com/sliding-puzzle/
 STATES: A state description specifies the location of each of the tiles.
 INITIAL STATE: Any state can be designated as the initial state. Note that a parity
property partitions the state space—any given goal can be reached from exactly half of
the possible initial states.
 ACTIONS: While in the physical world it is a tile that slides, the simplest way of
describing an action is to think of the blank space moving Left, Right, Up, or Down. If
the blank is at an edge or corner then not all actions will be applicable.
 TRANSITION MODEL: Maps a state and action to a resulting state; for example, if
we apply Left to the start state, the resulting state has the 5 and the blank switched.
 GOAL STATE: Although any state could be the goal, we typically specify a state with
the numbers in order.
 ACTION COST: Each action costs 1.
PROBLEMAS DEL MUNDO REAL

 Problema de Determinación de Ruta (Route-finding problem)


 Problemas de Turismo - Traveling Salesperson Problem (TSP)
 Distribución VLSI (VLSI Layout)
 Movimiento de robots
 Secuencias de ensamble automático
 Diseño de proteínas
 Búsqueda en internet.
PROBLEMA DE DETERMINACIÓN DE RUTA
 STATES: Each state obviously includes a location (e.g., an airport) and the current
time. Furthermore, because the cost of an action (a flight segment) may depend on
previous segments, their fare bases, and their status as domestic or international, the
state must record extra information about these “historical” aspects.
 INITIAL STATE: The user’s home airport.
 ACTIONS: Take any flight from the current location, in any seat class, leaving after the
current time, leaving enough time for within-airport transfer if needed.
 TRANSITION MODEL: The state resulting from taking a flight will have the flight’s
destination as the new location and the flight’s arrival time as the new time.
 GOAL STATE: A destination city. Sometimes the goal can be more complex, such as
“arrive at the destination on a nonstop flight.”
 ACTION COST: A combination of monetary cost, waiting time, flight time, customs
and immigration procedures, seat quality, time of day, type of airplane, frequent-flyer
reward points, and so on.
ALGORITMOS DE BUSQUEDA

 Un algoritmo de búsqueda toma un problema como entrada y regresa una solución o


un aviso de fallá (no éxito).

 Dada la formulación de un problema es necesario poder resolverlo, esto puede ser


posible a través de la búsqueda.

 Un árbol de búsqueda es generado a partir del estado inicial y la función sucesor que
define el espacio de estados.

 La diferencia entre “Gráfica de búsqueda” y “Arbol de búsqueda”, se refiere a que en


una gráfica un nodo puede ser alcanzado por varias rutas y en un árbol no.
 La raíz de un árbol de búsqueda es un nodo de búsqueda y corresponde al
estado inicial.
• Si se ha verificado que el nodo actual no es el nodo objetivo, es necesario
considerar otros estados, para esto es necesario expander el estado actual
aplicando la función sucesor al estado actual para generar el conjunto de
nuevos estados.

• Si no se ha llegado al nodo objetivo es necesario continuar expandiendo y


generando nuevos estados hasta que se encuentre la solución o no se tengan
mas estados.

• La selección de cual estado debe de ser expandido esta determinado por la


estrategia de búsqueda.

• En un “Espacio de estados”, se tiene un número infinito de rutas, ya que esta


en función del número de estados y un “Arbol de búsqueda” posee un
número infinito de nodos.
• Un nodo es una estructura de datos la cual posee cinco componentes:
• Estado: es el estado en el espacio de estados
• Nodo-Padre: es el nodo en el árbol de búsqueda que genero este nodo.
• Acción: la acción aplicada en el nodo padre para generar este nodo.
• Ruta de costo: es el costo (g(n)) de la ruta desde el estado inicial hasta el
nodo.
• Profundidad: el número de pasos a través de ruta desde el estado inicial.

• Un nodo es una referencia de una estructura de datos utilizada para representar


un árbol de búsqueda y un estado corresponde a la configuración del mundo
actual.

• La colección de nodos generados pero no expandidos se denomina lindero


(fringe), y cada elemento del lindero se le llama nodo hoja.
• Una colección de nodos se implementa utilizando colas (queue). Las
operaciones que pueden realizarse sobre una cola son:
• Crear_Cola(elem1, elem2, ...), crear la cola con los elementos dados.
• Cola_Vacía(cola), regresa verdadero si la no hay elementos en la cola
• Primero(cola), regresa el primer elemento de la cola
• Remover_Primer_Elemento(cola), regresa Primero(cola) y lo remueve de
la cola
• Insertar(elemento, cola), inserta un elemento en la cola y regresa la cola
resultante.
• Inserta_Todos(elementos, cola), inserta el conjunto de elementos en la
cola y regresa la cola resultante.

El resultado de un algoritmo de solución de problemas puede ser un fracaso o


una solución, algunos algoritmos pueden entrar en un ciclo infinito y nunca
proporcionar una solución.
• Un algoritmo se puede evaluar con cuatro elementos:
• Completez: esta el algoritmo garantizado para encontrar una solución,
cuando esta exista?
• Optima: la estrategia para encontrar la solución es la óptima?
• Complejidad en el tiempo: cuanto tiempo se tarda en encontrar una
solución?
• Complejidad en el espacio: cuanta memoria se requiere para realizar la
búsqueda?

• La complejidad se representa en función de tres cantidades:


• b, factor de ramificación o número máximo de sucesores en cualquier nodo
• d, profundidad hasta el nodo objetivo de menor profundidad
• m, la longitud máxima de cualquier ruta en el espacio de estados.
• Para ser completo un algoritmo de búsqueda debe de ser sistematico en el
sentido en el cual este explora el estado infinito de estados, buscando
asegurar que eventualmente sea posible alcanzar algún estado que este
conectado con el estado inicial.

• Por ejemplo, si se considera una retícula infinita, un tipo de búsqueda


sistemática es una ruta en espiral que cubre todas las celdas que se
encuentran cercanas al origen hasta las que se encuentran mas alejadas.
Desfortunadamente, en un espacio infinito de estados sin solución, un
algoritmo se mantendría buscandpo de forma infinita, debido a que no sabe so
el siguiente estado es el estado objetio.

• Un estado repetido se presenta cuando este vuelve a aparecer en el arból de


búsqueda, el cual puede ser generado a causa de un ciclo (cycle o loopy
path)..
• Se debe de considerar que pese a que se tenga un número finito de estado, el
arból de búsqueda es infinito debido a que no hay límite en el número de veces que
el agente pueda atravesar el ciclo.

• Un ciclo es un caso especial de ruta redundante. Donde esta puede representar no


la mejor ruta, y debería no ser considerada cuando se esta en la búsqueda de rutas
óptimas.

• Hay tres tipos de colas que son utilizadas en los algoritmos de búsqueda:
• Cola con prioridad (priority queue): abre primero el nodo con el costo mínimo
de acuerdo con alguna función de evaluación f, se usa en la búsqueda de mejor
primero
• Cola FIFO (primero en entrar primero en salir): abre primero el nodo que se
agregó a la cola en primer lugar; se utiliza en la búsqueda primero en amplitud.
• Cola LIFO (Ultimo en entar primero en salir o pila o stack): abre primero el
nodo agregado más recientemente; se usa en la búsqueda primero en
profundidad.
• Para lograr la efectividad de un algoritmo de búsqueda, se debe de considerar
el costo de búsqueda, el cual depende regularmente de la complejidad del
tiempo, pero también se incluye en términos de utilización de memoria.

• EL costo total incluye el costo de búsqueda y el costo de ruta de la solución


encontrada.
TYPES OF SEARCH ALGORITHMS

 Based on the search problems we can classify the search algorithms into:
 Uninformed search (Blind search) algorithms.
 Informed search (Heuristic search) algorithms.
Search Algorithms

Uninformed / Blind Informed

Breadth First Search (BFS) Depth Limited Search Best First Search

Uniform Cost Search Iterative Deepinig DFS A* Search

Depth First Search (DFS) Bidirectional Search


UNINFORMED / BLIND ALGORITHMS

 Not contain any domain knowledge such as closeness, the location of the goal.

 It operates in a brute-force way as it only includes information about how to traverse


the tree and how to identify leaf and goal nodes.

 Uninformed search applies a way in which search tree is searched without any
information about the search space like initial state operators and test for the goal,
so it is also called blind search. It examines each node of the tree until it achieves the
goal node.
BREADTH FIRST SEARCH (BFS)

 Breadth-first search is the most common search strategy for traversing a tree
or graph. This algorithm searches breadthwise in a tree or graph, so it is
called breadth-first search.
 BFS algorithm starts searching from the root node of the tree and expands all
successor node at the current level before moving to nodes of next level.
 The breadth-first search algorithm is an example of a general-graph search
algorithm.
 Breadth-first search implemented using FIFO queue data structure.
BREADTH FIRST SEARCH (BFS)

 Advantages:
 BFS will provide a solution if any solution exists.
 If there are more than one solutions for a given problem, then BFS will
provide the minimal solution which requires the least number of steps.

 Disadvantages:
 It requires lots of memory since each level of the tree must be saved into
memory to expand the next level.
 BFS needs lots of time if the solution is far away from the root node.
BREADTH FIRST SEARCH (BFS)

 However, we can get additional efficiency with a couple of tricks. A first-in-first-


out queue will be faster than a priority queue, and will give us the correct
order of nodes: new nodes (which are always deeper than their parents) go to
the back of the queue, and old nodes, which are shallower than the new
nodes, get expanded first. In addition, reached can be a set of states rather
than a mapping from states to nodes, because once we’ve reached a state,
we can never find a better path to the state.

 That also means we can do an early goal test, checking whether a node is a
solution as soon as it is generated, rather than the late goal test that best-first
search uses, waiting until a node is popped off the queue.
BREADTH FIRST SEARCH (BFS)
BREADTH FIRST SEARCH (BFS)

 The tree structure, shown the traversing of the


tree using BFS algorithm from the root node S
to goal node K.
 BFS search algorithm traverse in layers, so it
will follow the path which is shown by the
dotted arrow, and the traversed path will be:
 S-> A->B->C->D->G->H->E->F->I->K
BREADTH FIRST SEARCH (BFS)

 Time Complexity: Time Complexity of BFS algorithm can be obtained by the


number of nodes traversed in BFS until the shallowest Node. Where the d=
depth of shallowest solution and b is a node at every state.
T (b) = 1 + 𝑏 2 + 𝑏 3 +....... + bd = O(bd)
 Space Complexity: Space complexity of BFS algorithm is given by the
Memory size of frontier which is O(bd).
 Completeness: BFS is complete, which means if the shallowest goal node is
at some finite depth, then BFS will find a solution.
 Optimality: BFS is optimal if path cost is a non-decreasing function of the
depth of the node.
UNIFORM-COST SEARCH OR DIJKSTRA’S ALGORITHM

 When actions have different costs, an obvious choice is to use best-first


search where the evaluation function is the cost of the path from the root to
the current node.
 This is called Dijkstra’s algorithm by the theoretical computer science
community, and uniform-cost search by the AI community.
 The idea is that while breadth-first search spreads out in waves of uniform
depth-first depth 1, then depth 2, and so on .uniform-cost search spreads out
in waves of uniform path-cost.
 The algorithm can be implemented as a call to BEST-FIRSTSEARCH with
PATH-COST as the evaluation function.
UNIFORM-COST SEARCH

 Uniform-cost search is a searching algorithm used for traversing a weighted


tree or graph. This algorithm comes into play when a different cost is available
for each edge. The primary goal of the uniform-cost search is to find a path to
the goal node which has the lowest cumulative cost.
 Uniform-cost search expands nodes according to their path costs form the
root node.
 It can be used to solve any graph/tree where the optimal cost is in demand.
 A uniform-cost search algorithm is implemented by the priority queue. It gives
maximum priority to the lowest cumulative cost.
 Uniform cost search is equivalent to BFS algorithm if the path cost of all
edges is the same.
UNIFORM-COST SEARCH

 Advantages:
 Uniform cost search is optimal because at every state the path with the
least cost is chosen.

 Disadvantages:
 It does not care about the number of steps involve in searching and only
concerned about path cost. Due to which this algorithm may be stuck in an
infinite loop.
UNIFORM-COST SEARCH

 The tree structure, shown the


traversing of the tree using
Uniform-Cost Search algorithm
from the root node S to goal
node G.
 The traversed path will be:
 S-> A->D->G
UNIFORM-COST SEARCH

 Completeness: Uniform-cost search is complete, such as if there is a


solution, UCS will find it.
 Time Complexity: let C* is Cost of the optimal solution, and ε is each step to
get closer to the goal node. Then the number of steps is = C*/ε+1. Here we
have taken +1, as we start from state 0 and end to C*/ε.
Hence, the worst-case time complexity of Uniform-cost search is
O(b1 + [C*/ε])/.
 Space Complexity: the same logic is for space complexity so, the worst-
case space complexity of Uniform-cost search is O(b1 + [C*/ε]).
 Optimal: Uniform-cost search is always optimal as it only selects a path with
the lowest path cost.
UNIFORM-COST SEARCH

 The complexity of uniform-cost search is characterized in terms of the cost of


the optimal solution, and a lower bound on the cost of each action, with ε.
Then the algorithm’s worst-case time and space complexity is which can be
much greater than ε. This is because uniform-cost search can explore large
trees of actions with low costs before exploring paths involving a high-cost
and perhaps useful action. When all action costs are equal, is just and
uniform-cost search is similar to breadth-first search.
 Uniform-cost search is complete and is cost-optimal, because the first solution
it finds will have a cost that is at least as low as the cost of any other node in
the frontier. Uniform-cost search considers all paths systematically in order of
increasing cost, never getting caught going down a single infinite path
(assuming that all action costs are ε).
UNIFORM-COST SEARCH
DEPTH FIRST SEARCH (DFS)

 Depth-first search is a recursive algorithm


for traversing a tree or graph data
structure.
 It is called the depth-first search because it
starts from the root node and follows each
path to its greatest depth node before
moving to the next path.
 DFS uses a stack data structure for its
implementation.
 The process of the DFS algorithm is similar
to the BFS algorithm.
DEPTH FIRST SEARCH (DFS)

 Advantage:
 DFS requires very less memory as it only needs to store a stack of the
nodes on the path from root node to the current node.
 It takes less time to reach to the goal node than BFS algorithm (if it
traverses in the right path).
 Disadvantage:
 There is the possibility that many states keep re-occurring, and there is no
guarantee of finding the solution.
 DFS algorithm goes for deep down searching and sometime it may go to
the infinite loop.
DEPTH FIRST SEARCH (DFS)

 In the search tree, we have shown the


flow of depth-first search, and it will
follow the order as:
Root node-->Left node --> right node.
 It will start searching from root node S,
and traverse A, then B, then D and E,
after traversing E, it will backtrack the
tree as E has no other successor and
still goal node is not found. After
backtracking it will traverse node C
and then G, and here it will terminate
as it found goal node.
DEPTH FIRST SEARCH (DFS)

 Completeness: DFS search algorithm is complete within finite state space as it will
expand every node within a limited search tree.
 Time Complexity: Time complexity of DFS will be equivalent to the node traversed by
the algorithm. It is given by:
T(n)= 1+ 𝑛2 + 𝑛3 + ......... + nm = O(nm)
Where, m = maximum depth of any node and this can be much larger than d
(Shallowest solution depth)
 Space Complexity: DFS algorithm needs to store only single path from the root node,
hence space complexity of DFS is equivalent to the size of the fringe set, which is
O(bm).
 Optimal: DFS search algorithm is non-optimal, as it may generate a large number of
steps or high cost to reach to the goal node.
DEPTH-LIMITED SEARCH

 To keep depth-first search from wandering down an infinite path, is possible use depth-
limited search algorithm, a version of depth-first search in which we supply a depth limit
ℓ, and treat all nodes at depth as if they had no successors.

 The time complexity is O(𝑏 ℓ ) and the space complexity is O(𝑏ℓ).

 Unfortunately, if a poor selection of ℓ is made the algorithm will fail to reach the
solution, making it incomplete again.
DEPTH-LIMITED SEARCH

 Since depth-first search is a tree-like search, it is not possible to waste time on


redundant paths in a general way, but is possible eliminate cycles at the cost of some
computation time. If look only a few links up in the parent chain it is possible catch
most cycles; longer cycles are handled by the depth limit.

 Sometimes a good depth limit can be chosen based on knowledge of the problem. This
number, known as the diameter of the state-space graph, and gives us a better depth
limit, which leads to a more efficient depth-limited search.

 However, for most problems we will not know a good depth limit until we have solved
the problem.
DEPTH-LIMITED SEARCH
 Depth-limited search returns
one of three different types of
values: either a solution node;
or failure, when it has
exhausted all nodes and
proved there is no solution at
any depth; or cutoff, to mean
there might be a solution at a
deeper depth than ℓ.
 This a tree-like search algorithm that doe not keep track of reached states, and thus
uses much less memory than best-first search, but runs the risk of visiting the same
state multiple times on different paths. Also, if the IS-CYCLE check does not check all
cycles, then the algorithm may get caught in a loop.
ITERATIVE DEEPENING SEARCH

 The iterative deepening algorithm is a combination of DFS and BFS algorithms. This
search algorithm finds out the best depth limit and does it by gradually increasing the
limit until a goal is found.
 This algorithm performs depth-first search up to a certain "depth limit", and it keeps
increasing the depth limit after each iteration until the goal node is found.
 This Search algorithm combines the benefits of Breadth-first search's fast search and
depth-first search's memory efficiency.
 The iterative search algorithm is useful uninformed search when search space is large,
and depth of goal node is unknown.
ITERATIVE DEEPENING SEARCH

 Advantages:
 It combines the benefits of BFS and DFS search algorithm in terms of fast
search and memory efficiency.

 Disadvantages:
 The main drawback of IDDFS is that it repeats all the work of the previous
phase.
 Four iterations of iterative
deepening search for goal on
a binary tree, with the depth
limit varying from 0 to 3.
 Note the interior nodes form a
single path.
 The triangle marks the node to
expand next; green nodes with
dark outlines are on the
frontier; the very faint nodes
provably can’t be part of a
solution with this depth limit.
ITERATIVE DEEPING SEARCH
ITERATIVE DEEPENING SEARCH
 Following tree structure is showing the
iterative deepening depth-first search. IDDFS
algorithm performs various iterations until it
does not find the goal node.
 The iteration performed by the algorithm is
given as:
1'st Iteration-----> A
2'nd Iteration----> A, B, C
3'rd Iteration------>A, B, D, E, C, F, G

In the third iteration, the algorithm will find the


goal node.
ITERATIVE DEEPENING SEARCH
 Completeness: This algorithm is complete is if the branching factor is finite.

 Time Complexity: Let's suppose b is the branching factor and depth is d then the
worst-case time complexity is O(𝑏 ℓ ) when there is a solution, or O(𝑏 𝑚 ) when there is
none.

 Space Complexity: The space complexity of IDDFS will be O(bd), when there is a
solution, or O(bm) when there is none.

 Optimal: IDDFS algorithm is optimal if path cost is a non- decreasing function of the
depth of the node.
ITERATIVE DEEPENING SEARCH
 Hybrid approach that runs breadth-first search until almost all the available
memory is consumed, and then runs iterative deepening from all the nodes in
the frontier.
 In general, iterative deepening is the preferred uninformed search method when
the search state space is larger than can fit in memory and the depth of the
solution is not known.
BIDIRECTIONAL SEARCH
 Bidirectional search simultaneously searches forward from the initial state and
backwards from the goal state(s), hoping that the two searches will meet. The
𝑑ൗ 𝑑ൗ
motivation is that 𝑏 + 𝑏 2 is much less than 𝑏 𝑑 .
2

(e.g., 50,000 times less when b = d = 10).

 For this to work, its necessary to keep track of two frontiers and two tables of
reached states, and we need to be able to reason backwards: if state s’ is a
successor of in the forward direction, then we need to know that is a successor
of s’ in the backward direction. We have a solution when the two frontiers
collide.
BIDIRECTIONAL SEARCH
 Bidirectional search algorithm runs two simultaneous searches, one form initial
state called as forward-search and other from goal node called as backward-
search, to find the goal node. Bidirectional search replaces one single search
graph with two small subgraphs in which one starts the search from an initial
vertex and other starts from goal vertex. The search stops when these two
graphs intersect each other.

 Bidirectional search can use search techniques such as BFS, DFS, DLS, etc.
BIDIRECTIONAL SEARCH
 Advantages:
 Bidirectional search is fast.
 Bidirectional search requires less memory

 Disadvantages:
 Implementation of the bidirectional search tree is difficult.
 In bidirectional search, one should know the goal state in advance.
BIDIRECTIONAL SEARCH
 Bidirectional best-first search keeps two
frontiers and two tables of reached
states.
 When a path in one frontier reaches a
state that was also reached in the other
half of the search, the two paths are
joined (by the function JOIN-NODES) to
form a solution.
 The first solution we get is not
guaranteed to be the best; the function
TERMINATED determines when to stop
looking for new solutions.
BIDIRECTIONAL SEARCH
 This algorithm divides one
graph/tree into two sub-graphs. It
starts traversing from node 1 in the
forward direction and starts from
goal node 16 in the backward
direction.
BIDIRECTIONAL SEARCH

 Completeness: Bidirectional Search is complete if we use BFS in both


searches.

 Time Complexity: Time complexity of bidirectional search using BFS is O(bd).

 Space Complexity: Space complexity of bidirectional search is O(bd).

 Optimal: Bidirectional search is Optimal.


DEPTH-FIRST SEARCH (BACKTRACKING)
 A variant of depth-first search called backtracking search uses even less memory. In
backtracking, only one successor is generated at a timerather than all successors; each
partially expanded node remembers which successor to generate next.

 In addition, successors are generated by modifying the current state description


directly rather than allocating memory for a brand-new state. This reduces the memory
requirements to just one state description and a path of O(m) actions; a significant
savings over O(bm) states for depth-first search.

 With backtracking we also have the option of maintaining an efficient set data structure
for the states on the current path, allowing us to check for a cyclic path O(1) in time
rather than O(m).
DEPTH-FIRST SEARCH (BACKTRACKING)
 Space complexity of O(1) means that the space required by the algorithm
to process data is constant; it does not grow with the size of the data on
which the algorithm is operating. Comment hidden because of low score.

 For backtracking to work, we must be able to undo each action when we


backtrack. Backtracking is critical to the success of many problems with large
state descriptions, such as robotic assembly.
UNINFORMED / BLIND ALGORITHMS
INFORMED SEARCH ALGORITHMS hms

 Informed search algorithms use domain knowledge. In an informed search,


problem information is available which can guide the search.
 Informed search strategies can find a solution more efficiently than an
uninformed search strategy.
 Informed search algorithm contains an array of knowledge such as how far we
are from the goal, path cost, how to reach to goal node, etc. This knowledge
help agents to explore less to the search space and find more efficiently the
goal node.
 The informed search algorithm is more useful for large search space.
 Informed search can solve much complex problem which could not be solved in
another way.
HEURISTICShms

 Informed search algorithm uses the idea of heuristic, so it is also called


Heuristic search.

 Heuristics function: Heuristic is a function which is used in Informed Search,


and it finds the most promising path. It takes the current state of the agent as
its input and produces the estimation of how close agent is from the goal.

 The heuristic method, however, might not always give the best solution, but it
guaranteed to find a good solution in reasonable time.
HEURISTICShms

 Heuristic function estimates how close a state is to the goal. It is represented


by h(n), and it calculates the cost of an optimal path between the pair of states.
The value of the heuristic function is always positive.
 h(n) = estimated cost of the cheapest path from the state at node n to a goal state.

 Admissibility of the heuristic function is given as:


 h(n) <= h*(n)
 Here h(n) is heuristic cost, and h*(n) is the estimated cost. Hence heuristic cost
should be less than or equal to the estimated cost.
PURE HEURISTIC SEARCH

 Pure heuristic search is the simplest form of heuristic search algorithms. It


expands nodes based on their heuristic value h(n). It maintains two lists, OPEN
and CLOSED list. In the CLOSED list, it places those nodes which have
already expanded and in the OPEN list, it places nodes which have yet not
been expanded.

 On each iteration, each node n with the lowest heuristic value is expanded and
generates all its successors and n is placed to the closed list. The algorithm
continues until a goal state is found.
INFORMED SEARCH ALGORITHMS

 Best First Search (Greedy search)


 A* Search
 Iterative Deeping A* Search (IDA*)
 Recursive Best First Search (RFBS)
 Beam Search
 Weighted A*
GREEDY BEST-FIRST SEARCHhms

 Greedy best-first search is a form of best-first search that expands first the node with the
lowest h(n) value—the node that appears to be closest to the goal—on the grounds that
this is likely to lead to a solution quickly.
 The evaluation function: f(n) = h(n). h(n)= estimated cost from node n to the goal.
 Greedy best-first search algorithm always selects the path which appears best at that
moment. It is the combination of depth-first search and breadth-first search algorithms. It
uses the heuristic function and search. Best-first search allows us to take the advantages of
both algorithms. With the help of best-first search, at each step, it’s possible choose the
most promising node. In the best first search algorithm, it expand the node which is
closest to the goal node and the closest cost is estimated by heuristic function
 The greedy best first algorithm is implemented by the priority queue.
GREEDY BEST-FIRST SEARCHhms
 Advantages:
 Best first search can switch between BFS and DFS by gaining the advantages of both
the algorithms.
 This algorithm is more efficient than BFS and DFS algorithms.

 Disadvantages:
 It can behave as an unguided depth-first search in the worst case scenario.
 It can get stuck in a loop as DFS.
 This algorithm is not optimal.
GREEDY BEST-FIRST SEARCH ALGORITHMhms
 Step 1: Place the starting node into the OPEN list.
 Step 2: If the OPEN list is empty, Stop and return failure.
 Step 3: Remove the node n, from the OPEN list which has the lowest value of h(n), and places
it in the CLOSED list.
 Step 4: Expand the node n, and generate the successors of node n.
 Step 5: Check each successor of node n, and find whether any node is a goal node or not. If
any successor node is goal node, then return success and terminate the search, else proceed
to Step 6.
 Step 6: For each successor node, algorithm checks for evaluation function f(n), and then check
if the node has been in either OPEN or CLOSED list. If the node has not been in both list,
then add it to the OPEN list.
 Step 7: Return to Step 2.
GREEDY BEST-FIRST SEARCHhms
 At each iteration, each node is Node H(n)
expanded using evaluation A 12
function f(n) = h(n) , which is B 4
given in the table. C 7
D 3
E 8
F 2
H 4
I 9
S 13
G 0
GREEDY BEST-FIRST SEARCHhms
 In this search example, we are using two lists which
are OPEN and CLOSED Lists. Following are the
iteration for traversing the above example.
 Expand the nodes of S and put in the
CLOSED list
Initialization: Open [A, B], Closed [S]
Iteration 1: Open [A], Closed [S, B]
Iteration 2: Open [E, F, A], Closed [S, B]
: Open [E, A], Closed [S, B, F]
Iteration 3: Open [I, G, E, A], Closed [S, B, F]
: Open [I, E, A], Closed [S, B, F, G]
Hence the final solution path will be: S-> B->F-> G
GREEDY BEST-FIRST SEARCHhms
 Time Complexity: The worst case time complexity of Greedy best first search is
O(bm).

 Space Complexity: The worst case space complexity of Greedy best first search is
O(bm). Where, m is the maximum depth of the search space.

 Complete: Greedy best-first search is also incomplete, even if the given state space is
finite.

 Optimal: Greedy best first search algorithm is not optimal.


STRAIGHT-LINE DISTANCE HEURISTICs
 Straight-line distance ℎ𝑆𝐿𝐷 , is the straight line distance between two points.

 The values of ℎ𝑆𝐿𝐷 cannot be computed from the problem description itself (that
is, the ACTIONS and RESULT functions). It takes a certain amount of world
knowledge to know that.

 Example Route-finding problem (Romania)


STRAIGHT-LINE DISTANCE HEURISTIChms

Arad 366 Mehadia 241


Bucharest 0 Neamt 234
Caraiova 160 Orada 380
DrobetaEforie 242 Pitesti 100
Eforie 161 Rimnicu Vilcea 193
Faragas 176 Sibiu 253
Giurgiu 77 Timisora 329
Hirsova 151 Urziceni 80
Iasi 226 Vaslui 199
Lugoj 244 Zerind 374
 Straight-line distance, Stages in a
greedy best-first tree-like search
for Bucharest with the straight-
line distance heuristic ℎ𝑆𝐿𝐷 .
 Nodes are labeled with their -
values.
 Greedy best-first graph search is
complete in finite state spaces, but
not in infinite ones. The worst-
case time and space complexity is
O( 𝑉 ). With a good heuristic
function, however, the complexity
can be reduced substantially, on
certain problems reaching O(bm).
A* SEARCH ALGORITHM

 A* search is the most commonly known form of best-first search. It uses


heuristic function h(n), and cost to reach the node n from the start state g(n). It
has combined features of UCS and greedy best-first search, by which it solve
the problem efficiently.

 A* search algorithm finds the shortest path through the search space using the
heuristic function.

 This search algorithm expands less search tree and provides optimal result
faster. A* algorithm is similar to UCS except that it uses g(n)+h(n) instead of
g(n).
A* SEARCH ALGORITHM

 In A* search algorithm, we use search heuristic as well as the cost to reach the node.
Hence we can combine both costs as following, and this sum is called as a fitness
number.

 At each point in the search space, only those node is expanded which have the lowest
value of f(n), and the algorithm terminates when the goal node is found.
A* SEARCH ALGORITHM

 Advantages:
 A* search algorithm is the best algorithm than other search algorithms.
 A* search algorithm is optimal and complete.
 This algorithm can solve very complex problems.
 Disadvantages:
 It does not always produce the shortest path as it mostly based on heuristics and
approximation.
 A* search algorithm has some complexity issues.
 The main drawback of A* is memory requirement as it keeps all generated nodes in
the memory, so it is not practical for various large-scale problems.
A* SEARCH ALGORITHM

 Step1: Place the starting node in the OPEN list.


 Step 2: Check if the OPEN list is empty or not, if the list is empty then return failure
and stops.
 Step 3: Select the node from the OPEN list which has the smallest value of evaluation
function (g+h), if node n is goal node then return success and stop, otherwise
 Step 4: Expand node n and generate all of its successors, and put n into the closed list.
For each successor n', check whether n' is already in the OPEN or CLOSED list, if not
then compute evaluation function for n' and place into Open list.
 Step 5: Else if node n' is already in OPEN and CLOSED, then it should be attached to
the back pointer which reflects the lowest g(n') value.
 Step 6: Return to Step 2.
A* SEARCH ALGORITHM

 The heuristic value of all states is Node h(n)


given in the table so we will calculate S 5
the f(n) of each state using the A 3
formula f(n)= g(n) + h(n), where g(n) B 4
is the cost to reach any node from C 2
start state. D 6
G 0
 Here we will use OPEN and
CLOSED list.
A* SEARCH ALGORITHM

 Solution:
Initialization: {(S, 5)}
Iteration1: {(S-> A, 4), (S-->G, 10)}
Iteration2: {(S-> A-->C, 4), (S--> A-->B, 7), (S-->G, 10)}
Iteration3: {(S-> A->C->G, 6), (S-> A->C->D, 11),
(S-> A->B, 7), (S->G, 10)}
Iteration 4 will give the final result, as S->A->C->G it
provides the optimal path with cost 6.
A* SEARCH ALGORITHM

 Complete: A* algorithm is complete as long as:


 Branching factor is finite.
 Cost at every action is fixed.
 Optimal: A* search algorithm is optimal if it follows below two conditions:
 Admissible: the first condition requires for optimality is that h(n) should be an
admissible heuristic for A* tree search. An admissible heuristic is optimistic in
nature.
 Consistency: Second required condition is consistency for only A* graph-search.
 If the heuristic function is admissible, then A* tree search will always find the least
cost path.
A* SEARCH ALGORITHM

 Time Complexity: The time complexity of A* search algorithm depends on heuristic


function, and the number of nodes expanded is exponential to the depth of solution d.
So the time complexity is O(b^d), where b is the branching factor.

 Space Complexity: The space complexity of A* search algorithm is O(b^d)


STRAIGHT-LINE DISTANCE HEURISTIChms

Arad 366 Mehadia 241


Bucharest 0 Neamt 234
Caraiova 160 Orada 380
DrobetaEforie 242 Pitesti 100
Eforie 161 Rimnicu Vilcea 193
Faragas 176 Sibiu 253
Giurgiu 77 Timisora 329
Hirsova 151 Urziceni 80
Iasi 226 Vaslui 199
Lugoj 244 Zerind 374
 Stages in an A* search for
Bucharest. Nodes are
labeled with f = g + h.
The values are the
straight-line distances to
Bucharest taken from
each city.

Arad 366 Mehadia 241


Bucharest 0 Neamt 234
Caraiova 160 Orada 380
DrobetaEforie 242 Pitesti 100
Eforie 161 Rimnicu Vilcea 193
Faragas 176 Sibiu 253
Giurgiu 77 Timisora 329
Hirsova 151 Urziceni 80
Iasi 226 Vaslui 199
Lugoj 244 Zerind 374
 Stages in an A* search for
Bucharest. Nodes are
labeled with f = g + h. The
values are the straight-line
distances to Bucharest
taken from each city.
A* SEARCH ALGORITHM

 A* algorithm returns the path which occurred first, and it does not search for all
remaining paths.
 The efficiency of A* algorithm depends on the quality of heuristic.
 A* algorithm expands all nodes which satisfy the condition f(n).
 Whether A* is cost-optimal depends on certain properties of the heuristic. A key
property is admissibility: an admissible heuristic is one that never overestimates the
cost to reach a goal. (An admissible heuristic is therefore optimistic.) With an
admissible heuristic, A* is cost-optimal.
 A slightly stronger property is called consistency. A heuristic h(n) is consistent if, for
every node n and every successor n’ of generated by an action:
ℎ 𝑛 ≤ 𝑐 𝑛, 𝑎, 𝑛′ + ℎ(𝑛′ )
SEARCH CONTOURS

 A useful way to visualize a search is to draw contours in the state space, just like the
contours in a topographic map.
 Inside the contour labeled 𝑓 𝑛 = 𝑔 𝑛 +
ℎ 𝑛 ≤ 400, all nodes have and so on. Then,
because A* expands the frontier node of
lowest -cost, we can see that an A* search
fans out from the start node, adding nodes
in concentric bands of increasing-cost.
 Map of Romania showing contours at f=380,
f=400, and f=420 with Arad as the start
state. Nodes inside a given contour have
costs less than or equal to the contour
value.
MONOTONIC

 Monotonic: the path cost always increases as you go along a path, because action costs
are always positive.

 Strictly monotonic for costs that always increase, and “monotonic” for costs that never
decrease, but might remain the same.

 The term “monotonic heuristic” is a synonym for “consistent heuristic”.


A* SEARCH ALGORITHM

 If C* is the cost of the optimal solution path, then we can say the following:
 A* expands all nodes that can be reached from the initial state on a path where
every node on the path has 𝑓 𝑛 < 𝐶 ∗ . We say these are surely expanded nodes.
 A* might then expand some of the nodes right on the “goal contour” (where
𝑓 𝑛 = 𝐶 ∗ ) before selecting a goal node.
 A* expands no nodes with 𝑓 𝑛 > 𝐶 ∗ .
 A* with a consistent heuristic is optimally efficient in the sense that any algorithm that
extends search paths from the initial state, and uses the same heuristic information,
must expand all nodes that are surely expanded by A* (because any one of them could
have been part of an optimal solution).
A* SEARCH ALGORITHM

 A* is efficient because it prunes away search tree nodes that are not necessary for
finding an optimal solution.

 A* search is complete, cost-optimal, and optimally efficient among all such algorithms is
rather satisfying. Unfortunately, it does not mean that A* is the answer to all our
searching needs. The catch is that for many problems, the number of nodes expanded
can be exponential in the length of the solution.
SATISFICING SEARCH

 A* search has many good qualities, but it expands a lot of nodes.


 Few nodes can be explore (taking less time and space) if we are willing to accept
solutions that are suboptimal, but are “good enough”—what we call satisficing
solutions.
 If we allow A* search to use an inadmissible heuristic—one that may overestimate—
then we risk missing the optimal solution, but the heuristic can potentially be more
accurate, thereby reducing the number of nodes expanded.
 Detour index, which is a multiplier applied to the straight-line distance to account for
the typical curvature of roads. A detour index of 1.3 means that if two cities are 10
miles apart in straight-line distance, a good estimate of the best path between them is
13 miles. For most localities, the detour index ranges between 1.2 and 1.6.
WEIGHTED A*

 Weighted A* search weight the heuristic value more heavily, giving us the evaluation
function.
 Weighted A* search finds a solution that is slightly costlier, but the search time is much
faster.
 Weighted search focuses the contour of reached states towards a goal. That means
that fewer states are explored, but if the optimal path ever strays outside of the
weighted search’s contour (as it does in this case), then the optimal path will not be
found.
 In general, if the optimal solution costs C*, a weighted A* search will find a solution
that costs somewhere between C*, and W × 𝐶 ∗ ; but in practice we usually get results
much closer to C* than W × 𝐶 ∗
WEIGHTED A*

A* search Weighted A* search


BOUNDED SUBOPTIMAL & UNBOUNDED-COST SEARCH

 There are a variety of suboptimal search algorithms, which can be characterized by the
criteria for what counts as “good enough.”
 In bounded suboptimal search, look for a solution that is guaranteed to be within a constant
factor W of the optimal cost. Weighted A* provides this guarantee.
 In bounded-cost search, we look for a solution whose cost is less than some constant C. And
in unbounded-cost search, we accept a solution of any cost, as long as we can find it quickly.
 An example of an unbounded-cost search algorithm is Speedy search, which is a version of
greedy best-first search that uses as a heuristic the estimated number of actions required to
reach a goal, regardless of the cost of those actions. Thus, for problems where all actions
have the same cost it is the same as greedy best-first search, but when actions have different
costs, it tends to lead the search to find a solution quickly, even if it might have a high cost.
MEMORY-BOUNDED SEARCH

 The main issue with A* is its use of memory.

 Memory is split between the frontier and the reached states:


 A state that is on the frontier is stored in two places: as a node in the frontier (so
we can decide what to expand next).
 An entry in the table of reached states (so know if we have visited the state before).

 For many problems (such as exploring a grid), this duplication is not a concern,
because the size of frontier is much smaller than reached, so duplicating the states in
the frontier requires a comparatively trivial amount of memory.
MEMORY-BOUNDED SEARCH

 Some implementations keep a state in only one of the two places, saving a bit of space
at the cost of complicating (and perhaps slowing down) the algorithm.

 Another possibility is to remove states from reached when we can prove that they are
no longer needed.
SEPARATION PROPERTY

 Some problems, use the separation property, along with the prohibition of U-turn
actions, to ensure that all actions either move outwards from the frontier or onto
another frontier state. In that case, only check the frontier for redundant paths, and
eliminate from the reached table.

 For other problems, keep reference counts of the number of times a state has been
reached, and remove it from the reached table when there are no more ways to reach
the state.
BEAM SEARCH

 Beam search limits the size of the frontier. The easiest approach is to keep only the
nodes with the best -scores, discarding any other expanded nodes.
 This makes the search incomplete and suboptimal, but we can choose to make good
use of available memory, and the algorithm executes fast because it expands fewer
nodes.
 For many problems it can find good near-optimal solutions.
 An alternative version of beam search doesn’t keep a strict limit on the size of the
frontier but instead keeps every node whose f-score is within 𝛿 of the best f-score.
That way, when there are a few strong-scoring nodes only a few will be kept, but if
there are no strong nodes then more will be kept until a strong one emerges.
ITERATIVE-DEEPENING A*

 Iterative-deepening A* search (IDA*) is to A* what iterative-deepening search is to


depthfirst:
 IDA* gives us the benefits of A* without the requirement to keep all reached states in
memory, at a cost of visiting some states multiple times. It is a very important and
commonly used algorithm for problems that do not fit in memory.
 In IDA* the cutoff is the f-cost (g + h); at each iteration, the cutoff value is the smallest f-
cost of any node that exceeded the cutoff on the previous iteration. In other words, each
iteration exhaustively searches an f-contour, finds a node just beyond that contour, and
uses that node’s f-cost as the next contour.
 If the optimal solution has cost C* then there can be no more than C* iterations. But for a
problem where every node has a different f-cost, each new contour might contain only
one new node, and the number of iterations could be equal to the number of states.
RECURSIVE BEST-FIRST SEARCH (RBFS)

 Recursive best-first search attempts to mimic the operation of standard best-first search,
but using only linear space.

 RBFS resembles a recursive depthfirst search, but rather than continuing indefinitely down
the current path, it uses the _limit variable to keep track of the -value of the best
alternative path available from any ancestor of the current node. If the current node
exceeds this limit, the recursion unwinds back to the alternative path.

 As the recursion unwinds, RBFS replaces the -value of each node along the path with a
backed-up value—the best -value of its children. In this way, RBFS remembers the -value of
the best leaf in the forgotten subtree and can therefore decide whether it’s worth
reexpanding the subtree at some later time
 Stages in an RBFS
search for the
shortest route to
Bucharest.
 The f-limit value for
each recursive call is
shown on top of
each current node,
and every node is
labeled with its f-
cost.
 RBFS is somewhat
more efficient than
IDA*, but still suffers
from excessive node
regeneration.
 Stages in an RBFS
search for the
shortest route to
Bucharest.
 The f-limit value for
each recursive call is
shown on top of
each current node,
and every node is
labeled with its f-
cost.
 RBFS is somewhat
more efficient than
IDA*, but still suffers
from excessive node
regeneration.
RECURSIVE BEST-FIRST SEARCH (RBFS)

 RBFS is optimal if the heuristic function is admissible.


 Space complexity is linear in the depth of the deepest optimal solution.
 Time complexity is rather difficult to characterize: it depends both on the accuracy of
the heuristic function and on how often the best path changes as nodes are expanded.
It expands nodes in order of increasing f-score, even if is nonmonotonic.
 IDA* and RBFS suffer from using too little memory. Between iterations, IDA* retains
only a single number: the current -cost limit. RBFS retains more information in
memory, but it uses only linear space: even if more memory were available, RBFS has
no way to make use of it. Because they forget most of what they have done, both
algorithms may end up reexploring the same states many times over.
MEMORY-BOUNDED A* (MA*) & SIMPLIFIED MA* (SMA*)

 SMA* proceeds just like A*, expanding the best leaf until memory is full.
 At this point, it cannot add a new node to the search tree without dropping an old
one.
 SMA* always drops the worst leaf node—the one with the highest -value. Like RBFS,
SMA* then backs up the value of the forgotten node to its parent. In this way, the
ancestor of a forgotten subtree knows the quality of the best path in that subtree.
With this information, SMA* regenerates the subtree only when all other paths have
been shown to look worse than the path it has forgotten.
 What if all the leaf nodes have the same -value? To avoid selecting the same node for
deletion and expansion, SMA* expands the newest best leaf and deletes the oldest
worst leaf.
MEMORY-BOUNDED A* (MA*) & SIMPLIFIED MA* (SMA*)

 SMA* is complete if there is any reachable solution—that is, if d the depth of the
shallowest goal node, is less than the memory size (expressed in nodes).
 SMA* is optimal if any optimal solution is reachable; otherwise, it returns the best
reachable solution.
 In practical terms, SMA* is a fairly robust choice for finding optimal solutions,
particularly when the state space is a graph, action costs are not uniform, and node
generation is expensive compared to the overhead of maintaining the frontier and the
reached set.
 Extra time required for repeated regeneration of the same nodes means that problems
intractable for SMA*.
BIDIRECTIONAL A* SEARCH

 Bidirectional A* search is sometimes more efficient than A* itself, sometimes not.


 Use 𝑓𝐹 𝑛 = 𝑔𝐹 𝑛 + ℎ𝐹 (𝑛) for nodes going in the forward direction (with the initial state as
root) and 𝑓𝐵 𝑛 = 𝑔𝐵 𝑛 + ℎ𝐵 (𝑛) for nodes in the backward direction (with a goal state as
root).
 Although both forward and backward searches are solving the same problem, they have
different evaluation functions.
 The heuristics are different depending on whether you are striving for the goal or for the
initial state. The heuristic ℎ𝐹 estimates the distance to the goal (or, when the problem has
multiple goal states, the distance to the closest goal) and ℎ𝐵 estimates the distance to the
start. This is called a front-to-end search.
 An alternative, called front-to-front search, attempts to estimate the distance to the other
frontier.
HEURISTIC ACCURACY

 One way to characterize the quality of a heuristic is the effective branching factor 𝑏 ∗ If
the total number of nodes generated by A* for a particular problem is N and the
solution depth is d, then is the branching factor 𝑏 ∗ that a uniform tree of depth d would
have to have in order to contain N + 1 nodes.
 Experimental measurements of 𝑏 ∗ on a small set of problems can provide a good guide
to the heuristic’s overall usefulness. A well-designed heuristic, allow fairly large
problems to be solved at reasonable computational cost.
 A better way to characterize the effect of A* pruning with a given heuristic is that it
reduces the effective depth by a constant 𝑘ℎ compared to the true depth. This means
that the total search cost O(𝑏 𝑑−𝑘ℎ ) is compared to O(𝑏 𝑑 ) for an uninformed.
HEURISTIC ACCURACY

 From the definitions of the two heuristics that for any node 𝑛, ℎ2 𝑛 ≥ ℎ1 . When ℎ2 is
always better than ℎ1 , we say that ℎ2 dominates ℎ1 .

 Domination translates directly into efficiency: A* using will never expand more nodes
than A* using (except in the case of breaking ties unluckily).
RELAXED PROBLEMS

 A problem with fewer restrictions on the actions is called a relaxed problem. The
state-space graph of the relaxed problem is a supergraph of the original state space
because the removal of restrictions creates added edges in the graph.
 Because the relaxed problem adds edges to the state-space graph, any optimal solution
in the original problem is, by definition, also a solution in the relaxed problem; but the
relaxed problem may have better solutions if the added edges provide shortcuts.
 The cost of an optimal solution to a relaxed problem is an admissible heuristic for the
original problem.
 The derived heuristic is an exact cost for the relaxed problem, it must obey the
triangle inequality and is therefore consistent.
SUBPROBLEMS

 Admissible heuristics can also be derived from the solution cost of a subproblem of a
given problem.

 Pattern databases is to store these exact solution costs for every possible subproblem
instance.

 The sum of the two costs is still a lower bound on the cost of solving the entire
problem. This is the idea behind disjoint pattern databases.
REFERENCES

 Rusell & Norvig, Artificial Intelligence A Modern Approach, Fourth Ed.


 Rusell & Norvig, Artificial Intelligence A Modern Approach, Third Ed.
 Artificial Intelligence Tutorial – Javatpoint
 Análisis asintótico | Algoritmos y Estructura de Datos (luchonet.com.ar)

También podría gustarte