Planejamento em Inteligência Artificial Capítulo 9 Heurísticas em Planejamento Leliane Nunes de Barros MAC5788 IME-USP 2005 Leliane Nunes de Barro. Slides extraidos de Lecture slides for Automated Planning by Dana Nau. Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License: http://creativecommons.org/licenses/by-nc-sa/2.0/ Planejamento como Busca Não-determinística: uma visão conceitual O nó u representa um conjunto de planos soluções u : conjunto de todas as soluções alcançáveis de u. Leliane Nunes de Barro. Slides extraidos de Lecture slides for Automated Planning by Dana Nau. Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License: http://creativecommons.org/licenses/by-nc-sa/2.0/ Planejadores como instâncias de Abstract-search (I) Planejamento no espaço de estados: u é uma seqüência de ações. Toda solução alcançável apartir de u contém essa seqüência como prefixo ou sufixo. Planejamento no espaço de planos: u é um conjunto de ações com vínculos causais, restrições de ordenação e restrições de unificações. Toda solução alcançável apartir de u contém todas as ações e satisfazem as restrições Algoritmo Graphplan: u é um sub-grafo de um grafo de planejamento, isto é, uma seqüência de conjuntos de ações junto com restrições de pré-condições, efeitos e exclusões mútuas. Toda solução alcançável apartir de u contém as ações em u correspondente aos níveis já resolvidos (da busca no grafo) e pelo menos uma ação de cada nível qua ainda não foi resolvido em u. Leliane Nunes de Barro. Slides extraidos de Lecture slides for Automated Planning by Dana Nau. Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License: http://creativecommons.org/licenses/by-nc-sa/2.0/ Planejadores como instâncias de Abstract-search (II) Planejamento basead em SAT: u é um conjunto de literais valorados e o restante das cláusulas, cada uma sendo uma disjunção de literais que descrevem ações e estados. Toda solução alcançável apartir de u corresponde a uma atribuição de valores verdade ou falso aos literais não valorados tal que todas as cláusulas restantes sejam satisfeitas. Planejamento baseado em CSP: u é um conjunto de variáveis CSP e restrições com algumas variáveis com valores já atribuidos. Toda solução alcançável apartir de u inclue essas variáveis valoradas e atribuições para as outras variáveis CSP que satisfazerm as restrições. Leliane Nunes de Barro. Slides extraidos de Lecture slides for Automated Planning by Dana Nau. Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License: http://creativecommons.org/licenses/by-nc-sa/2.0/ Planejadores como instâncias de Abstract-search (III) Planejamento no espaço de estados: u é um plano parcial Planejamento no espaço de planos: u é um plano parcial Algoritmo Graphplan: nem todas as ações em u farão parte do plano solução Planejamento basead em SAT: nem todas as ações em u farão parte do plano solução Planejamento baseado em CSP: nem todas as ações em u farão parte do plano solução Leliane Nunes de Barro. Slides extraidos de Lecture slides for Automated Planning by Dana Nau. Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License: http://creativecommons.org/licenses/by-nc-sa/2.0/ Planejadores como instâncias de Abstract-search (III) Planejamento no espaço de estados: u é um plano parcial Planejamento no espaço de planos: u é um plano parcial Disjunctive refinement approaches: Algoritmo Graphplan: nem todas as ações em u farão parte do plano solução Planejamento basead em SAT: nem todas as ações em u farão parte do plano solução Planejamento baseado em CSP: nem todas as ações em u farão parte do plano solução Leliane Nunes de Barro. Slides extraidos de Lecture slides for Automated Planning by Dana Nau. Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License: http://creativecommons.org/licenses/by-nc-sa/2.0/ Planejadores como instâncias de Abstract-search (IV) Refine: modifica a coleção de ações e/ou restrições associadas com um nó u Branch: gera um ou mais filhos de u que serão os nós candidatos para ser o próximo nó visitado. Cada filho v representa um sub-conjunto de soluções v u Prune: remove do conjunto de nós candidatos {u1, u2, ... , uk} alguns nós que parecem ser não promissores para a busca (por exemplo, um nó já visitado que não levou a solução) Planejadores podem: » Variar a ordem em que esses 3 procedimentos são executados » Usar diferentes mecanismos de controle, como IDS ou A* Como os planejadores que estudamos realizam esses procedimentos (páginas 195 à 197) Leliane Nunes de Barro. Slides extraidos de Lecture slides for Automated Planning by Dana Nau. Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License: http://creativecommons.org/licenses/by-nc-sa/2.0/ Tornando Abstract-search Determinística Leliane Nunes de Barro. Slides extraidos de Lecture slides for Automated Planning by Dana Nau. Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License: http://creativecommons.org/licenses/by-nc-sa/2.0/ Heurística de seleção de nós: resolve o problema da escolha não-determinística Suponha que para cada estado s, temos uma estimativa h(s) da distância de s para uma solução h(s) é usada para decidir qual será o próximo nó a ser expandido quanto mais informativa for h(s) menor será o número de escolhas erradas h(s) deve ainda ser facilmente computável » Heurísticas são geralmente baseadas mo princípio de relaxação de problemas: para avaliar o quão desejável é um nó consideramos um problema mais simples obtido apartir do original fazendo suposições de simplicação e relaxando restrições Leliane Nunes de Barro. Slides extraidos de Lecture slides for Automated Planning by Dana Nau. Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License: http://creativecommons.org/licenses/by-nc-sa/2.0/ Heurística de seleção de nós Suppose we’re searching a tree in which each edge (s,s') has a cost c(s,s') If p is a path, let c(p) = sum of the edge costs For classical planning, this is the length of p For every state s, let g(s) = cost of the path from s0 to s h*(s) = least cost of all paths from s to goal nodes f*(s) = g(s) + h*(s) = least cost of all paths from s0 to goal nodes that go through s Suppose h(s) is an estimate of h*(s) Let f(s) = g(s) + h(s) » f(s) is an estimate of f*(s) h is admissible if for every state s, 0 ≤ h(s) ≤ h*(s) If h is admissible then f is a lower bound on f* Leliane Nunes de Barro. Slides extraidos de Lecture slides for Automated Planning by Dana Nau. Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License: http://creativecommons.org/licenses/by-nc-sa/2.0/ The A* Algorithm A* on trees: loop choose the leaf node s such that f(s) is smallest if s is a solution then return it and exit expand it (generate its children) On graphs, A* is more complicated: additional machinery to deal with multiple paths to the same node If a solution exists (and certain other conditions are satisfied), then: If h(s) is admissible, then A* is guaranteed to find an optimal solution The more “informative” the heuristic is (i.e., the closer it is to h*), the smaller the number of nodes A* expands If h(s) is within c of being admissible, then A* is guaranteed to find a solution that’s within c of optimal Leliane Nunes de Barro. Slides extraidos de Lecture slides for Automated Planning by Dana Nau. Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License: http://creativecommons.org/licenses/by-nc-sa/2.0/ Heuristic Functions for Planning *(s,p): minimum distance from state s to a state containing p *(s,s’): minimum distance from state s to a state containing every p in s’ i(s,p) and i(s,s’), where i = 0, 1, 2, … estimates of *(s,p) and *(s,s’) 0 ignora os efeitos negativos das ações and p s, h(s) = 0(s,g), where g is the goal Leliane Nunes de Barro. Slides extraidos de Lecture slides for Automated Planning by Dana Nau. Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License: http://creativecommons.org/licenses/by-nc-sa/2.0/ Heurística da relaxação de independência (polinomial no número de proposições e ações) Given s, can compute 0(s,p), for every proposition p: From this, can compute h(s) = 0(s,g) = p g 0(s,p) Leliane Nunes de Barro. Slides extraidos de Lecture slides for Automated Planning by Dana Nau. Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License: http://creativecommons.org/licenses/by-nc-sa/2.0/ Heuristic Forward Search Note: this is depth-first search; thus admissibility is irrelevant Leliane Nunes de Barro. Slides extraidos de Lecture slides for Automated Planning by Dana Nau. Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License: http://creativecommons.org/licenses/by-nc-sa/2.0/ Heuristic Backward Search Same comment as on the previous slide Leliane Nunes de Barro. Slides extraidos de Lecture slides for Automated Planning by Dana Nau. Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License: http://creativecommons.org/licenses/by-nc-sa/2.0/ and p s, h(s) = 0(s,g) is an inadmissible heuristic 1: like 0 except that 1(s,g) = maxp g 0(s,p) This heuristic is admissible; thus it could be used with A* It is not very informative Leliane Nunes de Barro. Slides extraidos de Lecture slides for Automated Planning by Dana Nau. Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License: http://creativecommons.org/licenses/by-nc-sa/2.0/ A More Informed Heuristic Instead of computing the maximum distance to each p in g, compute the maximum distance to each pair {p,q} in g: Leliane Nunes de Barro. Slides extraidos de Lecture slides for Automated Planning by Dana Nau. Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License: http://creativecommons.org/licenses/by-nc-sa/2.0/ More Generally, … Leliane Nunes de Barro. Slides extraidos de Lecture slides for Automated Planning by Dana Nau. Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License: http://creativecommons.org/licenses/by-nc-sa/2.0/ Complexity of Computing the Heuristic Takes time (nk). If k = max(|g|, max{|precond(a)| : a is an action}) then computing (s,g) is as hard as solving the entire planning problem Leliane Nunes de Barro. Slides extraidos de Lecture slides for Automated Planning by Dana Nau. Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License: http://creativecommons.org/licenses/by-nc-sa/2.0/ Get Heuristic Values from a Planning Graph The planning graph provides not only the distance estimates from s0 to every reachable proposition p but also the mutex relations Pi The Solution extraction procedure selects a set of propositions g in a layer only if no pair of elements in g is mutex. Theorem: if a pair of elements in not mutex in a layer, then it remains nonmutex in the following layers this is very close to 2(s,g) Leliane Nunes de Barro. Slides extraidos de Lecture slides for Automated Planning by Dana Nau. Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License: http://creativecommons.org/licenses/by-nc-sa/2.0/ Get Heuristic Values from a Planning Graph Recall how GraphPlan works: loop Graph expansion: this takes polynomial time extend a “planning graph” forward from the initial state until we have achieved a necessary (but insufficient) condition for plan existence this takes exponential time Solution extraction: search backward from the goal, looking for a correct plan if we find one, then return it repeat Leliane Nunes de Barro. Slides extraidos de Lecture slides for Automated Planning by Dana Nau. Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License: http://creativecommons.org/licenses/by-nc-sa/2.0/ Using the Planning Graph to Compute h(s) In the graph, there are alternating layers of ground literals and actions The number of “action” layers is a lower bound on the number of actions in the plan Construct a planning graph, starting at s g(s,p) = level of the first layer that “possibly achieves” p g(s,g) is very close to 2(s,g) 2(s,g) counts each action individually g(s,g) groups together the independent actions in a layer Leliane Nunes de Barro. Slides extraidos de Lecture slides for Automated Planning by Dana Nau. Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License: http://creativecommons.org/licenses/by-nc-sa/2.0/ The FastForward Planner Use a heuristic function similar to h(s) = g(s,g) Some ways to improve it (I’ll skip the details) Don’t want an A*-style search (takes too much memory) Instead, use a greedy procedure: until we have a solution, do expand the current state s s := the child of s for which h(s) is smallest (i.e., the child we think is closest to a solution) There are some ways to improve this (I’ll skip the details) Can’t guarantee how fast it will find a solution, or how good a solution it will find However, it works pretty well on many problems Leliane Nunes de Barro. Slides extraidos de Lecture slides for Automated Planning by Dana Nau. Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License: http://creativecommons.org/licenses/by-nc-sa/2.0/ FastForward in the AIPS-2000 Planning Competition FastForward did quite well In the AIPS-2000 competition, all of the planning problems were classical problems Two tracks: Fully automated planners FastForward participated in the fully automated track » It got an “outstanding performance” award Large variance in how close its plans were to optimal » However, it found them very fast compared with the other automated classical planners Leliane Nunes de Barro. Slides extraidos de Lecture slides for Automated Planning by Dana Nau. Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License: http://creativecommons.org/licenses/by-nc-sa/2.0/ FastForward in the AIPS-2002 Planning Competition Among the automated planners, FastForward was roughly in the middle LPG (graphplan + local search) did much better, and got a “distinguished performance of the first order” award One of thing that caused difficulty for FastForward The problems in the AIPS-2002 went beyond classical planning » Numbers, optimization, time durations Example: A domain inspired by the Hubble space telescope (a lot simpler than the real domain, of course) » A satellite needs to take observations of stars » Gather as much data as possible before running out of fuel Any amount of data gathered is a solution » Thus, FastForward always returned the null plan Leliane Nunes de Barro. Slides extraidos de Lecture slides for Automated Planning by Dana Nau. Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License: http://creativecommons.org/licenses/by-nc-sa/2.0/ Heuristics for Plan-Space Planning How to select the next flaw to work on? Leliane Nunes de Barro. Slides extraidos de Lecture slides for Automated Planning by Dana Nau. Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License: http://creativecommons.org/licenses/by-nc-sa/2.0/ One Possible Heuristic Fewest Alternatives First (FAF) Leliane Nunes de Barro. Slides extraidos de Lecture slides for Automated Planning by Dana Nau. Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License: http://creativecommons.org/licenses/by-nc-sa/2.0/ Do Others Work Better? Sometimes yes, sometimes no Limits to how good any flaw-selection heuristic can do Leliane Nunes de Barro. Slides extraidos de Lecture slides for Automated Planning by Dana Nau. Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License: http://creativecommons.org/licenses/by-nc-sa/2.0/ Serializing and AND/OR Tree Partial plan p The search space is an AND/OR tree Goal g2 … Variable … Goal constraint ordering Operator o1 … Operator on Goal g1 Deciding what flaw to work on next = serializing this tree (turning it into a state-space tree) at each AND branch, choose a child to expand next, and delay expanding the other children Partial plan p Goal g1 Operator o1 Partial plan p1 Goal g2 … Variable … Order constraint tasks … Operator on Partial plan pn Goal g2 … Variable … Order constraint tasks Leliane Nunes de Barro. Slides extraidos de Lecture slides for Automated Planning by Dana Nau. Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License: http://creativecommons.org/licenses/by-nc-sa/2.0/ One Serialization Leliane Nunes de Barro. Slides extraidos de Lecture slides for Automated Planning by Dana Nau. Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License: http://creativecommons.org/licenses/by-nc-sa/2.0/ Another Serialization Leliane Nunes de Barro. Slides extraidos de Lecture slides for Automated Planning by Dana Nau. Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License: http://creativecommons.org/licenses/by-nc-sa/2.0/ Why Does This Matter? Different refinement strategies produce different serializations the search spaces have different numbers of nodes In the worst case, the planner will search the entire serialized search space The smaller the serialization, the more likely that the planner will be efficient One pretty good heuristic: fewest alternatives first (FAF heuristic) => choose the flaw having the smallest number of resolvers. This will limit the cost of eventual backtracks. Takes O(n) where n is the number of flaws in a partial plan. Leliane Nunes de Barro. Slides extraidos de Lecture slides for Automated Planning by Dana Nau. Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License: http://creativecommons.org/licenses/by-nc-sa/2.0/ How Much Difference Can the Refinement Strategy Make? Case study: build an AND/OR graph from repeated occurrences of this pattern: Example: number of levels k = 3 branching factor b = 2 Analysis: Total number of nodes in the AND/OR graph is n = Q(bk) How many nodes in the best and worst serializations? Leliane Nunes de Barro. Slides extraidos de Lecture slides for Automated Planning by Dana Nau. Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License: http://creativecommons.org/licenses/by-nc-sa/2.0/ Case Study, Continued The best serialization contains k 2 Q(b ) nodes k k 2 The worst serialization contains Q(2 b ) nodes The size differs by an exponential factor, but the best serialization still is exponentially large To do better, need good node selection, branching, pruning Leliane Nunes de Barro. Slides extraidos de Lecture slides for Automated Planning by Dana Nau. Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License: http://creativecommons.org/licenses/by-nc-sa/2.0/ Resolver selection heuristics Let Q be the set of all plans generated with all the possible reasolvers. Select a plan that has the smallest set of sug-goal whithout causal links (not very informed heuristic). A more informative heuristic: construct an AND/OR graph along regression steps defined by -1 down to some fixed level k Select the plan that minimizes the wheighted sum of: 1. The number of actions in this graph that are not in and 2. The number of subgoals remaining in its leaves that are not in the initial state Leliane Nunes de Barro. Slides extraidos de Lecture slides for Automated Planning by Dana Nau. Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License: http://creativecommons.org/licenses/by-nc-sa/2.0/