planning notes: offline vs online, and why deductive planning is cool
i recently started studying planning at college. i did not expect it to feel this close to the way i think about programming: specify what you want, specify what operations are allowed, then search for a sequence that makes the world look like the goal.
in this post i just want to write down the basics that helped me orient myself.
what i mean by planning
in classical ai planning you usually have:
- a set of states (facts that are true right now)
- a set of actions/operators with preconditions and effects
- an initial state $s_0$
- a goal $g$ (often a conjunction of predicates)
planning is the problem of finding a plan (a sequence, or sometimes a partial order) of actions that transforms $s_0$ into a state that satisfies $g$.
offline vs online planning
this distinction is simple but important.
offline planning (also called deliberative planning):
- you compute a plan before acting
- you assume the model is reliable (or at least stable)
- you execute the plan more or less as computed
it is a good fit when actions are predictable and you can afford thinking time.
online planning (planning while acting):
- you interleave planning and execution
- you re-plan when the world surprises you (or you only plan a short horizon)
it is a good fit when the environment changes, sensing matters, or the model is incomplete.
for me the interesting part is that both styles are the same idea, just with a different control loop.
two formulations i keep seeing: green vs kowalski
different sources present planning in different “shapes”. two that pop up a lot are:
green formulation (planning as theorem proving):
- you encode actions and change in logic
- planning becomes: prove that the goal is reachable from the initial facts
- a proof corresponds to a plan (the constructive part)
it frames planning as a logical derivation problem.
kowalski formulation (planning as logic + control):
- “algorithm = logic + control”
- you still have a declarative part (rules describing the world)
- but you make explicit the strategy that drives the search (which subgoal first, how to backtrack, what to prefer)
in practice this matches what you actually do when you implement a planner: the domain model is one thing, but search control is where everything becomes feasible or infeasible.
deductive planning (the part i like the most)
deductive planning is the view where building a plan is like building a proof: you start from a goal, and you keep reducing it into subgoals until everything is supported by the initial facts.
one very concrete way to see this is partial-order planning (also called non-linear planning):
- you do not commit immediately to a total sequence
- you maintain a set of actions, ordering constraints, and causal links
- you refine the plan by resolving open preconditions
- you also have to detect and resolve threats
tiny blocks world example
i’ll use the classic blocks world because it makes the mechanics obvious.
predicates (informal):
on(x, y): block $x$ is on block $y$ontable(x),clear(x),holding(x),handempty
actions (strips-like skeleton):
-
pickup(x)- pre:
handempty ∧ clear(x) ∧ ontable(x) - eff:
holding(x) ∧ ¬handempty ∧ ¬ontable(x)
- pre:
-
stack(x, y)- pre:
holding(x) ∧ clear(y) - eff:
on(x, y) ∧ handempty ∧ clear(x) ∧ ¬holding(x) ∧ ¬clear(y)
- pre:
(i am not trying to be perfectly formal here, i just want enough to follow the refinement steps.)
assume a goal like:
$$g = on(b, c) \wedge on(c, a)$$
and an initial state where, among other facts, $b$ is on the table and $c$ is clear.
plan refinement, step by step
start with a minimal plan with two dummy actions:
start: its effects represent the initial factsstop: its preconditions represent the goal facts
then iteratively:
1st step: support on(b, c)
- select an open precondition from the goal:
on(b, c) - choose an action whose effect achieves it:
stack(b, c) - add ordering:
start < stack(b, c) < stop - add causal link: ⟨
stack(b, c),on(b, c),stop⟩
2nd step: support on(c, a)
- same idea for
on(c, a) - choose
stack(c, a) - add ordering and the causal link into
stop
now the plan has two real actions, but each of them introduces new open preconditions.
3rd step: support holding(b) for stack(b, c)
stack(b, c) needs holding(b).
- select open precondition:
holding(b) - actions that can achieve it include
pickup(b)(orunstack(b, y)) - if the initial state says
ontable(b), thenpickup(b)is the reasonable choice - add ordering:
start < pickup(b) < stack(b, c) - add causal link: ⟨
pickup(b),holding(b),stack(b, c)⟩
at this point you check if the preconditions of pickup(b) are already supported by start.
if they are, you add the corresponding causal links from start.
4th step: do the same for the other branch
repeat the same idea for stack(c, a) (it will need holding(c) and clear(a)), inserting the actions that make those preconditions true.
threats: when planning becomes interesting
this is where the planner stops feeling like a clean proof and starts feeling like real search.
a typical threat pattern is:
- you have a causal link ⟨
a,p,b⟩ meaning “actionaestablishespfor actionb” - some other action
tmight negatep - and
tcould be ordered betweenaandb
then you must resolve the threat, usually by adding ordering constraints.
one concrete example:
pickup(b)makes¬handempty- but
pickup(a)needshandempty
so if you want both in the plan, you need to order them through an action that restores handempty (like stacking/putting down something) or constrain the ordering so the required conditions are preserved.
and sometimes you discover something worse: a choice you made early makes a later subgoal impossible, and ordering constraints cannot fix it. then you have to backtrack and change the plan structure.
this is why i find deductive planning interesting: you are not just picking actions, you are building a consistent web of commitments.
small closing thought
for now i’m mostly trying to get comfortable with the basic mental model: goals, preconditions, effects, refinement, threats. even in toy domains, it is a nice reminder that “figuring out what to do” can be treated as a first-class computational problem.