Bridging the Modelling Gap: Examining the ... - CiteSeerX

3 downloads 5672 Views 116KB Size Report
of the Satellite domain, considering what can be expressed using the extensions of ... strained by the availability of downlink opportunities (win- dows of availability of ..... Therefore, we can check to determine whether an event is triggered by ...
Bridging the Modelling Gap: Examining the Expressiveness of Planning Domain Description Languages Derek Long and Maria Fox Department of Computer Science University of Durham, UK [email protected] [email protected]

Abstract Research in planning has advanced rapidly over recent years, spurred partly by a series of international planning competitions and the convergence on empirical evaluation that this has encouraged. One of the most important contributions made by the competition sequence is the development of a community standard modelling language, PDDL. This language has been extended from a relatively simple action description language to include constructs to express the use of resources and other metrically measured quantities and time. The traditional boundaries between planning and scheduling have become increasingly blurred, so that planning research is now regularly considered in application to problems that include management of resources, time, and the combined planning and scheduling of complex concurrent activities. In this paper we examine the extent to which PDDL2.1, the most recent development of the community standard modelling language, supports expression of realistic features of complex application domains. This examination is carried out with particular reference to two domains inspired by space applications: satellite observation planning and planetary rover exploration planning. Versions of these domains were used in the 3rd International Planning Competition (2002). The limitations of these versions with respect to the application problems and the gap between the modelling language and the demands of these applications is explored.

Introduction Planning research has made dramatic strides forward in the past few years. This is well illustrated by the leap in performance between planners competing in the 1st International Planning Competition (Long 2000), in 1998, and those appearing in the 3rd competition this year (2002). Where the earlier planners were considered trail-blazing when they could solve problems requiring up to 50 steps, it is now possible to find plans involving 200 or more steps. However, perhaps more importantly, this performance gain has not been focussed on the problems of purely logical statetransitions that dominated planning four years ago, but has spread far beyond to include problems that involve numeric valued state changes and actions with duration, allowing plans to be constructed to exploit concurrency. This huge jump forward has brought mainstream planning research to the point where it is possible to imagine modelling real problems and tackling them with generic planning

technology. Indeed, attempting to close the gap between the toy domains that have dominated planning research and the modelling demands of potential application areas was one of the goals of the last competition, and this was approached by presenting models of domains inspired by real potential application areas and, in particular, by the application area of space missions. The problems introduced were the Satellite domain and the Rovers domain. The first is inspired by the problem of planning satellite observations and the second by planning the explorations of planetary rovers. In this paper we will concentrate primarily on the features of the Satellite domain, considering what can be expressed using the extensions of the planning language introduced for the 3rd IPC. We also consider the areas where the model falls short and identify the remaining challenges that must be confronted to bring this application, and others like it, within the modelling capabilities of planning systems.

The Satellite Domain The problem of satellite observation scheduling was discussed as a planning problem in (Smith, Frank, & J´onsson 2000). Satellite observation planning is the following problem: one or more satellites are available in orbit, each equipped with various observation equipment. Requests are made by interested parties to have observations made using particular instruments, perhaps at particular times. Making observations will usually involve slewing the satellite in order to align the instrument making the observation with the target, possibly first calibrating the instrument using a different target. Before calibrating or otherwise using an instrument, the instrument must be powered-up and, in some cases, warmed up for some time. The observation will generate data that must be stored, in a limited capacity onboard store, and subsequently downlinked to a ground station. Ground station communication is an expensive commodity, since the ground station must be devoted to tracking the satellite during downlink. Typically, a satellite has limited opportunities to downlink data to any given ground station, constrained by the period during which communication is possible within the orbit of the satellite. Communication bursts must be prefaced by a hand-shaking protocol and concluded with a sign-off protocol. The planning problem arises in attempting to decide which requests to fulfil, what order to execute them in, when to communicate data and, if

there is choice, which satellite to use to fulfil each request. Further choices can arise in which calibration targets to use in preparing instruments. This domain involves elements of planning, where there is choice between alternative satellites for an observation, between ground stations for downlink or between calibration targets. These choices can lead to significantly different decisions about which observations to attempt, how to allocate those tasks to different platforms and so on. The domain also contains elements of scheduling — the observations must be fitted into the time windows during which the targets are visible, downlinks must also be placed into windows of opportunity, activities on board each satellite must be managed so that the data capacity is never exceeded and so on. In exhibiting elements of both of the traditional concerns of planning and scheduling, this problem is typical of many real applications and illustrates one of the reasons that the two research fields are increasingly convergent. Within the structure of the Satellite domain lies a constrained bin-packing problem, in which the data capacities of the satellites represent separate bin capacities and the data packets generated by making observations represent the objects to be packed. Although the data stores can be emptied by downlinking data, the ability to do this will be constrained by the availability of downlink opportunities (windows of availability of ground stations) and downlinking might, in fact, be split across multiple contacts with different ground stations. An important contrast with classical planning is that the goal of an instance of the problem is a complex interplay between maximizing data acquired and minimizing the cost of acquiring it, rather than a simple collection of logical propositions. This, again, is typical of real application areas: there might be certain hard constraints on what must be achieved (for example, an observation that must be made regardless of cost), but most of the requirements of a solution are expressed through the desire to optimise some function of parameters that describe the numerically measurable features of the domain.

Encoding Planning Domains The first challenge for the planning community is to find a language and, importantly, a semantics for the language, that allows expression of the features of this problem. As a research community, the progress towards common agreement on the expressive features of a domain description language, and on its semantics, has been driven most forcefully by the sequence of planning competitions (Long 2000) (see also: www.cs.toronto.edu/aips2000 and www.dur.ac.uk/d.p.long/competition.html). The first of these motivated the development of PDDL (McDermott 2000), which formed a standard syntax for the large core of generally accepted planning language elements. The second competition led to the consolidation of this core and established the goal of striving to test the expressive power of the language with real application problems (Koehler & Schuster 2000). The third competition saw a dramatic development in the expressive power of PDDL (Fox & Long

2001b) and significant strides towards the expression of real problems. PDDL2.1 allows expression of actions with duration and defines the way in which actions can be allowed to execute concurrently. It consolidates the use of numeric values in the context of planning domains and integrates this with the measurement of duration of actions. A semantics is supplied which is also embodied in a software tool allowing automatic validation of plans (Long & Fox 2001). An example of an action with duration is given in figure 1, in which it can be seen that constraints on application of an action can be expressed in terms of local conditions that must hold at the start or at the end of the action, or that must hold as invariants over the entire duration of the action. Duration can be expressed as a function of parameters of the action. The effects of a durative action are similarly localised to the end points of the action — in this example, the sole effect occurs at the end of the action, but in other examples there can be effects at both ends. For example, turning a satellite will cause it to be moving and to be no longer pointing at whatever angle it was previously facing at the start of the activity and to stop moving and to be facing a new angle once it stops. In addition to actions with duration, PDDL2.1 makes a more precise commitment to the expression of numeric values than did its predecessor. An example can be seen in figure 2, where the data capacity of a satellite is correctly managed by the effects of an action that produces data by making an observation. The apparent oddity of the repetition of the requirement that the instrument power be on throughout the action and at the end of the action is a technical issue that arises because of the distinction between open and closed intervals of time. A further innovation in PDDL2.1, made possible by the resolution of ambiguities in the expression of numerical quantities, is the introduction of problem-specific plan metrics. These are expressions, associated with a specific instance of a problem for a domain, that express how the final plan is to be evaluated. In particular, these expressions can state that some function of parameters of the problem is to be minimized or maximized. This function can also refer to the total time over which the plan will be executed, so it is possible to seek to minimize execution time, or to balance this against other features.

Encoding the Satellite Domain The features of PDDL2.1 allow a reasonable first approximation of the Satellite domain to be captured. We based an encoding on an earlier attempt by Haslum and Geffner (Haslum & Geffner 2001). Satellites equipped with instruments that must be powered up and calibrated before use can easily be described. The activity of slewing a satellite between targets and the activity of capturing images can be modelled, together with the effects of fuel-consumption and of use of data store. Importantly, it is also possible to express a plan metric that captures the desired maximization of data acquired. It is possible to use more complex expressions to capture the intention that the data be acquired as quickly as possible and

(:durative-action calibrate :parameters (?s - satellite ?i - instrument ?d - direction) :duration (= ?duration (calibration_time ?i ?d)) :condition (and (over all (on_board ?i ?s)) (over all (calibration_target ?i ?d)) (at start (pointing ?s ?d)) (over all (power_on ?i)) (at end (power_on ?i)) ) :effect (at end (calibrated ?i)) )

Figure 1: Example durative action drawn from a version of the Satellite Domain.

(:durative-action take_image :parameters (?s - satellite ?d - direction ?i - instrument ?m - mode) :duration (= ?duration 7) :condition (and (over all (calibrated ?i)) (over all (on_board ?i ?s)) (over all (supports ?i ?m) ) (over all (power_on ?i)) (over all (pointing ?s ?d)) (at end (power_on ?i)) (over all (>= (data-capacity ?s) (data ?d ?m))) (at end (>= (data-capacity ?s) (data ?d ?m))) ) :effect (and (at start (decrease (data-capacity ?s) (data ?d ?m))) (at end (have_image ?d ?m)) (at end (increase (data-stored) (data ?d ?m))) ) )

Figure 2: An example of an action with numeric effects.

Satellite-Numeric

Satellite-Time

900

900 TLPlan (20 solved) SHOP2 (20 solved) MIPS (Plain setting) (7 solved) FF (Speed) (14 solved) LPG (Quality) (10 solved)

TLPlan (20 solved) TALPlanner (20 solved) SHOP2 (20 solved) MIPS (Plain setting) (19 solved) Sapa (19 solved) LPG (Quality) (19 solved)

800

700

700

600

600

500

500

Quality

Quality

800

400

400

300

300

200

200

100

100

0

0 0

2

4

6

8

10 12 Problem number

14

16

18

20

Figure 3: Numeric Satellite variant, plan quality: smaller values are better plans. SHOP2 and TLPlan, both using hand-coded knowledge, are generally producing better quality plans. as cheaply as possible, provided that these criteria can be combined into a single function. In encoding the Satellite domain we chose not to consider some of the less central activities such as warming up instruments. Using this expressive power, the 3rd IPC included a collection of variants on the Satellite domain that, in their most complex form, included constraints on data storage, time and fuel use. A particularly interesting variant, since it captures a problem form that has not been seen in previous planning benchmarks, is one in which there are no logical goals at all, but the metric expresses the requirement that plans should seek to maximize data acquired. In other variants, the goals were expressed in more orthodox style as the requirement that a certain collection of observations should have been made, but with the metric seeking minimization of time and fuel use in some combination. To illustrate the achievements of some of the participants on variants of the Satellite domain, consider the following figures. The graphs show various performance statistics for the planners FF (Hoffmann & Nebel 2000), MIPS (Edelkamp 2002), Sapa (Do & Kambhampati 2001), LPG (Gerevini & Serina 2002), SHOP2 (Nau et al. 1999), TLPlan (Bacchus & Kabanza 2000) and TALPlanner (Kvarnstrom & Doherty 2000). The last three planners use hand-coded control knowledge, while the others are fully automatic, requiring only a description of the legal actions in the domain. The first graph, in figure 3, shows the planners working with a numeric variant of the Satellite domain, in which the satellites have constrained data storage capacity. Figure 4 shows plans produced for a temporal variant of the domain, in which actions have duration. Here the objective was to minimize the time for execution of the plan. Figure 5 shows how fast these plans were produced (on a

0

2

4

6

8

10 12 Problem number

14

16

18

20

Figure 4: Timed Satellite variant, plan quality: smaller values are better plans. LPG, a fully automated planner, produces good plans here. 1800MHz Athlon CPU PC, with 1Gb RAM), while figure 6 shows that these plans contain up to 140 steps in some cases. Note that the planners using hand-coded controls produced plans containing more steps in general, but that these plans are often of the best, or close to best, quality. This reflects the exploitation, by the domain-engineers, of certain sequences of actions that do not obey a “triangle inequality”: for some actions A, B and C, actions A and B achieve a goal that can otherwise be achieved by action C, but the sum of the costs of actions A and B is smaller than the cost of action C. The automatic planners tend to favour selecting single action solutions to goals when they can, even though this is not always optimal. This fact indicates a weakness in the responses of fully-automated planners to plan quality metrics: the fundamental assumption that a plan with fewer steps is always to be preferred is bound into the planning strategy in such an integral way as to make it hard, or even impossible, for these planners to find optimal quality plans that break this expectation. However, this observation must be tempered by the fact that LPG, a fully automatic planner, produced some of the best quality plans to problems in the set. The graph in figure 7 shows the performance of planners on the Complex Satellite variant, in which satellites are equipped with numeric data capacities and the objective is to minimize the time taken, given that the data must be “binpacked” into the data stores. The last figure, figure 8 shows performance on the HardNumeric variant. This variant is unusual in demanding that the planners maximize the data acquired, while not requiring them to satisfy any signficant logical goals (in many cases, there are no logical goals at all, while in other cases there are some goals indicating final facings for satellites). It can be seen that the fully-automated planners do not, currently, tend to respond well to this. Only MIPS produced plans

Satellite-Complex 700 TLPlan (20 solved) SHOP2 (20 solved) MIPS (Plain setting) (10 solved) Sapa (16 solved)

600 Satellite-Time 1e+07 TLPlan (20 solved) TALPlanner (20 solved) SHOP2 (20 solved) MIPS (Plain setting) (19 solved) Sapa (19 solved) LPG (Quality) (19 solved)

500

400 Quality

1e+06

100000

Milliseconds

300 10000 200 1000 100 100 0 0 10

2

4

6

8

10 12 Problem number

14

16

18

20

1 0

2

4

6

8

10 12 Problem number

14

16

18

20

Figure 5: Timed Satellite variant, planning speed: note logscale. The three planners exploiting hand-coded controls are fastest, but all of LPG’s plans are produced in under 15 minutes.

Figure 7: Complex Satellite variant, plan quality: this variant involves both durative actions and numbers. Sapa and MIPS are both fully-automated planners, with Sapa showing particularly competitive quality compared with the planners using hand-coded controls. that acquired any data, and did not do so very effectively. The planners with hand-coded controls produce much better plans, varying in the quality of their solutions to the packing problem by only a few units.

Challenges for the Future Satellite-Time 160 TLPlan (20 solved) TALPlanner (20 solved) SHOP2 (20 solved) MIPS (Plain setting) (19 solved) Sapa (19 solved) LPG (Quality) (19 solved)

140

Number of steps

120

100

80

60

40

20

0 0

2

4

6

8

10 12 Problem number

14

16

18

20

Figure 6: Timed Satellite variant, plan length. This plot shows the size of the plans produced, placing the planning times in context. Note that the fully-automated planners typically produce plans with fewer steps.

Encoding the Satellite domain has been a problem we have been considering, with the help of David E. Smith and Jeremy Franks at NASA Ames, for some time. There are elements of the real problem that are not to be found in any of the variants used in the competition. The most important of these is that none of the variants examine the problem of downlinking data — once data is captured it stays on board the satellite that made the observation. A second aspect that does not appear in these variants is that opportunities for capturing observations are typically constrained to certain time windows. To express both of these parts of the real problem requires a language that has the expressive power to capture exogenous events: events that occur because conditions in the world trigger them, not because the executive enacts them. Of course, many events can be triggered as a consequence of activities of an executive, but the important feature that distinguishes actions from events is that actions are transitions in world state wrought entirely by the agent, while events are transitions in world state that occur because the state of the world demands them. In principle, an executive could decide, at the instant of application of an action, to abort the action and not cause the state transition. This choice is not available at the instant of the triggering of an event. This distinction is of critical importance in modelling windows of opportunity, such as time windows during which an ob-

Satellite-HardNumeric 6000 TLPlan (20 solved) SHOP2 (20 solved) FF (Speed) (20 solved) MIPS (20 solved) 5000

Quality

4000

3000

2000

1000

0 0

2

4

6

8

10 12 Problem number

14

16

18

20

Figure 8: HardNumeric Satellite variant, plan quality. In this variant, exceptionally, the quality metric is reversed: good plans require higher values. Only the planners using handcoded controls do particularly well. servation can be made, or windows during which a ground station can be contacted by a given satellite. This is because these windows do not open and close under the control of the executive (or of any of the multiple executives, in this case), but as a consequence of the uncontrolled passage of time. A different limitation of the expressive powers of PDDL2.1 is exposed in the Rovers problem. In this problem it should be possible for a rover to recharge its batteries using solar cells concurrently with certain activities that simultaneously drain power from the batteries. The elements of PDDL2.1 used in the 3rd IPC only support discrete changes in numeric values. This makes it impossible to properly model the interaction of charging and discharging activities when they operate at the boundary of the capacity of the battery. Consider the case when the battery is fully charged and the rover then drives, in sunlight, to a distant destination. In practice the driving will consume charge continuously and the rover could recharge as it travels. However, using step functions to model charge use makes it impossible to model this situation properly. If the driving action is modelled with discrete consumption at the start of the activity then the battery will appear to have a discharged capacity that can be recharged by using exploiting solar recharging during the drive. Unfortunately, if the recharge rate is sufficiently high this can allow the planner to plan to recharge the battery before the driving is complete, and hence replacing the used charge before it is even consumed! This can lead to situations in which plans are constructed that could not be executed in practice (consider the situation in which the rover drives from sunlight into shade, and the planner plans to recharge the battery in the first part of the drive while sunlight is available, expecting the battery to be fully charged on arrival at the destination). Equally, if discharging is mod-

elled as a step function at the end of the drive action then the battery cannot be recharged until the drive is over. Note that this situation cannot be adequately handled by making conservative assumptions that prevent concurrent activities from affecting the same numeric value. PDDL2.1 contains syntactic constructs and a semantic model for the use of durative actions with continuous effects. Such actions allow one to model concurrent behaviours more accurately. However, even the task of validating plans with continuous effects is difficult (even restriction to linear continuous change and simple threshold invariant conditions on durative actions leaves a tricky problem, since the invariants can include disjunctions which complicate matters considerably)1 . Once continuous effects are considered a further, even greater, complication becomes apparent: it is often the case that it is the interaction of a continuous effect with some threshold trigger that is the underlying effect that is represented by the end of a durative action. Indeed, PDDL2.1 fails to provide any mechanism for distinguishing the underlying reason for the completion of a durative action: by deliberate action on the part of the executive or by a triggered consequence of a process initiated by earlier action of the executive. This lack of distinction is also apparent in other languages that have been developed for expressing temporal activities in planning, such as the interval-based language used in HSTS (Muscettola 1994), or the temporal extension of STRIPS used in TGP (Smith & Weld 1999). The failure to distinguish between these alternatives is unlikely to affect the exploitation of planning technology in the short term (not least because managing any form of continuous change in planning problems is already at the boundaries of the achievements of planning technology), but it is not hard to conceive of applications in which a system must reason about plan failure during plan execution and this will require a model that exposes continuous changes that have been initiated by earlier actions and which will not terminate without active intervention by the executive. Linking planning and execution is a challenge in its own right and one which is increasingly pressing. In terms of modelling, this problem demands a richer expressiveness in the plan description language. Currently, plans are modelled as time-stamped collections of actions (with durations where appropriate). Although there are situations in which one can imagine such a plan being appropriate (where the executive is a machine equipped with no sensors, required to initiate activities simply according to a timetable), many situations demand greater flexibility and responsiveness in the execution of a plan and this, in turn, requires that the executive be supplied with more information about the dependencies with a plan structure. A plan that expresses the dependencies between actions allows an executive to appropriately delay or promote activities in which the antecedents have been completed unexpectedly early or late. Since the precise capabilities of an executive system will vary, the domain modelling language will have to expand to include means by which the 1 The EPSRC funded Linguist project, held by the authors, is supporting ongoing work exploring this problem.

capabilities of the executive can be captured so that a plan of appropriate structure can be constructed. This model will have to indicate the sensory capabilities of the executive, the distributed capabilities of the executive system (which might consist of multiple executive agents), the various costs associated with use of and risk to parts of the executive system and the mapping of actions to executive sub-systems. This remains an almost completely unexplored territory.

Pushing the Expressiveness of PDDL2.1 PDDL2.1 is a powerful language and it is possible to use it to capture several of the behaviours that we have described above without further extension. In this subsection we will consider some of the strategies that can be exploited to achieve this. The intention is to explore what parts of the expressive power required to capture events and processes lie within PDDL2.1, in order to better understand both the potential demands of working with PDDL2.1 and also whether the constructions outlined above represent a genuine increased demand on planning systems. We begin by considering events. The crucial characteristic of events is that a planner cannot avoid their effects by simply not choosing to place them in a plan — they are a necessary consequence of some triggering conjunction of propositions. The trigger can be propositions that are made true by planned activities, but they can also include the mere passage of time. This is the case, for example, for the windows of opportunity for downlinking data, where the events of the window opening and closing are triggered simply by the passage of time (the effect of orbiting). It turns out that it is possible to model events triggered by passage of time using PDDL2.1. The technique is not entirely elegant, since it involves the construction of artificial actions to model the events, and therefore demands direct modification of the domain encoding for each separate problem instance. Nevertheless, this approach demonstrates the expressive powers of PDDL2.1. Before exploring the technique, we just review the intention: PDDL2.1 supports the modelling of activities with duration but, as with other common planning domain models before it, it only attempts to represent actions that are executed at the instruction of the planner. If the planner chooses not to schedule an activity, then the activity will not occur. In contrast, events are intended to be activities that will occur without the intervention of the executive and without choice on the part of the planner. To simulate this in PDDL2.1 it is necessary to create a situation in which the planner is forced to enact an action that simulates a required event, regardless of whether it is to the advantage of the planner or not, and to do so at precisely the time the event should occur. The technique is as follows: suppose event E is required to occur at time t. A durative action, AE , of length t and with the effects of the event E can be constructed which has, as an additional initial effect, the proposition PE . PE is added as a new precondition to every other action in the domain, preventing the planner from attempting any other activity until AE is executed. That is, the planner can only construct plans that begin with the action AE and which, therefore, cause the effects of event E to take place at time

t after the start of execution of the plan. If the event, E, has additional triggering conditions it is possible to make all the effects in AE that model the effects of E conditional on those trigger propositions. It will be noted that this achieves the intention, although there can be an arbitrary gap between the initial state at time 0 and the point at which the event is triggered. Since the planner cannot schedule any other activity in this gap, the dead time can be ignored. This approach still assumes that the time at which E can trigger is fixed. Indeed, a disadvantage of this technique is that it can only be used to model events that occur at predetermined times — it is not appropriate for capturing events that might be triggered as a consequence of activities on the part of an executive. A further disadvantage is that it fails if several events must all be triggered at specific times, but each at a different time. The reason is that a single durative action can only trigger effects at the conclusion of its duration. If the effects of several different events are attached to different durative actions there is nothing that can be done (in PDDL2.1) to force these several actions to be executed from the same start time. The mechanism used to force the planner to execute the single event-simulating action, which is to give it a dummy initial effect that is required for execution of any other actions, is not powerful enough to force multiple actions to be executed simultaneously. Thus, a planner would be free to reschedule inconveniently clustered events to fit a more convenient timetable of its own, subject to the constraint that all of the triggering actions were executed before any of the other actions in the plan. This would undermine the intention behind the model of events that are activated regardless of the intervention of an executive under instruction by the planner. A different technique can be used, however, to successfully schedule a sequence of events at specific times. Suppose the sequence of events, E1 · · · En is to occur at times t1 · · · tn . A sequence of durative actions, A1 · · · An is created, each with a new initial effect Pdummy , which is also added as a new precondition for all of the original actions in the domain. The final effects of each Ai are the intended effects of event Ei . In addition, all Ai for i < n are given the effect (at end (not Pdummy )) and each Ai is given the positive effect Qidummy . Finally, each Ai is given an initial 0 condition Qi−1 dummy , with Qdummy added as an initial state condition to any problem instance for the domain. Each action Ai deletes Qidummy as an initial effect. The duration of Ai is set to be ti − Σj=1···i−1 tj . Careful scrutiny will reveal that the action sequence, A1 · · · An , is locked together so that each achieves the condition necessary for execution of the next, while no other actions can be executed between successive pairs of actions in the sequence, since Pdummy , which is required for all other activity, is not asserted in those gaps. The durations of the actions in the sequence then force the timetabled events to occur at the scheduled times. This construction achieves the goal of modelling sequences of events, although at the price of a rather contrived domain structure. There remains an important weakness in this model: since the planner can allow gaps between successive pairs of

t1

Q

t 2− t 1

0

1

A1

¬Q

0 dummy

2

Q dummy

dummy

Pdummy

t 3− t 2− t 1

Q dummy A2

E1 ¬ Pdummy 1

Q dummy

Pdummy 1

¬ Q dummy

A3 E2

Pdummy

¬ Pdummy

2

¬ Q dummy

Q 2dummy

E3 ¬ Pdummy Q 3dummy

Pdummy D

Duration of D

Figure 9: The planner squeezes action D between events E1 and E3 by leaving a gap between the end of A2 and the start of A3 , which “swallows” time spent in D. Propositions below the actions represent effects and above represent preconditions. Notice that the action sequence A1 , A2 , A3 is forced by the effects and conditions on those actions, while nothing new can occur in the gap between the end of A2 and the start of A3 because of the absence of Pdummy . event-simulation actions, even though it cannot place other actions to execute within these gaps, it is free to exploit these gaps to “swallow” time spent on durative actions that span the gaps. For example, if durative action D should not occur concurrently with events E1 and E3 , but its duration is greater than the gap between t1 and t3 , the planner can circumvent the problem using the plan shown in figure 9. It might be imagined that this could be solved by adding Pdummy as an invariant for all durative actions (such as D), but this would only partially solve the problem. Although it would prevent D from being planned to span an artificially created gap in the event-simulation action sequence, it would also prevent any action from being made concurrent with events, since the proposition Pdummy is deleted at these points and reasserted by the next action in the eventsimulation sequence: an invariant cannot be satisfied by this broken chain, even if the break is infinitesimally small. There is a final trick that can be applied to resolve this problem. A further durative action is added, B, with duration tn . The idea is to ensure that there can be no gaps between the event-simulating actions by forcing the entire sequence to fit between the start and end points of B. This is achieved by adding the invariant condition Rdummy to each of the actions Ai and as an initial effect of B, but deleting it as a final effect of B. Finally, to prevent B from being reapplied (and thereby re-initiating the entire event sequence), we add a condition Bdummy as an initial condition of B, deleted as an initial effect of B and added to the initial state of any problem instance for the domain. Although this construction is quite awkward, it allows us to properly model events triggered by the passage of time in PDDL2.1, even without the use of continuous effects. To model events that are triggered by propositional formulae (assuming that there are no continuously changing quantities), the above approach can be adapted. The idea is

that proposition truth values only change at times when actions occur (including start or end points of durative actions). Therefore, we can check to determine whether an event is triggered by monitoring the context in which actions have effects that might trigger events. This can be done using conditional effects on actions that might trigger an event. The conditional effects have to flag the event trigger by deleting a condition that is required for the planner to continue execution of any other activities (just like the dummy proposition, Pdummy , in the previous construction). The condition is keyed to the event that must be executed, thereby forcing the planner to select the action that represents the effects of the event, even though the effects might not contribute usefully to the construction of the plan it is seeking. This approach can be used to simulate the triggering of events that are not time-dependent, but it is not quite so easy to prevent the planner from introducing gaps into the timeline, by planning to execute the actions that simulate the triggered events after some delay from the triggering condition. The technique described above that works for timetabled events does not work in this case, because it is impossible to predict the period into which the events and triggers must be squeezed (the triggers might not even be made true if the planner does not exploit certain actions). We have described techniques by which PDDL2.1 can be used to describe domains in such a way that actions are forced to be selected in any valid plan, simulating the execution of events. This approach works for timetabled sequences of events, such as downlink opportunities, observation opportunities, sunrise and sunset and other predictable phenomena that are unaffected by the actions of the executive. The approach is less effective when applied to events triggered by the actions of the executive, since it appears impossible to prevent the planner from being free to exploit temporal gaps between the triggering conditions be-

coming true and the actual execution of the event-simulating action. Attempting to include triggering conditions that depend on continuously changing values is even more difficult, since the trigger conditions can become true at time points at which no other activity is planned, so it is no longer possible to adapt the effects of actions to handle event triggers. Thus, the limits of PDDL2.1 certainly lie at the point of events triggered by continuously changing values, if not at the point of events triggered by discrete propositional changes.

Beyond PDDL2.1 An extension of PDDL2.1, PDDL+ (Fox & Long 2001a), has been developed that is capable of modelling events more powerfully, capturing events whose preconditions are triggered during a plan execution. Using this expressive power, it is possible to capture, in the domain model, families of event types, parameterised by preconditions that must be satisfied to allow an instance of an event type to be triggered. These conditions can then be included in problem instances, allowing the instances to determine different numbers of events according to need, without the requirement that the domain be modified for each problem instance. PDDL+ also supports the modelling of processes which are activities with duration that cause continuous change in the values of numeric expressions. PDDL2.1 already allows the expression of time-dependent effects, where the effects can be expressed as a function of the time that has passed since an activity began. This allows one, for example, to express the angle through which a satellite has turned during its continuous slewing from one direction to another. This feature is required to allow more sophisticated exploitation of concurrency when actions need access to the values of numerically valued expressions changing under the control of an activity, as discussed above. PDDL+ allows these behaviours (such as the example of the recharging rover, discussed previously) to be modelled as processes that progress continuously and concurrently, without requiring the processes to be encapsulated within bounded durative actions. The additional sophistication and complexity in managing arbitrary processes and events is beyond current planning technology, but it is clearly a necessary extension if domain models are to accurately capture physical systems. Although it is possible to abstract processes and encapsulate them in the form of durative actions when they do not affect values that are used by any other possibly concurrent activities, this ceases to be the case as soon as continuously changing values interact with multiple activities.

Challenges for Planning Technology The most expressive elements of PDDL2.1 and the proposed extensions in PDDL+ have yet to be confronted by planning systems. Their expressive power has not been fully explored and, although it is clear that such things as windows of opportunity can be modelled in these languages, it is not clear whether the modelling approaches required are convenient or adequate. Other features that can be captured, but perhaps not entirely conveniently, include deadlines, goals that must be achieved within certain time windows (such

as an observation that must be made during a certain interval) and events that do not fall into common families for a domain (one might wish to model a problem involving a satellite or an instrument with a finite life-span, for example, or the launch of a new satellite). PDDL+ provides modelling power capable of expressing these features, but there remains a question over the convenience of the representation. This question concerns both the convenience of use of the language in modelling domains and the subsequent use of the model in planning. Testing both of these properties will require empirical evaluations, perhaps through future competitions. The introduction of plan metrics into PDDL2.1 is a very significant step for planning systems. The complexion of problems can be entirely changed by a simple switch of the plan metric and a good solution under one metric can be a very poor one under an alternative metric. This extension allows the modelling of the common requirement appearing in the Satellite domain that the plan should seek to maximize the data acquired. However, the expression of more complex conditional requirements is harder. For example, if one wishes to express that a collection of observations is linked, so that the value of making one is very low unless all of the collection is made, or the requirement that a certain satellite is very expensive to use except for a certain subset of requests (made by military intelligence, say), then this cannot be expressed directly in the current syntax of valid plan metrics (although encoding tricks can be used to capture them through the introduction of artificial operators).

Conclusions Planning systems have made significant moves forward in the past few years in the expressive power of the languages with which they can plan. Planners now exist that can handle durative actions, exploiting concurrency, that can manage numeric-valued expressions and interesting plan quality metrics. These steps forward have brought real-world planning problems much closer for generic fully-automated planning systems. These application areas include space domains, since space applications constrain the availability of human planners due to communication lags with remote systems. Space applications often offer a further bonus: as highly engineered systems working in comparatively sparsely populated environments, it is realistic to consider planning long sequences of detailed activities. Satellite observation scheduling is an area of application which exhibits these features. In this paper we have discussed a collection of interesting variants of the Satellite domain, which is a close approximation to the real problem of satellite observation scheduling. This problem was used in the most recent planning competition and allowed current technology to show both its strengths and its weaknesses. A more complex version of the domain, capturing more of the features of the real domain, can be expressed using PDDL+, an extension of the PDDL2.1 language used in the 3rd IPC. This extension has a semantics, but continues to pose important challenges for the future of planning.

An important subtext to much of the work presented in this paper is the role of the planning competitions in shaping and directing the directions of research in the field. As organisers of the 3rd IPC, it is clear that the authors consider the role of the competition an important one in the development of the research field. Nevertheless, as a source of impetus for further developments of planning systems towards application it is necessary to make continual re-evaluations of the direction of the competition and the ambitions it creates for the community. Apart from an exploration in greater depth of some of the features that have already been introduced into PDDL2.1, particularly the use of plan quality metrics and the potential for more complex duration constraints on actions, the adoption of features from PDDL+ is a reasonable ambition. A completely unexplored area, as far as the competitions are concerned so far, is the role of uncertainty in planning problems. Finally, if planning is to play a more central role in some of the more ambitious space missions envisioned for the future (as well as many earthbound potential applications), there is a need to consider the problem of interaction between planning and execution. The challenges in both uncertainty and execution cover the entire spectrum from representation and modelling through planning to plan validation and evaluation and represent an exciting path forward for the research community.

References Bacchus, F., and Kabanza, F. 2000. Using temporal logic to express search control knowledge for planning. Artificial Intelligence 116(1-2):123–191. Do, M., and Kambhampati, S. 2001. Sapa: a domainindependent heuristic metric temporal planner. In Proc. ECP-01. Edelkamp, S. 2002. Mixed propositional and numeric planning in the model checking integrated planning system. In Fox, M., and Coddington, A., eds., Planning for Temporal Domains: AIPS’02 Workshop. Fox, M., and Long, D. 2001a. PDDL+ : Planning with time and metric resources. Technical report, University of Durham, UK. Fox, M., and Long, D. 2001b. PDDL2.1: An extension to PDDL for expressing temporal planning domains. Technical report, Available at: www.dur.ac.uk/d.p.long/competition.html. Gerevini, A., and Serina, I. 2002. LPG: A planner based on local search for planning graphs. In Proc. of 6th International Conference on AI Planning Systems (AIPS’02). AAAI Press. Haslum, P., and Geffner, H. 2001. Heuristic planning with time and resources. In Proc. of European Conf. on Planning, Toledo. Hoffmann, J., and Nebel, B. 2000. The FF planning system: Fast plan generation through heuristic search. Journal of AI Research 14:253–302. Koehler, J., and Schuster, K. 2000. Elevator control as a planning problem. In Proceedings of AIPS 2000, 331–338.

Kvarnstrom, J., and Doherty, P. 2000. TALplanner: A temporal logic based forward chaining planner. Annals of Mathematics and Artificial Intelligence 30(1-4):119–169. Long, D., and Fox, M. 2001. Encoding temporal planning domains and validating temporal plans. In Proceedings of 20th Workshop of UK Planning and Scheduling SIG, Edinburgh. Long, D. 2000. The AIPS’98 Planning Competition: Competitors’ perspective. AI Magazine 21(2). McDermott, D. 2000. The 1998 AI planning systems competition. AI Magazine 21(2). Muscettola, N. 1994. HSTS: Integrating planning and scheduling. In Zweben, M., and Fox, M., eds., Intelligent Scheduling. Morgan Kaufmann. 169–212. Nau, D.; Cao, Y.; Lotem, A.; and Mu˜noz-Avila, H. 1999. SHOP: Simple hierarchical ordered planner. In Proc. of 16th Internation Joint Conference on AI, 968–975. Morgan Kaufmann. Smith, D., and Weld, D. 1999. Temporal planning with mutual exclusion reasoning. In Proceedings of IJCAI-99, Stockholm. Smith, D.; Frank, J.; and J´onsson, A. 2000. Bridging the gap between planning and scheduling. Knowledge Engineering Review 15(1).