Dynamic Nash Task Reassignment - Semantic Scholar

2 downloads 0 Views 415KB Size Report
ABSTRACT. Dynamic task assignment is a critical issue in the control of ... weapons, we will characterize each unit by the average number of ... target. , and (c) the average salvo size fired. There are several constraints that restrict the choice of.
Dynamic Nash Task Reassignment Strategies in Multi-Team Systems Yong Liu1, Jose B. Cruz, Jr.2, and Marwan A. Simaan1 1

Department of Electrical Engineering University of Pittsburgh Pittsburgh, PA 15261 [email protected] (412) 624-8099

ABSTRACT

2

Department of Electrical Engineering The Ohio State University, Columbus, OH 43210 [email protected] (614) 292-1588

target on the defending side. The fighting units on each side may actually be teamed up and allocated specific tasks to accomplish. In that case, a problem will arise if some of the teams are able to accomplish their tasks successfully and others are not. It is therefore natural for the leader to consider reassigning those teams that are still strong after successfully finishing their tasks to join the remaining teams. In some cases, even if every team is able to complete its task on its own, the associated costs and the overall system performance may vary drastically if those teams that accomplish their tasks first are reassigned to the remaining tasks rather than if they are left inactive afterwards. The leader may therefore consider reassigning teams that have accomplished their tasks first to cooperate with the remaining teams in order to accelerate the accomplishment of the overall mission.

Dynamic task assignment is a critical issue in the control of dynamic multi-team systems. An interesting example of dynamic task assignment is the deployment of fighting units on each side during a military operation. An initial deployment of resources by a top commander may not always result in the most desirable outcome. A reassignment of the units to different tasks during the course of operation may then become necessary in order to achieve the required objective. In this paper, the possibilities of changing the initial assignment of certain units, and the deployment of units after completing their initial assignment to other tasks are investigated. Two typical cases requiring specific reassignment strategies are considered. Because of the complexity in optimizing multiteam dynamic systems, we consider only the two-step lookahead Nash strategy as the approach to determine the optimal reassignment. We show that the use of reassignment strategies can improve the performance of the overall system and thus provide the team leader with a more flexible and effective control mechanism to achieve the overall desired objective.

In this paper, the reassignment problem in multi-team dynamic systems, and specifically as encountered by a commander in a military operation, is investigated based on the model developed in [1]. In this paper, we make use of the two-step moving horizon game-theoretic Nash solution, considered in [2] to investigate the reassignment problem.

1. Introduction

2.

Dynamic resource allocation problems in multi-team systems involve making decisions that result in an optimal deployment of resources among the teams so as to achieve the maximum overall performance of the system. A resource allocation mechanism that allows for reallocation is important in the control of complex dynamic systems especially when the initial deployment of resources does not always yield the most effective results. A very interesting example of a complex dynamic enterprise is a military engagement between two opposing forces, each with a leader (a top commander), and several agents (the fighting units). A typical task for the attacking force may involve destroying a specific part of a fixed, or moving,

We consider a military operation, in which there are two opposing forces referred to as Blue and Red, respectively. The objective of the Blue is to attack some fixed targets that are defended by the Red force. A state-space dynamic model for such a military operation has been developed in [1]. The Blue force consists of Blue Weasels (BWs) and Blue Bomber (BBs). The weasels are escort fighter planes whose purpose is to attack and suppress the Red air defenses, and the bombers are planes whose purpose is to destroy Red Fixed Targets (FTs) such as bridges, refineries, or airports. The Red force consists of Red Troops (RTs), such as tanks and mobile vehicles, and Red air Defenses (RDs) such as SAM’s. In each force, the individual

An Attrition Model for a Military Operation

elements are grouped into units, and the elements in each unit are referred to as platforms. Thus a unit of BBs with ten platforms is a group of ten Blue Bombers acting as a unified entity. Each platform in a unit is carrying a certain number of weapons. Instead of considering individual weapons, we will characterize each unit by the average number of weapons per platform that it possesses. Finally, to facilitate the development of the model, we will restrict the movement of the unit to be on a two-dimensional square grid, and assume that time is discretized into steps. Thus, at each step, the unit i of type X = {BB, BW , RT , RD} is

These two factors are determined according to the following expressions: XY QijXY (k ) = β pij (1 − e

XY − µ pij

pYj ( k ) piX ( k )

(5)

)

and PijXY (k ) = 1 − (1 − β w PK ijXY )

sYj ( k )

(6)

In expression (5), piX (k ) and pYj (k ) are the number of platforms in the i th unit of X and j th unit of Y respectively

fully described by four variables: its xiX (k ) and yiX (k )

XY and β pij represents the probability that the j th unit of Y

coordinates, the number of platforms piX (k ) in it, and the average number of weapons per platform that it is carrying wiX (k ) .

XY acquires the i th unit of X as a target. The term µ pij is a

The control variables for each unit are chosen from a finite set of choices. These are divided into three types: (a) the  a X (k )  relocate command ri X (k ) =  iX  , (b) the choice of  bi (k ) 

( 0 ≤ β w ≤ 1 ), PK ijXY represents the probability of kill under ideal weather conditions for a single weapon (i.e. an effective salvo size of 1) for the type of weapon used by unit j against the type of platform in unit i, and sYj (k ) represents the average effective salvo size of the

target diX (k ) , and (c) the average salvo size ciX (k ) fired. There are several constraints that restrict the choice of controls for each unit, and they are explained in [1]. The state equations for the unit i of type X can be written as:  x (k + 1)   x (k )   a (k )   = +   y (k + 1)   y (k )   b (k )  X i X i

X i X i

X i X i

(1)

X

piX (k + 1) = piX (k ) Ai (k )

(2)

wiX (k + 1) = wiX (k ) − ciX (k )

(3)

Equation (1) represents the movement on the x-y grid. Equations (2) and (3) are attrition models for the number of platforms and average weapons per platform that govern the behavior of these variables as the two forces engage in a X battle. The term Ai (k ) in (2) represents the percentage of platforms of type X surviving the transition from stage k to stage k+1. Since only one-on-one engagement is allowed, once the identities of the attacking and the attacked units are determined from the choice of target controls, this percentage can be expressed as N RD

AiBB (k ) = 1 − ∑ QijBBRD (k )PijBBRD (k )δ (ξiBB (k ),ξ jRD (k ))δ (BBi , d RD j (k )) j =1

(4)

N RT

∑QijBBRT (k)PijBBRT (k )δ (ξiBB (k ),ξ jRT (k))δ (BBi , d RTj (k)) j =1

Expression (4) is written for the case when X represents Blue Bombers. Similar expressions can be written for the other units [1]. The term δ (.,.) represents the Kronecker delta, and the terms QijXY (k ) and PijXY (k) represent the engagement and attrition factors between the attacking unit th

th

( j unit of Y) and the unit being attacked ( i unit of X).

normalizing factor that uniformly scales the units of these platforms if they are of different types. In expression (6), β w is a weather dependent modification factor

weapons fired by the j th unit of Y that reach the i th unit of X at time k. Mathematically, sYj (k ) is computed according to: s (k ) = Y j

cYj (k ) pYj (k ) piX (k )

E(

pYj (k ) piX (k )

(7)

)

where E (.) is a factor that models the inefficiencies of scale that may exist when two forces of unequal sizes are engaged in combat and modifies the average salvo size that reaches the target accordingly. This factor was first introduced by Helmbold [3] as a modification of Lanchester's equations, and was labeled as the effective firing modification factor. In essence, this factor takes into account the fact that the larger the size of the attacking force with respect to the force being attacked the less effective their weapons will be. In other words, E (.) should be a decreasing function of its argument. In our model, we will use the following expression for E (.) as was suggested by Helmbold [3, 4]: ω −1

 XY pYj (k )  (8) E( X ) =  µ pij  pi (k )  piX (k )  where the factor 0 ≤ ω ≤ 1 is referred to as the Weiss parameter. The more detailed explanation for parameters in expression (5) and (6) can be referred to [1]. After deriving the state equations (1) - (3) for all units, it is possible to write them in the more compact form shown below: z (k + 1) = f k ( z (k ), u B (k ), u R (k )) (9) where pYj (k )

T

z (k ) =  z BB (k ) z BW (k ) z RT (k ) z RD (k ) z FT (k )  ,

task. In this paper, we will focus on the following two situations that require reassignment:

T

z X (k ) =  z1X (k ) " z NX X (k )  , T

u B (k ) = u BB (k ) u BW (k )  ,

Situation 1: Some teams cannot complete their preassigned tasks on their own and need help from other teams. Situation 2: Some teams can complete their pre-assigned tasks on their own, but with a heavy cost in time and losses.

T

u R (k ) = u RT (k ) u RD (k )  , and T

uiX (k ) =  aiX (k ) biX (k ) ciX (k ) diX (k )  .

For each of the two forces we define at every stage k an aggregate objective function that each force wishes to maximize. These functions are in the form: N

BB

N

BW

N

RT

B BB BW RT J k ( k ) = ∑ α BBi pˆ i ( k ) + ∑ α BWi pˆ i ( k ) − ∑ α RTi pˆ i ( k ) i =1

i =1

N

RD

i =1

N

(10a)

FT

− ∑ α RDi pˆ ( k ) −∑ α FTi pˆ ( k ) RD i

i =1 N

FT i

i =1

BB

N

BW

N

RT

J k ( k ) = −∑ β BBi pˆ i ( k ) − ∑ β BWi pˆ i ( k ) + ∑ β RTi pˆ i ( k ) R

BB

i =1

N

BW

i =1

RD

N

RT

i =1

FT

(10b)

+ ∑ β RDi pˆ ( k ) + ∑ β FTi pˆ ( k ) i =1

RD i

FT i

i =1

where pˆ iX (k ) is a normalized number of platforms: pˆ iX (k ) =

piX (k ) piX (0)

k = 0,1, 2,3....K

(11)

The expressions in (10) are linear combinations of normalized platforms and express the objective of each force to maximize its own platforms and minimize the platforms of the opposing force. The controls at each stage k are chosen so as to maximize the above objective functions at stage k+1.

3.

Nash Reassignment Strategy

When there is only one fixed target on the Red side, the Blue commander will assign the entire Blue force to that task. When the number of targets is greater than one, the commander may partition the Blue force into teams and decide which team will be assigned to which target. Let us assume that there are m distinct fixed targets, each occupying a specific location on the grid and defended by specific units of the Red force. On the Blue side, let us assume that the Blue force is divided into n teams where n ≤ m . We make this assumption because, in general, it may not make sense to have more Blue teams than Red targets. Each team would include a certain number of BWs and BBs, and is assigned to a specific target at the beginning of the battle. If n = m , the commander will have as many teams as there are targets and could assign each team to a target. If n < m the commander will have fewer teams than targets and therefore some targets will have to be attacked only after some of the teams that have accomplished their original tasks are reassigned to these targets. In either case, each team will have a pre-assigned

In both of these situations, the commander may consider reassigning a team that has completed its task to one or more of these “weaker” teams. When a team is re-assigned to another task, the commander needs to know which of the above two situations that task belongs to. It is clear that, the control actions taken by a Blue team that has been reassigned should now be determined based on the objective function of the team that it has joined and should continue to take into account the actions of the Red force. In other words, the solution will continue to be gametheoretic in nature. In this paper, we will maintain the Nash strategy as the approach to obtain the optimal reassignment controls for any Blue team that has been reassigned. The objective functions used in the reassignment optimization are also given by the expressions in (10). For simplicity, we will assume that the fixed targets are located near to each other so that the cost of any reassigned path from one target to another can be ignored in the objective functions. In order to reduce the computational complexity in determining the controls, instead of maximizing the objective functions JB over the entire time horizon (i.e. duration of the battle), we will consider the problem where *

the Blue and Red forces will seek control vectors ukB and *

ukR at time k that will maximize the objective function at time k+1 and k+2: J kB, k + 2 = J B (k + 1) + J B (k + 2)

(12a)

= J (k + 1) + J (k + 2)

(12b)

J

R k ,k + 2

R

R

where the right hand side terms are given by (10). We refer to this as the two-step look-ahead approach. In the two-step look-ahead Nash solution, both sides look for sequences of *

*

*

*

two consecutive controls {ukB , ukB+1} and {ukR , ukR+1} that will provide a Nash equilibrium for the costs. That is: *

*

*

*

J kB, k + 2 (ukB , ukB+1 , ukR , ukR+1 ) ≥ *

*

J kB, k + 2 (ukB , ukB+1 , ukR , ukR+1 ) *

*

*

∀ ( ukB , ukB+1 ) ∈ U kB × U kB+1

(13a)

∀ ( ukR , ukR+1 ) ∈ U kR × U kR+1

(13b)

*

J kR, k + 2 (ukB , ukB+1 , ukR , ukR+1 ) ≥ *

*

J kR, k + 2 (ukB , ukB+1 , ukR , ukR+1 )

Table 1: Initial deployment for the example

where U kB and U kR are sets of all available control choices at time k for Blue and Red respectively. After such sequences of control choices are determined, only the controls at time k are actually implemented. The controls at time k+1 are obtained by considering the same problem at the next step, i.e. for performance functions of J kB+1, k + 3

Unit Type Location Platform Weapon Max.Salvo BB1 BB2

and J kR+1, k + 3 , and so on. As such, this is a two-step moving

BW1

horizon Nash solution.

BW2

Since the set of all possible choices for the controls is finite, the two-step look-ahead Nash solution can be found from the corresponding bimatrix game representations. The units never change their locations as the result of the optimization process. We rectify this by assigning a corridor for the Blue force, which guides the Blue units to their pre-assigned targets. The two-step look-ahead approach includes some movement dynamics, but the units still have to be guided to the vicinity of their assigned engagement areas. Clearly, whenever reassignment is necessary the two-step look-ahead strategy enables the Blue commander to make more effective decisions in the sense that the unnecessary losses of the reassigned teams can be reduced. We will explore these characteristics and the advantages of the reassignment strategies in the following examples.

RT1 RD1 RD2 RD3 RD4 FT1

F4 bombers F4 bombers F2-E fighters F2-E fighters Armore d vehicles Fixed SAM & Radar Fixed SAM & Radar Fixed SAM & Radar Fixed SAM & Radar Building

FT2 Bridge

4.

(5,5)

7

4

1

(6,10)

7

4

1

(5,5)

8

4

1

(6,10)

6

3

1

(4,5)

50

3

0.5

(2,4)

6

15/6

5/6

(2,4)

7

18/7

6/7

(3,3)

6

15/6

5/6

(3,3)

7

18/7

6/7

(2,4)

10

N/A

N/A

(3,3)

10

N/A

N/A

Illustrative Scenario

Based on the above model, we consider a scenario that is taking place on a 2-dimensional 10 X 10 square grid. Each square on the grid corresponds to roughly 40 X 40 miles in dimensions. The Blue force consists of two groups of Blue bombers, BB1 and BB2, and two groups of Blue weasels, BW1 and BW2. The Red force includes two adjacent fixed targets, FT1 and FT2, (e.g., a refinery and a bridge) defended by four groups of Red defense units (RD1, …, RD4) and one group of Red troops (RT1). Let us consider an initial assignment, as shown in Table 1, where Blue is divided into two teams. Team 1 includes BB1 and BW1 and is assigned FT2, and Team 2 includes BB2 and BW2 and is assigned FT1. The task of a Blue team is considered accomplished when its assigned fixed target loses at least 40% of its platforms. After a task is accomplished, the corresponding team will either be reassigned or will be returned to base (located in the upper right corner of the grid). The initial states are shown in Figure 1. To illustrate the results of the Nash Reassignment Strategies based on this scenario, we will discuss two examples, corresponding to the two different situations of reassignment mentioned in section 3. In both examples, the simulations are performed in MATLAB using Nash type two-step look-ahead moving controls.

Theatre of OperationsTeam

Number of Platforms

2

11

Blue Bombers Blue Weasels Red Ground Troops Red Air Defenses Red Fixed Targets

10 9

9

8

8

7

7

Team 1

6

BW1

BB1 BB2

6 5

FT 1

5

FT1 FT2

10

RD2

BW2

RD1

RD4

RD3

RT1

4

4

3 3

FT 2

2

2

1 1

0 1

2

3

4

5

6

7

8

9

10

Figure 1: Initial states EXAMPLE 1: In this example, we consider probabilities of kill for each pair of units as given in Table 2, and weighting coefficients in the objective functions of both Blue and Red force as given in Table 3. a) At first, the simulation is performed without the possibility of reassignment. The final outcome of this simulation is shown in Figure 2. We see that Team 1 returned to base after accomplishing its task, but Team 2 exhausted all its weapons and could not

accomplish its task since more than 60% of FT1’s platforms remain undamaged. b) We then performed the same simulation except that the top commander now decides to re-assign Team 1, after it accomplishes its task, to join Team 2. Figure 3, shows a snapshot of how this is accomplished. We see that in the first step, upon joining Team 2, BW1 is very effective in increasing Team 2’s ability to weaken the defense units around FT1. In the next step, we see that BB1 now joins in the attack of FT1. This can be clearly seen in Figure 4. At the end, FT1 is damaged to 40% and the task of Team 2 has now been accomplished with help from Team 1.

Number of Platforms

Theatre of Operations 11 Blue Bombers Blue Weasels Red Ground Troops Red Air Defenses Red Fixed Targets

10 9

10 9

8

8

7

7

6

6

BW1

FT1

FT2 RT1

5

5

BW2

4

4

BB1 RD4

3 3 RD2

2 2

1 1

0

c) Figure 5 gives a comparison of the remaining number of platforms in the two simulations discussed above. It is clear that the reassignment of Team 1, after it finished its task against FT2, to join Team 2, not only helps that Team complete its task against FT1 but also saves more platforms of BB2 and BW2 in Team 2, while BB1 only suffers a little more damages than that in the simulation without using the reassignment strategies.

BBs BWs RT1 RD1 RD2 RD3 RD4 FT1 FT2

1

9

0

0.6

0.5

0.4

0.5

0.4

0.3

0.5

3

BW1

0

0

0

0.8

0.8

0.8

0.8

0

0

2

BW2

0

0

0

0.7

0.7

0.7

0.7

0

0

1

RT1

0.2

0.1

0

0

0

0

0

0

0

RD1 0.7

0.3

0

0

0

0

0

0

0

RD2 0.5

0.2

0

0

0

0

0

0

0

RD3 0.5

0.15

0

0

0

0

0

0

0

0

0

0

0

0

0

Table 2: Probabilities of kill for Example 1

BB2BW1

RD2

BW2

RD1

RT1

5 4

4

BB1 RD4

3 2 1 RD3

0 1

2

3

4

5

6

7

8

9

10

Figure 3: Effect of BW1 joining Team 2 in Example 1 Theatre of Operations

Number of Platforms 11

Blue Bombers Blue Weasels Red Ground Troops Red Air Defenses Red Fixed Targets

10

0

9

FT2

0

0

FT1

10

5

BB2

0

RD3

Number of Platforms

6

0.6

FTs

RD1

6

0.4

0

BB2

10

7

0.5

0

9

8

0.6

0

8

7

0.4

0

7

8

0.5

0

6

Blue Bombers Blue Weasels Red Ground Troops Red Air Defenses Red Fixed Targets

10

0.6

0

5

11

0

0

4

Theatre of Operations

0

0.15

3

Figure 2: Final states without reassignment in Example 1

BB1

RD4 0.6

2

9

10 9

8

8

7

7

6

6

FT1 BW1

FT2

BB1 BB2 BW1 BW2 RT1 RD1 RD2 RD3 RD4 FTs

4

4

BB1 BB2 RD4

3

Blue 0.8 0.4

0.2

0.1

0.1

0.3

0.2

0.3

0.3 1.0

3

Red 0.7 0.7

0.4

0.3

0.1

0.7

0.5

0.5

0.5 1.0

2

Table 3: Weighting coefficients in the objective functions for Example 1

RT1 BW2

5

5

RD2

2 1 1 0 1

2

3

4

5

6

7

8

9

RD1

RD3

10

Figure 4: Effect of BB1 joining Team 2 in Example 1 EXAMPLE 2: In this example, we improve the probability of kill of BW1 against RD1-RD3 to 0.8 instead of 0.5, 0.4, and 0.6 and keep the remaining probabilities as shown in Table 2. We also modify the values of weighting coefficients in the objective functions to the values shown in Table 4. The reason for doing this is to enhance Team 2’s ability to accomplish its task without Team 1’s help.

The comparison of noreassignment and reassignment is shown in Figure 6. When there is no reassignment strategy in the simulation, we indeed see that Team 2 can now finish its task without the help of Team 1. We note, however, that it takes seven time steps for Team 2 to accomplish this task, and this may not be considered satisfactory. The top commander then decides to reassign Team 1,

after finishing its task, to join Team 2. In Figure 7, we see that upon joining Team 2 BW1 is active first, and in the end FT1 is destroyed. It is interesting to note that, during the entire period when Team 1 is reassigned, the BB1 unit remains inactive since it appears that only BW1 is needed by Team 2 to accomplish its task. Also, only five steps are now required to accomplish Team 1’s task resulting in a saving of two time steps. We note that in Figure 6, as in the first example, the choice of reassignment also saves more platforms of BB2 and BW2 in Team 2 and destroys more units of RD1 and RD2.

Theatre of Operations

Number of Platforms 11

Blue Bombers Blue Weasels Red Ground Troops Red Air Defenses Red Fixed Targets

10 9

FT1

10 9

8

8

7

7

6

6

BB2BW1

FT2

BW2 RT1

5

5

RD2

BB1

4

4

RD1

3

3 2

2 1

RD3

1

8 No Reassignment Reassignment 7

0

1

2

3

4

5

6

7

8

9

RD4

10

Figure 7: Effect of BW1 joining Team 2 in Example 2

6

5. Conclusion

# of platform

5

4

3

2

1

0 BB1

BB2

BW1

BW2

RD1

RD2

RD3

RD4

FT1

FT2

In this paper, we discussed issues related to task assignment in multi-team control problems. When a team is not able to accomplish its task or when it can accomplish it in an inefficient manner, the system leader may decide to reassign another team to reinforce that team’s ability. We presented several examples from a military engagement game-theoretic model in which we illustrated these concepts and showed that it is possible to improve the overall system’s performance using reassignment strategies.

Type of Unit

Figure 5: Comparison of the remaining platforms in Example 1

BB1 BB2 BW1 BW2 RT1 RD1 RD2 RD3 RD4 FT Blue 0.8 0.6 0.2 0.1 0.1 0.4 0.4 0.4 0.4 1.0 Red 0.7 0.7 0.4 0.3 0.1 0.7 0.5 0.5 0.5 1.0

Table 4: Weighting coefficients in the objective functions for Example 2

6. Acknowledgement This research was sponsored by the Defense Advanced Projects Agency (DARPA) and the Air Force Research Laboratory (AFRL), Air Force Materiel Command, USAF, under agreement number F30602-99-2-0549. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of the DARPA, the AFRL, or the U.S. Government.

7. References 7 No Reassignment Reassignment

6

5 # of 4 pla tfor m 3

2

1

0

I

BB1 BB2 BW1 BW2 RD1 RD2 RD3 RD4 FT1 FT2 Type of Unit

Figure 6: Comparison of the remaining platforms in Example 2

[1] J. B. Cruz, Jr., M. A. Simaan, A. Gacic, H. Jiang, B. Letellier, M. Li, and Y. Liu “Game- Theoretic Modeling and Control of Military Operations” IEEE Transactions on Aerospace and Electronic Systems, Vol. 37, No. 4, 2001,pp. 1393-1405. [2] J. B. Cruz, Jr., M. A. Simaan, A. Gacic, and Y. Liu “Moving Horizon Game Theoretic Approaches for Control Strategies in a Military Operation” IEEE Transactions on Aerospace and Electronic Systems, (to appear), 2002. [3] Helmbold, R.L., “A Modification of Lanchester’s Equation” Operations Research, Vol. 13, 1065, pp.857-859. [4] Przemieniecki, J.S., Mathematical Methods in Defense Analysis, AIAA Education Series, 3rd Edition, 2000.