Genetic Algorithm Based System for Patient Scheduling in ... - CiteSeerX

Citation Reference: V. Podgorelec, P. Kokol, Genetic Algorithm Based System for Patient Scheduling in Highly Constrained Situations, Journal of Medical Systems, Plenum Press, vol. 21, num. 6, pp. 417-427, December 1997.

Genetic Algorithm Based System for Patient Scheduling in Highly Constrained Situations Vili Podgorelec, Peter Kokol University of Maribor Faculty of Electrical Engineering and Computer Science Smetanova 17, 2000 Maribor Slovenia [email protected], [email protected]

Abstract In medicine and health care there are a lot of situations when patients have to be scheduled on different devices and/or to different physicians or therapists. May it concern preventive examinations, laboratory tests or convalescent therapies, we are always looking for an optimal schedule that would result in finishing all the activities scheduled as soon as possible, with the least patient waiting time and maximum device utilization. Since patient scheduling is highly complex problem, it is impossible to make a qualitative schedule by hand or even with exact heuristic methods. Therefore we developed a powerful automated scheduling method for highly constrained situations based on genetic algorithms and machine learning. In this paper we present the method, together with the whole process of schedule generation, the important parameters to direct the evolution and how the algorithm is guaranteed to produce only feasible solutions, not breaking any of the required constraints. We applied the described method to a problem of scheduling patients with different therapy needs to a limited number of therapeutic devices, but the algorithm can be easily modified to be used in similar situations. The results are quite encouraging and since all the solutions are feasible, the method can be easily incorporated into an interactive user interface, which can be of major importance when scheduling of patients, and human resources in general, is considered.

1. Introduction The problem of constructing an automated scheduling system is know to be a very complex, especially for those situations when the human resources are involved. But it is also well known that to do it by hand takes a lot of effort and associated administrative work, in many cases it is even impossible. Therefore the application of computers to scheduling problems has a long and varied history. As the first generation of computer timetabling problems in the early 1960’s were presented programs to produce school timetables with the aim of fitting classes and teachers to periods. Soon after that numerous different heuristic approaches to timetabling have been introduced, including simulated annealing, constraint logic programming, linear programming and graph coloring heuristics. But it soon became clear that the exact methods are useless for more complex problems due to their ineffectiveness. Therefore a lot of different non-exact or soft methods are used lately, which do not give optimal solutions, but reasonably good solutions are obtained in a relative short time. What kind of solution is good enough and how long are we prepared to wait for it depends on the problem given, but all the research has one objective in common: how to find as good solution as possible in as short time as possible. Regardless of the used method, there are some basic rules that have to be fulfilled in order to construct a qualitative and effective automated scheduling system. First of all, we have to guarantee that all obtained solutions will be feasible; we have to fulfill all specific constraints of the given problem. Moreover, the system has to be efficient enough to find an adequate solution in a limited time to be still useful for the practical use. Beside these basic constraints there are even more non-obligatory constraints, that improve the quality of a scheduling system, when they are fulfilled; and in the case of patient scheduling they are even indispensable. One such property is generality or independence of the problem. It guarantees the use of the system for different kind of scheduling problems, not only for one very specific situation. Second non-obligatory property is the possibility of user interaction in the phase of solution development; when scheduling of patients is considered, this option definitely becomes necessary. Also very useful is the ability of the system to continue the search for the solution when one (or more) of the activities already scheduled is canceled

and removed from the schedule, or when the execution of scheduled activities starts almost simultaneously with the scheduling. Although some recent scheduling methods are able to provide an adequate solution in a reasonable time, there are very little of them that include at least one of the before mentioned non-obligatory properties. A kind of hybrid genetic algorithms have shown some prosperous results lately [2-4], and yet they have quite a number of disadvantages. They have been to problem dependent, which is exactly the opposite of the commonly known properties of genetic algorithms, and could have easily evolved to a suboptimal solution. And instead of finding the schedule itself they were searching for the set of rules of how to produce a good schedule, what disallows user to interact with the system. In the manner to overcome above weaknesses we developed a scheduling algorithm based on the genetic algorithms [1,6,7,9] and machine learning [5,8]. With the introduction of actors, resources and activities the method became problem independent. Because of the diversity kept in a population and “the best survive” principle we avoid the premature convergence to a suboptimal solution. In addition machine learning abolishes possible negative consequences of badly chosen parameter values. With the adequate internal representation of the individuals we guarantee all temporary solutions to be feasible, which assures the user to interact with the system. Also it is very important that the user can directly influence the direction of the evolution by weighting properly all the parameters that influence the quality of evolved solution. And only when the method fulfills all of the above it can be considered as a possible solution to a patient scheduling problem. We applied the described algorithm to a problem of scheduling patients with different physical therapy needs to a limited number of therapeutic devices and a limited number of therapists. In this case, a solution is a schedule, which we can consider to be an assignment of patients to time intervals on the specific therapeutic devices that are operated by physical therapists, trained for specific devices. Obtained results turned out to be very promising and because of the algorithm’s effectiveness and low computing resources consumption it can be used, in our opinion, also for very complex scheduling problems. The basic information on genetic algorithms and machine learning together with the choices made to hit the requirements of our situation are given in Section 2. Section 3 describes the

generation of schedules with the use of our system. Section 4 presents the results obtained by scheduling a test problem, after which the paper ends with some conclusions in Section 5. 2. Genetic algorithms and machine learning Genetic algorithms are adaptive heuristic search methods which may be used to solve all kinds of complex search and optimization problems. They are based on the evolutionary ideas of natural selection and genetic processes of biological organisms. As the natural populations evolve according to the principles of natural selection and “survival of the fittest”, first laid down by Charles Darwin, so by simulating this process, genetic algorithms are able to evolve solutions to real-world problems, if they have been suitably encoded. They are often capable of finding optimal solutions even in the most complex of search spaces or at least they offer significant benefits over other search and optimization techniques. The variety and complexity of learning systems makes it difficult to formulate a universally accepted definition of learning. However, the common denominator of most learning systems is their capability for making structural changes to themselves over time with the intent of getting more efficient in performing given tasks. One of the most important means for understanding the strengths and limitations of a particular learning system is a clear definition of knowledge structures, possible structural changes and the legal operators for selecting and making those changes. There are several different approaches to changing knowledge structures. The simplest one is the changing of parameters that influence the system's behavior; we used this approach in our scheduling system. Let's take a look at simple learning system model (Figure 1). Such a system is performance oriented. We have some tasks that have to be performed (in our case a schedule generation) and learning consists of both knowledge acquisition and refinement. System is separated into two subsystems: • a task component whose performance-oriented behavior is to be improved; in our case it is a genetic algorithm that generates schedules, and • a learning component charged with making appropriate structural changes; in our case we try to find optimal parameters' values that affect the execution of genetic algorithm.

Figure 1. A performance-oriented learning system.

Since the execution of genetic algorithm depends only on a few parameters (three in our case), it is beside of setting the appropriate initial values, also very important how and when in the process of evolution those values are modified. Therefore we presented knowledge as a set of simple rules indicating whether the parameter's value should be increased or decreased and when this action has to be done (Figure 2). Knowledge is acquisited and improved on the base of solved problems as we observe how the solution evolves depending on the parameters' values. As the solution evolves we randomly choose rules and reward or punish them (increase or decrease the probability of a rule to be executed) accordingly to their effect. By executing different problems we can decide which rules are useful in an actual situation. IF THEN

(generationNum>150)AND (generationNum100)

IncreaseMutationProb BY 0,001% WITH PROBABILITY 62,17% ;

Figure 2. An example of a rule. Numbers in italic indicate the values that are updated through machine learning.

3. Schedule generation Every scheduling problem can be described with three categories: activities, actors, and resources. Let’s have a number of actors, each of them having one or more activities to perform. Each activity needs some number of resources in order to be performed. Scheduling

is then a process of assigning all activities to the particular time slots and to the particular resources. Considering the scheduling of patients, patients are the actors, therapies are the activities and therapeutic devices and therapists are two types of resources that are necessary for a therapy to be performed. As we try to construct a schedule, there are some fundamental constraints that could not be broken in order to produce feasible solutions (a solution is feasible if the problem is executable by the given schedule). These are: • no actor (patient) can perform more than one activity at a time, • every resource (therapeutic device or physical therapist) can be used by only one actor at a time (it can perform only one activity simultaneously), • each activity (therapy) has to be performed in only one continuous time interval, • every activity can be performed only with a specific resource, and • some activities have to be performed in an exact time order.

A schedule that satisfies these constraints is a feasible schedule. But just because a schedule is feasible, it does not mean it is good enough to be used. Many other criteria exist which influence the quality of a schedule. Any of these criteria can be included into the evaluation function, that helps us select the fittest individuals for the future evolution, as we will see when the selection operator will be discussed later on.

3.1. Internal representation of the individuals Internal representation of the individuals within a population has to be defined in such a way, that it represents feasible schedule and simultaneously leaves enough space to derive the genetic operators. It has to guarantee that all the solutions will still be feasible after the selection and mutation process, and we should be able to randomly select the crossover point and the influence of the mutation.

In our case, we represent the individuals with a multi-dimensional model. First dimension is always the time, whereas all other dimensions represent all needed types of resources. In the case of patients scheduling the model is three-dimensional with the last two dimensions representing therapeutic devices and therapists. Objects in the model are therapies that have to be scheduled. Their positions within the model show resources needed for them to perform and the time order in which they perform. And how do we guarantee that all the obtained solutions are feasible? As we construct new solutions, only the time order is defined, not absolute time intervals, and upon this structure genetic algorithm operates. Then, in the second phase, the exact time intervals are defined for all the activities; as good as possible according to the time order given (see example on Figure 3).

Figure 3. An example of internal representation of the individuals with only one resource type. First the time order is defined (1st phase), then absolute time intervals are calculated (2nd phase). Activities with the same color belong to the same actor - therefore they can not intersect.

The internal representation of individuals is actually a multi-dimensional array, its elements are the consecutive numbers of activities. Indexes in the first dimension represent the time order, and other the allocation of resources.

3.2. Seeding of the initial population One of the parameters that influences the evolution in the genetic algorithm is the size of population. The size can change during the execution of the algorithm, or it can remain constant. We used the later approach; in this manner the number of individuals remains constant all the time. The actual size is determined upon knowledge gathered in the learning subsystem.

Before we can start with the evolution, we have to seed the initial population of individuals. They are usually generated randomly, but it is very important to create needed diversity. In our case, individuals are created as we fill the table with the therapies in a random time order.

3.3. Selection For the selection scheme we used modified exponential ranking selection method. After the evaluation of all individuals, they are sorted accordingly to their fitness score. Then we replace existing individuals from the worst to the best by creating new ones with crossover from two selected individuals, that still exist from the old population. When all the individuals are replaced, the new population is generated (there is still mutation to be applied). Actually we never replace all the individuals in the population. In this manner we guarantee that in every new population there will be a solution at least as good as the best one in the previous population. The number of the preserved individuals is the second parameter that influences the evolution. Both its initial value and its change during the execution are controlled by the learning subsystem. For effective selection we have to define an adequate evaluation function, that determines the fitness score of each individual. For this case we implemented the method of negative points given to the individuals by the evaluation function based on the values of parameters that, for the given problem, determine the quality of the obtained solution. Less negative points an individual has, better is its fitness score, and more chances it has to be selected for the crossover. For each specific scheduling problem there are some parameters that determine the quality of the obtained solutions. Regardless of the problem given, there are some general criteria of the quality: • overall duration of all activities, • time of the idleness of resources,

• overall waiting time of all actors (time when an actor has to wait between two of its activities), • average waiting time of the actors, and • maximum waiting time of an actor. For all the parameters it is good that their values are as low as possible. Therefore we can use their values as the negative points - evaluation function is then simply the sum of parameters’ values. But all the parameters are not equally important, therefore we introduce weights, that increase or decrease their importance. User can select these weights and in this way direct the searching. For example, if the highest weight is put to the parameter of average waiting time, the algorithm will prioritize the solutions where the actors on the average do not wait long.

3.4. Crossover When we select two individuals from the current population, with crossover we construct a new solution, that is placed into a growing new population (Figure 4). Both selected individuals are divided into several parts by cutting the multi-dimensional model of internal representation of the individuals along the timeline (first dimension) on all of the resource types. New individual is then constructed by randomly choosing parts from both parents and putting them together.

Figure 4. An example of two-dimensional model. Shaded fields from both parents construct new individual.

3.5. Mutation After the new population is fulfilled with individuals, we still have to apply the mutation operator (Figure 5). For all the individuals, except for some number of preserved ones, mutation operator with some probability exchanges two randomly chosen activities.

Figure 5. An example of mutation. Shaded fields are exchanged.

Mutation probability is the third and last parameter that influences the evolution and it is again determined upon the knowledge gathered by the learning subsystem. It turned out by running different tests, that the appropriate mutation probability for described method is quite higher than usual. It is mainly the consequence of the fact, that offsprings inherit by crossover more information from their parents as usual. Also preserving some number of individuals unchanged has an important role in this situation.

3.6. The process of schedule generation For every given problem, we first generate some random test problems, execute them and in this manner the initial knowledge needed for the adequate parameters setting is acquisited. Also the actual problem can be executed several times to refine gathered knowledge. The evolution process starts with the seeding of the initial population by generating a number of schedules randomly, taking just the care that all individuals are feasible. Each individual is then evaluated for its fitness score. According to the fitness score, better individuals have more chances to be selected to produce new ones by crossover. The crossover phase is repeated until new individuals fulfill the complete population (with the exception of some

number of the best individuals that are preserved for the next generation). As the population size is constant, every time a new individual is created, one old individual (with the lowest fitness score) is eliminated. When all the new individuals are generated, the mutation operator is applied with some probability. Next the parameters are modified based on the gathered knowledge. All the phases, from evaluation of each solution’s fitness score to the creation of a new population, are repeated until an acceptable solution is evolved.

4. Application of the algorithm and the results We applied the described algorithm to a problem of scheduling patients with different physical therapy needs to a limited number of therapeutic devices and a limited number of therapists. In this case, a solution is a schedule which we can consider to be an assignment of patients to time intervals on the specific therapeutic devices that are operated by therapists, trained for a specific device (or more of them). Patients are now the actors, therapies are the activities and devices and therapists are two types of resources, that are necessary for a therapy to be performed. In this case, we have to assure the following constraints to be fulfilled in order to obtain a feasible solution: • every patient can perform only one therapy at a time, • every therapeutic device can be used by only one patient at a time, • all therapies have to be performed in only one continuous time interval and can not be broken in several parts, • each therapy can be performed only on a specific prescribed therapeutic device, that can be operated only by trained physical therapists, and • some therapies have to be performed in an exact time order.

The criteria for the quality of the obtained solutions are the following: • overall duration of all therapies,

• time when devices are idle, • overall waiting time of patients, • maximum waiting time of a single patient, • average waiting time of patients who have more than one therapy, and • average waiting time only for those patients who are actually waiting.

Upon these parameters we construct an evaluation function that determines the fitness score of each individual solution. Such application of the described algorithm is intended for little specialized physical therapeutic studios. The patients in these studios are mostly children. Therefore it is more important to consider that all the children should finish their therapies as soon as possible, not waiting long between them, rather than the idle time of devices is very low. By weighting the parameters in evaluation function properly we were able to get almost an optimal solution. Another very important feature in this kind of applications is the handling of so called late cancellations. This situation occurs when one (or more) of the activities are canceled and others have already begun. For example a patient got problems after a therapy and therefore he cancels some of the following ones. In this case we try to reschedule the remaining activities to reduce the overall duration. This is a great problem with almost all other scheduling methods, but our algorithm performed quite well due to the diversity kept in a population. As an example let’s take a look at the results obtained with the described algorithm for the situation with 5 different therapeutic devices and 22 patients, each of them with more than 2 prescribed therapies on average. Based on the gathered knowledge of learning subsystem we set the parameters: population size was 67 individuals, mutation probability varied from 0,003 up to 2,7 %, and in each population between 5 and 6 individuals were unchanged preserved for the next generation. There were 5 physical therapists available, each of them for one therapeutic device. One special constraint was, that all the patients have to perform their therapies on the device number 5 before the therapies on the device number 1. We were looking for the solutions with the lowest possible overall duration and small waiting times

between therapies. You can see one possible solution, evolved after only 165 generations, on Figure 6.

Figure 6. Results obtained with the described algorithm. Upper part shows the schedule for all therapeutic devices (number in the box means patient number); lower part shows schedules for each of the patients.

The duration of all therapies in the obtained solution is 335 minutes and it is also the shortest possible solution (look at the device number 2). Devices 1, 3 and 4 are idle for 10, 5 and 30 minutes, but that does not affect the overall duration. The maximum overall waiting time is 15 minutes for patient number 21, who has 4 different therapies. The average waiting time for all patients with more than one therapy is 5 minutes. Patients 1, 5 and 17, who had have to perform therapies on both devices 1 and 5 are scheduled correctly (therapy on device 5 is always scheduled before the therapy on device 1 for the same patient).

5. Conclusion In the paper we have described a new method for patient scheduling in highly constrained situations. With the use of genetic algorithm and additional machine learning we have provided an effective method that finds qualitative solutions. The introduction of actors,

sources and activities allows all kinds of scheduling problems to be described and solved. With the modification of genetic algorithm the premature convergence to a suboptimal solutions is avoided and successful rearrangement of existing schedules is achieved. An adequate internal representation of individuals together with chosen genetic operators guarantees all solutions to be feasible, and in this manner the possibility of very important user interaction with the system. The ease of controlling the direction of evolution by adjusting importance of the parameters that affect the quality of final solution makes the method even more appropriate to be incorporated into an interactive scheduling tool. Because of its effectiveness and low computing resources consumption it could be used also for very complex problems.

References [1] Thomas Baeck. Evolutionary Algorithms in Theory and Practice. Oxford University Press, Inc., 1996. [2] E. K. Burke, D. G. Elliman, R. F. Weare. A Genetic Algorithm for University Timetabling. AISB Workshop on Evolutionary Computing, Leeds, 1994. [3] E. K. Burke, D. G. Elliman, R. F. Weare. Automated Scheduling of University Exams. Proceedings of IEEE Colloqium on Resource Scheduling for Large Scale Planning Systems, Digest No. 1993/144. [4] E. K. Burke, D. G. Elliman, R. F. Weare. A Hybrid Genetic Algorithm for Highly Constrained Timetabling Problems. Proceedings of the 6th International Conference on Genetic Algorithms (ICGA’95, Pittsburgh, USA), pp. 605-610, Morgan Kaufmann, San Francisco, CA, USA. [5] Kenneth De Jong. Learning with Genetic Algorithms: An Overview. In Bill P. Buckles, Frederick E. Petry (eds): Genetic Algorithms. IEEE Computer Society Press, Los Alamitos, CA, USA, 1994. [6] Stephanie Forrest. Genetic Algorithms. ACM Computing Surveys, pp. 77-80, Vol. 28, No. 1, March 1996. [7] David E. Goldberg. Genetic and Evolutionary Algorithms Come of Age. Communications of the ACM, pp. 113-119, Vol. 37, No. 3, March 1994. [8] David E. Goldberg. Genetic Algorithms in Search, Optimization, and Machine Learning. Addison Wesley, Reading MA, 1989. [9] John H. Holland. Adaptation in natural and artificial systems. MIT Press, Cambridge MA, 1975.