Scheduling Jobs For Three-Stage Hybrid Flow-Shop

King Saud University College of Engineering Department of Industrial Engineering

Scheduling Jobs For Three-Stage Hybrid Flow-Shop

By

Eng. Mohammed A. Al-Ohali Supervised By

Dr. Ibrahim Al-Harkan

Submitted To In Partial Fulfillment Of The Requirements For The Degree Of Master In Industrial Engineering In The College Of Engineering, Department Of Industrial Engineering At King Saud University

Riyadh 1427 H December 2006

0

Scheduling Jobs For Three-Stage Hybrid Flow-Shop By Eng. Mohammed A. Al-Ohali

: Approved by:

Committee Members

Supervisor:

____________________ Dr. Ibrahim M. Al-Harkan

Examiner:

____________________ Dr. Abdulrahman Al-AAhmari

Examiner:

____________________ Dr. Abdelghani Bouras

1

Acknowledgement

Above all, I thank Allah almighty for his mercy. I would like to take a moment to express my deep appreciation and gratitude to my family for their invaluable support. Moreover, I would like to express my gratefulness and acknowledgment to my thesis advisors Prof. Ahmet Bolat and Dr. Ibrahim Al-Harkan, for their guidance and patience; to Dr. Abdulrahman Al-AAhmari for his generous support throughout my master career, and to Dr. Abdelghani Bouras for his constructive comments.

2

Abstract . Although production systems with parallel duplicate machines have been well studied in the literature, relatively few works exist considering serial identical machines. Meanwhile this type of system is frequently utilized in practice and those scheduling algorithms are needed to utilize the existing system efficiently.

This work deals with scheduling jobs for a flow-shop system that consists of three stages; one station in the first stage, and one station in the last stage. There are two identical stations in the middle stage. There is no buffer between machines and the objective is to minimize the maximum completion time (Cmax). Utilizing the identical stations are to smooth out production flow when the processing times for the second stage are longer than that in either stages Recently Branch and Bound, Genetic Algorithm (GA) and mathematical modeling (using Standard Solver) are proposed for systems with two stages. In this work, we extended the system to includ in third stage after the duplicate station, and formulate the problem using two different models: The first one employs existing dispatching policy, and the next by using a newly created mathematical model. The newly created model avoids the overlooked deficiencies in the previously published works and guaranties finding optimal solutions in all cases. Integer programming solver (Lingo) has been used to produce benchmark solutions from the mathematical models. Additionally, a GA is developed to obtain effective solutions for practical size problems. A thorough experimental study is performed to analyze the performance of the solution approaches.

3

TABLE OF CONTENTS Page NO.

iv

Abstract Chapter I Introduction

1

1.1 Flow-Shop Scheduling And Sequencing Problems: Definition And Classification. 1.1.1 Levels Of The Sequencing And Scheduling Problems. 1.1.2 Environments Of The Sequencing And Scheduling Problem 1.1.3 Other Classification Schemes.

1.2 Characteristics And Importance Of Duplicate Serial Stations.

Chapter II Previous Work Chapter III Problem Statement And Formulation 3.1 Introduction 3.2 General Assumptions 3.3 Specific Assumptions 3.4 Problem Formulation

4 5 6 7

9

11 17 17 19 19 20

3.4.1 Problem Statement 3.4.2 Objectives 3.4.3 Constraints 3.4.4 Analyzing Existing Allocation Policy 3.4.5 Formulating The Objective Function

20 21 21 22 30

38

Chapter IV Solution Procedure 4.1 Employing Lingo

39

4

39 40

4.2 Premium Solver Platform 4.3 Genetic Algorithm 4.3.1 Overview 4.3.2 Implementation

42 48

4.3.2.1 Encoding And Initial Population 4.3.2.2 Fitness Function 4.3.2.3 Parent Choice 4.3.2.4 Crossover Operator 4.3.2.5 Mutation Operator 4.3.2.6 Selection Process 4.3.2.7 Stopping Criterion

Chapter V Computational Experiments And Analysis 5.1 Data Generation And Evaluation Methodology 5.2 Experiment Over Small Size Problems 5.2.1 Efficiency Of Different Optimal Models 5.2.2 Evaluation Of Existing Allocation Policy 5.2.3 Evaluating The Performance Of GA

5.3 Experiments On Large Size Problems

48 49 50 51 53 53 54

56 57 58 58 61 63 67

Chapter VI Conclusion And Future Studies

74

References

76

Appendix A Terms and Definitions Appendix B Lingo Code For INLM1 And ILM2 Appendix C Fortran 90 Codes Appendix D Details Of The Statistical Computations Appendix E Paired Comparison: t – Test

78 82 85 104 113

5

List Of Figure Figure 1.1 An automated transfer line with serial duplicate stations without buffer. Figure 1.2 An automated transfer line with parallel duplicate stations without buffer. Figure 3.1 Gantt chart for the example with worst and best makespan. Figure 3.2 A pseudo code for computing the cmax of a given input sequence. Figure 3.3 Schedule using greedy police. Figure 3.4 schedule with positive effect of allowing idle time. Figure 4.1 A typical Genetic Algorithm. Figure 4.2 A flowchart of a genetic Algorithm. Figure 5.1 Growth rate of time requirements of the tow models. Figure 5.2 makespan of GAvs. optimal makespan for problems of 10 jobs with (1,10), (1,50) and (1,100) processing time range. Figure 5.3 Ninety-Nine percent confidence interval for the performance of a GA (with processing time range 1-10) Figure 5.4 Ninety-Nine percent confidence interval for the performance of a GA (with processing time range 1-50) Figure 5.5 Ninety-Nine percent confidence interval for the performance of a GA (with processing time range 1-100)

2 2 25 29 31 32 44 45 60 65 72 72 73

List Of Tables Table 3.1 Processing times of job j at stage k for 10 jobs example problem. Table 3.2 Detailed description of computing of makespan for example of Figure 3.1a Table 5.1 CPU Time for INLM1 (optimal results for 10 jobs problems. Table 5.2 CPU Time for ILM1 (optimal results for 10 jobs problems. Table 5.3 Efficiency (CPUT) of the tow models using LINGO. Table 5.4 Comparing effectiveness of the existing a allocation policy with the optimal solution. Table 5.5 Results obtained using existing allocation policy for 10 jobs problem. Table 5.6 Results obtained by GA for problems of 10 jobs. Table 5.7 Comparing the GA with the optimum solutions (10 sets in each combination) Table 5.8 CPU time requirement of LINGO and GA. Table 5.9 makespan Results obtained by solver and GA for problems of 20 jobs. Table 5.10 Performance of GA over various problems (10 set in each combination) Table 5.11 Performance of PSP & various problems (10 set in each combination).

6

23 28 59 59 60 62 62 63 64 66 68 69 71

Chapter I Introduction Nowadays, manufacturing settings are much more complicated than before, with multiple lines of products requiring many different machines. This complexity passes to the resource management process including production scheduling. In manufacturing environment, job scheduling techniques mean the difference between significant profit and debilitating loss. Also, they mean the difference between a product that ships in time to hit a market window and a product that misses that window.

As early as 1950s, scheduling and sequencing have been considered to be very important tools that significantly improve the productivity, resource utilization and the profitability of production lines. In fact, scheduling and sequencing jobs have a wide variety of applications, from designing the product flow and processing orders in a manufacturing facility to modeling queues in service industries. Flow-shop scheduling problems with their continued challenge remain indeed a useful subject that is still being actively researched (Al-Harkan (1997). Some production lines require special sophisticated configuration and layout to process operations at high level of performance. For example, workstation (machine) that has a long cycle time or has a high failure rate may be duplicated in order to speed up the flow of the production. A three-stage transfer line with two serial duplicate stations in the middle is shown in Figure 1.1. In some situations, duplicate stations are laid out in parallel and additional material handling systems (MHS) is installed to route the products

7

to duplicate stations and to merge the products after the duplicate stations. Figure 1.2 shows this type of layout, which requires extra space for the MHS. Parallel duplicate stations are more efficient than the serial duplicate stations because these stations can be operated independently. In other words, the jobs completed at one of them will not be blocked before entering the following station. However, the cost of extra space and conveyors with a split and merge mechanism may be so high that serial configuration is performed. Another problem with the parallel layout is the possibility of loosing the product sequence and the need to resequence the products after the duplicate stations (Savsar (1998)).

Product

Transfer conveyor Duplicate stations

M/C1

M/C2a

M/C2b

M/C3

Figure 1.1 An automated transfer line with serial duplicate stations without buffers.

Product

Transfer conveyor M/C2b

M/C1

M/C3

M/C2a Duplicate stations

8

Figure 1.2 An automated transfer line with parallel duplicate stations without buffers.

In some cases, this duplication needs to be in series rather than in parallel due to material handling considerations and/or space limitations. Although a parallel layout doubles the capacity of a single station, it is more expensive and requires a special design and mechanism to insert the jobs into there original positions when the sequence of jobs need to be maintained. A very popular example of this situation could be found in automobile industry. We find other examples in the local industry where refrigerator bodies are painted in serial painting rooms at Al Babteen Factory. In such cases, workstations are laid along a straight conveyor and the jobs (products) are moved in one direction. Bolat et al. (2005) gaves another application utilized in ARAMCO, Dammam. The petrochemical lorries are prepared in single station then lorries are filled with longer processing times are performed in one of the identical and serially laid stations then lorries moves to inspection checks station. There is a similar model considered by Hall and Daganzo (1983) for the tandem tool booths of the golden gate bridge. Thus, there are several applications in practice where duplicate stations are utilized in series along a flow line. It is well known that finding an optimal solution for a complete production line scheduling problem is considered to be a very complex problem. In fact, the vast majority of scheduling problems are NP-hard, as explained by (Pinedo (1995)). Never the less, optimizing the performance of a subsystem representing the bottleneck of a whole production line, for example serial duplicate stations, is expected to result in an efficient management for the whole line.

9

In this thesis, we study the problem of scheduling n available jobs on three serial stages with two duplicate stations in the middle as the system shown in Figure1.1. A number of approaches have been applied for the addressed problem: First the policy mentioned in Bolat et al. (2005) has been adopted to the newly defined problem (3-stages and 4 stations). Next, a more general formulation was introduced, which allows the use of recent software tools. Additionally this formulation relaxes a previously imposed scheduling policy, which may results in suboptimal solutions. The third approach is to develope a Genetic Algorithm (GA) to produce good solutions efficiently.

Due to the promising performance of GAs on similar problems, this work has focused on showing the practicality of applying the GAs to the addressed problem. In order to evaluate the performances of the algorithms, a design of experiment has been made to run the algorithms over randomly generated data. Statistical analysis are made to evaluate the performance of the algorithms.

1.1

Flow-shop Scheduling and sequencing Problems: Definition and Classification Sequencing and scheduling, are important activities in production planning and control,

can be defined as the determination of the time-sequencing of customer orders (jobs) and the allocation of available required production resources (personal, machines, tools, etc.) to accomplish the related set of operations (Conway et al. (1967)). Scheduling problems involve finding the optimal schedule under various objectives, different machine

10

environments and characteristics of the jobs.

One of the most important machine

environments is the flow-shop environment. In the general flowshop model, there are series of m machines and n jobs, where each job has exactly m tasks. The first task of every job is to be done on the first machine, the second task on the second machine and so on. Every job goes through all m machines in a unidirectional order. However, the processing time that each task spends on a machine varies depending on the job that the task belongs to. The precedence constraint in this model requires that for each job, task i-1 on machine i-1 must be completed before the ith task can begin on machine i (French (1982)). In the following, we describe a number of categorizations of the sequencing and scheduling problems.

1.1.1 Levels of the Sequencing and Scheduling Problem Sequencing and scheduling are involved in planning and controlling the decisionmaking process of manufacturing and service industries in several stages. According to several researchers (Al-Harkan and Shivnan (1988), Muchnik (1992), and Morton and Pentico (1993)), sequencing and scheduling exist at several levels of the decisionmaking process. According to Al-Harkan (1997), these levels are as follows: 1. Long-term planning which has horizon of 2 to 5 years. Some examples are : plant layout, plant design, and plant expansion. 2. Middle term planning such as production smoothing and logistics, which can be done in a period of 1 to 2 years. 3. Short-term planning, which is done every 3 to 6 months, Examples include requirements plan, shop bidding, and due date setting.

11

4. Predictive scheduling assembly line which is performed in a range of 2 to 6 weeks. Jobshop routing, assembly line balancing, and process batch sizing qualify as predictive. 5. Reactive scheduling or control which is performed every day or every three days. A few examples are hot jobs, down machines, and late material.

1.1.2 Environments of the Sequencing and Scheduling Problem According to Conway et al. (1967), classified Sequencing and Scheduling problems according to the surrounding environments into four categories: the jobs and operations to be processed; the number and types of machines that comprise the shop; the disciplines that restrict the manner in which assignment can be made, and the criteria by which a schedule will be evaluated. The Sequencing and Scheduling environments are as follows: 1. Single machine shop: one machine and n jobs to be processed. 2. Flow shop: there are m machines in series and jobs can be processed in one of the following ways: a) Permutational: jobs are processed by a series of m machines in exactly the same

order, or b) Non-permutational: jobs are processed by a series of m machines not in the

same order. 3. Job shop: each job has its flow pattern and a subset of these jobs can visit each machine twice or more often. Multiple entries and exits are permitted.

12

4. Assembly job shop: a job shop with jobs that have at least two component items and at least one assembly operation. 5. Hybrid job shop: the precedence ordering of the operations of some jobs is the same. 6. Hybrid assembly job shop: it combines the features of both the assembly and hybrid job shop. 7. Open shop: there are m machines and there is no restriction in the routing of each job through the machines. In other words, there is no specified flow pattern for any job. 8. Closed shop: it is a job shop; however, all production orders are generated as a result of inventory replenishment decision. In other words, the production is not effected by the customer order.

1.1.3 Other Classification Schemes Typically, scheduling is developed with respect to certain objectives or goals such as: meeting due-dates, minimizing flowtime and work-in-process, minimizing makespan (the completion time of the last job to leave the system), minimizing the idle time or maximizing throughput and resources utilization. The problems arising in production scheduling are difficult in the technical sense. In general, flow-shop scheduling problems are known to be combinatorial and complex (Garey; and Johnson (1979)). Actual production scheduling problems involve a large number of jobs and machines subject to a

13

various set of constraints and objectives (Lee et al (1993)). It is then not surprising that exact solutions or even formulations are rather unmanageable.

Various of the flow-shop scheduling problems have been considered, studied, and evaluated for over fifty years. These problems may be categorized into stochastic and deterministic cases depending on the nature of the problem. Meanwhile the solution methods form two distinct classes: exact methods and heuristic methods. Exact methods are guaranteed to find an optimal solution if it exists, and typically provide some indication if no solution can be found. Heuristic solutions may have no such guarantee, but typically assure analytically some degree of optimality in their solutions. Stochastic approaches include probabilistic operations so that they may never operate the same way twice on a given problem. Deterministic methods exact or heuristic operate the same way each time for a given problem. Many hybrid methods combine the characteristics of these classes (French (1982)). Some flow-shop scheduling problems need a simple model to develop an exact method for solving the problem. Given a problem, the exact methods find the optimal solution (and are guaranteed to find the optimal solution) every time they are run. However, as constraints are added, the difficulty of solving the problem is increased, and simply finding a good solution (or in some cases, a feasible solution) becomes good enough. In addition, many methods take too long time when applied to problems of significant size, e.g. large number of jobs to be scheduled. This is particularly true for enumerative methods, which are often applicable when analytical procedure cannot be found. For example, the branch and bound method can find the optimal sequence if it is

14

properly designed with the right bounding scheme. However, the CPU time needed to solve problems with large number of jobs may become very high. Because of the excessive time needed to solve a scheduling problem using exact methods, many researchers focus exclusively on heuristic approaches in an effort to find near optimal solutions in realistic computing time.

Scheduling problems may also be classified depending on various schemes: static or dynamic, single-product or multi-product, single-processor or multi-processor facilities etc. One has to remark that in real life no two scheduling problems are the same, and thus, each specific problem has to be characterized and analyzed on its own in order to achieve the best results. This thesis is concerned with a static Flow Shop Problem (FSP). The problem is static if the environment necessary for the scheduling is known and fixed over the time horizon (French (1982)). The addressed problem in this work is also a deterministic one since no stochastic nature is present as all jobs and machines are available prior to the scheduling process.

1.2

Characteristics and Importance Of Duplicate Serial Stations.

The main characteristic of the problem under consideration is the presence of duplicate machines (see Figure 1). It should be noted that this is a real-life scheduling problem that is usually encountered in automated production industries (Inman and Leon (1994)).

15

The duplicate stations problem also exists in areas outside the manufacturing and industrial production systems, e.g., transportation engineering where space limitations force additional vehicles toll booths to be added in series rather in parallel. In the literature, Hall and Daganzo (1983) discussed the problem of tandem toll booths on the Golden Gate Bridge. They investigated the effect of adding a second tool booth in series on capacity. They didn’t, however, consider different operating policies for the tandem booths. They assumed that there were always at least two cars waiting to enter the tollbooths, and that the first two cars in line entered the tandem tollbooths as pair. Inman and Leon (1994) investigated other policies and considered the possibility that there were not always two or more jobs waiting. Other related work on duplicate stations can be found in literature in circuit board manufacturing area. Here the duplicate stations may be physically laid out in series, however they are operated in parallel (Askin et al. (1994)) and (Ng (1995)).

In this thesis it is assumed that the stations are laid out in series and operated in series (Figure 1.1) , i.e. the release of a job from the first station (after its completion) to the second station is only possible if the second station is free. Similarly, the release of a job from the first station (after its completion) to the third station is only possible if both the second and the third stations are free. In addition, the release of a job from the second station is only possible if the third station is free. Moreover, the release of a completed job from the second station to the fourth station is only possible if both third and fourth stations are free. The release of a completed job from the third station to the fourth station is only possible if the fourth station is free.

16

The following sections considers previous works and discusses different methods and approaches that were used to solve similar cases with duplicate stations.

17

Chapter II Previous work Many researchers have looked into the efficient operation of transfer lines. Most of them have dealt with the effect of buffer capacities and equipment reliability on line performance such as Buzacott (1972), Ignall and Silver (1977), Elsayed and Turley (1980), Groover (1982), Gershwin and Schink (1983), Savsar and Biles (1984), and ElTamimi and Savsar (1987). Meanwhile several other investigators have developed models for calculating the output of a two-stage transfer line. Gershwin and Schick (1983) have developed an analytical model for three-stage transfer lines with machine failures. Commault and Dallery (1990) have proposed models to determine the production rate of transfer lines without buffer storage. Hillier and So (1991) have studied the effects of machine breakdowns and inter-stage storage on the performance of production lines systems. They have developed simple heuristic rules to estimate the amount of storage space required to reduce the effects of machine breakdowns. Some other studies, including Burns and Daganzo (1985), Okamura and Yamashina (1979), and Bolat et al. (1994) have addressed the issue of assembly line scheduling without considering the possibility of duplicate stations on the line.

Later, Inman and Leon (1994) have drawn attention to the analysis of serial duplicate stations on automated production lines. They have conducted a real study in an

18

automotive assembly plant where a pair of duplicate stations laid out in series. Duplicate stations are in general used to smooth out production if some stations are slower than the others or subject to failures more often than the others. Due to space limitation in many plants, duplicate stations are installed in series rather than in parallel. Installing parallel duplicate stations requires excessive space, additional costs of employing material handling system and possibly re-sequencing of the products if the initial sequence is to be maintained. It is clear that the performance of a transfer line with serial duplicate stations is significantly affected by the job scheduling policy employed. However, determination of the best scheduling policy is a rather complex problem and cannot be solved analytically. For that reason, Inman and Leon (1994) have simulated a complete line that includes two duplicate stations. They have assumed that the sequence of arriving jobs is fixed, i.e., the jobs are released to the stations in the order they arrived, so that the only decision to be made is the allocation of the jobs to the stations. They have also assumed that the processing time is constant. Furthermore, they have tested four different policies for operating the serial duplicate stations. The first is the “alternating” policy under which jobs are alternately sent to the two duplicate stations. The second policy “Tandem,” releases jobs into duplicate stations in tandem. In other words, the only time jobs are allowed to enter the pair of duplicate stations is when both stations are empty, and there are at least two jobs waiting to enter. The first job in the line moves to the second duplicate station and the second job in line moves to first duplicate station. These two policies can cause throughput inefficiencies as stated by Inman and Leon (1994). The third analyzed policy is the “Greedy” assignment policy, which assigns arriving jobs

19

immediately to the farthest accessible station. Following this policy may cause in blocking the downstream duplicate station by the upstream one. The fourth is the “TimeLeft” policy, which considers the expected processing time left on jobs that are already in the duplicate stations. This policy attempts to improve the Greedy algorithm’s shortcomings. Inman and Leon (1994) concluded that the Time-Left policy is optimal for simple problems; however, it may not perform well over the real problems.

Savsar (1998) has investigated the effect of differentiate scheduling policies on transfer line performance. Independent section of line with two duplicate stations is considered and four different scheduling policies (Alternating, Tandem, Greedy, TimeLeft policy) are tested by simulation under five different case experiments. It was found that the greedy policy performed best with respect to production rate in all cases except the case where the duplicate stations were bottleneck and had constant processing times while the other stations had random processing times. However, Tandem policy resulted in the lowest production rate in all cases studied. Additionally the difference between the policies, significantly reduced when the line was balanced.

As far as exact analytical methods are concerned, Ng (1995) have studied the problem of determining the optimal number of duplicated process tanks. The objective was to maximize the throughput for a given tanks configuration of a single-host circuit board production line. He formulated the problem as a mixed integer program and derived the properties of the optimal solution. This problem was solved optimally to minimize the cycle length with a branch and bound algorithm by Shapiro and Nuttle

20

(1988). Meanwhile Ng (1995) has developed an algorithm to determine the optimal number of duplicated stations that maximizes the productivity of the system.

Savsar and Allahverdi (1999) have addressed the problem of duplicate station scheduling with respect to three objective functions: minimize mean flowtime, meakspan and station idle time. Their work was the first analytical attempt to solve the problem. They have showed that the well-known SPT produces optimal solutions in some cases. Unlike Inman and Leon (1994), they have assumed that all jobs were available at time zero to be scheduled and hence, two decisions needed to be made: One was how to allocate the jobs to the stations, and the other was how to sequence the jobs. Likewise, the same two decisions need to be tackled in the case considered in this study.

Recently, the meta-heuristics have become quite popular over the other approximate methods for solving complex combinatorial optimization problems. Metaheuristics have been highly successful in finding optimal or near-optimal solutions to many practical scheduling and sequencing problems. Some of the applications can be found in Holland (1975), Osman and Laporte (1995), Laporte and Osman (1995), and Reeves (1995). Traditional techniques to attempt the FSP provide exact analytical solutions to highly specific and restricted problems or approximate solutions to fairly general classes of problems. A review of these methods is given by Graves (1981). Modern approaches to the problem have involved techniques such as simulated annealing and tabu search, with improved results. Revees (1995) has provided a review of these methods. Genetic Algorithms (GA)s were introduced by Holland (1975) and only lately,

21

their potential for solving combinatorial optimization problems has been explored. Mott (1991) has discussed how the Genetic Algorithms can be used to drive suitable schedules for a serial flowshop. Bolat et al. (2005) have provided a persuasive evidence of the power of the GA to generate high quality solutions and have also shown that the GA compares favorably with modern approaches with respect to efficiency.

Bolat et al. (2005) have considered three serial stations with the last two duplicate. They have assumed that this line segment, the duplicate stations and the preceding one, is the bottleneck, and optimizing its performance is expected to result in efficient management of the whole line. They have proposed a Branch & Bound (BB) algorithm to obtain benchmark solutions and a Genetic Algorithm (GA) to provide very good solution efficiently, for large problems. Although they have developed lower bounding scheme to increase its efficiency, the BB requires prohibitively long running time for solving 17-18 jobs problems.

Al-Ohali and Bolat (2004) have reformulated the problem with two stages where the

second

stage

has

two

serial

duplicate

stations.

They

have

employed

Solvers/Optimizers in spreadsheet environment as optimal and heuristic procedures. They have found that even the heuristic utilization of this software always produces statistically significant improvements over the earlier proposed GA within the comparative CPU times.

22

As an extension of the previous works, this work considers an additional stage after the one with the duplicated stations. In other words, it will study the problem of determining the optimal or near optimal schedules for n available jobs and four serial stations where the middle two stations are duplicated. The objective is to minimize the total completion time i.e., makspan. With this extension, this work considers more general version of the problem and gets closer to real practice. The allocation policy of the jobs to the duplicate stations in this work is the same as the policy of Inman and Leon (1994), Savsar (1998), and Bolat et al. (2005). Whenever a job is ready to be processed on either one of the duplicate stations, and both stations are available, release the job to the farthest downstream station. In other words, if both stations are empty and a job just finished from the first station, immediately release it into the last duplicate station; the third station on the line. If the first duplicate station “the second station on the line” is empty but the second duplicate station “the third station on the line” is busy and a job finished from the first station, immediately release it into the first duplicate station. When a job just finished from second stage by first duplicate station "second in the line", immediately release it into the second duplicate station “the third station on the line” and the job will be kept idle, while the fourth station in the third stage is busy. Since there is no buffer zone between stations, this would be the best-seemed sensible policy that can be employed in our case. In the next chapter, we present detailed description of the problem considered in this work.

23

Chapter III Problem Statement and Formulation 3.1 Introduction In this chapter, the problem of flow-shop scheduling is formulated for three stagefour stations where the middle two are duplicated (identical). The targeted objectives as well as the underlying constraints are described in the following section. Two different formulations toward solving this problem will be presented.

In a general flow-shop scheduling problem, a given set of jobs has to be scheduled on a certain number of machines or stations. The objective considered is usually the minimization of a performance measure called makespan, i.e., minimizing the completion time of the last job in the sequence.

In this thesis, a system with an automated production line that contains three stages and four stations (or machines) with two duplicate stations in the middle is considered. Assuming that n jobs are available at time zero and each job requires three different operations (or groups of operations) in three stages from four serial stations. The two stations in second stage are identical, thus each job is processed only in the second or third station. Specifically, we let Pi,1 be the processing time for job i at stage 1 (first station) and Pi,2 be the processing time for job i at stage 2 (at station 2 or 3), and let Pi,3 be the processing time for job i at stage 3 (fourth station). We assume that the mean

24

processing time (throughput rate) of stations of stage 2 is substantially lower than that of station 1, and station 4 (in third stage). Thus, adding its duplicate (station 3) will help in increasing the throughput rate of the whole system closer to that of stations 1 and 4 (the average production rate of the first machine and fourth machine are higher than that of the either of the duplicate stations, this is actually the reason for duplicating the second station in stage two). We also assume that there is no buffer zone between the serially laid stations; and transfer times between stations are negligible.

Since the subsystem of four stations is bottleneck segment of the whole system, the jobs have to be maintained in the original permutation sequence until the last job leaves station 4. In other words, jobs completed from station 2 must go through station 3 (as if to be processed with zero processing time), and the jobs to be processed by station 3 after station 1 have to pass through station 2 (again with zero processing time).

Depending on the sequence of jobs and their processing times, some of the jobs may be blocked after stations 1 and/or 2, 3. Obviously, placing the duplicate stations parallel and after the first station sending the jobs to the first available one in stage 2 would make the scheduling problem better. However, as mentioned earlier, material handling requirements and/or space limitations may necessitate the serial layout of the duplicate stations. Therefore, the jobs should be sequenced and scheduled in such a way that the number of blocked jobs is minimized, and thus, the utilization of the identical stations is maximized. In order to define the problem formally, we first list all assumptions related to the operation of the system considered here.

25

3.2 General Assumptions While considering the scheduling problems, a certain number of assumptions have to be made for proceeding towards the solution methodologies. Some general assumptions about the considered system in this work are stated as follows:. •

Each job is an entity which means no two operations of the same job can be processed simultaneously.

•

No pre-emption: A job already started on a station must be completed before another job can start on that station, i.e., a new coming job cannot pre-empt the already under-process job and therefore must wait for scheduling until the previous job finishes.

•

Each job is to be processed once and only once on any station.

•

The processing times of the jobs are independent of the schedule.

•

No station can process more than one job simultaneously.

•

Stations may be idle waiting for the next job to be released from any of the previous machines in the production line.

•

Stations never breakdown and are available throughout the scheduling period.

•

All the jobs are available at time zero. This implies that as soon as a job is released from station S1, the next job in the sequence can immediately be processed on S1.

•

The system has no buffers between stations (zero buffer system).

26

3.3 Specific Assumptions Now we present some assumptions that are relevant to our problem considered here. •

The average production rate of the first machine and fourth machine are higher than that of either of the duplicate stations. This is actually the reason for duplicating the second station in stage two.

•

Due to space restrictions and material handling problems, the parallel layout in the middle stations is not possible.

•

The transfer times between stations are negligible.

•

The original sequence of jobs has to be maintained until the last jobs basses the system.

3.4 Problem Formulation Since the problem and all related assumptions were described above, the problem can be stated formally in the following sections. Section 3.2.1 gives clear problem statement. Next, section 3.2.2 states the objectives of this study. Section 3.2.3 lists all constraints applied to the problem. Section 3.2.4 provides analysis and formulation of the problem according to existing allocation policy. Section 4.2.5 shows a newly developed general formulation for the objective function that has been introduced in this thesis.

3.4.1 Problem statement Formally, the problem considered in this thesis can be stated as follows: a set of n jobs {J1, J2, …, Jn } as well as their processing times on three stages denoted by Pi,1 Pi,2 ,

27

and Pi,3

respectively. Stage one involves station S1; stage 2 involves two duplicate

stations S2 and S3; while stage 3 involves station S4. The completion time of job i in station k is represented by Ci, k. It is required to find a proper sequencing of the given jobs such that the makespan Cmax (the completion time of the last job in the sequence i.e., C [n],4)

is minimized subject to certain constraints. The variable C

[i], k

represents the

completion time of the job in the ith position at the kth station.

3.4.2 Objectives The major objective considered is the minimization of the makespan. However, by considering this objective, the utilization of the stations will be maximized, and the total idle time of the stations will be minimized (French 1982).

3.4.3 Constraints There is no buffer zone between stations. Thus, the release of a job from the first station S1 (after its completion) to the second station S2 is possible only if S2 is free. Similarly, the release of a job from the first station S1 (after its completion) to the third station S3 is possible only if both stations S2 and S3 are free. The release of a completed job from the second station S2 is possible only if the third station S3 is free. Finally, the release of a job from the station S3 to the third stage (station S4) is possible only if there is no job being processed in S4. In summary, the departure time of a job from a station will not be its completion time, if the following station is not free at that time.

28

Another constraint is related to the fact that all stations are in series. Thus, all the jobs have to pass through the third station (even though the job might have been processed in the second station), and fourth station.

The job allocation policy between duplicate stations plays a very important role in utilizing the system. There are several works in literature which successfully apply the greedy policy described in section 3.2.3 in practical applications. As defined below, this policy tries to utilize the stations in greedy manner, i.e., it pushes jobs to the farthest available station without waiting for other opportunities.

3.4.4 Analyzing Existing Allocation Policy The policy used to allocate jobs to the duplicate stations in this section is the same as the policy of Inman and Leon (1994), Savsar (1998), Al-ohali and Bolat (2004), and Bolat et el. (2005). This policy is explained in the following paragraphs. Whenever a job is ready to be processed on either of the duplicate stations, and both stations are available, release the job to the farthest downstream station. In other word, if both stations are idle and a job just finished from the first station, then release it to the second duplicate station; the third station on the line. If the first duplicate station “the second one on the line” is idle but the second duplicate station “the third station on the line” is busy and a job finished from the first station, then release it to the first duplicate station immediately and start processing it there. When a job just finished from second stage by first duplicate station "second in the line", and the second duplicate station is free, then immediately release the job to the second duplicate station “the third station on the line” and keep it waiting, while the fourth station in the third stage is busy. If the second duplicate station is not free then the job should wait in the first duplicate station. Since there is no buffer zone between stations, this is the best-claimed sensible policy that was employed in this case. The above constraints make this problem very hard by imposing a number of restrictions. The process of designing an objective function that can compute the value of

29

makespan requires a careful analysis of the system. In addition, devising good solution methodologies to obtain acceptable solutions becomes hard.

In this section, we design an algorithm for computing the value of the makespan Cmax for a given input job sequence. The accuracy of the objective function is vital to the performance of the solution. The objective function derived in this section will be used in the presented approaches for evaluating a generated job sequence.

The makespan Cmax refers to the completion time of the last job in the sequence, therefore our ultimate objective in this section is to obtain the value of the completion time on station S4. In addition D [i], k can be defined as the departure time of job in the ith position from kth station. Notice, the departure time from a station may not be the completion time of the job if the following station is not free at that time (because of the zero buffer assumption).

The following example is given to expose the details of the problem under consideration for ten jobs and with processing times given below: Table 3.1: Processing times of job j at stage k for 10 jobs example problem. Pj,k k\j 1 2 3

1 1 7 5

2 2 3 5

3 2 5 2

4 1 1 4

5 1 1 4

6 2 7 3

7 1 3 5

8 1 2 2

9 2 9 3

10 3 7 2

Figure 3.1(a) presents the implementation of the policy mentioned earlier when the jobs are sequenced in the order 5,4,1,3,2,10,8,7,6,9. In other words, job 5 is placed in the first position, job 4 in the second and so on.

30

As it can be seen from Figure 3.1(a), at time zero, the departure and completion times for the job in position 1 can be easily defined because both stationsS2 and S3 are idle. So; C [1], 1 = P1,1

(1)

D [1], 1 = C [1], 1

(2)

Due to the adopted allocation policy, the job in the first position will be directly transferred to the farthest free station. In this case, it is S3.

C [1], 2 = D [1], 1

(3)

D [1], 2 =C [1], 2 = P1,1

(4)

C [1], 3 = D [1], 2 + P1,2

(5)

D [1], 3 = C [1], 3

(6)

Similarly, the completion and departure times for the job in position 1 in thirdstage (stations S4) can be defined easily because stations S4 is idle.

C [1], 4 = D [1], 3 + P1,3

(7) (8)

D [1], 4 = C [1], 4

31

32

From now on, we should check whether station S3 and S4 are free or not at the completion of the job from station S1 and S2/S3. The same allocation policy dictates that job [i] will be started on station S1 as soon as job [i-1] is transferred to station S2. Thus:

C [i], 1 = D [i-1], 1 + Pi,1

(9)

In general, job [i] can be moved after its completion from station S1 as long as job [i-1] was moved from station S2. Therefore, the departure time of job [i] from station S1 can be calculated as follow:

(10)

D [i], 1 = max{C [i], 1, D [i-1], 2}

Recall that if station S3 is available at D[i],1 then job [i] will be moved directly to station S3 to be processed given that S2 is free. Therefore, the completion and departure times of job [i] on station S2 can be calculated as follow:

C [i], 2 = D [i], 1 + Pi,2

if D [i-1], 3 > D [i], 1

(11)

Otherwise

(12)

D [i], 2 = max{C [i], 2, D [i-1], 3}

(13)

C [i], 2 = D [i], 1 and

33

On the other hand, the completion and departure times of job [i] on station S3 depend on whether job [i] was already processed on station S2 or not. Specifically, if D [i-1], 3 > D [i], 1

C [i], 3 = D [i], 2

(14)

And;

C [i], 3 = D [i], 2 + Pi,2

otherwise

(15)

Thus; D [i], 3 = max{C [i], 3, D [i-1], 4}

(16)

Similarly for the fourth station

C [i], 4 = D [i], 3 + Pi,3

(17)

Finally;

Cmax = C [n], 4 = D [n], 4

(18)

Notice that the initial conditions are D[o],1 = D[o],2 = D[o],3 = D[o],4 = 0. According to the mentioned policy, the minimum makespan (for this example) is 40 units of time and can be obtained if the jobs are sequenced in the order 5,4,1,3,2,10,8,7,6,9 (Figure 3.1(a)). One can easily verify that the sequence (10,5,3,6,8,9,1,7,4,2) results in Cmax= 59 units

34

of time, which turns out to the longest makespan (Figure 3.1(b)). The following table presents all the detailed calculations.

Table 3.2: Detailed description of computation of makespan for example of Figure 3.1(a). [i] 1 2 3 4 5 6 7 8 9 10

i 5 4 1 3 2 10 8 7 6 9

P[I],1 1 1 1 2 2 3 1 1 2 2

P[I],2 1 1 7 5 3 7 2 3 7 9

P[I],3 4 4 5 2 5 2 2 5 3 3

C[i],1 1 2 3 5 12 15 16 23 25 28

D[i],1 1 2 3 10 12 15 22 23 26 28

C[i],2 1 2 10 10 15 22 22 26 26 37

D[i],2 1 2 10 10 15 22 22 26 26 37

C[i],3 2 3 10 15 15 22 24 26 33 37

D[i],3 2 6 10 15 17 22 24 26 33 37

C[i],4 6 10 15 17 22 24 26 31 36 40

C[i],4 6 10 15 17 22 24 26 31 36 40

To construct the general algorithm for finding the value of the objective function that is, the makespan, we present the following discussion. There is a set of N jobs {J1, J2, J3, …, Jn} to be scheduled on stations S1 , then on S2 or S3 and then on S4. Further, it was assumed that the input sequence of jobs is given to schedule them on the stations. The subscripts used refer to the position of job in the input sequence and not to the job index itself. This means that, Pi,1 , Pi,2 , and Pi,3 refer to the processing time of job that is at the ‘i’th position in the input sequence at first stage (station 1) , second stage (station 2 or 3), and third stage (station 4), respectively.

The process of determining Cmax is a recursive process. This starts by determining the completion time C[1], 4 of job 1. Then, based on value of C[1], 4 the value of C[2], 4 is determined, and so forth. At the end, generalized algorithm to determine the value of Cmax for any input job sequence can be constructed.

35

The pseudo code of the general algorithm to compute the makespan is given in Figure 3.2 below. Although this allocation policy is widely perceived as an optimal by many researchers, it will be shown that it is suboptimal in the next subsection.

36

Algorithm_C_MAX

Step # 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12.

13. 14.

15. 16. 17.

18.

C1= P1,1 D1=C1 C2=D1 D2=C2 C3=D2+ P12 D3=C3 C4=D3+ P1,3 D4=C4 DO J=2,NUM_JOBS C1=D1+ PI,1 D1=MAX(D2,C1) IF (D3.GT.D1) THEN C2=D1+ PI,2 ELSE C2=D1 END IF D2=MAX(D3,C2) IF (D3.GT.D1) THEN C3=D2 ELSE C3=D2+ PI,2 END IF D3=MAX(D4,C3) C4=MAX(D4,D3)+ PI,3 D4=C4 END DO Cmax = D4

END Algorithm_C_MAX

Figure 3 2 A pseudo code for computing the Cmax of a given input sequence.

37

3.4.5 Formulating the Objective Function

It should be clear that the previous allocation policy roles out the possibility of holding a job at any station when the next station is free. This possibility may seem to increase completion time of intermediate jobs. However, the following solution of the same example will show such an action could decrease Cmax.

In reference to the previous example (section 3.2.3) with processing as given in Table 3.1. If we assumed that the jobs are sequenced in the order of 4,2,3,1,10,6,8,5,7,9. In other words, job 4 is placed in the first position, job 2 in the second and so on. Figure 3.3 is prepared to present the result when applying the mentioned greedy dispatching policy. The blocking periods are shown with dotted rectangles where the solid ones indicate processing times of stations. It should be clear from Figure 3.3 that the makespan for this sequence (Cmax) is 44 unit of time. Next, we allow some idle periods even though one of identical station is idle and available. As shown in Figure 3.4, job 3 will wait in station 2 until station 3 becomes available. Thus, although station 2 is available, it was utilized to hold job 3 awaiting for station 3. Station 3 will start processing job 3 at time 6 until 11. The same thing happened with the job in position 9. After the completion of job 7 from station 1, then job 7 will move to station 2. Although station 2 is available, keep station 2 idle and job 7 waiting for station 3 to become available.

38

39

40

Notice that jobs 4,8 can be processed in any of the duplicate stations S2 or S3. The last job leaves the whole system at time 39 which is 5 unit time less than previous policy. Notice the total blocking time is now reduced to 18 compared with 25 which is 7 unit time less than the previous greedy policy.

The above example indicates that it is not enough to determine the best permutation sequence, but one should determine how to allocate the jobs between duplicate stations in the second stage.

Now, we propose two mathematical models with the same objective, which is to minimize the completion time of the last job at the last stage, Cmax. The first model utilizes nonlinear functions to distribute the jobs between identical stations at second stage while the second model utilizes additional linear constraints to avoid the complexity of nonlinearize. Tradeoffs analysis between these two alternative models will be discussed in Chapter 5. In the following paragraphs notation used in both models will be defined. Then, the nonlinear model (INLM1) and the linear model ( ILM2) will be described respectively.

Notation used :

N

Number of jobs to be scheduled

i

Job index, i=1, 2, …, N

j

Position index, j=1, 2, …, N

X i, j

Sequencing variable which is equal to 1 if job i is assigned to position j, and equal to 0 otherwise.

Pi,k Cs,j Ds,j

Processing time of job i at stage k, k=1,2,3. Completion time of the job in the jth position at station s, where s=1,.,4 . Departure time of the job in the jth position at station, s, s=1,..4

41

Yj

Sequencing variable which is equal to 1 if the job in the jth position will be

OD j

processed by station 3 and equal to 0 otherwise. Operation time required by the job in jth position at second stage

Os , j

(station 2 or3) Operation time required by the job in jth position at sth station Completion time, C max of the job at last position from the last station, i.e.,

D 4,N

the mekespan Constant (used to transform non-linear constraints into linear ones)

L INLM1: Objective

Minimization D

4,N

(1)

Subject to N i, j

=1

j = 1,..., N

( 20 )

i, j

=1

i = 1,...., N

( 21)

j = 1,...., N

( 22 )

j = 1,..., N

( 23)

j = 1,...., N

( 24 )

O3, j − Y j .OD j = 0

j = 1,..., N

( 25)

O3, j − OD j + O 2 , j = 0

j = 1,..., N

( 26 )

∑X

C1,1 = O1,1

i =1 N

∑X j =1

N

O1, j − ∑ Pi ,1 X i , j = 0 i =1

N

OD j − ∑ Pi , 2 X i , j = 0 i =1

N

O4 , j − ∑ Pi , 3 X i , j = 0 i =1

Cs , j − Ds , j −1 − Os , j ≥ 0

j = 2,...., N , s = 1,..,4

(28)

Cs , j − Ds −1, j − Os , j ≥ 0

j = 1,...., N , s = 2,..,4

(29)

D s , j ≥ Cs , j

j = 1,.., N , s = 1,...4

(30)

D s , j ≥ D s +1, j −1

j = 2,.., N , s = 1,...3

(31)

42

X i , j = 0or1, Y j = 0or1 Os , j , OD j , Ds , j , C s , j ≥ 0

i = 1,...N , j = 1,..N

j = 1,..., N

(32)

s = 1,2,3,4

(33)

The binary variables X i , j determine the position of the jobs in sequence. Every job has to be assigned to only one position in the sequence, as stated by Equations (20) and (21). Depending on the job in the jth position, the processing time of the job in this position at the first stage is given by Equation (22). Similarly Equation (24) determines the processing time of the job in the jth position at the third stage. Equation (23) determines the processing time of the job in the jth position at the second stage. This job is either processed by the first duplicate station (in case Y j = 0 and O2, j = OD j ), or by the second duplicate (in the case Y j = 1 and O3, j = OD j ) as stated in Equation (25). Notice that complementary relation in Equation (26) guarantees that O3, j = 0 in the former case and O3, j = OD j in the later case. Notice equations in (25) are only nonlinear ones in the

model.

Since the initial conditions that all stations are available at time 0, completion time for the job in the first position will equal its duration, as stated by Equation (27). Similarly, job can be started processing at station s as soon as the job in the previous position is cleared from this station. Equations (28) and (29) determine the completion time of the job in slot j at station s. In more details, Equation (28) guarantees that the station s is free and that the previous job (j-1) has departed. Equation (29) guarantees that the job j has departed from the previous station (S-1) and finished from station s. In other words, to minimize the idle time -and thus, the makespan- the job in jth position will be

43

transferred to the next station s+1 as soon as it is completed in the current station s and the job in the position j-1 leaves station s+1. Equation (30) states that job can depart only when it is completed. Equation (31) states that a job can not depart from the current station unless the previous job (j-1) departed station (s+1). Finally, the makespan is defined C

max

by = D

the 4,N

departure

= C

4,N

time

of

the

last

job

from

last

station,

.

Although there are only N nonlinear equations (Equation (25)), they dramatically increase the complexity of the problem. Therefore, an alternative way of defining the processing times of the stations in the second stage is proposed. An additional N linear constrains (Equations (34),(35)) were employed. These linear equations will replace Equation (25) in the pervious model. L is set to be larger than the highest operation time to ensure correctness of equations (34) and (35). For example, it can be set to the maximum operation time plus one. Furthermore, we have realized that some of the variables are redundant and can be omitted without affecting the results. For example, as soon as a job is completed from fourth station it can be moved forward because there is nothing to block it. Thus, the variable D

4, j

= C

4, j

for any j. An alternative

mathematical model is presented below in a complete form.

ILM2: Minimization

D

(1)

4,N

Subject to(20)-(24), (26),(31),(32), (33) and

44

O2, j ≤ (1 − Y j ).L

j = 1,..., N

(34)

O3, j ≤ Y j L

j = 1,......, N

(35)

D1,1 = O1,1

(36)

D1, j − D1, j −1 − O1, j ≥ 0

j = 2,...., N

(37)

Ds , j − Ds −1, j − Os , j ≥ 0

j = 1,...., N , s = 2,3,4

(38)

The alternative model avoids the nonlinear constraints that are also the main difficulty in solving the INLM1 mathematical model. Thus, there is a tradeoff between the two alternatives and, as the computational experiments will show in chapter V. This tradeoff depends mainly on the characteristics of the processing times.

In the next chapter a detailed solution procedures for the problem under consideration is described.

45

Chapter IV Solution Procedure

This chapter presents alternative solution methods of the problem. We will first solve the problem by a standard integer solver, LINGO 8.0 to obtain benchmark solution. Since initial experiments (pilot run) indicated that problems with more than 10 jobs may take prohibitively long time, we propose a heuristic algorithm, Genetic Algorithm (GA), to solve large size problems. The performance of the GA will be evaluated by comparing its results with the optimal solutions obtained by Lingo. In order to evaluate the GA over large size problems, we will compare its results with those obtained using premium solver platform (PSP), which has proven to be effective and efficient in solving similar problems in previous studies.

The organization of this chapter is as follows. The next sections briefly describes Lingo and premium solver platform (PSP). Lingo will be used to obtain an optimal solution, while PSP will be used to solve the problem according to the existing policy (according previous work) and to solve large size problems. Then I present the implementation details of a heuristic approach, Genetic Algorithm (GA), applied to this problem.

46

4.1 Employing Lingo Lingo is a powerful software tool that has been developed to utilize the power of linear and nonlinear optimization in formulating and solving large problems. In order to utilize the software, one should prepare Lingo model, similar to the mathematical models developed earlier in Chapter 3. The corresponding Lingo models of the INLM1 and ILM2 (described in Chapter 3) will be presented in Appendix B.

4.2 Premium Solver Platform

According to Al-Ohali and Bolat (2004), Premium Solver Platform (PSP) is chosen because of its ability to work very efficiently and effectively over permutation type scheduling problems. It works on spreadsheet environment allowing the user to express complex relationships. PSP was able to arrive at a solution identical to those achieved using Branch and bound (B&B), which has been extensively studied and researched. For large size problems, PSP has arrived at good solutions at acceptable time. In addition, it gives flexibility to balance efficiency and effectiveness, i.e. accuracy and time limits, through adjustment of some available parameters. Such parameters include stopping criteria, total CPU time allowed, and maximum time without any change in the solution. In fact, the scheduling problem under consideration has been reformulated to suit the PSP representation requirement.

The PSP uses a variety of techniques known as preprocessing and probing, which allow the Solver to preset values for some binary integer variables based on the settings 47

of others, and quickly determine whether a sub-problem needs to be solved, during the B&B process. Moreover, PSP takes advantage of cliques (Special Ordered Sets) of binary integer variables to select the next node or sub-problem to be explored, and the next variable to branch upon. It uses the Dual Simplex method to solve its "child" problems more quickly. Another feature is the “cutoff”, which uses a known integer solution to eliminate non-optimal branches. The Solver can then speed up the solution process on new runs.

Al-Ohali and Bolat (2004) showed that the standard solver PSP can be employed efficiently. The computational experiments done over randomly generated data indicated that for small size problems (up to 18 jobs) PSP finds optimal solutions very efficiently. For large size problems, PSP can be utilized as a heuristic algorithm by terminating it at predefined CPU time limit. Al-Ohali and Bolat (2004) provided a comparative study over GA implementation and indicated that good improvement can be obtained on the previously known solution. In the worst case, the average improvement was significant with 90% confidence level. PSP can be utilized to improve solution time by feeding the best make-span value found using GA as a cutoff value so that the Solver can eliminate worse branches early.

4.3 Genetic Algorithm Genetic algorithm (GA) is a well-known technique for solving hard problems involving huge search spaces. While the solution an obtained using GA is not guaranteed 48

to be optimal, yet it is widely used for its practicality in dealing with complicated problems. For example, in order to find a 20 jobs scheduling problem, 20! (2.43E+18) different permutations have to be produced. If each permutation takes .01 sec to be produced and evaluated, then we would require more than 4 centuries to evaluate all possible scheduling alternatives. B&B and dynamic programming could provide a more acceptable time frame, but there is a chance that such techniques may lead to complete enumeration of all scheduling alternatives. On the other hand, heuristic approaches, such as GA, provide a good compromise between search cost and optimality of obtained results. A number of other heuristic approaches namely Tabu Search (TS) and Simulated Annealing (SA) are also used in the literature (Laporte and Osman, (1995b)). However, the GA is selected due to its suitability to our problem. The GA has the characteristic of searching large space in a parallel fashion that helps in finding good solutions.

In the following sections, a description of Genetic Algorithms (GAs) is presented. Subsection 4.3.1 presents GA overview and describes its different parameters. Since the GA is a general purpose iterative heuristic that may be used for solving any type of optimization problems, it needs to be tailored for application to a specific problem. Subsection 4.3.2 describes the details of the specific implementation of the GA in this work, including the settings of various GA parameters and the choice of operators.

49

4.3.1 GA Overview GA is a search technique that loosely emulates the process of the theory of the natural evolution as a mean of progressing towards the better solution. It is based on the theory that individuals are likely to improve in their quality and characteristics with the passage of the generations. This idea follows from the properties of some GA operators, which try to transfer the good characteristics of the individuals in the current generation to the ones in the next. In fact, GA offers an analogy to the natural rule of “survival of the fittest”, i.e. the better or stronger individuals are, the higher their chances of being alive and also to reproduce and continue their generation by passing their good characteristics to their offsprings (Holland (1975) and Goldberg (1989)).

A number of environmental factors do affect the actual implementation of GA, including: 1. How to create new solutions from older ones or how to generate new chromosomes from parental chromosomes, (GA operators)? 2. How to select the parental chromosomes (solutions) that will be used to create the new one (selection method)? 3. How to evaluate the appropriateness of a solution (fitness functions)? 4. What should be the maximum number of chromosomes that can be considered at a time (population size)? 5. When to stop the search process (stopping criteria)?

50

A distinguishing feature of GA is that it works with a set of solutions as compared to other iterative heuristic that work on single solution only. A high level algorithm description and its corresponding flowchart of a GA implementation are given in Figures 4.1, 4.2 respectively.

The algorithm starts with a set of initial valid solutions called population. These initial solutions may be generated randomly or taken from the results of a constructive algorithm. Usually, a random initial population is preferred but the choice of initial population is dependant on the problem under consideration. In some cases, it may be beneficial to construct the initial population by running a constructive algorithm. In this case, GA is used to further improve the results obtained from the constructive algorithm. A general conclusion is that the initial population can be a mix of semi-optimized solutions and random solutions. Next, all individual chromosomes in the generated population are evaluated using a fitness function. The fitness function associates a fitness value with each chromosome, to facilitate subsequent steps of the algorithm. A fitness function can be any reasonable mapping of the result of a solution with respect to the objective being optimized. The fitness value of a solution is a measure of the solution’s proximity to the optimal solution. Some of the important characteristics that should be considered when designing a fitness function are the efficiently and accuracy. Then, after generating the initial population, the algorithm enter an iterative process, each of which produces a new generation. The generation process goes through a number of major steps implementation including: parental choice, cross-over operator,

51

mutation operator, evaluation and selection process. The iteration process continues until a particular stopping criteria are satisfied. In the following, we give brief description to each of these major steps.

Algorithm (Genetic_Algorithm) (Np = Population Size) (Ng = Number of Generations) (No = Number of Offsprings) (Pµ = Mutation Probability) Construct initial population of a predetermined size Evaluate Fitness of each chromosome in the initial population and set the best chromosome.

For i = 1 to Number of Generations (Ng) For j = 1 to Number of Offsprings (No) Choose two parents x,y with probability proportional to fitness value Perform crossover on x,y to generate offsprings Apply Mutation to each of the generated offsprings with probability Pµ. Evaluate Fitness of each of the generated offsprings, update the best chromosome if necessary. EndFor; Select the new population from the old one and the newly generated offsprings. EndFor;

Return highest scoring configuration in population

End. (Genetic Algorithm)

Figure 4.1: A typical Genetic Algorithm.

52

Start

Generate initial population

Evaluate initial population

Select 2 chromosomes to mate

Apply crossover operator

Mutate offspring’s

Evaluate offspring’s

Population limit Is reached?

N

Y Select the best of chromosomes from the previous and current generation

N

Stopping criteria satisfied? Y

Stop

Print best solution

Figure 4.2: A flowchart of a Genetic Algorithm. 53

In the parental selection step, two chromosomes of the current population are selected. These chromosomes are called parents, and are meant to mate and produce new off springs. The number of parents to be selected is equal to the desired number of offsprings to be produced per iteration. There are various methodologies for selecting the parents, but the commonly used selection methodology is the roulette wheel selection method. In this method, the individuals are selected with a probability that is proportional to their fitness value. Therefore, the individuals having higher fitness values are more likely to be chosen as parents for mating. After selection step, different operators namely crossover, mutation, are used. These genetic operators are described below.

The Crossover step accepts two individuals (parents) and generates an offsprings. The crossover operator assures that the generated offsprings inherits some characteristics from each parents. The crossover is applied with a certain probability called crossover rate. The good value of crossover rate is also dependant on the specific problem. There are different crossover operators, namely simple, order, partially mapped (PMX) and cycle (Al-Harkan (1997)).

The Mutation operator, used to introduce new random information in the population (Reeves (1995)), is usually applied after the crossover operator. It helps in producing some variations in the solutions. This reduces the chances that the search process get trapped in local optimal. An example of mutation operation is the swapping of two randomly selected genes of a chromosome. The importance of this operation is that it can introduce a desired characteristic in the solution that could not be introduced

54

by the application of the crossover operator alone. Mutation should be applied with a low rate so that GA dose not turn into a memory-less search process.

The last step in a GA procedure is the selection of the individuals for the next generation. As a result of crossover and mutation operators, the total number of chromosomes may become greater than the desired population size. This step is responsible for eliminating un-fit chromosomes to keep the population size within control. Again there are numerous schemes including roulette wheel for the selection purpose. This decision is based on the specific problem in hand (Al-Harkan (1997)).

As mentioned earlier, the iterative process continues until the stopping critera is met. The stopping critera could include one or more conditions that vary in their complexity. Examples of stopping criteria inculde: maximum number of iterations have been met, an acceptable fitness value has been reached, a very low improvement has been recorded in the previous iteration, or a certain amount of homoginity among the current population has been found.

The quality of the solution obtained by GA depends on the choice of certain parameters such as population type, population size, crossover rate, mutation rate, the type of crossover and the selection scheme used. The next section discusses the implementation details of Genetic Algorithm for our specific problem.

55

4.3.2 GA Implementation To successfully apply GA to a certain problem, it should be tailored according to the specific characteristics of the problem. In this section we provide implementation details of different steps of the GA algorithm.

4.3.2.1 Encoding and Initial Population For a solution to be processed by GA, it is required to be represented in the form of a string or chromosome. In this thesis the encoding scheme given in (Bolat et al. 2005) adopted, and is repeated here for the sake of reader’s comfortance. A chromosome is represented as a sequence of job indices. For instance, consider a set of 6 jobs {1, 2, 3, 4, 5,6}. A possible input job sequence is {4, 2, 5, 3, 1, 6}. We have encoded the scheduling solution as a string of N job indices, where N denotes the number of jobs to be scheduled. A chromosome is formed by concatenating the indices of jobs in a solution into a string. For example, the chromosome corresponding to the above job sequence is 4-2-5-3-1-6.

The resuilt obtined from using GA depend heavily on population size, i.e., the number of chromosomes in the pouplation. Therefor, expereriments are carried out using various pupulation sizes including 2N chromosomes,3N chromosomes, etc.. where N is the number of jobs. The increase in the population size improve the performance of GA but not without increase in the run time of algorithm. So there exists a tradeoff between population size and run time, and the designer has to choose a balance option.

56

In this thesis, the initial population (generation) is generated randomly. In order to utilize the problem knowledge, N solutions for each stage generate will be generated,i.e. there will be a total of 3N members in the first population. Th population size and the generation size have been set as a function of N, this decision was based on a number of experiments. Each chromosome is determined randomly and then it is inserted in the initial population. Chromosome determination is performed by a random combination of all jobs.

4.3.2.2 Fitness Function The fitness function measures the proximity of a given solution to the optimal solution. Higher fitness values characterize better solutions. Since the objective being considered in our problem is the minimization of make-span, the fitness value of a chromosome is related to the make-span of its corresponding job sequence.

For appropriate work of the GA operators, the minimization of the makespan for the problem needs to be translated to a maximization problem. This is done by mapping the makespan value obtained by procedures explained & discussed in Chapter III, into a fitness function with the following formula:

57

F(i) = - Cmax(i)

for all i = 1 to P

where F(i) denotes the fitness value of chromosome i, Cmax(i) denotes the makespan of solution i, and P is the population size. By doing so, it is ensured that the appropriateness of the fitness function of the GA without compromising the qualitative differences among different solutions as could be done by a multiplicative inverse function, for instance. The use of this mapping method has been arrived to after a number of experiments with other functions such as (1/ Cmax(i) ). Taking reciprocal of makespan makes our problem as a maximization problem in which, we require maximizing the fitness value. This ensures the appropriate working of the subsequent steps of GA including the next step, i.e. parent selection method.

4.3.2.3 Parent Selection Method In each generation of GA a certain number of offsprings is created. Each offspring is the result of mating two chromosomes called parents. This means that we have to choose a number of parent pairs from the current population that is equal to number of offsprings.

The roulette wheel selection strategy is implemented for the parental selection step. It is based on the idea of stochastic sampling with replacement. In this scheme, an individual chromosome is selected with a probability that is proportional to its fitness value. The probability Pchoice(i) of choosing a chromosome ‘i’ can be given as

58

Pchoice(i) = 1-(F(i) / ∑i=1,P F(i))

Although using this scheme, there is higher chance for fitter chromosomes to be chosen, but it also allows the individuals having low fitness values to be selected with a non-zero probability. The motivation for using such scheme that favors the choice of fitter chromosomes over the weaker ones is following: mating of two fitter chromosomes is more likely to reproduce fitter chromosomes than mating of two weaker ones. However, the choice of fitter individuals only is a greedy approach that may lead the algorithm to a local optimal. Therefore, roulette wheel selection also permits the weaker chromosomes to take part in crossover implementation. This property ensures the diversity in population that is known to be essential for healthy progress of GA towards the best points in search space.

4.3.2.4 Crossover Operator Crossover is the operator that causes inheritance of characteristics from one generation to the next. There are many different types of crossover operators, such as one-point or two-point simple crossover, order crossover, and partially mapped crossover (PMX), reported in the literature (Osman and Laporte (1995)). For the present problem, each gene in the chromosome representation is distinct and this property must be preserved from generation to generation for a chromosome to represent a valid solution.

59

Therefore, simple crossover operators cannot be used as it may result in infeasible solutions (AL-harkan (1997), Bolat et al. (2005)).

Among applicable crossover operators that are guranteed to produce feasible offsprings comes the PMX. It works by avoiding any redundancy of genes by their builtin redundancy checks. To clarify this operator we provid the following example. A slash “|” indicates a cut point, while x indicates a non-determined job number. 1. Suppose the previous step selected two parents P1:3-2-4-5-6-1 and P2:4-2-5-1-6-3 2.

Two random cuts are selected to give: P1:3-|2-4-5-|6-1 and P2:4|-2-5-1|-6-3

3. Segment swaping produces: C1:x-|2-5-1|x-x and C2:x-|2-4-5-|x-x. 4. Define mapping of items as follows: •

Job 3 from P1 is mapped to first location in C1 without any problems.

•

Job 6 from P1 is mapped to fifth location in C1 without any problems.

•

Job 1 can not be mapped directly to its corresponding location in C1 because it exists in the part that have been copied from P2. To avoid this problem, we map 1 (in P2) to 5 (in C2) and then map 5 (in C1) to 4 (in P1). Thus, job 1 in P1 is maped to job 4 in C1.

5.

This process leads to C1: 3-|2-5-1-|6-4. Applying the same process to C2 gives: 1-|2-4-5-|6-3. Because of its wide acceptance among researchers in this field (Al-harkan (1997) , Bolat et al. (2005)), the PMX operator will be used.

60

The crossover operator is generally performed with high probability Pc. A pilot experiment, indicated that 0.9 is a good number to work with.

Another phenomenon observed is that sometimes an offspring resulting from crossover of two dissimilar parents is exactly similar to one of the parents. This leads to duplicate chromosomes in the population, which decreases diversity of the population. This is undesirable phenomenon in GA because it hindrs the search process from exploring new regions of the search space. To prevent this phenomoneon, we discard any such duplicate offsprings.

4.3.2.5 Mutation Operator For each chromosome in the population selected for the next generation, a random number rand in range [0,1] is generated, and mutation is applied to the chromosome selected if rand < Pu, where Pu is the mutation probability. It was found in the pilot experiment that Pu = 0.1 is a suitable probability. The mutation operator is implemented as a series of random pair-wise interchanges of randomly chosen cells. The number of interchanges to be performed is a function of N, the size of problem, and a randomly generated number between 0.03 and 0.05.

4.3.2.6 Selection Process

There are a number of applicable selection schemes including:

61

1. Roulette wheel selection: Chromosomes for the next generation are selected with probabilities proportional to their fitness value. In this scheme, chromosomes having lower fitness values may also propagate with small probability to the next generation. 2. Competitive selection: The best P chromosomes are selected from the pool of parents and offsprings. This scheme is too greedy. 3. Elitist-roulette selection: The best half of the chromosomes from current population are selected, and the remaining (P/2) chromosomes are selected using roulette wheel. Using this approach, the global best over generations is always remembered. 4. Elitist-random selection: The best half of the chromosomes are selected from the parents and offsprings, whereas the remaining (P/2) chromosomes are selected randomly. Again, using this approach, the global best over generations is always remembered. Initial experiments suggest that Elitist-random selection works better for this problem and gave good results.

4.3.2.7 Stopping Criterion At the end of each generation in GA, the best individual in the population is found and compared with a global (overall) best solution, which is kept in record also. If the current generation’s best is better than the global best, the global best is updated. The history of the global best solution may help in deciding whether to stop or continue the algorithm. One criterion for stopping GA is related to the improvement of global best

62

over certain number of previous generations. If there is no significant improvement then GA is told to stop and return the global best solution.

Another stopping criterion is to stop the GA process after a certain number of generations. Experiments were carried out for different limits of this number, it was concluded that the number of generation is limit as a function of N2.

63

Chapter V Experimental Results And Computational Analysis Objectives of this chapter fall in into three major directions. The first is to study the computational behavior of the new mathematical models, namely INLM1 and ILM2. The second is to evaluate results of existing allocation policy compared with those of the optimal model. The third goal is to analyze the effectiveness and efficiency of heuristic algorithms. The effectiveness study will be based on the deviation of GA solutions from the optimal Cmax for small size problems. For large size problems, the effectiveness study will be based on the improvement of the GA solutions over results obtained by premium solver platform PSP. Unlike previous studies, the effectiveness study described in this chapter is based on the required computing time (total CPU seconds) to terminate the algorithm. This comes in contrast to previous studies where this measure is limited to the time of finding the best solution (i.e. assuming that the algorithm knows in prior the best solution that it may find). We will apply a pair-wise study to compare the computation times and Cmax in both models over the same problem sets. All computational experiments are carried out using a Laptop with 1.6 MHZ CPU and 512 MB RAM.

The rest of this chapter is as follows. Section 5.1 describes the data generation and the evaluation methodology used in this study. Section 5.2 provides experimental results for small size problems. This includes: comparison of the two mathematical models, evaluation of the existing allocation policy and evaluation of the GA under small size 64

problems. Section 5.3 provides results over large size problems, including comparison of GA with the PSP as a benchmark.

5.1 Data Generation And Evaluation Methodology For the completeness of this study, we used experimental data to compare INLM1 with ILM2, evaluate existing allocation policy, and to examine the effectiveness and the efficiency of the proposed approaches. Moreover, the performance of the GA is evaluated using the optimal solution in case of small size problems; and using premium solver platform PSP (evolutionary stander) in case of larger size problems. The data includes problems of size 5,9,10, 20, 50 and 100 jobs. Process times are uniformly distributed and generated randomly. In addition, for each problem size, we have considered three different ranges of processing times: 1-10, 1-50, 1-100 time units for the middle stage; and 1-5, 1-25, 1-50 time units for the first and the last stage, respectively. In other words, when we consider the processing time between 1 and 10, we mean that the processing times for the jobs at the first and last stages are drawn between 1 and 5 time units. Furthermore, we have used ten different data sets within each of the above ranges thus, a total of 600 problem have been solved.

In order to evaluate the existing greedy allocation policy it will be compare it with the optimal makespan for the 10 job problems. The evaluation will be to observe how close are the results returned by greedy policy to the optimal results.

65

As LINGO yields the optimal makespan for the small size problems (10 job problems), thus results returned by greedy policy and GA will be compared to the optimal results. By doing so, the “effectiveness” of the GA is measured. Additionally; for larger problem sizes since optimal results were not achievable the makespans returned by the GA will be compared to the results obtained by PSP.

In order to perform a comprehensive evaluation of the optimal makespan using LINGO and the GA, their run time requirements need to be considered, i.e. the efficiency. As pointed out earlier, large size problems need extremely large run time if an exact solution approach would be used. The CPU time taken to reach the best solution will be used to show the efficiency of the two proposed approaches.

5.2 Experiment Over Small Size Problems 5.2.1 Efficiency Evaluation Of Different Optimal Models (INLM1 And ILM2) As mentioned earlier the optimal solutions can be found only for small size problems due to extremely long running time for relatively large problems. Therefore, the performance of the INLM1 and ILM2 are evaluated and compared for small size problems (9,10 jobs) using LINGO. Table 5.1 shows the results obtained by INLM1 for 10 jobs problems where the processing time of the jobs are in various ranges as shown in the left most column. Ten different problems for each processing time range were experimented (columns 1 to 10). ‘Cmax’ denotes the makespan of the optimal solution whereas ‘CPUT’ stands for the total CPU time (in seconds) spent in reaching the solution with optimal makespan ‘Cmax’.

66

Table 5.1: CPU Time For INLM1 (Optimal Results For 10 Jobs Problems). Problem Instances

Processing Performance Time Range Measure

1-10 1-50 1-100

Cmax

1

2

3

4

5

6

7

8

9

10

42

39

35

41

50

38

36

39

34

40

CPUT(sec) 37416 26376 630 13650 252 53298 80136 84 Cmax 187 232 199 191 171 194 178 163 CPUT(sec) 32760 1260 2940 39102 14364 74172 50316 22176 Cmax 365 425 439 372 429 346 437 465 CPUT(sec) 840 28655 48356 19194 95172 49140 12264 256326

19572 39564 189

190

34524 75264 438

418

24990 125958

Likewise, Table 5.2 shows the results obtained using ILM2 for the same set of problems.

Table 5.2: CPU Time For ILM2 (Optimal Results For 10 Jobs Problems) Problem Instances

Processing Performance Time Range Measure

1-10 1-50 1-100

1

2

3

4

5

6

7

8

9

10

Cmax CPUT Cmax CPUT

42 886 187 780

39 628 232 30

35 4 199 70

41 325 191 931

50 4 171 342

38 1053 194 5891

36 2033 178 1198

39 2 163 528

34 466 189 822

40 942 190 1463

Cmax CPUT

365 20

425 72

439 21

372 457

429 346 1888 1170

437 292

465 840

438 595

418 2843

Lingo can solve the problems with 9 jobs in 4 seconds and, in the worst case, it took up to 7610 seconds to find optimal solution. On the average, INLM1 requires around 1720 seconds where ILM2 requires around 120 seconds. As Table 5.3 indicates, the processing time dose not have any apparent effect on computing time. Meanwhile, the number of jobs has significant effect and, in the worst case, INLM1 model may require 256,326 seconds to solve 10 jobs problems. The range of CPUT also get wider, i.e.,

67

INLM1 takes between 84 and 256,326 seconds, whereas ILM2 takes between 2 and 5,891 seconds. As problems get larger, the difference of CPU time units between models ILM2 and INLM1 gets larger too. This coincide with the mathematical analysis of the two models, and is due to the linearity of ILM2’s time requirements.

Table 5.3 Efficiency (CPUT) Of The Two Models Using LINGO.

NO of jobs

Processing Time Range avg 1976 1779.5 1463.9 27097.8 34687.8 66089.5

.1-10 1-50 1-100 .1-10 1-50 1-100

9

10

Model INLM1 std dev. 2170 1196.7 937.7 26141.41 26178.54 77219.81

min 84 436 478 84 1260 840

Model ILM2 max 7610 4263 3074 80136 75264 256326

avg 95.4 124 142.7 634.3 1205.5 819.8

std dev. 118.3368 131.7945 126.7895 633.1661 1709.436 922.2639

min 4 15 14 2 30 20

The effect of nonlinearity is shown in Figure 5.1 below. The figure shows the large growth rate of time requirements of INLM1 compared with the time requirement of ILM2.

40000 35000

CPUT (sec)

30000 25000 20000 15000 10000 5000 0 5

6

7

8 No. of jobs INLM1 2‫تم‬ ‫ةلسلس‬

9

10

11

ILM2 3‫ةلسلستم‬

Figure 5.1 Growth Rate Of Time Requirements Of The Two Models.

68

max 292 390 384 2033 5891 2843

5.2.2 Evaluation Of Existing Allocation Policy Over Small Size Problems In this section, the results obtained by existing allocation policy will be compared with the optimal makespan obtained. The evaluation will measure the deviation of the solution obtained by greedy policy (existing allocation policy) from the optimal results. This section will describe the effect of relaxing the problem by allowing some idle period even though one of the identical stations is idle and available (as discussed thoroughly in Chapter III). We check if there is an effect, and whether such effect, if exists, is significant or not. We have used PSP to conduct this experiment for the existing allocation policy.

We considered the percentage deviation from optimality to evaluate existing allocation policy. This will offer an informative and simple conclusion of comparison between the optimal results and the results of the existing allocation policy. Table 5.4 shows the percentage of deviations of the results from optimality. Moreover, it shows the minimum percentage of deviation, the maximum percentage of deviation and the average percentage deviation of the makespan for each processing time range. The table also shows t-test results using 99% confidence interval.

The critical limit for the t-test is 2.821 for 1% significance level (alpha= 0.01) with sample size of 10. As shown in Table 5.4 t-test values are less than this limit in all cases for problems of 5 and 9 jobs and in cases of process time in range 1-10 for problems of 10 jobs. This indicates that existing greedy policy produces results that are

69

significantly close to the optimal solutions. Complete details of the statistical computations are given in Appendix D.

Table 5.4 Comparing effectiveness of the existing allocation policy with the optimal solution (using 10 sets) Percentage (%) Deviations From Optimal Solutions Processing average Std. Dev Min Max t-test, t0 Time Range .1-10 0.60 1.52 0 1.2 1.25 1-50 1.60 2.94 0 2.8 1.72 1-100 2.10 3.65 0 3.8 1.82 .1-10 0.50 1.19 0 2.2 1.32 1-50 2.41 3.26 0 3.1 2.34 1-100 2.81 3.28 0 3.9 2.71 .1-10 0.69 1.1197 0 2.5 1.94 1-50 2.325 1.634 0 5.23 4.88 1-100 2.534 1.09 0 4 7.34

NO of jobs

5

9

10

At the same time, PSP is able to arrive at its results for 10 job problems with average time of 3.4 seconds for process time rages 1-10; 5.2 seconds for problems with processing time range 1-50; and 8 seconds for problems with processing time range 1100. This is compared to average time for INLM2 (using LINGO) of 634, 1709, and 922 seconds respectively. Table 5.5 shows the results obtained by PSP solver using existing allocation policy for 10 job problems.

Table 5.5 Results Obtained Using Existing Allocation Policy For 10 Jobs Problem

Processing Performance Time Measure Range 1-10 1-50 1-100

Problem Instances 1

2

3

4

5

6

7

8

9

10

Cmax

42

40

35

41

51

38

36

39

34

41

CPUT

14

2

8

3

1

1

1

1

2

1

Cmax

196

240

210

196

171

198

180

166

191

193

CPUT

2

7

2

4

8

8

6

4

6

5

Cmax

376

434

452

385

447

357

445

477

450

418

CPUT

8

8

4

14

4

2

34

1

3

2

70

5.2.3 Evaluating the Performance of GA over Small Size Problems The optimal solutions of model ILM2 solved by LINGO will be used to benchmark the GA as applied to problems of ten jobs. The run time requirements and the Cmax are the basis of benchmarking for the performance of GA. The Cmax comparison tests the effectiveness of the proposed GA, while the CPU run determines its efficiency.

Table 5.6 shows the results obtained by the GA for 10 jobs set of problems. ‘Cmax’ denotes the makespan whereas ‘CPUT’ stands for the CPU time (in seconds) spent in reaching the solution with best makespan ‘Cmax’.

Table 5.6 Results Obtained By GA For Problems Of 10 Jobs Processing Performance Time Measure Range 1-10 1-50 1-100

Problem Instances 1

2

3

4

5

6

7

8

9

10

Cmax CPUT Cmax CPUT

42 1 196 0

40 2 240 3

35 2 210 1

41 0 196 1

51 1 172 1

38 5 198 2

36 0 180 6

39 0 166 1

34 1 191 1

41 1 193 2

Cmax

376

434

452

386

447

357

445

477

450

418

CPUT

1

1

2

1

0

6

2

1

1

10

For further analysis of the above results, the percentage deviation from optimality is considered to evaluate the GA effectiveness. Again, this will offer an informative and simple conclusion of comparison between the optimal results and the results obtained by GA. Table 5.7 shows the percentage deviation of the GA results from optimality. Moreover, it shows the minimum percentage deviation, the maximum percentage 71

deviation and the average percentage deviation of the makespan for each processing time range. This represents Effectiveness of the GA approach.

Table 5.7 Comparing the GA with the optimum solutions (10 sets in each combination) percentage (%) Deviations from optimal solutions NO of Processing jobs average Std. Dev Min Max t-test,t0 Time Range 1-10 0.60 1.52 0 1.2 1.25 5 1-50 1.60 2.94 0 2.8 1.72 1-100 2.10 3.65 0 3.8 1.82 1-10 0.50 1.19 0 2.2 1.32 9 1-50 2.41 3.26 1.9 3.1 2.34 1-100 2.81 3.28 0 3.9 2.71 0.69 1.1197 0 2.5 1.94 1-10 10 2.384 1.55 0.58 5.23 4.84 1-50 2.559 1.114 0 4 7.26 1-100

As Table 5.7 presents, for five jobs, average deviations of 0.6, 1.6 and 2.1% are considered to be insignificant difference from optimal. The critical limit is 2.821 for 1% significance level (alph= 0.01) with sample size 10. As shown t-test value is less than this limit in all 3 cases, indicating that GA produces significantly close to the optimal solutions. In the worst case, a GA solution is 3.8% far from the optimal.

Similarly t-test value in all cases for problems of 9 jobs is less than the critical limit, with average deviations 0.5,2.41 and 2.8% indicating that GA produces results that are significantly close to the optimal solution.

The performance of the GA 10 job problems is also very promising. Average deviations for problems with 1-10 processing time range shows insignificant difference from optimal, since the t-test value is less than the critical limit (2.821). In the remaining 72

problems, with rang 1-50 and 1-100 the average deviation is 2.38 and 2.56% which is considered significant.

Although the GA approach finds the optimal only for 8 cases ( out of 30) over 10- job problems, the results are generally close to the optimal one. For the problem of 10 jobs with makespan as the primary performance measure, the GA algorithm gives solutions within 0.69-2.56% average deviation from the optimality and with 5.23% in the worst case. GA finds optimal solution in 7 cases out of 10 problems with 1-10 processing time range, while it solves one instance optimally for processing time range (1-100). The complete details of results and calculations are shown in appendix D. Figure 5.2 show that GA guarantee high effectiveness of finding very close-to-optimal makespan for problems of size 10 jobs.

465 415 365 315 265 215 165 115 65 15 1

2

3

4

5

6

7

8

9

10

Pr o b lem N umb er

optimum solutions

GA

Figure 5.2 Makespan of GA vs. optimal makespan for problems of 10 jobs with (1,10), (1,50) and (1,100) processing time ranges.

73

Table 5.8 presents statistical detail of the CPU time required by the GA and by LINGO. GA solves all 60 instances (9 and 10 job problems) within seconds. Also as indicated, there are some cases where LINGO solves it optimally in seconds (2.4 sec), but as the number of jobs increases, not only the average CPU time increases, but also the range of CPU time, e.g., some cases take very long time.

Table 5.8 CPU time requirement of LINGO and GA (10 sets per row) NO of jobs

9

10

Processing Time Range .1-10 1-50 1-100 .1-10 1-50 1-100

GA CPU time (sec) avg 0.7 1 1 1.3 1.8 2.5

std dev. 1.2 1.1 1.2 1.4 1.6 3.1

min 0 0 0 0 0 0

max 1 1 1 5 6 10

LINGO using Model ILM2 CPU time (sec) avg std dev. min max 95.4 118.3368 4 292 124 131.7945 15 390 142.7 126.7895 14 384 634.3 633.1661 2 2033 1205.5 1709.436 30 5891 819.8 922.2639 20 2843

Moreover, for problems of 10 jobs, the algorithm shows significant dominance in terms of CPU run time. That is, LINGO returned the optimal makespan in average time of 886 seconds while the GA approach finds near optimal solutions in average time of 2 seconds. That means, the GA finds near optimal solutions in 0.22% of the time needed by the LINGO to reach the optimal solutions. This leads to the fact of the significant advantage of the GA approach as it returns near optimal solutions in extremely short CPU run time.

74

In conclusion, GA can arrive at results that are very close to optimal solution in a reasonable time for small size problems. Note that LINGO produces the optimal solution, with high growth in time requirements in relation to the problem size.

5.3 Experiments On Large Size Problems As shown in section 5.2.1, the optimality using LINGO is brought at a cost of longer run time and in some cases beyond our reach as the size of the problem gets larger. Therefore, there is a need to consider another benchmark basis in order to evaluate the performance of the GA for larger size problems. Since lower bound failed to expose the real performance of the GA, for similar problems i.e. two stage problems (Bolat et el. ( 2005)). Therefore, we will use PSP which has proven its efficiency and effectiveness in previous works as benchmark (Al-Ohaly and Bolat (2004)). For larger problems, PSP can be utilized as a heuristic algorithm by terminating it at predefined CPU time.

So for problems with 20 jobs or larger, computational experiments are performed to compare the performance of the GA. Pairwise comparisons are made with the makespans of the solutions obtained by the GA and that obtained using PSP over the same problem instances. The percentage difference of the GA solutions is determined and evaluated.

75

Table 5.9 shows the makespan results obtained by PSP and the GA for problems of 20 jobs. The problems are categorized according to three different processing time ranges. Appendix D provides details about remaining results for problems of 50 and 100 jobs.

Table 5.9 Makespan Results Obtained By Solver And GA For Problems Of 20 Jobs.

Problem Instances Processing Time Range

Cmax of GA

Cmax of PSP

1-10 1-50 1-100 1-10 1-50 1-100

1

2

3

4

5

6

7

8

9

10

69 384 877 68 384 872

67 397 750 67 392 750

77 386 692 77 386 682

70 394 654 69 389 654

70 390 757 65 390 744

72 299 723 72 297 704

63 303 664 63 299 658

67 337 677 66 337 671

65 402 769 65 402 769

70 367 714 70 367 697

We have solved all ten sets of each combination, total of 120 problems, with GA. Table 5.10 presents the statistics related to best objective value, Cmax, of GA solutions to give some ideas about the characteristics of data and objective values of GA schedules. The CPU time requirements mainly depend on the size of the problem, n. For 10 jobs the maximum CPU time is 1 second, for 20 jobs, the maximum CPU time is 13 seconds.

For 50 jobs the solution is found after 149 seconds in average with maximum of 335 seconds. However, the average running time of the algorithm is 315 seconds. This is due to the fact that the algorithm is unable to know in advance that it has encountered the best solution, and thus may continue searching for better solutions.

76

For 100 jobs the average is 1077 seconds with maximum of 1885 seconds. The total running time is 1926 seconds in average. Note that the total running time is not the time to find the best solution but the time needed for the algorithm to terminate. In fact, this more accurately and truly reflects the effectiveness of the algorithm.

Table 5.10 Performance Of GA Over Various Problems (10 sets in each combination)

Number

Processing

of Jobs

Time

10

20

50

100

Statistics related to Cmax Average

Std. Dev.

Minimum

Maximum

1-10

39.7

4.59

34

51

1-50

194.2

19.67

166

240

1-100

424.2

36.89

357

452

1-10

96

3.69

63

77

1-50

356.9

36.93

299

402

1-100

727.7

62.37

654

877

1-10

178.3

8.61

168

190

1-50

893.4

64.24

809

1024

1-100

1764.6

94.33

1653

1926

1-10

356.3

11.27

330

361

1-50

17780.1

73.79

1634

1894

1-100

3570.6

125.06

3416

3764

In order to make fair comparison, we run PSP for only 1 second over 10 jobs, 13 second over 20 jobs problems, 315 second over 50 jobs problems with maximum time without change (m/o) equal 149 seconds and 1926 seconds over 100 jobs problems with

77

w/o equal 1077. Then we compare the PSP solutions and GA solutions with respect to quality of the solutions, i.e., Cmax, makespan of schedules. Over 20 job problems, for 15 sets, PSP found better solutions and for remaining problems PSP and GA found same quality solutions. With 50 jobs, GA found a better solution for 29 cases, and inferior solutions for only one cases. Over 100 jobs, GA found a better solution in all cases. Table 5.11 presents the statistical details about the performance of GA and PSP. Note that, cases where GA produces better solutions than PSP are indicated by negative values in the table. This means that GA produces solutions with less makespan (where negative sign means reduction in makespan of GA compared to PSP).

As the t-test values determined by pairwise statistics indicate, the improvement of GA over PSP increases as the problem size increases. As shown, t-test value is less than the critical limit of 1 % (+ - 2.821 ) significance level for all levels of 10 and 20 jobs problems. In other words, PSP and GA produces same solutions in 99% of the time and only 1% of the time PSP produces better solution. For 50 jobs problems, the performance of PSP reduces and GA produces better solution in all cases except one problem with process time range 1-10, the improvement obtained by GA is significantly better with 99% confidence level. GA performs similarly for 100 job problems, it reaches better solution in all cases and thus, the performance of GA is significantly better.

78

Table 5.11 Performance Of PSP & GA Over Various Problems (10 Sets In Each Combination)

Number

Processing

of Jobs

Time

Average

Std. Dev.

Minimum/ Maximum

Maximum / Minimum

t-test, t0

1-10

0

0

0

0

0

1-50

0.058

0.1839

0

0.581

1

1-100

0.026

0.0819

0

0.259

1

1-10

1.151

2.2144

0

7.143

1.644

1-50

0.452

0.6

0

1.32

2.34

1-100

1.05

1.37

0

2.628

2.437

1-10

-1.54

1.515

-3.88

1.613

-3.2

1-50

-4.197

1.48

-6.413

-2.1

-8.92

1-100

-3.318

0.78

-4.88

-1.856

-13.323

1-10

-2.586

1.59

-5.44

-0.28

-5.14

1-50

-7.108

3.4

-16.157

-3.509

-6.6

1-100

-6.78

3.31

-15.3

-3.67

-4.77

10

20

50

100

Percentage (%) Improvement GA Solutions over PSP

Figures 5.3, 5.4 and 5.5 presents the 99% confidence intervals of the differences between solutions obtained by the GA and PSP. It appears that as the number of jobs increases, the improvements of the GA solutions over PSP solutions increases. Meanwhile, there is no apparent effect of the processing time ranges on the quality of GA solutions. Note that the improvement is measured by subtracting the makespan of PSP from that of GA and dividing the result by makespan of GA.

79

Processing Time range (1-10)

Improvemennt Over PSP (%)

10.0

0.0 10

20

50

100

-10.0

Number ofJobs

Average

Lower Limits

Upper Limits

Fig. 5.3 Ninety-Nine Percent Confidence Interval For The Performance Of The GA (with processing time range 1-10)

Processing Time (1-50)


10.0

0.0 10

20

50

100

-10.0

-20.0

Number of Jobs

Average

Lower Limits

Upper Limits


80

Processing Time Range (1-100)


10.0

0.0 10

20

50

100

-10.0

-20.0

Number of Jobs

Average

Lower Limits

Upper Limits


81

Chapter VI Conclusions And Future Studies In this work, we have considered a production flow line with three stages. The middle stage has two duplicate stations in order to increase the production rate. All stations are laid serially. Every job must pass through all stations in the same order. The transfer time between stations is assumed negligible. There is no buffer between stations.

We solved the problem according to the existing allocation policy, which has been found suboptimal, for minimizing the completion time, makespan. Therefore, we have re-formulated the problem to permit better utilization. This is achieved by allowing some idle period even though one of the identical stations is idle and available. This may be thought of for a better opportunity to be realized in a later time. This has helped to overcome some overlooked deficiencies in the previously published policy .

We developed two mathematical models and employed standard solver Lingo to obtain optimal solution. We have compared the optimal results with the results obtained by applying existing allocation policy, and found in average, no significant difference related to Cmax.

As larger problems require significantly prohibitively longer processing time by Lingo, we developed GA to process them. Since the lower bound failed to expose the real performance of the GA, we have utilized PSP to benchmark the performance of GA.

82

Lingo is able to find optimal solution in seconds for small size problems within 5 jobs, and up to 9 jobs within reasonable amount of time. From 10 jobs upward, the effect of the composing mathematical components on the running time of Lingo is large. As expected, the linear model has shown results that are faster than that of the non-linear model. Evaluation criteria include CPU computation time required by each model.

The GA has performed very well on small size problems. Overall it has found the solution within average of 2.5% deviation from the optimal in the worst case. The GA find near optimal solutions in 0.22% of time needed by lingo. On larger problems the algorithm shows significant dominance over Lingo in terms of CPU run time. In addition, it shows improvements over PSP solution in terms of makespan with larger problems (larger than 20 jobs).

Future studies may fall in various directions, including: 1. Considering more than two duplicate stations in the middle line. 2. Developing optimization criteria for the whole system (m stations) rather than one bottleneck stage of the system. Thus, we take into consideration the possibility that the bottleneck stage may change as we proceed with the optimization process.

3.

Developing problem tailored optimum solution methods such as Dynamic Programming or Branch and Bound algorithm.

83

References 1. Al-Harkan, I. M., (1997), On Merging Sequencing and Scheduling Theory with Genetic Algorithms to Solve Stochastic Job Shops, Dissertation of doctor of philosophy, University of Oklahoma. 2. Askin, R. G., Dror, M., Vakharia, A. J., (1994), Printed Circuit Board Family grouping and components allocation for a multi-machine, open-shop assembly cell, Naval Research Logistics, 41, 5, 587-608. 3. Al-Ohali, M. and Bolat A. (2004), Two-stage flow-shop scheduling problem, Second International Industrial Engineering Conference, IIEC-2004, Riyadh, Saudi Arabia. 4. Bolat, A., Savsar, M., and Al-Fawzan, M. A., (1994), "Algorithms for Real-Time Scheduling of Jobs on Mixed Model Assembly Lines", Computers and Operations Research, 21, 487-498. 5. Bolat, A., (1997), "Sequencing Jobs for an Automated Manufacturing Module with Buffer", European Journal of Operational Research, 96, 622-635. 6. Bolat, A., Al-Harkan, I., and Al-Harbi, B., (2005), "Flow-shop Scheduling for Three Serial Stations with the Last Two Duplicate ", Computers and Operations Research. 7. Bolat, A., and Harkan, I. M., 2001, Scheduling Algorithms for Flow-shop with Duplicate Serial Stations,working paper 8. Burns, L. D., and Daganzo, C.F., (1985), Assembly line Job Sequencing Principles, Research Publication GMR-5127, general Motors Research Lab, Warren, MI 9. Carlier, J. and Reba, I., (1996), Two Branch and Bound Algorithms for the Permutation FlowShop Problem, European J. Oper. Res., 90,n_2, p. 238-251 10. Conway, R. W., Maxwell, W. L., and Miller, L. W. (1967), Theory of Scheduling, Addison-Wesley, Reading, Mass 11. Daganzo, C. F. and Blumenfeld, D., (1994), "Assembly System Design Principles and Tradeoffs", International Journal of Production Research, 32, 669-681. 12. French S., 1982, “Sequencing and Scheduling, An Introduction to the Mathematics of the Job-Shop”, Ellis Horwood. 13. Gershwin, S. B. and Schick, I. C., (1983), "Modeling and Analysis of Three-Stage Transfer Lines with Unreliable Machines and Finite Buffers", Operations Research, 31, 354-380. 14. Groover, M., (1982), Automation, Production Systems and Computer Aided Manufacturing, Wiley, New York. 15. Garey M. R., and Johnson, D. S., (1979), A Guide to the Theory of NP-Completeness, Freeman San Francisco, Computers and Intractability 16. Goldberg D. E. (1989), “Genetic Algorithms in Search, Optimization and Machine Learning”, Addison-Wesley Publishing Company, INC 17. Groover, M., (1982), Automation, Production systems and Computer Aided manufacturing, John Wiley

84

18. Hall, R. W. and Daganzo, C. F., (1983), "Tandem Tool Booths for the Golden Gate Bridge", Transportation Research Record, 905, 7-14. 19. http://www.solver.com/, (2004) Premium Solver Platform 20. Holland, J. H., (1975), Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control and Artificial intelligence, University of Michigan press, MI. 21. Inman, R. R. and Leon, M., (1994), "Scheduling Duplicate Serial Stations in Transfer Lines", International Journal of Production Research, 32, 2631-2664. 22. Laporte, G. and Osman I. H., (1995b), Routing Problems: A bibliography, Annals of Operations Research, Forthcoming 23. Lee, C. Y., Cheng, T.C.E. and Lin, B. M. T., (1993), Minimizing the Makespan in the 3Machine Assembly-Type Flowshop Scheduling Problem, Management Science, Vol.39; NO5 24. Mott G. F., (1991), Optimizing flowshop scheduling through adaptive genetic algorithms, Chemistry part II thesis, Oxford University, y([489]) 25. Ng, W. C., (1995), "Determining the Optimal Number of Duplicate Process Tanks in a Single-Host Circuit Board Production Line", Computer and Industrial Engineering, 28, 681-688. 26. Okamura, K. and Yamashina, H., (1979), "A Heuristic Algorithm for the Assembly Line Mixed Model Sequencing Problem to Minimize the Risk of Stopping the Conveyor", International Journal of Production Research, 17, 681-688. 27. Osman, I. H. and Laporte, G., (1995), Meta–Heuristics in Combinatorial Optimization, J.C. Baltzer Science Puplishers, Basel, Switzerland. 28. Pinedo, M., (1995), Scheduling, Theory, Algorithms, and Systems, Prentice Hall, New Jersey 29. Reeves, C. R., (1995), A Genetic Algorithm for Flowshop Sequencing, Computers & OR, Vol. 22, No. 1, pp. 5-13 30. Savsar, M. and Biles, W., (1985), Simulation Analysis of Automated Production Flow Lines, Material Flow, 2, pp.191-201. 31. Savsar, M. and Allahverdi, A., (1998), "effect of scheduling policies on the performance of transfer lines with duplicate stations", Production planning & Control, vol.9, no.7, 660-670 32. Savsar, M. and Allahverdi, A., (1999), "Algorithms for Scheduling Jobs on Two Serial Duplicate Stations", International Transactions in Operational Research, 6, 411-422. 33. Shapiro G.W., and Nuttle H.L.W., (1988), Hoist Scheduling for a PCB Electroplating Facility, IIE Transactions, vol.20, no.2, pp.157-167

85

Appendix A Terms and Definitions

86

Flow Shop In the general flow shop model, there are a series of machines numbered 1,2,3…m. Each job has exactly m tasks. The first task of every job is done on machine 1, second task on machine 2 and so on. Every job goes through all m machines in a unidirectional order. However, the processing time each task spends on a machine varies depending on the job that the task belongs to. In cases where not every job has m tasks, the processing times of the tasks that don't exist are zero. The precedence constraint in this model requires that for each job, task i-1 on machine i-1 must be completed before the ith task can begin on machine i.

Makespan The amount of time required to complete a set of activities. Let Ci denote the completion time of the ith job in a batch of n jobs given. The makespan, Cmax, is defined as, Cmax = max (C1, C2, … , Cn) The makespan is the completion time of the last job. A common problem of interest is to minimize Cmax, or to minimize the completion time of the last job to leave the system. This criterion is usually used to measure the level of utilization of the machine.

Job A job can be made up of any number of tasks. It is easy to think of a job as making a product, and each task as an activity that contributes to making that product, such as a paint task. A job usually has only a single task. The exceptions are the cases of job shop and flow shop where a job is broken down into tasks because different orders of tasks make up different schedules.

87

Machine/Station A machine (workstation) is available to execute jobs and tasks. Different machine environments exist, such as single machine duplicate machines or parallel machines.

Genetic Algorithm An evolutionary algorithm in which a population of individuals is evolved using selection, crossover, and mutation. Originally devised as a model of evolutionary principles found in Nature, genetic algorithms have evolved into a stochastic, heuristic search method. A genetic algorithm may operate on any data type with operators specific to the data type.

Allele One of a set of possible values for a gene. In a binary string genome, the alleles are 0 and 1.

Chromosome A set of information that encodes some of an individual’s traits. In evolutionary algorithms, chromosome is often used to refer to a genome.

Crossover A genetic operator that generates new individuals based upon combination and possibly permutation of the genetic material of ancestors. Typically used to create one or two offspring from two parents (sexual crossover) or a single child from a single parent (asexual crossover).

88

Migration The transfer of individuals from one population to another population.

Mutation A genetic operator that modifies the genetic material of an individual.

Ready Time Time, at which a job begins to be available for processing. For example, a job may be ready at a later time than time 0 because it has not been completed in the last shop. Processing Time Length of time to process a job or a task.

Completion Time Time, at which a job is finished.

Flow Time Amount of time job i spends in the system. Fi=Ci-ri, where Ci is the completion time of the ith job, and ri is the ready time of the ith job.

89

Appendix B Lingo Code For INLM1 And ILM2

! MODEL INLM1; 90

SETS: JOBS / 1..10 /; POSITION / 1..10/; STAGE /1..3/; LINK1(JOBS,POSITION): XIK; LINK2(JOBS,STAGE): PIJ; LINK3(POSITION):OK1,TK,OK2,OK3,OK4,DK1,DK2,DK3,DK4,YK; ENDSETS ! THE OBJECTIVE; [OBJ] MIN = DK4(10); ! THE BINARY VARIABLES; @FOR(LINK1:@BIN(XIK)); @FOR(LINK3:@BIN(YK)); ! EQUATIONS 2 AND 3; @FOR(JOBS(K):@SUM(LINK1(I,K): XIK(I,K))=1); @FOR(POSITION(I):@SUM(LINK1(I,K): XIK(I,K))=1); ! EQUATIONS 4 - 8; @FOR(POSITION(K): OK1(K)-@SUM(LINK1(I,K): PIJ(I,1)*XIK(I,K))=0); @FOR(POSITION(K): TK(K)-@SUM(LINK1(I,K): PIJ(I,2)*XIK(I,K))=0); @FOR(POSITION(K): OK2(K)-YK(K)*TK(K)=0); @FOR(POSITION(K): OK3(K)-TK(K)+OK2(K)=0); @FOR(POSITION(K): OK4(K)-@SUM(LINK1(I,K): PIJ(I,3)*XIK(I,K))=0); ! EQUATIONS 9 - 12; DK1(1)-OK1(1)=0; DK2(1)-DK1(1)-OK2(1)=0; DK3(1)-DK2(1)-OK3(1)=0; DK4(1)-DK3(1)-OK4(1)=0; ! EQUATIONS 13 - 19; @FOR(POSITION(K)| K #GE# 2: DK1(K)-DK1(K-1)-OK1(K)>=0); @FOR(POSITION(K)| K #GE# 2: DK1(K)-DK2(K-1)>=0); @FOR(POSITION(K)| K #GE# 2: DK2(K)-DK1(K)-OK2(K)>=0); @FOR(POSITION(K)| K #GE# 2: DK2(K)-DK3(K-1)>=0); @FOR(POSITION(K)| K #GE# 2: DK3(K)-DK2(K)-OK3(K)>=0); @FOR(POSITION(K)| K #GE# 2: DK3(K)-DK4(K-1)>=0); @FOR(POSITION(K)| K #GE# 2: DK4(K)-DK3(K)-OK4(K)=0); ! HERE ARE THE PROCESSING TIMES IN A FILE; DATA: !PIJ =@FILE('5JOBSEX.TXT'); ENDDATA END

! MODEL ILM2; SETS: JOBS / 1..9 /; POSITION / 1..9/;

91

STAGE /1..3/; LINK1(JOBS,POSITION): XIK; LINK2(JOBS,STAGE): PIJ; LINK3(POSITION):OK1,TK,OK2,OK3,OK4,DK1,DK2,DK3,DK4,YK; ENDSETS ! THE OBJECTIVE; [OBJ] MIN = DK4(9); ! THE BINARY VARIABLES; @FOR(LINK1:@BIN(XIK)); @FOR(LINK3:@BIN(YK)); ! EQUATIONS 2 AND 3; @FOR(JOBS(K):@SUM(LINK1(I,K): XIK(I,K))=1); @FOR(POSITION(I):@SUM(LINK1(I,K): XIK(I,K))=1); ! EQUATIONS 4 - 8; @FOR(POSITION(K): OK1(K)-@SUM(LINK1(I,K): PIJ(I,1)*XIK(I,K))=0); @FOR(POSITION(K): TK(K)-@SUM(LINK1(I,K): PIJ(I,2)*XIK(I,K))=0); @FOR(POSITION(K): OK2(K)=0); @FOR(POSITION(K)| K #GE# 2: DK2(K)-DK1(K)-OK2(K)>=0); @FOR(POSITION(K)| K #GE# 2: DK2(K)-DK3(K-1)>=0); @FOR(POSITION(K)| K #GE# 2: DK3(K)-DK2(K)-OK3(K)>=0); @FOR(POSITION(K)| K #GE# 2: DK3(K)-DK4(K-1)>=0); @FOR(POSITION(K)| K #GE# 2: DK4(K)-DK3(K)-OK4(K)=0); ! HERE ARE THE PROCESSING TIMES IN A FILE; DATA: PIJ= ; !PIJ =@FILE('5JOBSEX.TXT'); ENDDATA END

92

Appendix C Fortran 90 Codes

93

C ÌíÏ**************************************** C Program Genetic Algorithm C*************************************** USE PORTLIB implicit none c*************************************************************** c VARIABLES DECLARATIONS c***************************************************************

c

integer, parameter :: MAX_JOBS = 101 integer, parameter :: z=5 integer, parameter :: print = 0

c

job sequence integer, dimension(:), allocatable :: job_seq integer, dimension(:), allocatable :: spt_a integer, dimension(:), allocatable :: spt_b cmy chang integer, dimension(:), allocatable :: spt_b3 integer, dimension(:), allocatable :: best_seq integer, dimension(:), allocatable :: overall_best_seq integer, dimension(:), allocatable :: if_selected

c

processing times on M1 and M2/M3 integer, dimension(:), allocatable :: a integer, dimension(:), allocatable :: b c my ch integer, dimension(:), allocatable :: b3 integer, dimension(:), allocatable :: c integer :: num_jobs,best_makespan=9999,overall_best_makespan=9999 integer :: parent1, parent2, best_gen, fittest, duplicate integer :: i, j, k, n, present, tmp, gen, m, cut_point integer :: start_time, end_time, best_time real :: p, prev_portion, tot_fitness, highest_fitness character*46 line, fn c*************************************************************** c GENETIC VARIABLES DECLARATIONS c*************************************************************** integer num_generation, pop_size, num_offspring, off

94

real :: cross_rate=0.9, mut_rate=0.1 real, dimension(:), allocatable :: portion, fitness character :: sel_scheme type chromosome_type integer, dimension(MAX_JOBS) :: seq integer :: makespan end type chromosome_type type (chromosome_type), dimension(:), allocatable :: chromo c****************************************************************** c MAIN PROGRAM CODE STARTS HERE c******************************************************************

print *, 'ENTER THE NAME OF INPUT DATA FILE' read(*,*) fn open(unit = 5, file = 'c:\3STAGE\data_file\'//fn, status = 'old') read(5,*) line,line,line,num_jobs print *,num_jobs allocate( a (num_jobs) ) allocate( b (num_jobs) ) cmy ch allocate( b3 (num_jobs) ) c allocate( ind (num_jobs) ) read(5,*) line, line, line do i = 1, num_jobs read(5,*) i, a(i), b(i) ,b3(i) enddo c

print *, 'INDEX', (ind(i), i=1,num_jobs) print *, 'PT_M1', (a (i), i=1,num_jobs) print *, 'PT_M2', (b (i), i=1,num_jobs) print *, 'PT_M3', (b3 (i), i=1,num_jobs)

c100 print *, 'Enter the number of jobs to be scheduled' c read *, num_jobs c print *, 'Enter POPULATION SIZE for GA'

95

c c c c c

read *, pop_size print *, 'Enter the NUMBER OF GENERATIONS for GA' read *, num_generation print *, 'Selection scheme (R)oulette,(C)ompetetive, (E)litist' read *, sel_scheme pop_size = 260 num_generation = 40000 sel_scheme = 'e' open(unit=6,file='c:\3STAGE\GAOUT\'//fn//'-'//sel_scheme// + '-GA.out', status='replace') num_offspring = pop_size

c

allocate arrays according to number of jobs

allocate ( job_seq(num_jobs) )! , &stat = allocation_status) allocate ( spt_a (num_jobs) )! , &stat = allocation_status) allocate ( spt_b (num_jobs) )! , &stat = allocation_status) c my cha allocate ( spt_b3 (num_jobs) )! , &stat = allocation_status) allocate (best_seq(num_jobs) )! , &stat = allocation_status) allocate (overall_best_seq(num_jobs) )! , &stat = allocation_status) allocate ( chromo (pop_size+num_offspring) ) allocate ( portion (pop_size+num_offspring) ) allocate ( fitness (pop_size+num_offspring) ) allocate ( if_selected(pop_size+num_offspring) ) c c

allocate ( a(num_jobs) allocate ( b(num_jobs) allocate ( c(num_jobs)

)! , &stat = allocation_status) )! , &stat = allocation_status) )

do i = 1, num_jobs job_seq(i) = i spt_a(i) = i spt_b(i) = i end do c c

a(1)=5;a(2)=2;a(3)=3;a(4)=4;a(5)=8;a(6)=6;a(7)=1 a(8)=7;a(9)=9;a(10)=11;a(11)=10

c

b(1)=9;b(2)=7;b(3)=2;b(4)=6;b(5)=5;b(6)=4;b(7)=12

96

c

b(8)=1;b(9)=14;b(10)=15;b(11)=8

print *, "a(i)'s are " , a print *, "b(i)'s are " , b c my ch print *, "b3(i)'s are " , b3

call get_makespan(num_jobs, job_seq, a, b,b3, c) c c

print *, "sequence is" , job_seq, " | Makespan = ", c(num_jobs) print *, "C(i)'s are " ,(c(i) , i= 1, num_jobs) print *, ''

do i = 1, num_jobs-1 do j = i+1, num_jobs if ( a( spt_a(i) ) > a( spt_a(j) ) ) then tmp = spt_a(i) spt_a(i) = spt_a(j) spt_a(j) = tmp end if end do end do call get_makespan(num_jobs, spt_a, a, b,b3, c) c c

print *, "SPT a(i) is" , spt_a, " | Makespan = ", c(num_jobs) print *, "C(i)'s are " ,(c(i) , i= 1, num_jobs) print *, ''

do i = 1, num_jobs-1 do j = i+1, num_jobs if ( b( spt_b(i) ) > b( spt_b(j) ) ) then tmp = spt_b(i) spt_b(i) = spt_b(j) spt_b(j) = tmp end if end do end do call get_makespan(num_jobs, spt_b, a, b,b3, c) c

print *, "SPT b(i) is" , spt_b, " | Makespan = ", c(num_jobs)

97

c

print *, "C(i)'s are "

,(c(i) , i= 1, num_jobs)

print *, '-------------------------------------------------------' print *, ' Starting Genetic Algorithm ' print *, '-------------------------------------------------------' print *, '' print *, 'Initial population of solutions is following' print *, ''

c c

Initial population construction by making a series of random swaps in the SPT_A sequence

do n = 1, pop_size do i = 1, num_jobs chromo(n)%seq(i) = i end do end do j = int(rand(1)) n=1 do while (n