A Neural-Evolutionary Model for Case-Based Planning ... - Springer Link

A Neural-Evolutionary Model for Case-Based Planning in Real Time Strategy Games Ben Niu, Haibo Wang, Peter H.F. Ng*, and Simon C.K. Shiu Department of Computing, The Hong Kong Polytechnic University, Hum Hom, Kowloon, Hong Kong, P.R. China [email protected], {cshbwang,cshfng,csckshiu}@comp.polyu.edu.hk

Abstract. Development of real time strategy game AI is a challenging and difficult task because of the real-time constraint and the large search space in finding the best strategy. In this paper, we propose a machine learning approach based on genetic algorithm and artificial neural network to develop a neuralevolutionary model for case-based planning in real time strategy (RTS) games. This model provides efficient, fair and natural game AI to tackle the RTS game problems. Experimental results are provided to support our idea. This model could be integrated with warbots in battlefields, either real or synthetic ones, in the future for mimic human like behaviors. Keywords: Case-based planning, real time strategy (RTS) games, genetic algorithm, artificial neural network.

1 Introduction Computer games provide an ideal platform for Artificial Intelligence (AI) research. For example, traditional techniques such as finite-state machines, A* searching, logicand rule-based reasoning have been proved extremely useful for developing game AI applications. Real Time Strategy Games, such as Warcraft (Blizzard Entertainment, Inc.) and Age of Empires (Microsoft, Inc.), relying heavily on AI, have attracted millions of users world-wide playing online over the Internet. These games provide not only a platform for entertainments but also an interactive environment for making friends, chatting, advertising, trading and other social activities. In these RTS games, there are many attractive features, e.g., interesting 3D landscapes, attractive avatars, and virtual currencies and resources, which are produced, controlled, manipulated or collected by the game players and third parties. However, the current architecture of game applications doesn’t support well the utilization of user contributed contents to get better game playability. The difficulty is that the AI algorithms now used in computer games are mostly heuristic-based. The portability of the algorithms is quite poor. For example, in the RTS game a team of the computer generated characters can perform well on a predefined map by following the heuristic rules specified by the *

Corresponding author.

B.-C. Chien et al. (Eds.): IEA/AIE 2009, LNAI 5579, pp. 291–300, 2009. © Springer-Verlag Berlin Heidelberg 2009

292

B. Niu et al.

game developer but may later fail to behave correctly on a new map created by the users due to the lack of the rules to handle this new situation. To address this deficiency, we suggest building a machine learning component into the game so that the characters are able to learn from data directly for decisions and actions without using the heuristics. We have built a prototype system to test our idea. The system is designed to mimic the human ability that people can learn and plan to solve problems by first trying and working hard exhaustively to find out the ‘good’ solutions and then remembering the correct answers to make decisions quickly in the future. The results demonstrated the feasibility and the effectiveness of our technique. This paper is organized in six main sections. Section two describes the problem and its background. Section three proposes our method and algorithm. Section four presents our experimental results, and the conclusion is given in section five.

2 Problem Background and Description The current game AI approach is something like “God” AI, i.e., it knows all the user actions. But under this arrangement, players will lose interest quickly. For example, when the game logic is designed and represented using finite state machines [1], which is usually based on inputting the hard-coded parameters, players may lose interest quickly once they have discovered that repeated patterns are being used. Moreover, this process requires a lot of time to fine tune the parameters and often relies on trial-anderror. Once the players discovered that the game behavior is somewhat “hard-coded”, they may be disappointed. Therefore, machine learning approach may be a better choice to introduce human like behavior. Hsieh [2] used 300 professional game players’ replays to train the decision system in one single battlefield. The result is a much more human like robot behavior in the battle. Zanetti [3] and Nicholas Cole [4] used genetic algorithm for first person shooting game AI development. Salge [5] also applied machine learning technique on turn base strategic game. Louis [6] developed a Caseinjected genetic algorithm for NPC action in RTS game. These are some of the researches that tell us AI is getting more popular in RTS games. We will extend this machine learning idea and present our techniques and simulation data in the following sections. 2.1 Genetic Algorithm Genetic algorithm (GA) was first introduced by John Henry Holland [7] in 1975. It is a search technique to find an approximate solution based on the concept of human evolution. GA is designed for large, non-linear, and poorly-understood search spaces in which heuristic-based search is found to be unsuitable. In RTS games, GA can be used as a guided random walk technique based on natural selection to find the optimal solution. However, using GA to develop RTS game strategy also faces some difficulties. First is how to design the GA encoding scheme for large and complex problems such as computer games, i.e., it is difficult to represent the whole battlefield using binary DNA. Second is how to design the fitness function for evaluating the chromosomes. Third is the real time constraint.

A Neural-Evolutionary Model for Case-Based Planning in Real Time Strategy Games

293

2.2 Artificial Neural Network Artificial neural network (ANN) is a mathematical model or computational model based on biological neural networks. ANN has been used as an adaptive learning system that changes its weights based on external information. Again, ANN has rarely been used in game development. Only recently, David M. Bourg and Glenn Seemann [8] have introduced some discussions to use ANN in games.

3 Problem Description and Proposed Approach There are many possible game plays in any RTS games, in this paper, we have chosen a base defense scenario as the testing bed of our technique (e.g., Tower Defense in Warcraft). The situation is described as below. 1. Two teams are created in a battlefield. One is the attack team (enemy). Another one is the defense team (player). 2. Enemy must move to the base of the defense team and attack. 3. Defense team is able to set up a number of cannons (e.g., ten or fifteen) in the battlefield to kill the enemy when they are approaching their base. 4. The goal is creating the maximum casualty to the enemy no matter which path they choose to approach the base. 3.1 Divide and Conquer Since it is difficult to design one binary chromosome in GA to represent the entire RTS game, our concept is dividing the game play into smaller parts. This also makes the fitness function easier to be formulated. Therefore, in this research work, the chromosome encoding focuses on base defense only. Similar to the concept of spatial reasoning Frobus [9], battlefields can be also divided into different smaller subfields and with different canyons. In our simulation, we can classify a battlefield into four different smaller subfields (or maps) as shown in Fig. 1.

1 Canyon

2 Canyons

3 Canyons

4 Canyons

Fig. 1. Four different canyons. They represent the common landscape in RTS game.

To divide the battlefield into smaller subfields also gives us two advantages. First, the fitness function can work more effectively. Open area without canyons will not be considered as it will not affect the solution, but increase the run time because the search space increases.

294

B. Niu et al.

3.2 Genetic Algorithm GA requires two crucial components that need to be designed. First is the binary encoding scheme. It is the genetic representation of the solution. The length required for the encoding depends on the area of the battlefield and the possible barriers. Here we use an example to demonstrate the encoding idea and is shown in Fig 2. A map with 5x5 units is used and 5 cannons to be set in the battlefield. Here, white circle represents the open area that a cannon can be set (i.e., located). The black circle represents the barrier that is unable to set up the cannon. The grey circle represents the position in which the cannons are set up. The total bits of the encoding depend on the size of the open area that is usable for setting up the cannons. (i.e., Encoding bit = Total area – Barrier area). In this example, it will be 13 bits. Five cannons are set on the map randomly as shown in Fig. 2. The encoding of each distribution of cannon is formulated as a chromosome.

Fig. 2. Demonstration of the encoding in GA

In this simulation, 50x50 units map will be used and 15 cannons will be set on the map. The length of the chromosome is shown in Table 1. Table 1. Length of chromosome

Canyon 1 2 3 4

Chromosome length (bits) 50 x 50 – 440 = 2060 50 x 50 – 420 = 2080 50 x 50 – 268 = 2232 50 x 50 – 280 = 2220

Another crucial component is a fitness function to evaluate the solution. In our simulation, 50 chromosomes are chosen in each generation for better and quicker convergence of the GA process. The evaluation function measures the enemy’s casualty (or damage caused to them). After the 15 cannons have been set up, the enemy will try to find the best path to go to attack the player’s base. Enemy is given a certain velocity (v). It will receive damage within the attack range of the cannon. Damage (hit point) is calculated using equation (1) above. The total damage is calculated by summing up all individual


295

damages caused by different cannons, and becomes the fitness value of this simulation. The higher the damage did to the enemies, the better the fitness of the cannons’ positions. Cannon positions will be ranked by this fitness values and used to produce off-springs. (1)

4 Testing Arrangements and Results This simulation is done using an Intel Pentium IV 2.4GHz machine with 1.5 GB Ram under Windows XP. MathWorks Matlab 7.0 is used as the simulation tool. The result is shown in Fig. 3.

Fig. 3. Cannon distribution (Optimized by GA)

4.1 Results in Different Generations The evaluation function is designed based on the damage done to the enemies. During the simulation, the fitness does not have a significant change after 100 generations as shown in Fig. 4, i.e., the result stabilized. This stabilization result is similar to ChuenTsai Sun [10] and Yi Jack [11] results.

Fig. 4. Damage on enemy for different generations

296

B. Niu et al.

4.2 Using ANN to Speed Up the Process GA is a time consuming process. Table 2 is the run time of using GA with different number of canyons. The average run time of using GA is 1947 seconds which is unacceptable in RTS games and real world battlefields. Table 2. Run time of using GA (Population size: 50, Generations: 150)

No. of Canyon 1 2 3 4

Run time of GA (second) 1249.5 1279.3 2606.9 2650.4

To overcome this problem, we suggest using artificial neural network (ANN) to speed up the whole process. Our proposed neural-evolutionary model is given in Fig. 5. First, the given terrain information for deciding of cannon distribution is encoded as a chromosome. Different generations of chromosomes are produced. Through the fitness evaluation the best off-springs (i.e., cannon distributions) will be produced. These best cannon distributions can be used as inputs to ANN for training. Afterwards, cannon location prediction can be suggested quickly and directly by ANN for new battlefields having different landscape.

Fig. 5. Neural-Evolutionary Model

The idea of case-based planning in our model is as follows. We use the four best solutions (cases) that correspond to 1, 2, 3 and 4 canyons obtained from the GA training previously as the inputs to the ANN. For each case, every point with a certain


297

radius will be treated as one of the inputs to the ANN. For example, Point A as shown in Fig. 6 will be one input. The 8 points around Point A will be encoded. “-1” represents the barrier while “1” represents the open area and the final digit “d” represents the distance between the Point A and the base of the player.

Fig. 6. Encoding of Point A

The encoding becomes [ -1 1 -1 -1 -1 -1 1 -1 d ]. The objective of the ANN learning is to approximate the best location function developed by us in (2). It is defined in such a way that the higher the value, the better the location for setting up the cannon. It is based on the relationship among the current landscape of Point A, the cannon locations previously found using GA and the distance to the base. (2) (A1, A2) is the coordinate of point A. n is the total number of cannon, i.e., n=15 in this case. (K1, K2) is the coordinate of cannon K, and r is a parameter for controlling the spread of f(Point A). The ANN training is using back-propagation and log-sigmoid output function. The number of the hidden layer is calculated as the square root of the encoding string’s length. In this example, it will be as shown in Fig. 7.

Fig. 7. ANN Structure

The training time of the ANN is 125.95 seconds. Fig. 8 shows the result by using GA and by using ANN. It shows a similar result on cannon distribution but gives a great improvement on run time as shown in Table 3.

298

B. Niu et al.

Fig. 8. Cannon distribution (Comparison between using GA and using ANN) Table 3. Run time of ANN (Population size: 50, Generations: 150)

No. of Canyon 1 2 3 4

Run time of ANN (second) 2.81 2.89 3.18 3.43

4.3 Combine All Results into Single Battlefield After ANN training, our machine learning component becomes very useful. In Fig. 9(a), it shows a battlefield that is commonly found in RTS games, (e.g., a number

(a)

(b)

Fig. 9. (a) Comparison between using GA and using ANN. (b) Result of using ANN.


299

Fig. 10. Cannon distribution on random battlefield using ANN

of canyons in the landscape). The result of GA and ANN is similar but the run time of distributing cannons in this battlefield is only 21.266 seconds (in a Pentium IV environment). This component is also shown to be very useful in a more “real world like” battlefield as shown in Fig. 9(b) and Fig. 10.

5 Conclusion and Future Work A neural-evolutionary model for case-based planning in real time strategy games is developed and shown in this paper. We believe that this research direction can provide an efficient, fair and natural AI development for RTS games. Base defense will only be part of our evaluations in our current and future works. We will extend the idea and combine with other key components in RTS games, such as resource management and battle strategies, in our future research.

Acknowledgement This project is supported by the HK Polytechnic University Grant A-PA6N.

References 1. Lent, M.V., Laird, J.: Developing an Artificial Intelligence Engine. In: GDC Proceeding (1999) 2. Hsieh, J.L., Sun, C.T.: Building a player strategy model by analyzing replays of real-time strategy games. In: Neural Networks, IJCNN 2008 (2008)

300

B. Niu et al.

3. Zanetti, S., Rhalibi, A.E.: Machine learning techniques for FPS in Q3. In: Proceedings of the 2004 ACM SIGCHI International Conference on Advances in computer entertainment technology, Singapore. ACM, New York (2004) 4. Cole, N., Louis, S.J., Miles, C.: Using a genetic algorithm to tune first-person shooter bots. In: Evolutionary Computation, Congress on Evolutionary Computation (2004) 5. Salge, C., Lipski, C., Mahlmann, T., Mathiak, B.: Using genetically optimized artificial intelligence to improve gameplaying fun for strategical games. In: Proceedings of the 2008 ACM SIGGRAPH symposium on Video games, Los Angeles, California. ACM, New York (2008) 6. Louis, S.J., Miles, C.: Playing to learn: case-injected genetic algorithms for learning to play computer games. IEEE Transactions on Evolutionary Computation 9(6), 669–681 (2005) 7. Holland, J.H.: Adaptation in natural and artificial systems: an introductory analysis with applications to biology, control, and artificial intelligence. MIT Press, Cambridge (1975) 8. Bourg, D.M., Seeman, G.: AI for Game Developers. O’Reilly Media, Inc., Sebastopol (2004) 9. Forbus, K.D., Mahoney, J.V., Dill, K.: How qualitative spatial reasoning can improve strategy game AIs. IEEE Intelligent Systems 17(4), 25–30 (2002) 10. Chuen-Tsai, S., Liao, Y.H., Lu, J.Y., Zheng, F.M.: Genetic algorithm learning in game playing with multiple coaches. In: Evolutionary Computation, 1994. IEEE World Congress on Computational Intelligence (1994) 11. Yi Jack, Y., Teo, J.: An Empirical Comparison of Non-adaptive, Adaptive and SelfAdaptive Co-evolution for Evolving Artificial Neural Network Game Players. In: 2006 IEEE Conference on Cybernetics and Intelligent Systems (2006)