TOOL WEAR MONITORING USING ARTIFICIAL NEURAL NETWORKS

Tool Wear Monitoring by Means of Artificial Neural Networks Krzysztof Jemielniak, Warsaw Univ. of Technology, Warsaw, Poland Summary The paper presents the application of multi layer perceptron artificial neural network for the tool wear monitoring in turning. To simulate factory floor conditions, six sets of cutting parameters were selected and applied in sequence. Six configurations of input parameter were tested to reveal their usability. Subsequently, the network’s structure was optimised by means of an original pruning method, which makes possible an automatic network configuration. The obtained results prove the effectiveness of the studied BP neural networks for the purposes of tool wear monitoring.

1. INTRODUCTION

One of the necessary preconditions for increased manufacturing automation is the implementation of tool condition monitoring systems. The most frequently used in such systems is measurement of cutting forces or their derived quantities (including motor power, torque, etc.) /1, 2, 3, 9, 13/. Many such systems are available commercially /11/, but it is widely accepted that their reliability is inadequate /1, 3, 5, 13, 16/. To most specialists it is becoming increasingly obvious use that of a single sensor signal may offer incomplete sensory information as to the condition of a cutting tool /1, 5, 6, 8, 13/. Several different signal features describing one or many physical variables can be integrated by means of artificial neural networks /5, 6, 7/. A number of researchers reported application of neural network systems in tool condition monitoring, classification and prediction of tool wear and tool life. A survey by Dimla et al /5/ has indicated, amongst the many algorithms of neural networks, multi-layer perceptron (MLP) of the feed forward – back propagation (FF-BP) type is the most common. Dornfeld /7/ adopted MLP network to integrate information such as AE and force to monitor flank wear during turning. Dimla et al /4/ used two different MLP networks for classification of tool state, with respect to only two classes (worn or sharp).

Later /6/ they admitted that it was necessary to establish the neural network’s capability to distinguish sharp, part worn, worn and fractured tools. Choudhury /2/ used an optoelectronic sensor to measure the tool wear without interrupting the process. The BP model was applied for prediction of the tool flank wear. Ghasempoor et al /9/ used a combination of three different neural network inspectors to predict the wear on flank, crater and nose of the tool, while cutting force components where used as diagnostic signals. They used a combination of static and dynamic neural networks with off-line and on-line training. Liu and Altintas /15/ presented a three-layered multi feed forward neural network, using force ratio, cutting speed and feed as input variables, for predicting flank wear on a tungsten carbide cutting tool when cutting P20 mold steel. Dutta et al /8/ studied modified BP neural network learning algorithm and parameters applied in tool condition monitoring to achieve for faster convergence. Lee and Leeb /14/ developed network model to predict flank wear from force ratio and increment obtained from a tool dynamometer. They investigated influence of training conditions (i.e. initial network weights, number of hidden neurons, mean squared error) on the network performance. In research papers on tool condition monitoring it is a generally accepted practice to apply constant cutting conditions throughout the life of a single cutting edge /5, 15/. However, in most cases the industrial practice involves a number of different feeds and depths of cut for the same tool in successive cuts. Consequently, research results tend to be rather academic and difficult to apply in factory floor practice. The aim of this research project /12/ was to explore the relationship between tool wear and the cutting forces in changing conditions. A neural network approach was employed in order to cope with the stochastic characteristics of the cutting forces as well as the varied cutting conditions.

2. EXPERIMENT METHODOLOGY

The experiments were carried out on a conventional TUD-50 lathe. A CSRPR 2525 tool holder was used, fitted with an SNUN 120408 carbide insert coated with TiN-Al2O3-TiCN. Characteristic of the applied grade is a soft, cobalt-enriched substrate layer under the coating, which ensures an increased resistance to interrupted cutting. On the other hand, this soft layer causes that this tool’s life tends to end abruptly when the coating has worn through. Therefore, the tool life criterion used here should not exceed VBB = 0.3 mm. Nevertheless, the tests were continued until the occurrence of a tool failure. In order to simulate factory floor

conditions, six sets of cutting parameters were selected and applied in sequence to imitate the machining of a single workpiece (i.e. a single six-cut technological cycle). Each cut lasted for 30 seconds. The sequencing scheme is shown in Fig. 1. Cutting speeds were selected in such a way as to correspond to the same approximate tool life in each cut.

Fig. 1. The cutting parameter sequence as applied in the experiment

Cutting forces were measured at selected points by means of a Kistler 9263 threecomponent laboratory dynamometer. Flank wear (VBB) was measured by using tool makers microscope after each cycle, i.e. every three minutes. The obtained values were subsequently interpolated to estimate the tool wear after each cut, assuming linear tool wear growth between measurements (see Figure 2 below). The tests were continued until the occurrence of a catastrophic tool failure (CTF). The tools wore down gradually throughout the test, ending with a CTF at some point in the last cycle. The surface finish was invariably good until the CTF.

3. EXPERIMENT RESULTS

Two such tests – referred to as W5 and W7 – were performed. To illustrate the data format used in the experiment, Table 1 includes the results obtained in the eighth cycle. Since six sets of cutting parameters were applied in each cycle, a single test is equivalent to six conventional tool wear tests. This is evident from the variations in tool wear and the cutting forces over time obtained in the tests, as presented in Fig. 2 and 3.

Table 1. Measurements of tool wear and the cutting forces in the eighth cycle in both tests

Test W5 t min

vc

f

ap

m/min mm/rev mm

Test W7

Fc

Ff

Fp

VBB

Fc

Ff

Fp

VBB

N

N

N

mm

N

N

N

mm

21.50 351

0.24

1.5

829 505 380 0.238

876 493 350 0.255

22.00 417

0.17

1.5

646 508 376 0.247

677 489 348 0.260

22.50 251

0.47

1.5

1462 547 430 0.255

1440 503 420 0.265

22.67 251

0.47

1.5

1454 553 446 0.258

1436 513 424 0.267

23.00 251

0.47

3.0

2665 920 590 0.263

2713 891 563 0.270

23.25 300

0.33

1.5

1088 523 426 0.268

1114 515 408 0.273

23.50 300

0.33

1.5

1125 542 426 0.272

1100 531 423 0.275

24.00 300

0.33

3.0

2032 868 542 0.280

2066 870 548 0.280

Test W5 (Fig. 2) involved ten tests with an overall cutting time of 30 minutes. While machining the last workpiece (i.e. during the last operation), tool wear rose markedly to reach approximately 0.5 mm. Burrs appeared on the surface showing an impending catastrophic tool failure, and thus leading to test termination.

Fig. 2. Results of test W5 (numbers refer to the cutting parameters as in Fig. 1)

Fig. 3. Results of test W7 (numbers refer to the cutting parameters as in Fig. 1)

During the last cut of the ninth cycle in test W7 (Fig. 3) a catastrophic tool failure actually occurred. It is not discernible from the tool wear course, which was about 0.35 mm at this point, but it made a great impact on feed and passive cutting forces alike. A marked increase of the forces, pointing to an impending CTF, showed already in the fifth cut of the last cycle. Of course, the results of the last two cuts could be disregarded as not correspondent to normal tool wear. Nonetheless, we decided to include the results and thus find out how the tool condition monitoring strategy would react to a challenge of this sort. As could be expected based on general knowledge and previous research conducted at Warsaw University of Technology /10/, feed (Ff) and passive (Fp) forces turned out to be strongly related to tool wear, whereas the main cutting force (Fc) remained almost unchanged throughout the test, apparently being affected only by the feed and the depth of the cut. One interesting, not typical phenomenon in such cutting conditions is the weak relationship between the feed force (Ff) and the feed itself. Here the Ff force is affected only by the depth of cut (and by tool wear). This provides an interesting opportunity to estimate tool wear in the absence of information regarding the depth of cut. The passive force depends both on the feed and the depth of cut.

4. APPLICATION OF ARTIFICIAL NEURAL NETWORKS IN TOOL WEAR MONITORING

4.1. Multi-layer perceptron neural network

Fig. 4 shows a three-layer MLP including an input (upper) layer containing n-neurons, a hidden (intermediate) layer containing m-neurons, and an output layer with a single neuron. The neurons in the input layer act only as buffers for relaying xi scaled input signals to neurons in the intermediate layer. The output of the i-th neuron of the first layer can be thus formulated as: o1,i = xi w1,i

(1)

Fig 4. The three-layer one-output neural network as used in this research

These outputs, after weighting them with the strengths of the respective connections w2,j,i, are summed as inputs to the hidden layer neurons (Fig. 4): s2 , j =

where o1,0 ≡1,

å (w 2 , j ,i o1,i ) n

i =0

w2,j,0 – threshold values (bias)

(2)

Then, the outputs of those neurons (o2,j) are computed according to an activation function (a sigmoid one in this example):

( )

o2 , j s 2 , j =

1 1+e

−s2 ,i

(3)

The output of the last layer neuron, corresponding to tool wear, is computed in the same way. Network training is based on a back propagation algorithm using Rummelhart’s generalised delta rule and cumulative weight adjustment following the presentation of all training vectors, i.e. after each iteration. A number of training vectors (i.e. patterns) is required, each consisting of input values of x1...xn, and the desired z network output. Thus, the training set takes the form of the following table: x11, x21, … xn1, z1 :

:

:

(4)

:

x1k, x2k, … xnk, zk :

:

:

:

x1N, x2N, … xnN, zN Input data are presented to the network consecutively, and the desired output values (zk) are compared to the actual ones (o3k). An algorithm minimises the network error: E =

N

å (z

k

− o3k

)2

(5)

1

Following the presentation of a complete set of training parameters (i.e. an iteration), new weight values of the output neuron are calculated for the next iteration (l+1):

(w 3 , j )l +1 = (w 3 , j )l + ηN1 å (δ 3 k o2 , j ,k )k + η2 ∆ (w 3 , j )l −1 N

(6)

k =1

where δ 3 k = o3 k (1 − o3 k )(zk − o3 k ) N, k

– number of patterns in a training set and current pattern index

zk

– target (desired) network output for the k-th pattern

o3k

– actual output of the network for k-th pattern

l

– number of completed iterations

∆(w3,j)l-1 – change of weight of w3,j in the previous iteration η1, η2

− learning coefficient and momentum

(7)

Next, the new weight values in the hidden layer are calculated:

(w 2 , j ,i )l +1 = (w 2 , j ,i )l + ηN1 å (δ 2 , j o1,i )k + η2 ∆(w 2 , j ,i )l −1 N

(8)

k =1

where

δ 2 , j = o2 , j (1 − o2 , j )δ 3 w 3 , j

The weights of the first input layer are not updated. The artificial neural network system as applied in this paper was developed at the Warsaw University of Technology.

4.2. Efficacy of neural network inputs

A trained neural network forms a specific process model. In the case of tool wear monitoring this model is functionally similar to statistical correlation. However, a neural network requires no specific formula of the correlation function. The network extracts by itself the required information contained in input signals, and correlates it with the output signal. The result of neural network learning tells us whether there is a sufficient correlation between the output (the tool wear in this case) and the input or, to put it another way, whether the input signal includes enough information to estimate the wear of a tool. On the other hand, the result of neural network testing informs us how general is the model generated by the network in its learning process – in other words, can it be extended to new data other than those used in the learning process. Testing data should be therefore derived from a separate experiment. The practice of dividing the results of a single test into training and testing sets (as it is occasionally the case in the literature of the subject) is unacceptable. For the purposes of practical application of tool condition monitoring it is necessary to install a suitable sensor (a cutting force sensor in this case). Of course, the particular kind of sensor is by no means a matter of indifference. The laboratory sensor used in this research cannot be applied in factory floor conditions for lack of overload protection, but several industrial cutting force sensors are commercially available, whose prices depend largely on the number of measurable force components. In exploring the possible application of neural networks and cutting force measurements in tool condition monitoring it would be therefore a good idea to scale down the full range of data available in laboratory research conditions, and consider imposing some limits on the available input in order to sound the possibility of using

less expensive sensors and eliminating random (i.e. unknown) changes in the depth of cut. This research focused on finding such limited input signal configurations as could be used for practical solutions. The network input configurations selected for research purposes are presented in Table 2. In each network the intermediate layer included ten neurons, all of them trained with a learning coefficient of 0.9 and a momentum of 0.6 over 200,000 iterations. As the network was being trained using the W5 test results, each iteration was followed by a network test using the W7 test results. Of course, the results of the tests had no influence on the training and only served for the purposes of evaluating the network’s generalising capability throughout the experiment. Fig. 5 presents the mean square root error and maximum error variations in learning and testing. As can be seen, the average learning error el (controlling the training process) as well as the maximum training error ml follow a similar pattern of change for all the networks, being the highest in network 2. The courses of testing errors, however, are markedly different. It has to be borne in mind that those errors are more important, since they determine a network’s capability to estimate the wear of a different tool than that used for network training. Particular attention should be drawn here to the decidedly worst results obtained by Network 2, which were based only on the measurements of cutting forces (to the exclusion of any information regarding the cutting parameters). Networks 1 and 5 were much slower in gaining the ability to generalize. Nonetheless, after 20,000 iterations the results obtained by all networks (with the exception of network 2) were similar. In order to facilitate the evaluation of the numerical values of the errors they have been included in Table 3. The lowest values in each group have been highlighted in bold type. The best learning results were recorded for network 5, and the results of network 1 were nearly as good. In terms of testing, the best results were obtained by network 3, and only slightly inferior ones were recorded for network 5. It is worth bearing in mind, however, that while the potential of network 3 reached its limits (after ca. 100,000 iterations the errors would decrease no longer), the testing errors of networks 1 and 5 kept going down with further training. For instance, after another 100,000 iterations the testing errors of network 1 were: et=0.0366, and mt=0.1629. Disregarding network 2, the average testing error did not exceed 0.04 mm, which is equivalent to some 13% of the tool life criterion – not a bad result at all. Of course, the maximum errors were higher, reaching up to 0.2 mm, which can be worrying. Therefore, the problem calls for a more careful analysis, and will be addressed later on. In the meantime, it is worth pointing out that a number of papers on the application of neural networks in tool condition monitoring does not go beyond reporting the obtained learning and testing errors.

This does not seem to be the correct approach. What matters here are not so much the bare values of the average or maximum errors as the context in which these occur. In fact, the important thing is to estimate the exact moment in which a tool loses its cutting ability, that is to say, to accurately determine the end of a tool’s life.

Table 2. Configurations of neural network inputs

Net 1

Input signals f

ap Ff Fp Fc

Comments

X X X X X Full network – all inputs. This can be assumed to be an excessive demand. A three-component cutting force sensor is required for this application.

2

X X X An attempt at examining whether the information contained in the cutting force signals is sufficient to take into account the influence of cutting parameters. A three-component cutting force sensor is required for this application.

3

X

X

X Ff depends almost exclusively on f and VBB, whereas Fc depends almost exclusively on ap and f. This offers an opportunity to extract the cutting depth value from Fc and f, and thus to eliminate the influence of ap on Ff. This configuration is particularly suited for unknown changes of ap (rough machining).

4

X X X

When the depth of cut and the feed value are known they can be used as net inputs together with Ff, thus letting the network eliminate their influence on Ff and tool wear estimation.

5

X X X X

As above, but the information included in Ff is backed up by that contained in Fp.

6

X X X

X For a 2-component force sensor, when the changes of ap are known.

Fig. 5. Variations of errors in the tested networks in the course of training. Table 3. Learning and testing errors of the studied networks input signals

learning errors

testing errors

el

ml

et

mt

net 1

f

ap

Ff Fp Fc

X

X

X

X

X

0.0113

0.0257

0.0380

0.1751

X

X

X

0.0161

0.0553

0.0509

0.2740

X

0.0144

0.0329

0.0362

0.1726

0.0121

0.0278

0.0390

0.2043

0.0112

0.0257

0.0393

0.1873

0.0123

0.0260

0.0390

0.2020

2 3

X

X

4

X

X

X

5

X

X

X

6

X

X

X

X X

Fig. 6 presents the difference between the network outputs (VBB values as estimated by the networks) and the actual flank wear values in test W7 (which was used for testing purposes). As can be seen, approximately up until the 24th minute, the errors of all the networks are confined within a band of ±0.05 mm. It was only when the tool nose was cut off (resulting in a pronounced increase in the Ff and Fp forces) that the errors in the estimation of tool wear rose sharply. However, the networks are hardly to blame here. Quite on the contrary, it must be acknowledged that when the tool lost its cutting ability the networks’ estimations were actually better than the exact VBB flank wear value.

Fig. 6. Tool wear estimation errors

To sum up, Fig. 7 presents the courses of actual and estimated tool wear in both tests. This is an auto-test in the case of test W5, as it was this test that was used for network training.

Fig. 7. Actual and estimated tool wear in both tests (excluding the results of Network 2)

Let us return now to the analysis of the results presented in Table 3. Network 1, which was fed with all available inputs, achieved good results and would be fit for application but for the aforementioned fact that it is too excessive, requires the most expensive sensor and learns slowly. Network 2 has to be rejected as it gives the worst results and offers nothing in return – feed value information is always available anyway. The next two networks, 3 and 4, both reflect the same information: tool wear information is implicit in the variations of feed force. In order to extract it, it is necessary to eliminate the influence of the feed and of the depth of cut, that is to say they have to be taken into account. In both networks, the feed value is provided directly, while the information on the depth of cut can be derived from the value of the cutting force. This makes it possible to apply the network where the depth of cut changes randomly. In network 4, ap is provided directly, which makes it possible to use a singlecomponent sensor. As far as testing errors are concerned, network 3 was by far the best performer, but it was the worst (excepting Network 2) in terms of learning errors. Nonetheless, as the latter were significantly lower than the former, Network 3 was more balanced than the others, which is a point its favour. In network 5, more information is provided at input, which produces the best results. After all, the Fp component also contains information on tool wear. Network 6 is an extension of network 3 with the added input of ap or of network 4 with the added input of Fc. The information on the depth of cut is doubled here. However, the results obtained by this network are not clearly superior to those obtained by networks 3 and 4.

In conclusion, it must be remarked here that a variety of neural networks are feasible for application. All that is needed is merely the information to be provided to and processed by the networks.

4.3. Optimisation of neural network structure The neural networks studied in the previous sub-chapter had different inputs, but each had the same number of neurons in the hidden layer. This number (10 neurons) was assumed arbitrarily. The program used in this project allows for an automatic selection of the number of neurons. It starts with the selected default number of neurons, and then the network is trained according to given learning criteria. During this training, whenever the average learning error (el) drops below the defined threshold the program executes a so-called neuron pruning operation. The program analyses the variations of neuron outputs (max minus min output value) in the hidden layer throughout the presentation of the entire testing data set. Low output variation means that the output value of a given neuron bears little relationship to input data, and so the neuron is redundant. Thus, the program is able to identify neuron with lowest degree of variability, and (if the variation is below the level of a operator-defined pruning parameter) to remove it. Average output value of the removed neuron is added to the bias of the output neuron (w3,0, see Fig. 4). The training is then resumed. In the presented experiment a pruning parameter of 0.25 was defined, and another pruning took place after at least 1000 iterations, when the average learning error el dropped below 0.02. Fig. 8 presents the average errors during the automatic structure selection for Network 3. The first neuron was removed already after 33,300 iterations (as soon as the learning error was small enough), and the pruning process was complete after less than the next 10,000 iterations, with only two neurons remaining in the hidden layer.

Fig. 8 Automatic configuration of the network 3.

The structures of networks 4, 5, and 6 were optimised in the same way, and the letter A (standing for “automatically configured”) was added to their designations. Their results, presented in Table 4, were not much inferior to those obtained by full networks. However, only networks 3 and 4 were pruned down significantly. As in the case of the ten-neuron networks discussed above, network 3A turned out to be more balanced than the others and had the lowest margin of testing errors.

Table 4. Results of the automatic selection of the number of neurons in the hidden layer

input signals

net f

ap Ff Fp Fc X

X

n2

learning errors

testing errors

el

ml

et

mt

0.0160

0.0379

0.0367

0.1677

2

3A

X

4A

X X X

0.0124

0.0289

0.0393

0.2043

3

5A

X X X X

0.0113

0.0268

0.0398

0.1897

8

6A

X X X

0.0122

0.0260

0.0383

0.1946

6

X

5. CONCLUSIONS

An analysis of the presented results confirms that the multi-layer perceptron neural networks studied in this paper are very suitable for automatic tool wear estimation in changing cutting conditions. They are robust mathematical processing devices capable of nonlinear approximation of a multi-dimensional correlation between tool wear, the cutting conditions and the values describing the cutting process (in this case, the cutting forces) without the necessity to assume a particular model for this correlation. Tool wear estimation errors in this method are of an acceptable extent and they result not so much from any imperfections of the networks themselves as from the scattered nature of the cutting force values caused by heterogeneous work material and the natural stochastic characteristics of the tool wear process. The neuron pruning algorithm presented here has proved its efficacy in automatically selecting the minimum number of neurons in the hidden layer.

REFERENCES

1.

Byrne G., Dornfeld D., Inasaki I., Ketteler G., Konig W., Teti R., Tool condition monitoring – the status of research and industrial application, CIRP Annals 44 1995.

2.

Choudhury S.K., Jain V.K., Rama Rao Ch.V.V., On-line monitoring of tool wear in turning using a neural network, Int. J. Mach. Tools & Manufacturing, 1999.

3.

Dimla, D.E., Sensor signals for tool-wear monitoring in metal cutting operations— a review of methods, Int. J. Mach. Tools & Manufacturing, 2000.

4.

Dimla D.E., Lister P.M., Leighton N.J., Automatic tool state identification in a metal turning operation using MLP neural networks and multivariate process parameters, Int. J. Mach. Tools & Manufacturing, 1998.

5.

Dimla D.E., Lister P.M., Leighton N.J., Neural network solutions to the tool condition monitoring problem in metal cutting — a critical review of methods, Int. J. Mach. Tools & Manufacturing, 1997.

6.

Dimla D.E., Lister P.M. On-line metal cutting tool condition monitoring. II: tool-state classification using multi-layer perceptron neural network. Int. J. Mach. Tools & Manufacturing, 2000.

7.

Dornfeld D.A., Neural network sensor fusion for tool condition monitoring, CIRP Annals,1990.

8.

Dutta R.K., Paul S., Chattopadhyay A.B., Applicability of the modified back-propagation algorithm in tool condition monitoring for faster convergence, J. of Materials Processing Technology, 2000.

9.

Ghasempoor A., Jeswiet J., Moore T.N., Real time implementation of on-line tool condition monitoring in turning, Int. J. Mach. Tools & Manufacturing, 1999.

10. Jemielniak K., Kwiatkowski L., Wrzosek P., Diagnosis of Tool Wear Based on Cutting Forces and Acoustic Emission Measurements as Inputs to a Neural Network, J. of Intelligent Manufacturing, 1998. 11. Jemielniak K., Commercial Tool Condition Monitoring Systems, Int. J. of Advanced Manufacturing Technology, 1999. 12. Jemielniak K. et al., Utilisation of acoustic emission and cutting forces signals for tool condition monitoring in turning, Project KBN 7T07TD03410, Warsaw University of Technology, 1998. 13. Ketteler G., Analysis of Requirements for Monitoring Systems, Proc. of the Second Int.

Workshop on Intelligent Manufacturing Systems, Leuven, Belgium, 1999. 14. J.H. Leea, S.J. Leeb, One-step-ahead prediction of flank wear using cutting force, Int. J. Mach. Tools Manufacturing, 1999. 15. Liu Q., Altintas Y., On-line monitoring of flank wear in turning with multilayered feedforward neural network, International Journal of Machine Tools & Manufacture, 1999. 16. O’Donnel G., Young P., Kelly K., Byrne G., Towards the improvement of tool condition monitoring systems in the manufacturing environment, Journal of Materials Processing Technology, 2001.