Application of neural network and adaptive neuro-fuzzy inference

0 downloads 0 Views 1MB Size Report
Application of neural network and adaptive neuro-fuzzy inference system to predict subclinical mastitis in dairy cattle. Nazira M. Mammadova1 and Ismail Keskin ...
Indian J. Anim. Res., 49 (5) 2015 : 671-679 Print ISSN:0367-6722 / Online ISSN:0976-0555

AGRICULTURAL RESEARCH COMMUNICATION CENTRE

www.arccjournals.com/www.ijaronline.in

Application of neural network and adaptive neuro-fuzzy inference system to predict subclinical mastitis in dairy cattle Nazira M. Mammadova1 and Ismail Keskin* Department of Animal Science, Faculty of Agriculture, Selcuk University, 42075, Konya, Turkey. Received: 03-12-2014 Accepted: 04-02-2015

DOI: 10.18805/ijar.5581

ABSTRACT Mastitis is an important problem, while I guess AI is a possible solution to detect subclinical mastitis in Holstein cows milked with automatic milking systems. Mastitis alerts were generated via ANN and ANFIS model with the input data of lactation rank (current lactation number), milk yield, electrical conductivity, average milking duration and season. The output variable was somatic cell counts obtained from milk samples collected monthly throughout the 15 months of the sampling period. Cattle were judged healthy or infected based on somatic cell counts. This study undertook a detailed scrutiny of ANN, and ANFIS AI methodology; constructed and examined models for each; and chose optimal methods based on that examination. The two mastitis detection models were evaluated as to sensitivity, specificity and error rate. The ANN model yielded 80% sensitivity, 91% specificity, and 64% error and the ANFIS, 55%, 91% and 35%. These results suggest the ANN model is better predictor of subclinical mastitis than ANN based on Z-test (the hypothesis control for the difference between rates). AI models such as these are useful tools in the development of mastitis detection models. Prediction error rates can be decreased through the use of more informative parameters. Key words: Adaptive neuro-fuzzy inference system, Artificial Intelligence, Artificial neural network, Dairy cattle, Neuro-Fuzzy Inference System, Subclinical mastitis. INTRODUCTION Mastitis is a disease typically resulting from inflammation of the mammary gland caused by microorganisms, usually bacteria, that invade the udder, multiply and produce toxins that are harmful to the mammary gland and causing substantial economic loss in the form of decreased milk yields (Osteras et al., 1999; Wilson et al., 2004; Schroeder, 2012). The udder reacts against mastitis to destroy the bacteria causing the infection, neutralize toxins and repair milk secretion tissues. Mastitis types were classified according to the strength of the udder’s reaction. Subclinical mastitis is a form in which there is no swelling of the gland or observable abnormality of the milk, although there are changes in the milk that can be detected by special tests like the California Mastitis Test (CMT) and somatic cell count (Batra, 1986; Sargeant et al., 1998). Most mastitis events are subclinical, representing the hidden part of the iceberg. In the absence of udder health control programs, 50% of cows in a given herd will be infected with subclinical

mastitis. On average, two udder quarters of infected cows are positive for subclinical mastitis (Tekeli, 2005). Related to the increased permeability of mastitisinfected udder tissue, researchers have observed changes in the ionic composition of milk, marked by decreased levels of certain minerals and increased levels of Na and Cl, resulting in increased electrical conductivity (Nielen, 1992; Ilie et al., 2010). Some studies have reported that electrical conductivity exceeding 5.5 mS/cm could indicate subclinical mastitis (Ilie et al., 2010; Norberg et al., 2004; Bruckmaier et al., 2004; Janzekovic, 2009). Today, with enterprises using computerized herd management systems that automatically record such traits as milk yield, milk flow rate and electrical conductivity, such programs may alert users to the presence of mastitis in cows based on extreme deviations in electrical conductivity. However, these alarms are so frequently wrong that the practice of using electrical conductivity alone to predict mastitis seems unviable (Atasever and Erdem, 2008).

*Corresponding author’s e-mail: [email protected]. 1 Department of Animal Science, Faculty of Agriculture, Siirt University, 56100, Siirt, Turkey. Abbreviations: AI, artificial intelligence; ANN, artificial neural network; ANFIS, adaptive neurofuzzy inference system; SSC, Somatic cell count; MD, mastitis detection; EC, electrical conductivity; S, season; LR, lactation ranks; MY, daily milk yield; TN, true negative; FN, false negative; FP, false positive; AMD, average duration of milking.

672

INDIAN JOURNAL OF ANIMAL RESEARCH

SCC of milk samples has gained wide recognition as an aid in the detection of subclinical mastitis (Batra, 1986). Somatic cell count (SCC) depends on such factors as cow age, breed, lactation rank, milk yield, udder anatomical and physiological features, stress, season, nutrition, shelter conditions, milking technique and the presence of mastitis (Raubertas and Shook, 1982; Jones et al., 1984; Göncü and Özkütük, 2002; Seker et al., 2000; Cedden et al., 2002; Risvanli and Kalkan, 2002; Uzmay et al., 2003; Bademkýran et al., 2005; Eyduran et al., 2005; Kul et al., 2006; Porcionato et al., 2010). Some studies have reported that increased SCC is negatively related to milk yield (Raubertas and Shook, 1982; Jones et al., 1984; Eyduran et al., 2005). AI methods, used in a wide range of technological applications, have recently seen increased use in animal husbandry. By applying such artificial intelligence concepts as fuzzy logic (FL), artificial neural networks (ANNs) and adaptive neuro-fuzzy interface systems (ANFIS), researchers have sought to achieve early detection of mastitis (Yang et al., 2000; Cavero et al., 2006; Krieter et al., 2007). In Turkey, annual economic losses caused by mastitis are estimated at around $41.5 million, despite the fact that each $1 spent on effective mastitis control programs returns $5 to the producer (Tekeli, 2005). Knowledge of mastitis-causing factors and a willingness to take necessary measures are considered the most important steps in overcoming the barriers to ensure high-quality milk production and protect animal welfare (Atasever and Erdem,2008). This study investigates performance of AI methods of ANN and ANFIS for early detection of subclinical mastitis in situations where detection would be difficult using conventional methods, like the California Mastitis Test (CMT) and somatic cell count. MATERIALS AND METHODS In this study, data from the February 2010–April 2011 Karyem Private Dairy Farm were used to develop and validate FL, ANN and ANFIS models for classifying dairy cows into healthy and mastitis-infected groups. Animal subjects used in the study consisted of 170 Holstein Friesian cows raised in the Karyem Agricultural Enterprise, situated in the Karapinar district (370 47’ North, 330 35’ East and 994 m altitude) of the Konya province. In the sampling periods the cows extracted for calving were replaced with new joined cows. The Karyem Agricultural Enterprise uses a professional herd management system that immediately

records all events and uses this information to predict future problems to allow for the highest levels of profitability. The program allows animal specific information to be entered by the user or to be automatically registered by the system. As animals are milked using the automatic milking system, the system automatically collects such data as milk yield, average milking duration (min/cow), and milk electrical conductivity. Cows were housed in a freestall barn and milked twice daily (03:00 to 06:00 and 15:00 to 18:00) in a 2 x 12 side-closed milking parlour (Afikim, AFIFARM). For this study, milk samples were collected from the automatic milking system in 50 cc tubes once a month (from February 2010 until April 2011) from 03:00 to 06:00 am with the help of a sampling apparatus (It is a bottle installed to the cow’s milking system, where milk drops during milking). Milk samples were collected with all due care, given that the primary goal was to calculate milk SCC. Experiment was carried out according to Selçuk University Faculty of Agriculture Guidelines. In the laboratory, SCC was calculated using a microscope from samples spread over slides and colored, using the lane counting method (Torlak, 2005). Counts were done twice in order to increase reliability of the SCC figures, and an average of the two counts was taken. The udder health of the cow is determined via the somatic cell count. Here, two cases are investigated: 1) Somatic cell count less than 225 000 cells/ml is ‘healthy’ case; 2) Somatic cell count more than 225 000 cells/ml is ‘subclinical mastitis’ case. Artificial neural network (ANN): An artificial neural network (ANN) can be thought of as a computer program that produces a recognized predetermined output based on changes in certain inputs and outputs of the system. ANN is used to address problems commonly encountered in pattern, sample, and character recognition and similar processes. An electric model of a human neural network constitutes the basis of ANN. Data, or inputs to the neural network, are multiplied by a weight assigned to each input and then added. If the total value exceeds a certain limit value, that value is transmitted to the total output. The problem is to arrange input weights to obtain the desired output at the end. There are many ways to regulate input weights. Figure 1 shows a simple neural network model. Here, Xi – is an input value of the network, Wi – is a weight of the appropriate input,  – is a linear sum of the collection of input values and shows the limit function. In general, ANNs use multi-layer models. This kind of model has an input layer, a hidden layer and an output layer (Allahverdi,

673

Volume 49 Issue 5 (October 2015)

(FP) observations. Undetected subclinical mastitis is “False Negative” (FN), and if there are no alerts, observations outside the subclinical mastitis period are classified as “True Negative” (TN). The sensitivity represents the number of correctly detected samples with subclinical mastitis out of all samples with subclinical mastitis:

Sensitivity (%)  FIG 1. Neural network sample

2002). ANN is an information processing system for a large number of non-linear and tightly interconnected processors or neurons organized as layers and connected to the other layers with weights.

TP TP  FN

 100

The specificity indicates the percentage of correctly identified healthy samples of all samples of health:

Specificit y (%) 

TN TN  FP

 100

The main advantage of the ANN technique over traditional methods is that it does not require clear, obvious mathematical information concerning the complex state of the investigated problem. The most general definition of ANN is that it is a complex system, designed to emulate the neurons of the human brain and formed by artificial neurons connected to one another via various connection geometries. The features which distinguish ANN from other methods include its capacities for learning, generalization and parallel processing. These features allow for such advantages as speed, fault tolerance and efficiency.

The error rate represents the percentage of samples with wrong detected subclinical mastitis of all samples on which an alarm was produced:

Adaptive Neuro-Fuzzy Interface System (ANFIS): Both the ANN and FL techniques are based on the working logic of the human brain. Where FL imitates the mechanism of the human brain’s ability to draw conclusions, ANNs emulate the brain’s physical structure. Both techniques require a mathematical model to control the system, and they can thus be used to model both linear and complex non-linear systems.

RESULTS AND DISCUSSION Analysis of the results revealed that the daily milk yield (MY) of Holstein cows in 6 lactation ranks (LR) and 4 seasons (S) (spring, summer, autumn and winter) varied from 14.29±1.680 to 34.20±2.430 kg/min; electrical conductivity (EC) from 3.99±0.097 to 4.50±0.322 mS/cm; average milking duration (AMD) from 4.38±0.317 to 9.04±1.430 min/cow; and SCC from 29488 to 228026 cells/ml respectively (Table 1).

The Adaptive Neuro-Fuzzy Interface System (ANFIS) is an artificial intelligence technique resulting from a synthesis of the ANN and FL systems. To capitalize on FL’s capability to process uncertain information and ANN’s learning ability; these two technologies may be combined in a variety of ways (Jacceh, 2003). At the same time, the combination of FL and ANN covers certain disadvantages of each method. ANFIS is composed of four layers: the fuzzification layer, the hidden layer, the function layer and the defuzzification layer (Jaos, 1998). In this study, correctly assigned subclinical mastitis cases in the subclinical mastitis prediction model are classified as “True Positive” (TP). Mastitis alerts outside the current subclinical mastitis period are “False Positive”

Error (%) 

FP FP  TP

 100

To compare the sensitivity, specificity and error rates obtained as a result of implementing the AI techniques (ANN and ANFIS), a Z-test (the hypothesis control for the difference between rates) was used (Kesici and Kocabas, 2007).

Results from applying the ANN and ANFIS methods to the obtained data to predict early stage subclinical mastitis in Holstein cows are given below. Artificial Neural Network: To train the 5 ANN input layers, 5 hidden layers, and one output layer neurons, a Multi Layer Feed Forward Neural Network was used (Figure 2) as calculated using the Matlab 7 software package. In practice, TABLE 1. Trait averages. TRAITS

MIN

MAX

Milk Yield (kg/min) 14.29±1.680 34.20±2.430 Electrical Conductivity (mS/cm) 3.99±0.097 4.50±0.322 Average Milking Duration (min/cow) 4.38±0.317 9.04±1.430 Somatic Cell Count (cells/ml) 29488 228026

674

INDIAN JOURNAL OF ANIMAL RESEARCH

the data collected during the 15-month control were used. Milk sample SCC counts were used as the MD output data. The output, MD healthy or subclinical mastitis infected was classified as healthy=0, subclinical mastitis=1. In application, the real-world data collected during the 15-month sampling period were used. Milk sample SCC counts were used as the output MD data for classification in training. Output data MD were classified as healthy or

FIG 3. Sample regression coefficient in ANN prediction Gradient = 0.011257, at epoch 10000

Validation Checks = 0, at epoch 10000

FIG 4. Sample training state in ANN FIG 2. Sample ANN model for subclinical mastitis detection

subclinical mastitis infected, with the assigned values of 0 and 1, respectively. All input values used in the ANNs were normalized using the following formula:

xi 

xi  xmin xmax  xmin

Of the 346 data collected for 170 cows in the study, 61 were determined to have subclinical mastitis. The 346 data samples collected on the sampling days were divided as follows: 311 (90%) training, 35 (10%) testing; 260 (75%) training, 86 (25%) testing; 242 (70%) training, 104 (30%) testing; and 208 (60%) training, 138 (40%) testing. The ANNs were trained before the system was run with the test data. Values estimated from the system were compared with actual observed values. The training and testing stages of the operations resulted without unused data. The 346 data samples were stirred to obtain 5 distinct ranking data sets. Each data set was subject to training–testing procedures, the percentages of correct responses were calculated, and the averages of percentages of these operations for each data set were estimated (Figures 3–5).

FIG 5. Sample performance in ANN

The 10000 epoch was used for each dataset in training and testing and yielded a 0.001 error rate. These results are presented in Table 2. As shown in Table 2, in the 3rd test, which yielded the best ANN training, the MD output value was observed as 91%. The sensitivity, specificity and error rates of the ANN model were 80%, 91% and 64%, respectively.

675

Volume 49 Issue 5 (October 2015)

TABLE 2: Test results

Adaptive Neuro-Fuzzy Interface System (ANFIS): For the ANFIS application, the MATLAB Toolbox ANFIS interface was used. A total of 5 inputs (LR, MY, EC, AMD, S) were prepared for training (60%, 70%, 75%, and 90% of all data) and MD output data were entered into the system for model training (Figure 6). Inputs were fuzzified using the FL technique. As with the FL model, LR=3, MY=4, EC=3, AMD=3 and S=4 were divided into four membership functions. Rules and an inference system were prepared by the system using the artificial neural networks technique. The epoch number was determined to be 20. The error value changed little with an epoch number of 50, so a value of 20 was deemed appropriate. The Sugeno inference method was used here as a mining method. The Sugeno inference method is similar to the Mamdani inference method used in the fuzzy

FIG 6. Scheme of ANFIS model in MATLAB Toolbox

expert systems, except that the output data membership functions can be either constant or linear in the Sugeno method. The membership function of the outputs in the established model was defined as constant. The obtained output value zi for each rule was multiplied by the weight wi of running rule. For example, if LR=1, MR=30 l, EC=4.5 mS/cm, AMD=8 min, and S=3, running weight wi is equal to wi   ( F1 (1), F2 (30), F3 ( 4.5), F4 (8), F5 (3)) Here, F1,2,3,4,5 (.) values are the membership functions of the inputs: LR, MR, EC, AMD, and S. The final output of the system is a weighted average of all rule outputs: N

 w izi Final output

 i 1

N

 w i 1

i

676

INDIAN JOURNAL OF ANIMAL RESEARCH

Figure 7 shows graphically the error value at the end of the 20th training epoch. Table 4 shows all error values obtained at the end of implementation of the ANFIS model on the 5 data sets. Figures 8 and 9 show the estimation error of the system for the 10% and 40% test inputs. Similarly to the implementation of the ANN model, the predicted values for the 5 data sets were compared to the actual observed values. The percentage of correct values was calculated, as, for example, with 86 test inputs yielding 71 correct outputs the success rate was 83%. After this procedure was performed for each output, the arithmetic average of the percentages was estimated (Table 3). The operations resulted in unused data in the training and testing stages. Z-test comparing the specificity of the ANN and ANFIS models found no statistically significant difference between the two. In terms of the sensitivity and error rates, the Z-test found ANN method to be superior at a statistically significant level (P