The capabilities of artificial neural networks in body ... - Springer Link

16 downloads 0 Views 101KB Size Report
models, artificial neural networks can be used. In the pres- ent work, we describe the pre-processing and multivariate analysis of data using neural network ...
Acta Diabetol (2003) 40:S9–S14 DOI 10.1007/s00592-003-0018-x

© Springer-Verlag 2003

R. Linder • E.I. Mohamed • A. De Lorenzo • S.J. Pöppl

The capabilities of artificial neural networks in body composition research

Abstract When estimating in vivo body composition or combining such estimates with other results, multiple variables must be taken into account (e.g. binary attributes such as gender or continuous attributes such as most biosignals). Standard statistical models, such as logistic regression and multivariate analysis, presume well-defined distributions (e.g. normal distribution); they also presume independence among all inputs and only linear relationships, yet rarely are these requirements met in real life. As an alternative to these models, artificial neural networks can be used. In the present work, we describe the pre-processing and multivariate analysis of data using neural network techniques, providing examples from the medical field and making comparisons with classic statistical approaches. We also address the criticisms raised regarding neural network techniques and discuss their potential improvement. Key words Artificial neural network • Perceptron • Feed-forward • Modular network • Leave one out • Approximation • Classification in Medicine (ACMD)

R. Linder () Institute of Medical Informatics University of Lübeck Ratzeburger Allee 160, D-23538 Lübeck, Germany E-mail: [email protected] R. Linder • S.J. Pöppl Institute of Medical Informatics University of Lübeck, Germany E.I. Mohamed • A. De Lorenzo Division of Human Nutrition, Faculty of Medicine and Surgery Tor Vergata University, Rome, Italy E.I. Mohamed Department of Biophysics, Medical Research Institute University of Alexandria, Egypt A. De Lorenzo Scientific Institute “S. Lucia”, Rome, Italy

Introduction Neural networks (also termed “neural nets” or “connectionist models”) are a series of non-linear, interconnected mathematical equations, which tangentially resemble biological neuronal systems and are used to calculate an output variable on the basis of independent input variables. Neural network analysis is an outgrowth of artificial intelligence, yet it differs from expert systems in that it is not rule-based with pre-programmed constraints, rules, or conditions. Instead, neural networks “learn” and progressively develop meaningful reliable relationships between input and output variables. Unlike classic statistical models and correlative methods, neural networks consist of multiple indirect interconnections between input and output variables and employ non-linear mathematical equations and statistical techniques to successively minimise the variance between actual and predicted outputs. This eventually yields a model which can be subsequently applied to an independent data set, in turn producing predicted outputs that reliably correspond to the actual observed values.

The history of neural networks The history of artificial neural networks (ANN) dates back to as early as 1890, when the American psychologist William James developed a model to explain the capabilities of the brain in making associations. He speculated that “when two brain processes are active together or in immediate succession, one of them, on reoccurring tends to propagate its excitement into the other”. In other words, as stated by the famous Hebbian learning rule formulated in 1949: “The connections between two neurons will increase if they are active simultaneously”. In 1958, an ANN known as “perceptron” was developed by Rosenblatt, marking the beginning of the process-

S10

R. Linder et al.: The capabilities of artificial neural networks in body composition

ing of information in only one direction by so-called feedforward networks. In 1960, perceptron was applied for the first time, yet 9 years later, Minsky and Papert published a terribly discouraging analysis of the perceptron and its limitations. It later became clear that the perceptron needed one or more hidden layers to overcome its limitations. In 1986, Rumelhart et al. developed a powerful learning algorithm that could also handle hidden neurons [1], representing the starting point for the triumphant progress of ANN technology. Not least in medicine, ANN are used for supporting decisions in diagnosis, classification, early detection, prognosis, and quality control. As revealed by a Medline search for the MESH term “Neural-NetworksComputer”, there is a growing interest in ANN, with more than 4,000 papers dealing with this topic, most of which focus on feed-forward networks.

How do feed-forward networks work? Generally, feed-forward networks each consist of three layers of artificial neurons. Data are entered in the input layer and further processed in the hidden and output layers. ANN use non-linear mathematical equations to successively develop meaningful relationships between input and output variables through a learning process, which consists of a “training phase” and a “recall phase”. In the training phase, the relationships between the different input variables and the output variable(s) are established through adaptations of the weight factors assigned to the interconnections between the layers of the artificial neurons. This adaptation is based on rules that are set in the learning

algorithm. At the end of the training phase, the weight factors are fixed. In the recall phase, data from patterns not previously interpreted by the network are entered, and an output is calculated based on the above-mentioned, and now fixed, weight factors. In Fig. 1, a diagram representing an ANN containing six input neurons (in this case, attributes of schoolchildren) and one output neuron (prediction of obesity) is presented. Usually, the output neuron produces a so-called activity between 0 and 1, with the specific value representing in this case either “obesity” (activity ≥0.5) or “no obesity” (activity