Artificial Neural Networks Theory and Applications

ARTIFICIAL NEURAL NETWORKS THEORY AND APPLICATION

BAL AJI J, Ro l l No . 16 2 012 61 Re s e arc h Fe l l ow, Department of

A e ros p ac e E n g i n e e r i n g ,

Indian Institute of Technol ogy, K a np u r - 2 0 8 016.

THE FOUR PARADIGMS OF ARTIFICIAL INTELLIGENCE Cognitive Model of the Brain (Machine Learning)

Laws of Thought (Syllogism)

Artificial Neural Networks Support Vector Machines

Boolean, Fuzzy, Propositional and Predicate Logics

‘The Turing Test’ Natural Language Processing Knowledge Representation Automated Reasoning Machine Learning Robotics Computer Vision

Rational Agent Problem Solving by Search Autonomous Systems

Artificial Neural Networks - Theory and Applications

ARTIFICIAL NEURAL NETWORKS 

A machine that is designed to model the way in which the brain performs a particular task or function of interest; the network is usually implemented by using electronic components or is simulated in software on a digital computer.



A massively parallel distributed processor made up of simple processing units that has a natural propensity for storing experiential knowledge and making it available for use. It resembles the brain in two aspects:

•

Knowledge is acquired by the network from its environment through a learning process.

•

Inter neuron connection strengths, known as synaptic weights, are used to store the acquired knowledge.


CHARACTERISTICS OF NEURAL NETWORKS

    

Nonlinearity Input–Output Mapping Adaptivity Evidential Response Contextual Information

 Fault Tolerance  VLSI Implementability  Uniformity of Analysis and Design  Neurobiological Analogy


THE BIOLOGICAL AND ARTIFICIAL NEURON

Biological Neural Network

Artificial Neural Network

Cell Body

Neurons

Dendrite

Weights or interconnections

Inputs p1

Soma

Net input

p2

Weights w1 w2

f

w3

a Output

p3

Axon

Output

1

Bias

a  f p1 w1  p2 w2  p3 w3  b   f  pi wi  b 


ARTIFICIAL NEURAL NETWORK (ANN)  An artificial neuron is characterized by  Architecture

(connection between neurons)

 Activation function

(determines the decision boundary)

 Training or learning

(determining weights on the connections)

 Example


TYPES OF ACTIVATION FUNCTION


NEURAL NETWORKS VIEWED AS DIRECTED GRAPHS Rule 1. A signal flows along a link only in the direction defined by the arrow on the link. Rule 2. A node signal equals the algebraic sum of all signals entering the pertinent node via the incoming links. Rule 3. The signal at a node is transmitted to each outgoing link originating from that node, with the transmission being entirely independent of the transfer functions of the outgoing links.


SIGNAL FLOW GRAPH OF ANN A neural network is a directed graph consisting of nodes with interconnecting synaptic and activation links and is characterized by four properties: 1. Each neuron is represented by a set of linear synaptic links, an externally applied bias, and a possibly nonlinear activation link. 2. The synaptic links of a neuron weight their respective input signals. 3. The weighted sum of the input signals defines the induced local field of the neuron in question. 4. The activation link squashes the induced local field of the neuron to produce an output. Artificial Neural Networks - Theory and Applications

FEEDBACK  Feedback is said to exist in a dynamic system whenever the output of an element in the system influences in part the input applied to that particular element, thereby giving rise to one or more closed paths for the transmission of signals around the system.  Feedback plays a major role in the study of a special class of neural networks known as recurrent networks.


NETWORK ARCHITECTURES  The manner in which the neurons of a neural network are structured is intimately linked with the learning algorithm used to train the network.  Feedforward Neural Networks – the signal flow is strictly from the input layer to the output layer only.  Single Layer Feedforward Networks  “single-layer” referring to the output layer of computation nodes.  The input layer is not counted as source nodes because no computation is performed there.


NETWORK ARCHITECTURES  A Multilayer Feedforward Neural Network distinguishes itself by the presence of one or more hidden layers, whose computation nodes are correspondingly called hidden neurons or hidden units.  the term “hidden” refers to that part of the neural network which is not seen directly from either the input or output of the network.  The function of hidden neurons is to intervene between the external input and the network output in some useful manner.  By adding one or more hidden layers, the network is enabled to extract higher-order statistics from its input. Artificial Neural Networks - Theory and Applications

NETWORK ARCHITECTURES  A recurrent neural network distinguishes itself from a feedforward neural network in that it has at least one feedback. There are no self-feedback loops in the network.  The feedback loops involve the use of particular branches composed of unit-time delay elements, which result in a nonlinear dynamic behaviour.

R e c u r r e nt n et wo r k w i t h n o s e l f - f e ed b ac k l oo p s a nd no hidden neurons. R e c u r r e nt n et wo r k hidden neurons.

with


KNOWLEDGE REPRESENTATION  Knowledge refers to stored information or models used by a person or machine to interpret, predict, and appropriately respond to the outside world.  Knowledge of the world consists of two kinds of information: • The known world state, represented by facts about what is and what has been known; this form of knowledge is referred to as prior information. • Observations (measurements) of the world, obtained by means of sensors designed to probe the environment, in which the neural network is supposed to operate.  Primary characteristics of knowledge representation • What information is actually made explicit; and • How the information is physically encoded for subsequent use.  A major task for a neural network is to learn a model of the world in which it is embedded, and to maintain the model suf ficiently consistently with the real world so as to achieve the specified goals of the application of interest. Artificial Neural Networks - Theory and Applications

RULES OF KNOWLEDGE REPRESENTATION

 Rule 1 . Similar inputs (i.e., patterns drawn) from similar classes should usually produce similar representations inside the network, and should therefore be classified as belonging to the same class.  Rule 2. Items to be categorized as separate classes should be given widely different representations in the network .  Rule 3. If a particular feature is important, then there should be a large number of neurons involved in the representation of that item in the network .  Rule 4. Prior information and invariances should be built into the design of a neural network whenever they are available, so as to simplify the network design by its not having to learn them.


LEARNING PROCESSES  Just as there are different ways in which we ourselves learn from our own surrounding environments, so it is with neural networks.  ANNs can model the learning process by adjusting the weighted connections found between neurons in the network .  This effectively emulates the strengthening and weakening of the synaptic connections found in the biological brain.  This strengthening and weakening of the connections is what enables the network to learn.  The learning processes through which neural networks function can be categorized as follows: learning with a teacher and learning without a teacher.  By the same token, the latter form of learning may be subcategorized into unsupervised learning and reinforcement learning. Artificial Neural Networks - Theory and Applications

LEARNING PROCESSES Learning with a Teacher - also referred to as supervised learning. 1. the teacher and the neural network are both exposed to a training vector (i.e., example) drawn from the same environment. 2. By virtue of built-in knowledge, the teacher is able to provide the neural network with a desired response for that training vector. 3. The network parameters are adjusted under the combined influence of the training vector and the error signal. 4. This adjustment is carried out iteratively in a stepby-step fashion with the aim of eventually making the neural network emulate the teacher. Artificial Neural Networks - Theory and Applications

LEARNING PROCESSES Learning without a Teacher - Reinforcement Learning 1. the learning of an input–output mapping is performed through continued interaction with the environment in order to minimize a scalar index of performance. 2. The critic converts a primary reinforcement signal received from the environment into a higher quality reinforcement signal called the heuristic reinforcement signal. 3. The system observes a temporal sequence of stimuli also received from the environment, which eventually result in the generation of the heuristic reinforcement signal. Artificial Neural Networks - Theory and Applications

LEARNING PROCESSES Learning without a Teacher – Unsupervised Learning 1. Provision is made for a task-independent measure of the quality of representation that the network is required to learn, and the free parameters of the network are optimized with respect to that measure. 2. To perform unsupervised learning, a competitive-learning rule is used. 3. The competitive layer consists of neurons that compete with each other for the “opportunity” to respond to features contained in the input data. 4. In such a strategy, the neuron with the greatest total input “wins” the competition and turns on; all the other neurons in the network then switch off. Artificial Neural Networks - Theory and Applications

LEARNING TASKS Pattern Association - Association has been known to be a prominent feature of human brain.  

  

An associative memory is a brainlike distributed memory that learns by association In Autoassociation, a neural network is required to store a set of patterns (vectors) by repeatedly presenting them to the network. Phases involved in the operation of an associative memory: storage phase and recall phase. In Heteroassociation, an arbitrary set of input patterns is paired with another arbitrary set of output patterns. Autoassociation involves the use of unsupervised learning, whereas the type of learning involved in heteroassociation is supervised. Artificial Neural Networks - Theory and Applications

LEARNING TASKS Pattern recognition - the process whereby a received pattern/ signal is assigned to one of a prescribed number of classes. 







During training session, the network is repeatedly presented with a set of input patterns along with the category to which each particular pattern belongs. Later, the network is presented with a new pattern that has not been seen before, but which belongs to the same population of patterns used to train the network. The network is able to identify the class of that particular pattern because of the information it has extracted from the training data. Pattern recognition performed by a neural network is statistical in nature, with the patterns being represented by points in a multidimensional decision space. Artificial Neural Networks - Theory and Applications

LEARNING TASKS Function Approximation - The ability of a neural network to approximate an unknown input–output mapping. Consider a nonlinear input–output mapping described by the functional relationship where the vector x is the input and the vector d is the output. The vector valued function f(·) is assumed to be unknown. To make up for the lack of knowledge about the function f(·), we are given the set of labelled examples:

The requirement is to design a neural network that approximates the unknown function f(·) such that the function F(·) describing the input–output mapping actually realized by the network, is close enough to f(·) in a Euclidean sense over all inputs, as shown by

where e is a small positive number Provided that the size of the training sample is large enough and the network is equipped with an adequate number of free parameters, then the approximation error can be made small enough for the task. Artificial Neural Networks - Theory and Applications

LEARNING TASKS Function Approximation - System identification 



 

Let 𝒅 = 𝒇(𝒙) describe the input–output relation of an unknown memoryless multiple input–multiple output (MIMO) system; Let the vector 𝒚𝒊 denote the actual output of the neural network produced in response to an input vector 𝒙𝒊 . The difference between 𝒅𝒊 (associated with 𝒙𝒊 ) and the network output 𝒚𝒊 provides the error signal vector 𝒆𝒊 , This error signal is, in turn, used to adjust the free parameters of the network to minimize the squared difference between the outputs of the unknown system and the neural network in a statistical sense, and is computed over the entire training sample. Artificial Neural Networks - Theory and Applications

LEARNING TASKS Function Approximation – Inverse Modelling  

Let 𝒅 = 𝒇(𝒙) describe the input–output relation of an unknown memoryless multiple input–multiple output (MIMO) system; an inverse model produces the vector x in response to the vector d. The inverse system may thus be described as 𝒙 = 𝒇−𝟏 (𝒅)


LEARNING TASKS Control  Indirect learning – the Jacobian of the plant is estimated used in the error-correction learning algorithm for computing the adjustments to the free parameters of the neural controller.  Direct learning - The absolute values of the Jacobian are given a distributed representation in the free parameters of the neural controller.


ROSENBLATT’S PERCEPTRON

 The simplest form of a neural network used for the classification of patterns said to be linearly separable.  Consists of a single neuron with adjustable synaptic weights and bias.  Perceptron is built around a nonlinear neuron, namely, the McCulloch–Pitts model of a neuron.


THE PERCEPTRON CONVERGENCE ALGORITHM

Illustration of the hyperplane as decision boundary for a twodimensional, two-class patternclassification problem.


(a) A pair of linearly separable patterns. (b) A pair of non-linearly separable patterns.

THE LEAST-MEAN-SQUARE ALGORITHM  The least−mean−square (LMS) algorithm is configured to minimize the instantaneous value of the cost function, 1

 ℇ 𝑤 = 2 𝑒2 𝑛  Differentiating ℇ w with respect to the weight vector wyields 

𝜕ℇ 𝑤 𝜕𝑤

=𝑒 𝑛

 Hence,

𝜕𝑒 𝑛 𝜕𝑤

𝜕𝑒(𝑛) 𝜕𝑤

= −𝑥(𝑛)

𝜕ℇ 𝑤 𝜕𝑤

= −𝑥(𝑛)𝑒(𝑛)

 the instantaneous estimate of the gradient vector

𝑔 𝑛 =


MULTILAYER PERCEPTRONS  The model of each neuron in the network includes a nonlinear activation function that is differentiable.  The network contains one or more layers that are hidden from both the input and output nodes.  The network exhibits a high degree of connectivity,

the

extent

of

which

is

determined by synaptic weights of the network .


TRAINING OF MULTILAYER PERCEPTRONS  Forward Phase - the synaptic weights of the network are fixed and the input signal is propagated through the network, layer by layer, until it reaches the output (function signal).  Backward Phase - an error signal is produced by comparing the output of the network with a desired response.  The resulting error signal is propagated through the network, again layer by layer in the backward direction.  successive adjustments are made to the synaptic weights of the network . Artificial Neural Networks - Theory and Applications

BACK PROPAGATION ALGORITHM 1. Pick the synaptic weights and thresholds. 2 . P r e s e n t t h e n et w o r k a n e p o c h o f t r a i ni n g ex a m p le s 3 . C o m p ute t h e i nd uc ed l o c al fi el d s a nd f u nc t i o n s i g n al s o f t h e n et wo r k by p ro c e e d i ng fo r war d t hro ug h t h e n et wo r k , l ayer by l aye r.

4 . C o m p ute t h e e r r o r s i g n a l 5 . C o m p ute t h e l oc al g r ad i e nt s o f t h e n et wo r k an d ad j us t t h e s y n ap t i c w ei g ht s o f t h e n et wo rk i n l ayer l ac co r d i ng to t h e generalized delta rule.

6 . I ter ate t h e fo r wa r d a nd b ac kw ar d c o m p ut at i o ns u nd er p o i nt s 3 to 5 by p r e s en t i ng n ew ep o c h s o f t r ai ni ng ex am p l e s to t h e n et w o r k u n t i l t h e c h o s e n s to p p i n g c r i te r i o n i s m et . Artificial Neural Networks - Theory and Applications

APPLICATIONS OF ANN  Classification    

in In In In

marketing: consumer spending pattern classification defense: radar and sonar image classification agriculture & fishing: fruit and catch grading medicine: ultrasound and electrocardiogram image classification, EEGs, medical diagnosis

 Recognition and identification  

In general computing and telecommunications: speech, vision and handwriting recognition In finance: signature verification and bank note verification

 Assessment   

In engineering: product inspection monitoring and control In defense: target tracking In security: motion detection, surveillance image analysis and fingerprint matching

 Forecasting and prediction    

In In In In

finance: foreign exchange rate and stock market forecasting agriculture: crop yield forecasting marketing: sales forecasting meteorology: weather prediction Artificial Neural Networks - Theory and Applications

CASE STUDY

AIRCRAFT SYSTEM IDENTIFICATION USING ARTIFICIAL NEURAL NETWORKS

Kenton Kirkpatrick, Jim May Jr. and John Valasekz, “Aircraft System Identification Using Artificial Neural Networks”, 51st AIAA Aerospace Sciences Meeting including the New Horizons Forum and Aerospace Exposition 07 - 10 January 2013, Grapevine (Dallas/Ft. Worth Region), Texas, U.S.A. Artificial Neural Networks - Theory and Applications

SYSTEM IDENTIFICATION  Identification of linear model for aircraft systems requires experimentally determined data, including state response to control inputs and excitation of modes.  Separate linear models are generally determined for longitudinal and lateral/directional modes.  Linear models are needed to analyse stability and determine control policies.  Identifying a linear model requires determining:    

State matrix A Control matrix B Output matrix C Carry-through matrix D

𝑥𝑘+1 = 𝐴𝑥𝑘 + 𝐵𝑢𝑘 𝑦𝑘 = 𝐶𝑥𝑘 + 𝐷𝑢𝑘

Commander 700 Aircraft Artificial Neural Networks - Theory and Applications

LONGITUDINAL LINEAR MODEL  Longitudinal Motion Covers motion that occurs in the pitching plane  Includes forward velocity, vertical velocity (or angle -of-attack), pitch angle, and pitch rate  Controls include elevator deflection and thrust 𝑥= 𝑢

𝛼

𝑢 = 𝛿𝑒

𝑞

𝜃

𝛿𝑇

𝑇

𝑇


LATERAL/DIRECTIONAL LINEAR MODEL  Lateral/Directional Motion covers motion that occurs in the rolling and yawing planes  Includes side velocity (or side-slip angle), roll rate, yaw rate, roll angle and heading angle  Controls include aileron and rudder deflections 𝑥= 𝛽

𝑝

𝑢 = 𝛿𝑎

𝑟

𝜙 𝜓 𝛿𝑟

𝑇

𝑇


ANNSID  Ar tificial Neural Network System Identification (ANNSID) uses backpropagation to determine A and B matrices  ANN Requirements:  No hidden layers. Only input and output layers.  Must use linear threshold function (i.e., no threshold).  No bias inputs to nodes.


ANNSID  The state-space model of the dynamics of flight  Uses experimental data to learn state prediction  A and B matrices are discrete


LONGITUDINAL EXAMPLE  ANNSID for identifying longitudinal linear model of C700  All initial conditions are zero  Experimentally determined response for training network


LONGITUDINAL EXAMPLE Input Data

Output Data


LATERAL - DIRECTIONAL EXAMPLE  ANNSID for identifying longitudinal linear model of C700  All initial conditions are zero  Experimentally determined response for training network


LATERAL - DIRECTIONAL EXAMPLE Input Data

Output Data


CONCLUSIONS  Accurate aircraft linear models for longitudinal and lateral/directional motion can be determined using an artificial neural network  Works well on inputs not used in training  The network must be restricted for network weights to be equivalent to A and B matrices  No hidden layers  Linear threshold  No bias input  Inputs of current state and control, outputs of next state  ANNSID is able to learn accurate models quickly (< 8 seconds CPU time for scenarios tested)


REFERENCES  Denni s J. Linse and Rober t F. Stengel , “Identifi cati on Of Aerodynami c Coef fici ents Using Computational Neural Networks”, Journal Of Guidance, Control, And Dynamics, Vol . 16, No. 6, November - December 1993  Joshua Harri s et al , “Aircraf t System Identifi cati on Usi ng Ar tifi cial Neural Networks With Flight Test Data”, 2016 Internati onal Co nference On Unmanned Ai rcraf t Systems (ICUAS), Arlington, VA , U.S.A J une, 2016.  Kenton Kirkpatric k , Jim May Jr. And John Valasekz , “Aircraf t System Identi ficati on Usi ng Ar tificial Neural Networks”, 51st AIAA Aerospace Sciences M eeting Including The New Horiz ons Forum And Aerospace Expositi on, Texas, U.S.A ., Januar y 2013  Randal W. Beard and Timothy W. M cLain, “Small Unmanned Aircraf t Theor y and Practice”, Princeton Univer sity Press, 201 2.  Simon Haykin, “Neural Networks And Learning Machines”, Pear son, 3 rd Edition, 2008.  Simon Haykin, “Neural Networks – A Comprehensive Foundation”, Pear son, 2 nd Edition, 1999.  Stuar t Russell (Author), Peter Nor vig , Ar tificial Intelligence : A M ode rn Approach, Pear son, 3 rd Edition , 2009.