Safety Criteria and Safety Lifecycle for Artificial Neural Networks

14 downloads 23268 Views 560KB Size Report
KEYWORDS: safety, critical, neural, network, criteria, lifecycle, argument, hybrid, ... Given the potential benefits of neural networks in many applications, it is ... There are many existing approaches for developing ANNs in safety critical systems.
Safety Criteria and Safety Lifecycle for Artificial Neural Networks Zeshan Kurd, Tim Kelly and Jim Austin Department of Computer Science University of York, York, YO10 5DD, UK Phone: +44-1904-433388, Fax: +44-1904-432767 email:{zeshan.kurd, tim.kelly, jim.austin}@cs.york.ac.uk

ABSTRACT: There are many performance based techniques that aim to improve the safety of neural networks for safety critical applications. However, many of these techniques provide inadequate forms of safety arguments required for safety assurance. As a result, neural networks are typically restricted to advisory roles in safety-related applications. Neural networks are appealing to use given their ability to operate in unpredictable and changing environments. Therefore, it is desirable to certify them for highly-dependable roles in safety critical systems. This paper outlines the safety criteria which if enforced, would contribute to justifying the safety of neural networks. The criteria are a set of safety requirements for the behaviour of neural networks. A potential neural network model is also outlined and is based upon representing knowledge in symbolic form. The paper also presents a safety lifecycle for artificial neural networks. This lifecycle focuses on managing behaviour represented by neural networks and contributes to providing acceptable forms of safety assurance. KEYWORDS: safety, critical, neural, network, criteria, lifecycle, argument, hybrid, symbolic, knowledge.

INTRODUCTION Artificial neural networks (ANNs) are used in many safety-related applications within industry. Typical applications within aerospace industry may involve utilisation of ANNs in flight control systems [1]. Other applications within medicine may involve ANNs for diagnosis of certain diseases [2]. A wide-ranging review of many applications of ANNs in safety-related industries can be found in a UK HSE report [3]. There are many reasons why industries find ANNs appealing. Most of these reasons relate to the functional benefits associated with ANNs. These benefits may include: •

• • •

The ability to learn: This is useful for problems whose intentionally complete algorithmic specification cannot be determined at the initial stages of development. They are also used when there is little understanding between the relationship of input and output patterns. Therefore the neural network uses learning algorithms and training sets to learn new features associated with the desired function. Dealing with novel inputs: Providing generalisation to novel inputs using pre-learned samples for comparisons. Operational performance: By exploiting the generalisation ability the neural network can outperform other methods particularly in areas of pattern recognition. Computational efficiency: Neural networks are often faster and more memory efficient than other methods.

Although neural networks are used in many safety-related applications, they share a common problem. That is, ANNs are typically restricted to advisory roles. In other words, the ANN does not have the final decision in situations where there is risk of severe consequences. Current safety standards have extremely limited recommendations for using artificial intelligence for safety critical systems. One example is IEC61508-7 [4] where neural networks may be used as a safety bag. The safety bag is an external monitor that ensures the system does not enter an unsafe state. This may protect against residual specification and implementation faults which may adversely affect safety. However, application of ANNs in safety critical systems is allowed only for the lowest level of safety integrity (SIL1). The principle reason why ANNs are restricted to advisory roles is the continued absence of acceptable forms of safety assurance provided through safety argumentation.

THE PROBLEM Given the potential benefits of neural networks in many applications, it is desirable to utilise them for highlydependable roles. A highly-dependable role may include situations where an ‘unsafe’ output from a neural network may cause severe consequences. The main challenge is to be able to generate satisfactory forms of safety argumentation. There are many existing approaches for developing ANNs in safety critical systems. One particular approach is using diverse neural networks [5],[6]. This technique attempts to overcome the difficulty of using a single ANN to cover some target function. To make this problem easier the approach is to create an ensemble of ANNs where each member in the ensemble is developed differently but for the same problem. Each network is varied using diverse set of techniques such as different training sets or learning algorithms. Although very low levels of error were achieved they did not provide suitable safety arguments. One prime reason was that each ANN could only be analysed as a black box. This limitation is common to many other approaches including verification and validation of ANNs [7],[8]. Many of these current approaches have been reviewed in [9]. The problem is to deal with issues particularly those related to learning and generalisation to obtain desired forms of analysis (white-box). The scope of this paper is to provide a summary of the work on the safety criteria [10],[11], a suitable ANN model [12],[13] and the safety lifecycle [13].

SAFETY CRITICAL SYSTEMS AND SAFETY ARGUMENTATION Safety-critical systems are being employed in many areas such as transport industries [1], medicine [7] and defence [14]. Systems need to be carefully developed if they are to have some direct influence on the safety of the user and the public. Safety critical systems are concerned with preventing incorrect operation that may lead to fatal or severe consequences. In other words, safety-critical systems must not directly or indirectly contribute to the occurrence of a hazardous system state. A system level hazard is a condition that is potentially dangerous to man, society and environment. Potentially hazardous states can be prevented though safety processes that aim to identify, analyse, control and mitigate hazards within the system. In general, systems can never been described as totally safe [15] including humans as well as computers. However, systems have been shown to be acceptably safe for a given role [16]. Through analysis aided and assured by safety processes, failures can be detected or controlled with the risk of failure assured to a tolerable level [16]. The software safety lifecycle specifies where certain safety processes should be performed throughout development of software systems. Within the software context, a hazard is a software level condition that could give rise to a system level hazard. The following is an outline of some of the major processes performed during the software safety lifecycle: • • • • •

Hazard Identification: is a major activity at the start of the software lifecycle. This requires an element of indepth knowledge about the system and inquisitively explores possible hazards. This may require consultation of a checklist of known hazards specific to the type of application (possibly from an initial hazard list). Functional Hazard Analysis (FHA): Analyses the risk or the severity and probability of potential accidents for each identified hazard. This is performed during the specification and design stages. Preliminary System Safety Analysis (PSSA): The purpose of this phase is two fold: To ensure that the proposed design will adhere to safety requirements and to refine the safety requirements and help guide the design process. System Safety Analysis (SSA): This process is performed at implementation, testing and other stages of development. Its main purpose is to gain evidence from these development stages for assurance that safety requirements have been achieved. Safety Case: This final phase delivers a comprehensible and defensible argument that the software is acceptably safe to use in a given context. This is presented with the delivery and commissioning of the final software system.

The intentions of these safety processes are well established. They can be found in current safety standards such as ARP 4761 [17]. Current approaches for improving safety of ANNs have been found to be inappropriate for safety critical systems [9]. They lack meaningful safety argumentation particularly about the functional properties of ANNs. Many neglect issues concerned with tackling typical safety concerns (such as hazard identification). The challenge is to devise a suitable

ANN model that has the ability to provide analytical safety arguments whilst maintaining some of the desirable features outlined in the introduction of this paper. Generated safety arguments must then be organised and presented using the safety case. TYPES OF SAFETY ARGUMENTS Safety arguments for ANNs must be comparable in strength of assurance to those found for conventional software. There are two main types of safety arguments currently used for software systems. These are known as process and product based arguments. Process based arguments are concerned with providing assurance purely on the fact that a certain process has been carried out. Previous work [18] attempted to devise process certification requirements for ANNs in safety critical systems. This is a good example of how process-based arguments were over-emphasised. For example, many of the requirements involved dealing with issues such a team management and implementation (coding) issues (using formal methods). However, there was no clear reasoning on how the functional behaviour of the ANN has been made ‘safer’ (in terms of hazard mitigation and control). The role of analytical tools is highly important and will involve hazard analysis for identifying potential hazards. Some factors which prevent this type of analysis in ANNs can be demonstrated in typical monolithic neural networks. These distribute their function among the weights in a fashion that makes interpretation difficult resulting in black-box views and pedagogical approaches to analysis [19]. The ability to argue more about the functional behaviour can be provided using product-based arguments. These are evidence-based and can show how they tackle safety issues such as identifying potential hazards. These product-based arguments are required to deliver certification based upon analytical methods. Process-based arguments are generally considered as weak forms of assurance [20] and current practices and standards are working towards excluding them. The types of arguments suitable for ANNs will be product-based and process-based but only where a clear a defensible argument can be made about how they tackle typical safety concerns.

SAFETY CRITERIA FOR ARTIFICIAL NEURAL NETWORKS Analytical product-based arguments about the functional behaviour of ANNs can be provided through the satisfaction of the safety criteria [10]. The safety criteria are a set of high-level goals devised by analysing aspects of current safety standards and behaviour of ANNs (factors affecting safety). They define minimum behavioural properties which must be enforced for safety-critical contexts. Most criterions require suitable white-box style arguments to be completely satisfied. These safety criteria have been presented in Figure 1 in the form of Goal Structuring Notation (GSN) [21] which is commonly used for composing safety case patterns. The boxes illustrate goals or sub-goals which need to be fulfilled. Rounded boxes denote the context in which the corresponding goal is stated. The rhomboid represents strategies to achieve goals. Diamonds underneath goals symbolise that further development is required leading to supporting arguments and evidence. Figure 1 is best read using a top-down approach. The safety criteria consists of a top goal labelled G1. If G1 is achieved then the neural network can be considered ‘safe’ to perform some specified function in safety critical contexts (particularly highly-dependable roles). To make the context of this goal more clearer consider the following contexts: • • •

Context C1 defines the specific neural network model being used. Context C2 requires ANNs to be used for a particular problem when conventional software methods are inappropriate (utilising the benefits of ANNs outlined in the introduction). Context C3 requires that ‘acceptably safe’ is determined by the manner in which the criteria is satisfied. This is related to the types of arguments used (such as product-based, white-box, analytical styles).

G1

C2 Use of network in safetycritical context must ensure specific requirements are met

Neural network is acceptably safe to perform a specified function within the safety-critical context

C1 Neural Network model definition

S1 Argument over key safety criteria

C6 A fault is classified as an input that lies outside the specified input set

C3 ‘Acceptably safe’ will be determined by the satisfaction of safety criteria

G2 Pattern matching functions for neural network have been correctly mapped

C4 Function may partially or completely satisfy target function

G3

G4

G5

Observable behavior of the neural network must be predictable and repeatable

The neural network tolerates faults in its inputs

The neural network does not create hazardous outputs

C5 Known and unknown inputs

C7 Hazardous output is defined as an output outside a specified set or target function

Figure 1: Preliminary Safety Criteria for Artificial Neural Networks The strategy S1 will attempt to generate safety arguments from the sub-goals (which form the criteria) to fulfil G1 and provide a justification of each criterion. The goals G2 to G5 represent the safety criteria and are outlined below: •

Criterion G2: Pattern matching functions for neural network have been correctly mapped. ¾ This criterion provides assurance that the neural network represents the desired function. ¾ The ‘function’ of the ANN may be considered as input-output mappings. ¾ ‘Correct’ refers to whether the input-output mappings fall within the desired or target function. ¾ Context C4 indicates partial or complete satisfaction of the desired function. ƒ This is a more realistic condition if all hazards have been identified and mitigated for the subset. ƒ Suitable for applications when analysis is unable to determine whether total representation of the entire desired function has been achieved. ƒ Previous work on dealing with and refining partial specification for neural networks [22] may apply. ¾ Forms of sub-goals or strategies for arguing G2 may involve using analytical methods such as decomposition approaches [23]. This will help analyse the function performed by the ANN and help present a white-box view of the network.



Criterion G3: Observable behaviour of the neural network must be predictable and repeatable. ¾ Criterion provides assurance that safety is maintained during ANN learning and training. ¾ The ‘observable behaviour’ of the network means the input and output mappings that take place (and not the weights on every connection). ¾ The term ‘repeatable’ provides assurance that any previous valid (or safe) mapping or output does not become flawed during learning. This is concerned with issues surround the ‘forgetting’ of previous learnt samples. ¾ Potential safety arguments may be concerned with providing evidence that learning is controlled (through behavioural constraints identified by hazard analysis).



Criterion G4: The neural network tolerates faults in its inputs. ¾ This allows the safety of the network to be assured for all input conditions. ¾ Possible flawed inputs may include samples that do not represent the target function.

¾ ¾ •

The required satisfaction of this goal may depend on the application context (if assurance can be provided that input will not be flawed). Possible safety arguments to achieve this goal may involve detecting and suppressing flawed inputs.

Criterion G5: The neural network does not create hazardous outputs. ¾ This is similar to G2 but focuses more upon the network output. ¾ Provides assurance that the output is not hazardous regardless of the integrity of the input. ¾ Possible forms of arguments may involve black-box approaches. ¾ Solutions may involve output monitors or bounds.

The safety criteria are intended to be applicable to most types of neural networks. However, certain ANN models may be susceptible to new types of faults. In this case, appropriate criterion can be added to deal with these new faults. This will allow the safety criteria to apply to neural networks that have different characteristics than typical ANNs (typical ANNs may include multi-layered perceptrons).

CRITERIA ‘COMPLETENESS’ A fault is a deficiency or imperfection in the system. The focus of the safety criteria is to identify all possible types of faults leading to failures and the types of failures that can occur. Failure modes are the effect by which the failure is observed. The safety criteria can be argued in terms of the different modes of failure that they tackle. In the following justification some non-functional properties are also highlighted (including flow anomalies, spatial and temporal resource usage etc.). A full list of functional and non-functional properties can be found in [11]. Here is a summary of some of the main faults and failures modes tackled by each criterion: Criterion G2 Faults: 1. One or more weight in the neural network is faulty • The weights in the network are faulty (wrong value) such that the contribution of that weight in the creation of the final output leads to a function other than that desired. 2. The topology of the network is faulty • The topology and structure (layers, number of neurons, arrangement and relations) does not allow learning of desired function. 3. Activation functions are faulty. • Activations occur when they should not or vice versa. 4. The connections are faulty. • In terms of existence and placement. Criterion G2 Failure Modes: • Given an input pattern within some valid area of the data space, the network output is not desired and can potentially lead to a hazard (during deployment). Criterion G3 Faults: 1. One or more weights during learning have changed such that the network output changes from a ‘safe’ to potentially hazardous signal. • This fault is associated with problems of ‘repeatability’ during learning (while deployed). 2. A neuron has been added and memory limitation has been exceeded (Non-functional). 3. Internal properties have changed such that the response time has exceeded acceptable limit (Non-functional). 4. Unsuitable learning algorithms and ANN structure results in training samples not being learnt. Criterion G3 Failure Modes: 1. The network output is potentially hazardous given some input pattern given that the network output was ‘nonhazardous’ at a previous network state (result of fault 1). 2. Memory overflow (result of fault 2) • Resource usage (Space) – The network has attempted to exhibit an architectural growth beyond memory capacity and has caused critical failure or infringed all other criteria. 3. Time-out problem given input pattern (result of fault 3). • Resource usage (Time) – The network has attempted to exhibit a change δ in response time from inputvector entering network to output-vector exiting network, beyond an acceptable limit τ (or where δ > τ).

4.

Time-out problem given training samples (result of fault 4) • Resource usage (Time) – The learning phase of the network has not been able to learn all required samples in the given available time. For example, classification problem.

Criterion G4 Faults: • Input vector is not within some valid area of data space (flawed input sample). Criterion G5 Faults: • Given any input pattern the network has a fault (black-box approach). Criterion G4 & G5 Failure Modes: • Network output is potentially hazardous. As a result of a flaw in inputs or some fault in the neural network.

A SUITABLE ANN MODEL The previous section described how the criteria tries to deal with many problems associated with typical ANNs. One of the key methodologies for satisfying the criteria is to avoid pedagogical style analysis (black-box) and to find ways for white-box style arguments. However, when attempting to generate white-box analytical arguments, at the same time, it is essential to preserve the ability to learn (outlined in introduction). One particular model that has the potential to generate the desired safety arguments is the well-established ‘hybrid’ ANN [24]. The term ‘hybrid’ in this case, refers to representing symbolic information within a neural network structure.

Figure 2: Framework for combining symbolic and neural paradigms [25] Consider the example illustrated in Figure 2. This diagram is divided into three columns: • • •

Knowledge / Data: All references to symbolic data and training data are kept here. Symbolic knowledge may be in the form of logic or if-then rules. Process: This column encapsulates different processes that take place. This may involve translation algorithms which can convert rules into network neurons and weights (and vice versa). It also encapsulates learning algorithms. Neural Network: This holds some of the major states of the ANN. Typically this may include ANNs with initial conditions or post-learning states.

The typical manner in which ‘hybrid’ ANNs are used is described within the columns of Figure 2. Suppose we want to use ANNs to solve some problem. Domain experts may attempt to gather knowledge and represent them in a form of rules. These rules may not be complete and may be incorrect as a result of insufficient prior knowledge. This set of initial knowledge is then processed using a rules-to-network algorithm which inserts them into a suitable structure. The structure of the ANN (such as topology, weights and connections) is determined by the insertion process. Once inserted, the result is an initial neural network that represents all the initial symbolic knowledge. The network then goes through

a process of learning using training data. The aim is to evolve or refine the rules given variations within the data domain. These rules may then expand in number or may be refinements of existing ones. Once training is complete, these rules are extracted (using suitable learning algorithms). The final set of rules may be different than the initial set and correspond to changes in the data. There are many advantages of this approach. The ‘hybrid’ ANN has been shown to outperform many all-symbolic techniques in terms of generalisation and the rules that it embodies [24]. In terms of safety, the ‘hybrid’ ANN offers the potential for analysis using a decompositional approach. This is facilitated by the ordered manner in which rules are represented within the ANN. It also provides the potential for transparency or white-box style analysis. This can be achieved through rule extraction algorithms which are much easier to implement than for conventional neural networks. These advantages can result in potentially strong analytical arguments [12]. However, the learning process still needs to be controlled using appropriate mechanisms if it is to be used whilst the ANN is deployed. So far, work is being performed to identify a suitable ‘hybrid’ ANN model [12].

SAFETY LIFECYCLE FOR ‘HYBRID’ NEURAL NETWORKS Current software development lifecycle is inadequate for artificial neural networks. For example, the specification, design, implementation and testing phases do not support the manner in which ANNs are developed including the extra activities involved (data processing). With the unique development lifecycle of ‘hybrid’ ANNs it is not obvious where required safety processes should be initiated. Moreover, adaptations of existing safety processes may need to be made to deal with symbolic and neural representations. There are very few existing ANN development lifecycles for safetyrelated applications [26],[27]. One particular lifecycle [26] is directly intended for safety-critical applications however there are several problems associated with its approach. One problem is that it relies on determining the specification and behaviour at the initial phase of development. This is not practical since the prime motivation for ANN learning is to determine its behaviour given very limited initial data. Another problem is that it over-emphasises control over nonfunctional properties. Typical examples of non-functional properties include usage of spatial or temporal resources. Instead, focus should be upon constraining functional properties such as learning or the behaviour of the ANN. Documenting each development phase (process-based arguments) are generally regarded as weak arguments for providing assurance for conventional software [20]. However, they are used extensively throughout the ANN lifecycle [26]. Figure 3 illustrates a development and safety lifecycle for ‘hybrid’ ANNs in the form of a ‘W’ model [13]. This diagram is divided into three main levels: 1. Symbolic Level: This level is associated only with symbolic information. It is separated from the neural learning paradigm and deals with analysis in terms of symbolic knowledge. Typical examples may include the gathering and processing of initial knowledge. Other uses may involve evaluating extracted knowledge gathered post-learning. 2. Translation Level: This is where symbolic knowledge and neural architectures are combined or separated. This is achieved though processes such as rule insertion [25] or extraction algorithms [23]. All transitions from the symbolic level to neural learning level involve rule insertion algorithms. All transitions from the neural learning level to symbolic involve rule extraction algorithms. 3. Neural Learning Level: This level uses neural learning to modify and refine symbolic knowledge. Neural learning is performed by using specific training samples along with suitable learning algorithms [25],[28],[29].

Figure 3: Safety Lifecycle for Hybrid Neural Networks The following points follow the ‘W’ model and outline the major stages in the development lifecycle: •

• • •

• •



Determination of requirements: These requirements describe in informal terms the problem to be solved. Although requirements for conventional software are intentionally complete, the requirements for ANNs are intentionally incomplete. Typical for real-world problems, ‘incompleteness’ may be a result of insufficient data or knowledge prior to development. Sub-initial knowledge: All known knowledge is translated into logical rules (by domain experts). Initial knowledge: Sub-initial knowledge is converted into symbolic forms compatible for translation into network structure. A compatible symbolic language has been defined in [24]. Dynamic Learning: Once the initial symbolic knowledge has been inserted or translated into a suitable ANN, twotier learning process commences. This uses suitable learning algorithms [12] to refine the initial symbolic knowledge and add new rules. It uses some training data set and attempts to modify the network to reduce error in output. This may result in topological changes to the network (adding new hidden neurons to represent new rules). Refined Symbolic Knowledge: Knowledge refined by dynamic learning is extracted using appropriate extraction algorithms. This will result in a new set of rules which can be analysed from a symbolic level. Static Learning: Refined symbolic knowledge may be modified by domain experts and re-inserted into the ANN structure. It then goes through a process of static learning. This further refines the knowledge in the ANN but does not allow topological or architectural changes. Instead, the learning process concentrates on each rule (such as adding, removing and inverting antecedents). This learning process can be performed during deployment whilst adhering to safety criteria [12]. Knowledge Extraction: This can be performed at any time during static learning to get a modified rule set.

Given the vastly different approach to developing ‘hybrid’ ANNs as compared with software, it is not obvious where each safety process should be carried out. The aim of the processes will be to identify, analyse, control and mitigate potential hazards (and associated faults). Although a detailed description of each process is beyond the scope of this paper an outline is presented below: Preliminary hazard identification (PHI) deals with initial symbolic knowledge and is used to determine the initial conditions of the network. PHI is performed over a relatively meaningful representation (rules) and attempts to understand potential hazards in real-world terms. In general, it attempts understand the problem as much as possible including identification of potential hazards and to generate possible system level hazards (using a black-box approach). This may result in a set of rules partially fulfilling the desired function. It utilises knowledge gathered from domain experts, empirical data and other sources. The next step is to perform Functional Hazard Analysis (FHA) over the sub-initial symbolic knowledge. FHA performs a predictive, systematic, white-box style technique to understand how the symbolic knowledge can lead to hazards. FHA also builds an understanding of the criticality of certain rules and can facilitate ‘targeted’ training. For example,

specific training sets may be devised to focus on certain important rules or areas of concern. FHA may also make rule assertions to prevent specific hazards. The result of FHA at this stage is the initial symbolic knowledge that can be translated. FHA is also performed during dynamic learning between training sets. FHA now uses an evaluative approach to analyse rules modified by dynamic learning (through rule extraction). It determines the existence of potential hazards and will do the following: 1. 2.

Manually modify rule set: Through manual assertions to mitigate or control identified hazards. Initiate further training: Devise new training sets to reflect desired directions of learning. It can also be viewed as ‘guiding’ the design process. This can be iterated until desired performance is achieved (determined through analytical techniques).

It is important to remember that the function (or rule set represented by the ANN) may only partially satisfy the desired function. This is a realistic assumption since it may be infeasible to determine whether totality of the desired function has been achieved (supporting the motivation of neural learning). Certification can be provided based upon this ‘partialbut-safe’ approach. Although the approach so far deals with mitigating hazards associated with the rule set and improving performance, it must also control future learning. FHA is performed again at post-dynamic learning stage. This now uses an exploratory safety analysis technique to identify how rules may be allowably refined during static learning. FHA is based upon the pre-consideration of possible adaptation of rules. This involves analysing the symbolic knowledge of the ANN and determines how potential hazards may be introduced (if parts of rules are modified). Measures to mitigate potential hazards are then incorporated into the network by exploiting knowledge-insertion algorithms to translate rule conditions into network weights. The safety lifecycle for ‘hybrid’ ANNs respects the development methodologies of the ANN model whilst considering the safety criteria. It has adapted software safety process to deal with particular safety concerns. The important advantage is that this lifecycle contributes to generating acceptable product-based analytical safety arguments [13]. POTENTIAL SAFETY ARGUMENTS Many potential safety arguments can be generated using a suitable ‘hybrid’ ANN and safety lifecycle. Although a full assessment is beyond the scope of this paper some of the main arguments for the ‘hybrid’ ANN model are as follows: • • •

Analysing the rules represented by the network can contribute to satisfying criterion G2 (provision of function). The two-tier learning model allows working towards desired function whilst potentially control learning (Criterion G3). Translation algorithms and data representation greatly contribute to white-box style analysis over the ANN.

Some potential safety arguments by using the safety lifecycle are as follows: • • • • • •

PHI can provide assurance that potential hazards have been identified in the initial set of knowledge. Hazard analysis will provide the strongest form of safety arguments. One aspect is guiding training based upon criticality of rules and desired directions of learning. Hazard analysis can also determine faults that could possibly lead to hazards. Hazards within rules represented in the network may be identified using rule extraction. Hazard analysis can provide assurance that all hazards have been identified and mitigated. Hazard analysis (FHA) can also provide assurance for learning that takes place post-certification. This is performed by analysing knowledge in the ANN and incorporating required constraints.

These potential contributions to safety arguments are only a few of the possible types. The main emphasis is that blackbox arguments do not need to be relied upon for certifying neural networks. The types or forms of safety arguments presented in this paper are more acceptable and comparable to those found for software safety assurance. Whilst providing suitable forms of safety assurance the ‘hybrid’ ANN also maintains the ability to learn post-certification. Neural learning is one of the major motivations for adopting the ANN approach. By preserving the learning capability some of the main motivations are maintained without compromising on safety.

CONCLUSION The absence of analytical certification methods has typically restricted ANNs to advisory roles in safety-related systems. Many of the existing techniques on improving safety of ANNs for safety critical systems do not provide the necessary forms of safety arguments. This work has attempted to overcome typical problems by first outlining safety criteria for the functional behaviour of neural networks. This has been devised by considering some of the major faults associated with ANNs. The paper then presented a potential model known as ‘hybrid’ ANNs which combines symbolic and neural paradigms. The advantage of this approach is a more analysable and controllable representation. The ‘hybrid’ ANN also tackles the challenge of maintaining advantages gained from learning by incorporating necessary controls and constraints. The safety lifecycle provides a framework for developing ANNs in safety critical applications and focuses on identifying, analysing, controlling and mitigating hazards. The approach for utilising ‘hybrid’ ANNs is very flexible. For example, other domain-specific base functions (such as image filters for image processing applications [30]) may be used instead of rules. Appropriate translation algorithms will also need to be derived for other domain-specific base functions without the need to modify the safety lifecycle. Some of the main potential safety arguments resulting from these approaches are also discussed. These potential safety arguments demonstrate the potential to use ANN as highly-dependable roles in safety critical systems.

REFERENCES [1] [2] [3] [4] [5] [6] [7] [8] [9] [10]

[11] [12] [13]

[14] [15] [16] [17] [18]

[19] [20]

E. Johnson, N., A. J. Calise, and J. E. Corban, "Adaptive Guidance and Control for Autonomous Launch Vehicles," Georgia Institute of Technology & Guided Systems Technologies, Inc. 2000. M. Sordo, H. Buxton, and D. Watson, "A Hybrid Approach to Breast Cancer Diagnosis," School of Cognitive and Computing Sciences, University of Sussex, 2001. P. Lisboa, "Industrial use of safety-related artificial neural networks," Health & Safety Executive 327, 2001. IEC, "61508: Fundamental Safety of Electrical / Electronic / Programmable Electronic Safety Related Systems," International Electrotechnical Commission 1999. D. Partridge, "Engineering Multiversion Reliability in Neural Networks - Producing Dependable Systems," ERA Technology Ltd 97-0365, 1997. A. J. C. Sharkey, N. E. Sharkey, and O. C. Gopinath, "Diversity, Neural Nets and Safety Critical Applications," Computer Science, University of Sheffield, 1995. I. Nabney, "Validation of Neural Network Medical Systems," Workshop on Regulatory Issues in Medical Decision Support, October 2001. R. E. Saeks, C. J. Cox, W. J. Sefic, and L. P. Graviss, "Verification and Validation of a Neural Network Flight Control System," Accurate Automation Corporation, USA 1997. Z. Kurd, "Artificial Neural Networks in Safety-critical Applications," First Year Dissertation, Department of Computer Science, University of York, 2002. Z. Kurd and T. P. Kelly, "Establishing Safety Criteria for Artificial Neural Networks," To appear in Seventh International Conference on Knowledge-Based Intelligent Information & Engineering Systems (KES'03), Oxford, UK, 2003. Z. Kurd and T. P. Kelly, "Establishing Safety Criteria for Artificial Neural Networks," Department of Computer Science, University of York, York, Internal Report 2002. Z. Kurd and T. P. Kelly, "Proposal for Developing Safety-critical Artificial Neural Networks," Department of Computer Science, University of York, York, Internal Report January 2003. Z. Kurd and T. P. Kelly, "Safety Lifecycle for Developing Safety-critical Artificial Neural Networks," To appear in 22nd International Conference on Computer Safety, Reliability and Security (SAFECOMP'03), 2326 September, 2003. DARPA, DARPA Neural Network Study: AFCEA International Press, 1988. N. Storey, Safety Critical Computer Systems. Harlow: Addison-Wesley, 1996. R. Bell and D. Reinert, "Risk and system integrity concepts for safety-related control systems," Microprocessors and Microsystems, vol. 17, pp. 13-15, 1993. SAE, "ARP 4761 - Guidelines and Methods for Conducting the Safety Assessment Process on Civil Airborne Systems and Equipment," The Society for Automotive Engineers December 1996. D. F. Bedford, G. Morgan, and J. Austin, "Requirements for a Standard Certifying the use of Artificial Neural Networks in Safety Critical Applications," presented at Proceedings of the International Conference on Artificial Neural Networks, 1996. C. C. Kilimasaukas, "Neural nets tell why," Dr Dobbs's, pp. 16-24, April 1991. R. A. Weaver, J. A. McDermid, and T. P. Kelly, "Software Safety Arguments: Towards a Systematic Categorisation of Evidence," presented at International System Safety Conference, Denver, CO, 2002.

[21] [22]

[23] [24] [25] [26]

[27] [28] [29] [30]

T. P. Kelly, "Arguing Safety – A Systematic Approach to Managing Safety Cases," Ph.D. Thesis, Department of Computer Science, University of York, 1998. W. Wen, J. Callahan, and M. Napolitano, "Towards Developing Verifiable Neural Network Controller," Department of Aerospace Engineering, NASA/WVU Software Research Laboratory, West Virginia University, 1996. R. Andrews, J. Diederich, and A. Tickle, "A survey and critique of techniques for extracting rules from trained artificial neural networks," Neurocomputing Research Centre, Queensland University of Technology 1995. G. Towell and J. W. Shavlik, "Knowledge-Based Artificial Neural Networks," Artificial Intelligence, pp. 119165, 1994. J. W. Shavlik, "A Framework for Combining Symbolic and Neural Learning," Computer Science Department, University of Wisconsin, Madison Tech. Rep. 1123, 1992. D. M. Rodvold, "A Software Development Process Model for Artificial Neural Networks in Critical Applications," presented at Proceedings of the 1999 International Conference on Neural Networks (IJCNN'99), Washington D.C., July 1999. I. Nabney, M. J. S. Paven, R. C. Eldridge, and C. Lee, "Practical Assessment of Neural Network Applications," Aston University & Lloyd's Register, UK 2000. I. Taha and J. Ghosh, "A Hybrid Intelligent Architecture and Its Application to Water Reservoir Control," Department of Electrical and Computer Engineering, University of Texas, 1995. D. E. Rummelhart, G. E. Hinton, and R. J. Williams, "Learning representations by back-propagating errors," Nature, vol. 323, pp. 533-536, 1986. B. J. Zwaag and L. Spaanenburg, "Analysis of Neural Network in Terms of Domain Functions," presented at 3rd IEEE Benelux Signal Processing Symposium (SPS-2002), Leuven, Belgium, 2002.