Artificial Neural Network Based Intrusion Detection ...

Artificial Neural Network Based Intrusion Detection Techniques Mohamed Abdel-Azim1, Abdel-Fatah Ibrahim1 & Mohamed Awad2 1

Communications Engineering Department, Faculty of Engineering, Mansoura University 2 SCADA & Telecommunication Dept. Head, GASCO, Egypt

Abstract-Intrusion detection is the art of detecting computer abuse and any attempt to break into networks. As a field of research, it must continuously change and evolve to keep up with new types of attacks or adversaries and the ever-changing environment of the Internet. To render networks more secure, intrusion-detection systems aim to recognize attacks within the constraints of two major performance considerations: high detection and low false-alarm rates. It is also not enough to detect already-known intrusions, yet-unseen attacks or variations of those known present a real challenge in the design of these systems. Intrusion-detection systems are firmly entrenched on the security front but the exact role they can play and what their deployment entails must be clear to planners of security. They implement a solution to a multifaceted problem and to be efficient in a given environment may require a combination of detection methods.

I. INTRODUCTION In recent years the size of the Internet and the volume of traffic have grown steadily. This expansion and an increase in computerization generally have also seen a rise in computer misuse and attacks on networks. No software will ever be perfect and even minor bugs can leave systems vulnerable to costly malicious damage. Prevention of such crime is impossible and so, monitoring and detection are resorted to as the best alternative line of defense. The implementation of this process, called Intrusion Detection is the art of detecting computer misuse and any attempt to break into networks. It is performed with the aid of dedicated software and hardware systems operating on security logs, audit data or behavior observations. Important new ways to handle these data were found firstly in data warehouse, which keeps them ordered and accessible, and secondly in data mining, which extracts new hidden meanings from them, thus increasing their value as a resource. Intrusion-detection systems (IDSs) also need to process very large amounts of audit data and are mostly based on hand-crafted attack patterns that are developed by manual encoding of expert knowledge [1]. A. What is Intrusion Detection? With the increase of attacks on computers and networks in recent years, improved and essentially automated surveillance has become a necessary addition to IT security. Intrusion detection is the process of monitoring the events occurring in a computer system or network and analyzing them for signs of intrusions [1]. Intrusions are defined as attempts to compromise the confidentiality, integrity and/or availability of a computer or network or to bypass its security mechanisms. They are caused by attackers accessing a system from the Internet, by authorized users of the systems who attempt to gain additional privileges for which they are not authorized and by authorized users who misuse the privileges given to them. An IDS is the software and hardware implementation that automates this monitoring and analysis process.

B. Main Benefits and Characteristics The main benefits of an IDS include: (i) detecting attacks and other security violations, which have not been prevented by other primary protection techniques (firewalls, authentication …); (ii) preventing problem-behaviors by increasing the perceived risk of discovery and punishment for those who would attack or otherwise abuse the system; (iii) detecting probes (to gain unauthorized access to a network) and scanning of ports, in time to deal with an attack; (iv) presenting traces of intrusions, allowing improved diagnosis, recovery and corrective measures after an attack; (v) documenting the existing threat from inside and outside a system, permitting security management to realistically assess risk and adapt its security strategy in response, and (vi) acting as quality control for security design and implementation (highlighting some deficiencies or errors, before serious incidents occur) [1- 4]. Much work based on various mechanisms has been done to implement these features, so that now over 150 commercial, freeware and shareware IDSs products are available to users. To facilitate evaluation of these solutions, Purdue University IDS research project put forward a list of characteristics for good systems: (i) it must run continually without human supervision. The system must be reliable enough to allow it to run in the background of the system being observed. However, it should not be a "black box". That is, its internal workings should be examinable from outside; (ii) it must be fault tolerant in the sense that it must survive a system crash and not lose its knowledge-base at restart; (iii) on a similar note to above, it must resist subversion. The system can monitor itself to ensure that it has not been subverted; (iv) it must impose minimal overhead on the system. A system that slows a computer to a crawl will simply not be used; (v) it must observe deviations from normal behavior; (vi) it must be easily tailored to the system in question. Every system has a different usage pattern, and the defense mechanism should adapt easily to these patterns; (vii) it must cope with changing system behavior over time as new applications are being added. The system profile will change over time, and the IDS must be able to adapt, and (viii) it must be difficult to fool [6- 8]. C. Distinguishing between the Major Types of IDS In the field of intrusion detection, D. Denning‘s work from the mid-80‘s [2] is considered to be seminal, providing basics which still hold today. It is clear that intrusion detection is a relatively new but complex system function and integrating it into an existing security infrastructure brings many challenges [3]. To add to confusion, a wide range of IDS types are either already available in numerous product forms or currently an area of active research [4, 5, 6]. The major types of IDS are characterized by their different approaches in monitoring and analysis.

To help distinguish between different types of IDS, a general process model [7] can be defined in terms of three fundamental functional components as shown in Fig.1: (i) event information; drawn from different sources, used to determine whether an intrusion has taken place. This information can be extracted from different levels of the system: network, host and application monitoring are the most common; (ii) analysis; the part of IDSs that actually organizes and makes sense of the events derived from the information sources, deciding when those events indicate that intrusions are occurring or have already taken place. The most common analysis approaches are misuse detection and anomaly detection; (iii) response; once, IDS has obtained event information and analyzed it to find symptoms of an attack, it generates a response. There are, two main types of responses: (i) passive; provide information in the form of alarms or notification to administrators, who may then decide on appropriate action to be taken, and (ii) warrant active; a collection of supplementary information or even a change of the environment to block an attacker‘s activity.

Event information from source (Host, network, application)

Analysis

Response

(Misuse or anomaly)

(Action taken on detection)

Input

Activity

Output

Fig.1 Process model of intrusion detection

D. Network-Based vs. Host-Based IDSs Network-based IDS monitors network traffic by capturing and analyzing network packets. The placement of network intrusion-detection sensors is critical in reducing identified risks. A sensor placed outside the network, beyond the protection of such access-control devices as firewalls or routers can report more than one placed inside the network. An internal sensor should, however, detect events indicating possibly serious security-breaches originating inside the network. Advantages of network-based IDSs: (i) a few strategically placed sensors can monitor a whole network; (ii) the deployment of these systems has little impact on the existing network; (iii) they have little effect on the normal network operation and are relatively easy to upgrade, and (iv) they are robust in the face of attacks and can be made invisible to attackers. On the other hand, their disadvantages are: (i) during peak-traffic periods some packets may go unprocessed and attacks undetected; (ii) if the network is switched, the integrated switching system rarely provides efficient universal monitoring of its ports for the IDS to monitor all traffic; (iii) encrypted information cannot be analyzed; (iv) attack attempts may be detected but hosts must usually then be investigated manually to determine whether or not they were penetrated and damage caused, and (v) attacks involving fragmentation of packets can cause these IDS to crash. Host-based IDS monitors the network traffic of a particular host and some system events on the host itself. One may be installed on each host or simply on some chosen critical ones within a network. Advantages of host-based IDSs are: (i) some local events on hosts can only be detected by this type of IDS; (ii) raw data are available for analysis in

non-encrypted form; (iii) unaffected, when network is switched, and (iv) software integrity checks can be used in the detection of certain types of attack (e.g. Trojan horse). In addition, it has the following disadvantages: (i) more complex to manage; (ii) may be disabled if host is attacked and compromised; (iii) not suitable for network attacks involving distributed scans and probes; (iv) can be disabled by overload attacks (e.g. denial of service); (v) for large amounts of information to be processed, local storage may be necessary, and (vi) use the hosts’ own computing resources at a cost to performance. Application-based IDSs are a special subset of host-based IDSs, which can perform more detailed analysis of events occurring within a software application. Systems may combine both host-based and network-based features to produce hybrid-IDS. E. Real-Time Operation Real-time IDS operate on continuous inputs from information sources. This mode is predominantly used in network-based-IDS, which work on network traffic streams. When results are yielded quickly, IDS may take action to influence the progress of a recognized attack. Host-basedIDS can also be achieved but in multihost systems; this implies significant intercommunication overheads (typically involving software agents to extract higher-level knowledge from raw data). II. OVERVIEW OF DETECTION TECHNIQUES There are two main techniques used to analyze events and to detect attacks: misuse and anomaly detection. A. Misuse Detection Misuse detection depends on the pre-recording or prior representation of specific patterns for intrusions, allowing any matches to them in current activity to be reported. Patterns corresponding to known attacks are called signatures, also giving rise to the term 'signature-based detection'. These systems are not unlike virus-detection systems; they can detect many known attack patterns and even variations thereof but are likely to miss new attack methods. Regular updates with previously unseen attack signatures are necessary. B. Anomaly Detection Anomaly detection identifies abnormal behavior. It requires the prior construction of profiles for normal behavior of users, hosts or network connections; therefore, historical data are collected over a period of normal operation. Sensors; then monitor current event data and use a variety of measures to distinguish between abnormal and normal activities. These systems are prone to false alarms, since user's behavior may be inconsistent and threshold levels will remain difficult to fine tune. Maintenance of profiles is also a significant overhead but these systems are potentially able to detect novel attacks without specific knowledge of details. It is essential that the normal data used for characterization are free from attacks. C. Performance Indices Important measures of the efficiency of network IDSs are false-alarm rates: (i) the percentage of time-consuming false positives registered; normal data detected falsely as an intrusion, and (ii) the percentage of more dangerous false

negatives; intrusions falsely classified as normal data. Here, false negatives denote computer attacks, which have escaped detection, and not intrusion false-alarms; always referred to as false positives. Attack detection rate alone, however, only indicates the types of attacks that IDS may detect. Such measurements do not indicate the human workload required in analyzing false alarms generated by normal background traffic. False-alarm rates in hundreds per day make a system almost unusable, even with high detection accuracy, because detections and alerts must be verified and require that security analysts spend many hours each day dismissing false alarms [13]. Low false-alarm rates combined with high detection rates mean; the detection outputs can be trusted. III. DATA COLLECTION A. DARPA IDS Evaluation Datasets The Defense Advanced Research Projects Agency (DARPA) intrusion-detection evaluation datasets were the original source of the data most directly relevant to this work. For the 1998 DARPA datasets, 7 weeks of training data were accumulated from the multi-system testbed, to represent basically normal operation spiced with a series of automatically or manually launched attacks. A further 2 weeks of test data were collected containing additional new and novel intrusions. B. Dataset Description The Knowledge Discovery and Data Mining (KDD) Cup 1999 Data’ are the data sets, which were issued for use in the KDD ’99 Classifier-Learning Competition. These sets of training and test data were made available by Stolfo and Lee [15] and consisted of a pre-processed version of the 1998 DARPA Evaluation Data. This team’s IDS had performed particularly well in the Intrusion-Detection Evaluation Program of that year, using data mining even as a preprocessing stage to extract characteristic intrusion features from raw TCP/IP audit data [14]. The original raw training data were about 4 GBytes of compressed binary tcpdump data obtained from the first seven weeks of network traffic at MIT. This was pre-processed with the feature-construction framework MADAM-ID, to produce about 5×106 connection records. A connection is defined to be a sequence of TCP packets starting and ending at some well-defined times, between which data flow to and from a source IP address to a destination IP address under some well-defined protocol. Each connection is labeled as either ‘normal’ or with the name of its specific attack type. A connection record consists of about 100 bytes. A 10% of the complementary 2 weeks of test data were, likewise, preprocessed to yield a further less than half-a-million connection records. For the information of contestants, it was stressed that these test data were not from the same probability distribution as the training data and that they included specific attack types not found in the training data. The full amount of labeled test data with some two million records was not included in this dataset. V. ATTACK CATEGORIZATION A. The Four Main Categories The simulated attacks were classified, according to the actions and goals of the attacker. Each attack’s type falls into one of the four following main categories: (i) Denial-ofservice (DoS) attacks have the goal of limiting or denying

service(s) provided to a user, computer or network. A common tactic is to severely overload the targeted system (e.g. a SYN flood); (ii) Probing or surveillance attacks have the goal of gaining knowledge of the existence or configuration of a computer system or network. Port scans or sweeping of a given IP-address range are typically used in this category (e.g. IPsweep); (iii) Remote-to-Local (R2L) attacks have the goal of gaining local access to a computer or network to which the attacker previously only had remote access. Examples of this are attempts to gain control of a user account, and (iv) User-to-Root (U2R) attacks have the goal of gaining root or super-user access on a particular computer or system on which the attacker previously had user level access. These are attempts by a non-privileged user to gain administrative privileges (e.g. Eject). A total of 24 attack types were included in the training data IV. KDD FEATURES In the KDD'99 data, the initial features extracted for a connection record include the basic features of an individual TCP connection, such as: its duration, protocol type, number of bytes transferred and the flag indicating normal or error status of a connection [9]. These intrinsic features provide information for general network-traffic analysis purposes [10- 12]. Since most DoS and Probe attacks involve sending a lot of connections to the same host(s) at the same time, they can have frequent sequential patterns, which are different to the normal traffic. For these patterns, same host feature examines all other connections in the previous 2seconds, which had the same destination as the current connection. Similarly, same service feature examines all other connections in the previous 2-seconds, which had the same service as the current connection. These temporal and statistical characteristics are referred to as time-based traffic features. There are several Probe attacks which use a much longer interval than 2-seconds (e.g., one minute) when scanning the hosts or ports. For these, a mirror set of hostbased traffic features were constructed based on a connection window of 100 connections. The R2L and U2R attacks are embedded in the data portions of the TCP packets and may involve only a single connection. To detect these, connection features of an individual connection were constructed using domain knowledge. These features suggest whether the data contains suspicious behavior, such as: number of failed logins, successfully logged in or not, whether logged in as root, whether a root shell is obtained, etc. In general, there are 42 features (including the attack type) in each connection record, with most of them taking on continuous values. The individual features are listed and briefly described in Table.1. V. IDS CLASSIFICATION TECHNIQUES Classifiers are tools that partition sets of data into different classifications on the basis of specified features in that data. In this section we give a brief description of artificial neural network based classifiers. Neural networks are a uniquely powerful tool in multiple class classification, especially when used in such applications where formal analysis would be very difficult or even impossible, such as pattern recognition, nonlinear system identification and control. Provided the neural network has been given sufficient time to train, the property of generalization ensures that the network will be able to classify patterns that have

never been seen before. The accuracy however of such classifications depends on a variety of parameters, ranging from the architecture of the actual neural network to the training algorithm of choice [16]. A. Multilayer perceptron (MLP) MLP is a layered feed forward networks typically trained with static back propagation. These networks have found their way into countless applications requiring static pattern classification. Their main advantage is that they are easy to use, and that they can approximate any input/output map. The key disadvantages are that they train slowly, and require lots of training data (typically three times more training samples than network weights). B. Generalized Feed-Forward (GFF) GFF networks are a generalization of the MLP such that connections can jump over one or more layers. In theory, a MLP can solve any problem that a generalized feed-forward network can solve. In practice, however, generalized feedforward networks often solve the problem much more efficiently. A classic example of this is the two spiral problem. Without describing the problem, it suffices to say that a standard MLP requires hundreds of times more training epochs than the generalized feed-forward network containing the same number of processing elements. C. Modular Neural Network (MNN) Modular feed-forward networks are a special class of MLP. These networks process their input using several parallel MLPs, and then recombine the results. This tends to create some structures within the topology, which will foster specialization of function in each sub-module. In contrast to MLP modular networks don't have full interconnectivity between their layers. Therefore, a smaller number of weights are required for the same size network. This tends to speed up training times and reduce the number of required training exemplars. There are many ways to segment a MLP into modules. It is unclear how to best design the modular topology based on the data. There are no guarantees that each module is specializing its training on a unique portion of the data. Four modular topologies were implemented. D. Jordan/Alman network (JAN) Jordan-Elman networks extend the multilayer perceptron with context units, which are processing elements that remember past activity. Context units provide the network with the ability to extract temporal information from the data. In the Elman network; the activity of the first hidden elements are copied to the context units, while the Jordan network copies the output of the network. Networks which feed the input and the last hidden layer to the context units are also available. Four modular topologies were realized. VI. RESULTS A. Data preprocessing and Simulation Tools

Attributes in the KDD datasets had all forms continuous, discrete, and symbolic, with significantly varying resolution and ranges. Most pattern classification methods are not able to process data in such a format. Hence preprocessing was required before pattern classification models could be built. The KDD dataset has a very large number of duplicate records. For the purpose of training different classifier models, these duplicates were

removed from the datasets. The original 10% labeled training dataset is 494021 records reduced to 145587 records after removing duplications. Symbolic features like protocol_type (3 different symbols), service (70 different symbols), and flag (11 different symbols) were mapped to integer values ranging from 0 to N-1 where N is the number of symbols. Attack names (like buffer_overflow, guess_passwd, etc.) and normal labels were: (i) mapped to integer values ranging from 0 to 21 (22 attack names) and 22 for normal; (ii) mapped to one of the five classes, 0 for Normal, 1 for Probe, 2 for DoS, 3 for U2R, and 4 for R2L, and (iii) all attacks and normal are mapped to “attack” and “normal” respectively. 80% of the data set used for training and the rest is used for testing. B. Results of the First Experiment At the first time, we tried to implement 23-classes IDS system to classify each intrusion to one of the learned attack, so the numbers of neurons in output layer in all nets were 23; Tables 2, 5, 8, and 11. C. Results of the Second Experiment In this experiment we implemented 5-classes IDS system, to classify each intrusion as belonging to one of 4 known intrusion classes (Probe, DoS, U2R, and R2L) and for Normal. So the numbers of neurons in the output layer; in all nets, were chosen to be 5; Tables 3, 6, 9, and 12. D. Results of the Third Experiment

Finally, we implemented a 2-classes IDS system, to classify all intrusions as belonging to “attack” class and for Normal as belonging to “normal” class. So the numbers of neurons in the output layer; in all nets, were chosen to be 2; Tables 4, 7, 10, and 13. Generally, a confusion matrix is used to measure the performance of the artificial neural network based implemented classifiers. A confusion matrix (CM) is defined by associating classes as labels for the rows and columns of a square matrix: for example, in the 5-classes KDD data set, there are five classes, (Normal, PRB, DoS, U2R, and R2L), and therefore the matrix has dimensions of 5×5. An entry at row i and column j, CM(i, j), represents the number of misclassified patterns, which originally belong to class i yet mistakenly identified as a member of class j. VII. CONCLUSION The results showed that all nets are: (i) perfect in 2classes classification; (ii) in 5-classes classification, normal and DoS are perfectly classified except for DoS in JordanElman it is poor but PRB is perfect; (iii) in 23-classes topologies, it is commonly poor but normal, Neptune, and teardrop are perfectly detected; (iv) GFF 41×50×50×23 gives best results for normal, Neptune and teardrop; (although 41×28×23×23×23 is slightly better but it is more complicated architecture; 3-hidden layers); (v) in general using two hidden layers of neural networks is recommended; (vi) in 23-classes, just one more attack detected in one of 4 topologies in MNN; (vii) in 23-classes, just one more attack detected in one of 4 topologies in JAN. REFERENCES [1] R. Power, "2001 CSI/FBI Computer Crime and Security Survey," Computer Security Issues and Trends, Vol. 7, No. 1, Spring 2001. [2] D. E. Denning, “An Intrusion-Detection Model,” IEEE Trans. on Software Engineering, Vol. SE-13, No.2, Feb. 1987, pp. 222-232.

[3] FAQ: Network Intrusion Detection Systems, Version 0.8.3, March 21, 2000 http://www.robertgraham.com/pubs/network-intrusiondetection. html#1.11 (accessed 18.4.02). [4] Michael Sobirey's Intrusion Detection Systems page, http://wwwrnks.informatik.tucottbus.de/~sobirey/ids.html, (accessed 18.4.02). [5] Published papers on Intrusion detection, Columbia University, NY, http://www.cs.columbia.edu/ids/publications/ (accessed 18.4.02). [6] CERIAS, Intrusion Detection Research, Purdue University. [7] “NIST Special Publication on Intrusion Detection Systems“, SP 800-31 Computer Security Resource Center (CSRC), National Institute of Standards and Technology (NIST), Nov. 2001, p.15. [8] Kristopher Kendall, "A Database of Computer Attacks for the Evaluation of Intrusion Detection Systems", Master's Thesis, Massachusetts Institute of Technology, June 1999. [9] Eleazar Eskin, Andrew Arnold, Michael Prerau, Leonid Portnoy and Salvatore Stolfo. ``A Geometric Framework for Unsupervised Anomaly Detection: Detecting Intrusions in Unlabeled Data.'' To Appear in Data Mining for Security Applications. Kluwer 2002. [10] Wenke Lee, Sal Stolfo and Kui Mok. ‘A Data Mining Framework for Building Intrusion Detection Models’, Proceedings of the 1999 IEEE Symposium on Security and Privacy, Oakland, CA, May 1999.

[11] Wenke Lee, Sal Stolfo, and Kui Mok, “Adaptive Intrusion Detection: A Data Mining Approach”, Artificial Intelligence Review, Kluwer Academic Publishers, 14(6):pp.533-567, December 2000. [12] Wenke Lee, Ph.D. in Computer Science, Columbia University, New York, NY. 1994-1999, Dissertation: A Data Mining Framework for Constructing Features and Models for Intrusion Detection Systems. [13] Joshua W. Haines, Richard P. Lippmann, David J. Fried, Eushiuan Tran, Steve Boswell, Marc A. Zissman, "1999 DARPA Intrusion Detection System Evaluation: Design and Procedures", MIT Lincoln Laboratory Technical Report, Feb. 2001. [14] University of California, Irvine, KDD Cup 1999 Data, 5th International Conference on Knowledge Discovery and Data Mining, http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html, (accessed 30.04.02). [15] Task description for the KKD’99 Classifier-Learning Contest http://kdd.ics.uci.edu/databases/kddcup99/task.htm (accessed 3.5.02). [16] R. Beghadad, "Critical study of neural networks in detecting intrusions," Elsevier, Computer and Security, pp.1-8, 2008.

TABLE.1 CONNECTION FEATURES, KDD' 99 DATA Feature category

Basic features of individual TCP connections

Traffic features computed using a two second time window: * = same-host cxn, ** = same-service cxn

Traffic features computed using a hundred-connection window: * = same-host cxn ** = same-service cxn

Content features within a connection suggested by domain knowledge

Attack type / normal

Feature name

Data type

Description

duration protocol_type service flag src_bytes dist_bytes land wrong_fragment urgent count * serror_rate * rerror_rate * same_srv_rate * diff_srv_rate * srv_count ** srv_serror_rate ** srv_rerror_rate ** srv_diff_host_rate **

continuous symbolic symbolic symbolic continuous continuous symbolic continuous continuous continuous continuous continuous continuous continuous continuous continuous continuous continuous

Length (number of seconds) of the connection Type of the protocol, e.g. tcp, udp, etc. network service on the destination, e.g., http, telnet, etc. number of data bytes from source to destination number of data bytes from destination to source normal or error status of the connection 1 if connection is from/to the same host/port; 0 otherwise number of ``wrong'' fragments number of urgent packets no. of connections to same host as the current connection in past 2 secs % of connections that have ``SYN'' errors % of connections that have ``REJ'' errors % of connections to the same service % of connections to different services no. of connections to same service as the current connection in past 2 secs % of connections that have ``SYN'' errors % of connections that have ``REJ'' errors % of connections to different hosts

dst_host_count * dst_host_serror_rate * dst_host_rerror_rate * dst_host_same_srv_rate * dst_host_diff_srv_rate * dst_host_srv_count ** dst_host_srv_serror_rate ** dst_host_srv_rerror_rate ** dst_host_srv_diff_host_rate ** dst_host_same_src_port_rate **

continuous continuous continuous continuous continuous continuous continuous continuous continuous continuous

no. of connections to same host as the current connection in past 2 secs % of connections that have ``SYN'' errors % of connections that have ``REJ'' errors % of connections to the same service % of connections to different services no. of connections to same service as the current connection in past 2 secs % of connections that have ``SYN'' errors % of connections that have ``REJ'' errors % of connections to different hosts % of connections to same service port

Hot num_failed_logins logged_in num_compromised root_shell su_attempted num_root num_file_creations num_shells num_access_files num_outbound_cmds is_host_login is_guest_login class class

continuous continuous symbolic continuous continuous continuous continuous continuous continuous continuous continuous symbolic symbolic symbolic symbolic

number of ``hot'' indicators continuous number of failed login attempts 1 if successfully logged in; 0 otherwise number of ``compromised'' conditions 1 if root shell is obtained; 0 otherwise 1 if ``su root'' command attempted; 0 otherwise number of ``root'' accesses number of file creation operations number of shell prompts number of operations on access control files continuous 1 if the login belongs to “hot'' list; 0 otherwise 1 if the login is a ``guest'' login; 0 otherwise Normal traffic or specific attack type

Table.2 Results of MLP using experiment #1 Classes Normal Buffer_overflow loadmodule Perl Neptune Smurf Guess_passwd Pod Teardrop Portsweep Ipsweep Land ftp_write Back Imap Satan Phf Nmap Multihop Warezmaster Warezclient Spy Rootkit

Classes Normal

41×50×23 99.99% 00.00% 00.00% 00.00% 100.00% 00.00% 00.00% 00.00% 99.41% 87.62% 00.00% 00.00% 00.00% 00.00% 00.00% 90.12% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00%

Architecture 41×50×50×23 41×28×23×23×23 99.99% 99.99% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 99.99% 100.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 99.99% 2.30% 71.32% 89.71% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 79.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 29.82% 00.00% 00.00% 00.00% 00.00% 00.00%

Table.3 Results of MLP using experiment #2 Architecture 41×50×5 41×50×50×5 41×28×23×23×5 99.95% 98.80% 99.77%

DoS

97.78%

98.86%

97.78%

PRB

92.03%

94.42%

92.03%

R2L

00.00%

79.46%

83.97%

U2R

00.00%

00.00%

00.00%

Classes Normal Attack

Table.4 Results of MLP using experiment #3 Architecture 41×50×2 41×50×50×2 41×28×23×23×2 99.82% 99.85% 99.73% 98.12% 98.97% 98.43%

Classes Normal Buffer_overflow loadmodule Perl Neptune Smurf Guess_passwd Pod Teardrop Portsweep Ipsweep Land ftp_write Back Imap Satan Phf Nmap Multihop Warezmaster Warezclient Spy Rootkit

(1) 99.96% 00.00% 00.00% 00.00% 99.99% 79.82% 00.00% 00.00% 00.00% 86.32% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00%

41×25U×23 25L (2) (3) 99.85% 99.96% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 99.98% 99.99% 1.1% 79.82% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 88.68% 86.32% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 78.23% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 72.00% 00.00% 00.00% 00.00% 00.00% 00.00%

Table.5 Results of GFF using experiment #1 Architecture Classes 41×50×23 41×50×50×23 41×28×23×23×23 Normal 99.87% 99.92% 99.94% Buffer_overflow 00.00% 00.00% 00.00% loadmodule 00.00% 00.00% 00.00% Perl 00.00% 00.00% 00.00% Neptune 99.99% 99.99% 100.00% Smurf 00.00% 00.00% 00.00% Guess_passwd 00.00% 00.00% 00.00% Pod 00.00% 00.00% 00.00% Teardrop 00.00% 99.27% 99.14% Portsweep 00.00% 00.00% 83.42% Ipsweep 00.00% 00.00% 00.00% Land 00.00% 00.00% 00.00% ftp_write 00.00% 00.00% 00.00% Back 00.00% 00.00% 00.00% Imap 00.00% 00.00% 00.00% Satan 83.54% 00.00% 00.00% Phf 00.00% 00.00% 00.00% Nmap 10.13% 00.00% 00.00% Multihop 00.00% 00.00% 00.00% Warezmaster 00.00% 00.00% 00.00% Warezclient 53.53% 32.03% 7.39%

Spy Rootkit

00.00% 00.00%

00.00% 00.00%

00.00% 00.00%

Table.6 Results of GFF using experiment #2 Architecture Classes 41×50×5 41×50×50×5 41×28×23×23×5 Normal 99.97% 98.96% 99.79% DoS 97.66% 98.22% 97.70% PRB 58.09% 00.00% 00.00% R2L 00.00% 00.00% 81.76% U2R 00.00% 00.00% 00.00% Table.7 Results of GFF using experiment #3 Architecture Classes 41×50×2 41×50×50×2 41×28×23×23×2 Normal 99.81% 99.71% 99.73% Attack 98.11% 98.96% 98.00%

Table.8 Results of MNN using experiment #1 Architecture 41×25U×25UX23 25L×25L (4) (1) (2) (3) 99.85% 99.83% 99.87% 99.96% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 99.98% 100.0% 99.99% 100.0% 1.10% 81.41% 95.48% 66.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 99.27% 00.00% 00.00% 88.68% 86.05% 87.50% 00.00% 00.00% 80.49% 88.02% 81.06% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.10% 00.00% 00.00% 00.00% 00.00% 78.23% 66.96% 00.00% 79.67% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 72.00% 90.03% 32.03% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00%

(4) 99.94% 00.00% 00.00% 00.00% 99.99% 92.11% 00.00% 00.00% 00.00% 94.47% 95.45% 00.00% 00.00% 01.38% 66.67% 00.00% 00.00% 00.00% 14.29% 00.00% 00.00% 00.00% 00.00%

(1) 99.98% 00.00% 00.00% 00.00% 100.00% 00.00% 00.00% 00.00% 00.00% 1.32% 00.00% 00.00% 00.00% 00.00% 00.00% 55.58% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00%

41×28U×23U×23U×23 28L×23L×23L (2) (3) 99.94% 99.93% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 99.99% 99.98% 00.00% 89.25% 00.00% 00.00% 00.00% 18.82% 96.58% 99.27% 52.89% 18.16% 00.00% 78.98% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 77.24% 00.00% 00.00% 00.00% 00.63% 18.99% 00.00% 00.00% 00.00% 00.00% 29.45% 00.00% 00.00% 00.00% 00.00% 00.00%

(4) 99.98% 00.00% 00.00% 00.00% 100.0% 00.00% 00.00% 00.00% 00.00% 01.32% 00.00% 00.00% 00.00% 00.00% 00.00% 55.58% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00%

41×25U×5 25L (2) (3) 99.86% 99.93% 97.53% 97.48% 57.94% 84.47% 00.00% 00.10% 00.00% 00.00%

Classes Normal DoS PRB R2L U2R

(1) 99.93% 97.48% 84.47% 00.10% 00.00%

Classes (1) 99.50% 96.69%

Normal attack

Classes Normal Buffer_overflow loadmodule Perl Neptune Smurf Guess_passwd Pod Teardrop Portsweep Ipsweep Land ftp_write Back Imap Satan Phf Nmap Multihop Warezmaster Warezclient Spy Rootkit

Classes Normal DoS PRB R2L U2R

(1) 99.95% 00.00% 97.62% 00.00% 91.88%

Classes Normal attack

(1) 99.82% 98.12%

Table.9 Results of MNN using experiment #2 Architecture 41×25U×25U×5 25L×25L (4) (1) (2) (3) (4) 99.86% 99.79% 99.95% 99.91% 99.94% 97.53% 97.71% 97.73% 97.86% 97.74% 57.94% 81.53% 86.61% 00.00% 00.00% 00.00% 81.06% 00.00% 00.00% 00.00% 00.00% 00.00% 37.14% 02.86% 08.57%

(1) 99.92% 00.00% 00.00% 00.00% 99.98% 97.81% 00.00% 18.28% 00.00% 87.89% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 90.71% 00.00% 00.00%

41×25U×2 25L (2) (3) 99.58% 99.50% 96.66% 99.69%

41×25U×23 25L (2) (3) 99.95% 99.92% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 99.99% 99.99% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 99.02% 00.00% 00.00% 31.63% 62.69% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 80.11% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00%

41X50X5 (2) (3) 99.83% 99.71% 00.00% 00.00% 97.83% 97.58% 00.00% 80.46% 91.12% 88.43%

41X50X2 (2) (3) 99.31% 99.31% 99.08% 99.08%

(4) 99.97% 00.00% 00.00% 00.00% 100.0% 76.10% 00.00% 00.00% 98.41% 79.21% 78.03% 00.00% 00.00% 00.00% 00.00% 79.00% 00.00% 00.00% 00.00% 00.00% 31.69% 00.00% 00.00%

Table.12 Results of JAN using experiment #2 Architecture 41X50X50X5 (4) (1) (2) (3) (4) 99.88% 99.91% 99.85% 26.77% 99.64% 00.00% 00.00% 00.00% 00.00% 00.00% 97.77% 98.35% 97.70% 99.87% 99.67% 00.00% 82.46% 81.96% 83.37% 79.96% 92.03% 89.14% 64.18% 67.94% 93.81% Table.13 Results of JAN using experiment #3 Architecture 41X50X50X2 (4) (1) (2) (3) (4) 99.49% 99.84% 99.84% 97.32% 99.62% 98.61% 98.36% 98.36% 99.42% 99.12%

(4) 99.88% 97.32% 57.38% 00.20% 00.00%

41×28U×23U×23U×2 28L×23L×23L (1) (2) (3) 99.93% 99.60% 99.90% 90.26% 90.49% 94.71%

(4) 99.93% 90.26%

(1) 99.97% 95.26% 53.88% 00.00% 00.00%

Table.10 Results of MNN using experiment #3 Architecture 41×25U×25UX2 25L×25L (4) (1) (2) (3) (4) 99.58% 99.70% 99.71% 99.70% 99.75% 96.66% 97.20% 97.25% 97.48% 96.74% Table.11 Results of JAN using experiment #1 Architecture 41×25U×25UX23 25L×25L (4) (1) (2) (3) 99.91% 99.96% 99.98% 99.91% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 99.99% 99.96% 99.97% 99.99% 98.46% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 97.43% 98.17% 97.80% 88.16% 76.58% 00.00% 81.32% 00.00% 75.00% 23.48% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 82.10% 78.01% 00.00% 78.67% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 87.68% 80.96% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00%

41×28U×23U×23U×5 28L×23L×23L (2) (3) 99.97% 99.96% 95.26% 96.18% 53.88% 75.49% 00.00% 00.00% 00.00% 00.00%

(1) 99.98% 00.00% 00.00% 00.00% 99.97% 00.00% 00.00% 00.00% 00.00% 00.00% 08.90% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00%

(1) 99.97% 05.71% 96.45% 29.26% 79.86%

(1) 99.68% 97.02%

41×28U×23U×23U×23 28L×23L×23L (2) (3) 99.98% 99.99% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 99.97% 99.99% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 08.90% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00%

41X28X23X23X5 (2) (3) 99.95% 99.97% 00.00% 00.00% 97.43% 97.17% 27.86% 00.00% 63.01% 65.55%

41X28X23X23X2 (2) (3) 99.18% 99.75% 97.46% 96.77%

(4) 99.97% 00.00% 00.00% 00.00% 99.99% 00.00% 00.00% 00.00% 85.09% 00.00% 88.07% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 00.00% 06.61% 00.00% 00.00%

(4) 99.95% 05.71% 97.71% 00.00% 63.88%

(4) 99.81% 96.95%