A Review on Application of Machine Learning and Deep Learning for Intrusion Detection

DOI : 10.17577/IJERTCONV10IS04040

Download Full-Text PDF Cite this Publication

Text Only Version

A Review on Application of Machine Learning and Deep Learning for Intrusion Detection

Anns Issac

B-tech student Computer Science and Engineering Mangalam College of Engineering

(Affiliated by A P J Abdul Kalam University) Kerala, India

Aswathy S

B-tech student Computer Science and Engineering Mangalam College of Engineering

(Affiliated by A P J Abdul Kalam University) Kerala, India

Aswathy Reghu

B-tech student Computer Science and Engineering Mangalam College of Engineering

(Affiliated by A P J Abdul Kalam University) Kerala, India

Jinu P Sainudeen

Assistant professor Computer Science and Engineering Mangalam College of Engineering

(Affiliated by A P J Abdul Kalam University) Kerala, India

AbstractMalicious cyber-attacks can hide in large amounts of normal data in unbalanced network traffic. The accuracy and timely detection should be ensured by Network Intrusion Detection System (NIDS). For intrusion detection in imbalance network traffic, machine learning and deep learning methods can be used. in this paper a survey of different intrusion detection systems based on machine learning and deep learning methods is performed. The proposed system adds on ensemble learning approach to improve accuracy. A review on various intrusion detection system (IDS) using the techniques in machine learning is been put forwarded.

Keywords Intrusion detection system(IDS), Deep Learning(DL)

  1. INTRODUCTION

    Recent interests and advances in the development of Internet and communication technologies over the last decade have made network security an important area of research. The technological development systems say for intrusion detection secures the network and all related assets in cyberspace. In an unbalanced network, this IDS can detect intrusions. The Network Intrusion Detection System (NIDS) is located at a point in the network and inspects traffic from all devices on the network. It performs monitoring of traffic through the subnet and compares the traffic forwarded on the subnet with a collection of known attacks. If an attack is identified or anomalous behaviour is observed, an alert can be sent to the administrator. An example of NIDS is to install it on a subnet with a firewall and see if anyone is trying to break through the firewall.Host intrusion detection systems (HIDS): runs on independent hosts (devices) on the network. HIDS monitors only incoming and outgoing packets from the device and alerts the administrator if suspicious or malicious activity is detected. Take a snapshot of an existing system file and compare it to the previous snapshot. If the analysis system file is edited or deleted, an alert is sent to the administrator for investigation. Examples of HIDS use can be seen on

    mission-critical machines where layout changes are not expected.Protocol-based intrusion detection system includes a system or agent that 6resides consistently on the front end of the server and controls and interprets the protocol between the user / device and the server. It attempts to protect the web server by regularly monitoring the HTTPS stream and accepting the associated HTTP protocol. HTTPS is unencrypted and this system must be present on that interface to use HTTPS before it can immediately enter the web presentation layer.

    Application Protocol-based Intrusion Detection Systems, Systems or agents that are typically in a group of servers. Here the intruders are identified by monitoring and interpreting communications with application-specific protocols. For example, it explicitly monitors the middleware's as it executes transactions with the web server's database.Hybrid intrusion detection systems are created by combining two or more intrusion detection system approaches. In hybrid intrusion detection systems, host agents or system data are combined with network information to create a complete view of the network system. Hybrid intruder alerting system is more effective than other intruder alerting systems. Prelude is an example of a hybrid IDS.

  2. RELATED WORKS

    A. Machine Learning Based Intrusion Detection System

    An intrusion detection system is used to investigate malicious activity that occurs in a network or system. Intrusion detection software or hardware scans a system or network for suspicious activity. Because computers are becoming more connected, intrusion detection is becoming increasingly important for network security. To protect networks, various machine learning techniques and statistical methodologies have been used to create various types of Intrusion Detection Systems. The

    accuracy of an intrusion detection system determines its performance. In recent studies, various techniques have been used to improve performance. The main task of an intrusion detection system is to analyze large amounts of network traffic data. To solve this problem, you'll need a well-organized classification methodology. This issue is addressed in the proposed strategy. Support Vector Machine (SVM) and Naive Bayes are two machine learning techniques used. The NSL KDD knowledge discovery Dataset is used to evaluate intrusion detection systems. Support Vector Machine and Naive Bayes are used to perform comparative analysis, and their accuracy and misclassification rate are calculated.

    Fig 1 .Block Diagram

    1. Deep Learning Approach for Intelligent Intrusion Detection System

      Machine learning techniques are frequently utilized to construct intrusion detection systems (IDS) that can identify and classify cyber-attacks at the network and host level in a timely andautonomous manner. However, because harmful attacks are always evolving and occurring in high volumes, various issues develop, necessitating a scalable solution. Various malware datasets are publicly available for further investigation by the cyber security community. A deep neural network (DNN), a form of deep learning model, is investigated in this study in order to construct a flexible and effective IDS for detecting and classifying unanticipated and unpredictable cyber-attacks. Because of the constant change in network behavior and the quick evolution of attacks, it is required to analyze numerous datasets that

      have been generated throughout time using static and dynamic methodologies. This type of research aids in the identification of the optimal algorithm for predicting future cyber-attacks. On a variety of publicly available benchmark malware datasets, a complete evaluation of experiments with DNNs and other conventional machine learning classifiers is performed. The following hyper parameter selection methods with the KDDCup 99 dataset are used to select the ideal network parameters and network topologies for DNNs. By feeding the IDS data through several hidden layers, our DNN model learns the abstract and high-dimensional feature representation. It has been confirmed through thorough experimental testing that DNNs outperform traditional machine learning classifiers. Finally, we present scale-hybrid-IDS-AlertNet, a highly scalable and hybrid DNNs framework that can be utilized in real-time to successfully monitor network traffic and host-level events in order to preemptively notify probable cyber-attacks. The proposed DNN model was used to detect attacks and intrusions. In every example, we found that DNNs outperformed traditional machine learning classifiers. The suggested framework's speed can be improved even more by including a module that monitors DNS and BGP events in the networks. By adding more nodes to the existing cluster, the suggested system's execution time can be inceased. In addition, the proposed system does not give detailed information on the structure and characteristics of the malware. Overall, the performance can be further improved by training complex DNNs architectures on advanced hardware through distributed approach.

      Fig 2. Neural Network

    2. I-SiamIDS: an improved Siam-IDS for handling class imbalance in network-based intrusion detection systems

      By analyzing network traffic Network-based Intrusion Detection Systems (NIDSs) identify malicious activities. Samples of benign and intrusive network traffic are used to trained the NIDSs. Depending on the number of available instances training samples are included in either majority or minority classes. NIDSs trained on such skewed data are more likely to make incorrect predictions for minority attack types, resulting in undetected or

      misclassified intrusions. In previous studies, data-level approaches were used to address the problem of class imbalance by increasing minority class samples or decreasing majority class samples in the training data set. The NIDSss performance indirectly improve by the data- level balancing approaches, but they are unable to identify attacks having limited training data only. Improved Siam- IDS (I-SiamIDS), a two-layer ensemble for resolving class imbalance problems, is proposed in this study as an algorithm-level method. I-SiamIDS uses no data-level balancing strategies to identify both majority and minority classes at the algorithmic level. The I-SiamIDS first layer uses an ensemble of binary eXtreme Gradient Boosting (b-XGBoost), Siamese Neural Network (Siamese-NN) and Deep Neural Network (DNN) for hierarchical filtration of input samples to identify attacks. The second layer of I-SiamIDS uses a multi-class eXtreme Gradient Boosting classifier to classify attacks into multiple attack classes (m-XGBoost). Many type of class intrusions are in the network, so an efficient NIDS is needed which must be able to identify all types of intrusions by handling this class imbalance in network traffic. Improved Siam-IDS (I- SiamIDS), a two-layer ensemble that uses an algorithm- level approach to address the problem of class imbalance.

    3. Service-Aware Two-Level Partitioning for Machine Learning-Based Network Intrusion Detection With High Performance and High Scalability

      A network intrusion detection system (NIDS) is a critical cyber security tool. Machine learning-based NIDSs have recently received a lot of attention, as many machine learning algorithms have been developed. However, existing NIDSs have limitation in terms of generality because they have been designed based on specific characteristics obtained from analyzing some partial datasets. Furthermore, the NIDS datasets exhibit a considerably uneven ratio of normal to aberrant data in reality.It leads to the minority class issue, which must be addressed in order to produce strong and trustworthy NIDSs that perform in a variety of contexts. This research introduces a unique technique based on service-aware dataset partitioning that offers great scalability for handling large and rapidly rising network data while also assisting the classifier in improving classification accuracy and speed.We tested our technique on the Kyoto2016 dataset, which is a well-known dataset for severely unbalanced data, and compared it to existing state-of-the-art approaches, utilizing multiple classification algorithms and parameters to achieve the best performance. The results showed that the method can categories network traffic quickly and accurately even with large, unbalanced datasets. We conclude that it can relieve serious existing issues of imbalanced datasets for modern machine learning based NIDS solutions. This paper proposes a new two-level partitioning algorithm combining agglomerative and divisive partitioning to minimize the size of attack type belonging to each partition. Unlike most existing studies, we used very large unbalanced datasets, CIC-IDS2017 and Kyoto2016 datasets, to effectively analyze the minority class problem.

    4. Intrusion Detection of Imbalanced Network Traffic Based on Machine Learning and Deep Learning

      Malicious cyber-attacks can often hide in large amounts of normal data in imbalanced network traffic. Network Intrusion Detection System (NIDS) ensures the accuracy and timeliness of detection. The paper researches machine learning and deep learning for intrusion detection in imbalanced network traffic. Machine learning algorithms like DSSTE algorithm,RF,SVM,LSTM,AlexNet,Mini- VGGNet are used .Use the Edited Nearest Neighbour (ENN) algorithm to divide the imbalanced training set into the difficult set and the easy set. Next, use the K- Means algorithm to compress the majority samples in the difficult set to reduce the majority. In the challenging set, zoom in and out the continuous attributes of the minority samples, then synthesis fresh samples to increase the minority number.The system balances out the original training set and provides tailored data supplementation for the underserved minority group.It allows the classifier to better learn the distinctions in the training stage and increase classification accuracy.

      Fig 3. The overall framework of network ID model

    5. Real-Time Intrusion Detection in Wireless Network: A Deep Learning-Based Intelligent Mechanism

    With the advancement of wireless network technology, the number of cyber-attacks has increased substantially, posing a serious danger to Wireless Local Area Network security (WLAN). Traditional intrusion detection technology has been a popular topic of research for many years, however it may not have good real-time detection performance. As a result, developing a detection method to detect threats in a timely manner is critical. We use a CDBN-based intrusion detection mechanism in this paper to distinguish attack features and detect wireless network infiltration in real time. A window-based instance selection algorithm called "SamSelect" is used to under sample the majority class data samples, and a Stacked Contractive Auto-Encoder (SCAE) algorithm is proposed to reduce the dimension of the data samples to avoid the impact of the imbalanced dataset and data redundancy on the detection accuracy. As a result, our suggested mechanism can detect a possible assault effectively and with high accuracy. The results of the experiments reveal

    that CDBN can be integrated effectively with "SamSelect" and SCAE, and that the suggested mechanism has a high detection speed and accuracy, with an average detection time of 1.14 ms and a detection accuracy of 0.974. This is mainly designed for wireless network intrusion by improved Deep Belief Network based scheme.

    Fig 4. Overview of the proposed detection mechanism

    G.A Hybrid Intrusion Detection System Based on Scalable K-Means+ Random Forest and Deep Learning In this digital world, digital assets are under the various network security threats so many security equipments are used to protect digital assets. IDS is a type of security equipment, which is less efficient that means if the alert is not timely and also IDS is useless. If the accuracy cant meet the requirements. For that reason ID model then combines with machine learning and deep learning. In this paper for the binary classification k-means and random forest are used for the quickly classification of normal events and attack events, A distributed computing of these algorithm is implemented in the spark platform. Have to solve the unbalanced dataset and adaptive synthetic sampling is adopted. At the evaluation NSL-KDD and CIS-IDS2017 datasets are used. Result of the proposed model shows the better TPR for at fake events, faster data pre-processing speed and potentially less gaining time. The accuracy of the model by using the NSL-KDD data set is 85.24% and for the CIS-IDS2017 dataset 94.91% from these results we can understand that the attack events can be detected accurately. Purpose of the distributed computing platform is to attainfast pre-processing of the ID, dataset separation of attack events and normal events is done by integrating the distributed , k-means and the RF algorithm.

    Fig 5. Intrusion detection framework

    H.PCCN: Parallel Cross Convolutional Neural Network for Abnormal Network Traffic Flows Detection in Multi- Class Imbalanced Network Traffic Flows

    The use of deep learning to detect network attack behavior is an important research issue in the realm of network security. Detecting multi-class imbalanced aberrant traffic data is a difficult task at the moment. To improve the detection performance of imbalanced irregular flows, this research suggested a new deep learning-based intrusion detection network called the parallel cross convolutional neural network (PCCN). PCCN can better learn flow features with less data by combining the flow features obtained from the two branch convolutional neural networks (CNN). This improves the detection results of imbalanced anomalous flows. To extract multi-class flow features at the same time, we developed an enhanced feature extraction approach of the original flow. Not only does the suggested technique reduce the number of superfluous parts in network learning, but it also speeds up network convergence. We also presented four upgraded variants of the PCCN network structure to fulfil the real-time network intrusion detection requirements of today's big data computing. These networks can attain nearly the same detection outcomes as the PCCN, but with a significantly shorter data detection time. The suggested PCCN method outperforms standard machine learning algorithms by a large margin, according to high order assessment measures. PCCN can also outperform the existing hierarchical network model in terms of overall accuracy when compared to it.

    Fig 6. Parallel cross convolutional neural network

    I.Increasing the Performance of Machine Learning- Based IDSs on an Imbalanced and Up-to-Date Dataset Because of the wide spread usage of the internet in recent years, the number of networked computer has increased in our daily lives. Server flaws allow hackers to infiltrate systems using not only know but also new attack types

    that are more sophisticated and difficult to detect. IDS which is trained with some machine learning approaches using a pre-collected dataset to protect computers from them , is one of the most desired defense systems. The datasets used were gathered over a short period of time in a few distinct networks and do not generally contain up-to date information. furthermore they are unbalanced and unable of storing enough data to withstand all forms of attacks . The efficiency of current IDSs is harmed by these unbalanced and obsolete datasets, especially for attack types that are infrequently encountered. Using the K Nearest Neighbor, Random Forest, Gradient Boosting, Ada-boost, Decision Tree, and Linear Discriminant Analysis algorithms, we propose six machine learning- based IDSs in this study. An up-to date security dataset, CSE-CIC-IDS2018, is used instead of older and mostly worked dataset to create a more realistic IDS. In addition, the chosen dataset is unbalanced. As a result, the imbalance ratio is reduced by applying a synthetic data production model called Synthetic Minority Oversampling Technique (SMOTE) to increase system efficiency depending on attack kinds and to reduce missed incursions and false warnings. Minor classes data is formed, and their numbers are expanded to the average data size using this technique. The proposed approach significantly boosts the detection rate for rarely occurring incursions, according to experimental results.

    J.Federated Mimic Learning Privacy to Keep Access to Access

    Internet of Things (IoT) devices is prone to attack due to their privacy and security features. These attacks vary from background hacking to disrupting the device's network of devices. Intrusion Detection Systems (IDS) play an important role in ensuring the confidentiality of information and security of IoT devices against these attacks. Recently, IDS-focused reading strategies have become more prominent due to their high level of accuracy. However, common deep learning strategies endanger user privacy due to the transfer of user data to a central server. Federated Learning (FL) is a popular extended privacy management method. FL enables local training models on managed devices and transfers local models to a central server instead of transferring sensitive data. However, FL can suffer from ML engineering attacks that can read information about user data in the model. To overcome the problem of retractable engineering, learning to imitate is another ML-based IDS privacy policy. The results show that we can obtain 98.11% acquisition accuracy with integrated simulation readings. In this study, we propose an ML-based IDS for IoT devices using integrated simulation tutorials to maintain user privacy. The paper was divided into three types of use: first, we used an in-depth study model in the IDS and then applied. Subsequently, we implemented an integrated simulation learning approach that incorporates both teacher simulation learning, as well as student simulation learning as an ML-based IDS organization. The results show that integrated simulation learning provides acquisition accuracy while maintaining the same

    privacy as in deep learning with models by 98.118% in teacher simulation learning (FTML).

    K.TheIoT Intrusion Detection System Uses In-Depth Learning and Advanced Interim Search Development Major advances in communications, cloud computing, and the Internet of Things (IoT) have opened up significant security challenges. With these developments, cyber- attacksare also growing rapidly as current security measures do not provide effective solutions. Recently, a variety of solutions based on artificial intelligence (AI) have been proposed for various security systems, including access to access. In this paper, we propose a more efficient AI-based approach to access systems (IDS) for IoT systems. We propose the development of in-depth learning and metaheuristics (MH) algorithms that have proven their effectiveness in solving complex engineering problems. We suggest a feature removal method using convolutional neural networks (CNNs) to extract relevant features. Also, we are developing a new feature selection feature using a new variant of the transient search optimization algorithm (TSO), called TSODE, using different evolution operators (DL) algorithms. The proposed TSODE uses DE to improve the process of balancing the exploitation and testing categories. In addition, we use three public data sets, KDDCup-99, NSL-KDD, Bot-IoT, and CICIDS-2017 to test the performance of an improved, more accurate method compared to a few existing methods. In this study, the access login system (IDS) for IoT systems was proposed using the benefits of in-depth learning and metaheuristic development algorithms (MH). The advanced system uses the Convolutional neural network (CNN) as a feature detection method to obtain relevant features in the input data. In addition, we developed a new feature selection feature using a different transient search optimization algorithm (TSO) using the differential evolution (DE) algorithm. DE operators are hired to improve the search process of the standard TSD algorithm, as well as to avoid its shortcomings, as a trigger for better localization. We did extensive testing tests to test the improved method using three IoT IDS data sets, KDDCup-99, NSL-KDD, and BoT-IoT.

    L.Access Program using Machine Learning Techniques The rapid growth in the use of computer networks poses problems for maintaining network access, integrity, and confidentiality. This enables network administrators to use a variety of access control systems (IDS) that help monitor network traffic for unauthorized and dangerous activities. Intervention is a violation of security policy with malicious intent. Therefore, the hacking system monitors traffic flow through computer systems to search for malicious activity and known threats, sending alerts when it detects such threats. Malicious activity detection is of two types, misuse or signature detection where the IDS collects information, analyses it, and compares it with attack signatures stored on a large database. The second discovery is a confusing discovery that takes on as dangerous a task as any deviation from normal behaviour.

    The proposed paper presents an overview of the various activities undertaken in the development of an effective IDS using single-component, multi-component (ML) classification separators, tested using seven different databases. The results obtained from the various activities were discussed and compared providing a clear path and direction for future work. The advent of machine learning introduces new entry-level techniques where different types of segregation have been adopted by researchers and scholars in the creation of entry-level models. This paper presented various research papers related to the use of machine learning separators in detection systems that were not published from 2015 to 2020. Among the various models used in the research papers, the group and mixed class dividers were able to outperform their class dividers and as a result, have better predictive accuracy and detection rate.

    1. In-depth Learning Methodology Integrating Autoencoder and One-class SVM for DDoS Attack Detection on SDNs

      Software-Defined Networking (SDN) enables us to collect network traffic information and manage networks continuously. Therefore, SDN facilitates the promotion of strong and secure networks. Recently, several ways to gain access to machine learning (ML) / Deep Learning (DL) have been proposed to protect SDN networks. Currently, many of the proposed methods for obtaining ML / DL entry are based on a supervised learning method that required well-balanced and well-balanced data sets for training. However, this is time-consuming and requires significant human expertise to process these databases. These methods cannot cope well with unequal data sets and labels. In this paper, we propose a mixed DL method using a stacked autoencoder and a Single Class Vector Support Machine (SAE1SVM) for Distributed Denial of Service (DDoS) attacks. Test results show that the proposed algorithm can achieve an average accuracy of 99.35% with a small set of flow characteristics. SAE- 1SVM shows that it can significantly reduce processing time while maintaining a high level of detection. In short, SAE-1SVM can work well with unequal and unlabelled data sets and exhibit high acquisition accuracy.

    2. Machine Learning Algorithms in Intrusion Detection and Classification

      The complexity of information systems is rapidly increasing. New approaches that exploit the information inherent in networks are used to frame and execute threats and assaults. Due to the differences between subtle realms, knowledge is continually changing. There are three types of users: administrators, server managers, and those who require access.Threats such as denial of service attacks and invasions necessitate the protection of information devices. Intrusion is a serious threat to unauthorized data or a lawful network that takes advantage of legitimate users' identities or any of the network's back doors and vulnerabilities. Intrusion Detection Systems (IDS) are systems that detect intrusions at various stages (IDS). The goal of this study is to use

      rule-based approaches and learning-based algorithms for intrusion detection and classification to increase the efficiency of intrusion detection systems (IDS). Algorithms such as Neural Networks (NN), Random Forest, and SVM Regular datasets like kddcup 99 are used to test the output of rule-based approaches and machine learning algorithms.

    3. Decision Tree-Based Rule Derivation for Intrusion Detectionin Safety-Critical Automotive Systems

    Intrusion Detection Systems (IDSs) are becoming more common in safety-critical systems like linked autos. Because the behaviour and effectiveness of measures must be evaluated before they can be approved, IDS decisions must be traceable, and the IDS must also perform efficiently on resource-constrained embedded systems.These constraints make the application of Machine Learning (ML) methods in IDS design more difficult. In this research, we present a method for leveraging machine learning to build rules for a rule-based IDS like Snort. The time-consuming and complex process of establishing a rule set is made easier with our approach.We construct rules using decision trees, which experts can utilize to create a rule set for a specific safety- critical use case. We also apply long short-term memory approaches to get over the problem of restricted training data availability, which is a typical challenge in safety- critical systems. The practicality of our approach to develop customized IDS rules for such systems is demonstrated by our implementation and evaluation.

  3. PROBLEM DEFINITION

    Network scale and real-time traffic have become more complex and massive as a result of the rapid development and widespread application of 5G, IoT, Cloud Computing, and other technologies. Cyber-attacks have also grown in complexity and variety, posing significant security challenges in cyberspace. In real cyberspace, normal activities occupy the dominant position, so most traffic data are normaltraffic; only a few are malicious cyber- attacks, resulting in a high imbalance of categories. In the highly imbalanced and redundant network traffic data, intrusion detection is facing tremendous pressure. The class imbalance problem in network traffic should be tackled.

  4. PROPOSED METHODOLOGY

    1. Preprocessing

      Preprocessing is a process of preparing the raw data and making itsuitable for a machine learning model. It is the first and crucial step whilecreating a machine learning model. Our dataset generally contains noises, missing values, and maybe in anunusable format which cannot be directly used for machine learningmodels. Data preprocessing is required tasks for cleaning the data and making it suitable for a machine learning model which also increases theaccuracy and efficiency of a machine learning model.

      Remove null rows and columns, Remove duplicate values arepreprocessing steps performed in our dataset

    2. Data balancing

      Data balancing is the step of transforming a dataset contains equal oralmost equal number of samples from the positive and negative class. Ifthe samples from one of the classes outnumbers the other (such as yourexample), the data is skewed in favour of one of the class. The databalancing are performed by usingDifficult Set SamplingTechnique(DSSTE) algorithm . The following are the steps of databalancing

      Edited nearest neighbor to divide dataset in to near- neighborset and far-neighbor

      K neighbors to scale the dataset Create new training set

    3. Classification using machine learning algorithm

      The imbalanced dataset is used to train LSTM,AlexNet,Mini-VGGNet classifiers to perform network intrusion detection. The dataset isloaded and the features and labels input to these algorithms and thetraining isoccurred. After training the models will be saved

    4. Testing

    The trained model is tested by using input intrusion data. The inputdata is preprocessed as in preprocessing step and preprocessed testingdata is fed into already trained models. The model will predict the out putof the given features based on training

    Step 1. Pre-processing

    • Remove null rows and columns

    • Remove duplicate values

    • Spliting data into training and testing set

      Step 2. Data balancing

    • Edited nearest neighbour to divide dataset in to near-neighbor set and far-neighbor

    • K neighbors to scale the dataset

    • Create new training set

      Step 3. Long Short-Term Memory classifier

    • Create LSTM Classifier model

    • Training the model using training set

    • Saving the LSTM model

      Step 4. AlexNet classifier

    • Create AlexNet Classifier model

    • Training the model using training set

    • Saving the AlexNet model

      Step 5 Mini-VGGNet classifier

    • Create Mini-VGGNet Classifier model

    • Training the model using training set

    • Saving the Mini-VGGNet model

      Step 6 Prediction

    • Loading Saved LSTM,AlexNet,Mini-VGGNet Model

    • Load data

    • Preprocess as steps 1

    • Predict data using each loaded model

    • Predict data using Ensemble model

    • Displaying Output

      Step 7 Ensemble Learning

    • Create Ensemble learning Voting classifier by using created classifiers

    • Saving the Ensemble model

      Step 8 Prediction

    • Loading Saved Ensemble Models

    • Load data

    • Preprocess as steps 1

    • Predict data using Ensemble model

    • Displaying Output

    Fig.7 System architecture

  5. COMPARATIVE STUDY

    From the survey we can understood that the each and every system produces different accuracies. Because of the different concepts are used in each systems, so the output of the systems are depends on that. And each one have their own merits and demerits. Mostly Machine Learning and Deep Learning techniques are used for the detection of intrusions in the cyber world. Now a days security of cyber space is very essential, so many techniques are there, but we need a best technique from among them, for that a comparative study is follows.

    Table .1 A brief comparison of existing systems

    SI NO

    TITLE

    KEY CONCEPT

    MERIT

    DEMERIT

    RESULT

    FUTURE SCOPE

    1

    Real-Time Intrusion Detection in Wireless: Network A Deep Learning- Based Intelligent Mechanism

    CDBN(Condition Deep Belief Network), SamSelect window based instance selection algorithm, stacked contractive auto encoder algorithm

    It can effectively detect the potential attack and achieve high accuracy,

    It has high detection speed and accuracy

    Can t append this mechanism in big data environment , Cant detect wider range of cyber security attack

    Detection accuracy is 0.974

    Extend the work by studying an efficient detection method in big data environment and how to apply the proposed mechanism in detecting wider range of cyber security attacks

    2

    A Hybrid Intrusion Detection System Based on Scalable K- Means+ Random Forest and Deep Learning

    ML and DL For binary classification k- means and random forest distributed computing of these algorithm on spark platform, convolutional network, long short term memory

    Better TPR for number of attack events , Faster data pre- processing speed and potentially less training time

    This model cant implemented in an actual application

    NSL- KDD=85. 24%

    CIS- IDS2017

    =99.91%

    In future want to expand in actual application to verify the efficiency of the model

    3

    Increasing the Performance of Machine Learning- Based IDSs on an Imbalanced and Up-to- Date Dataset

    K-Nearest neighbour, Random Forest, Gradient Boosting Ada boost, Decision Tree and Linear Discriminate Analysis algorithm

    ,synthetic minority over sampling TEchnique

    Increases the detection rate for rarely encountered intrusion

    Efficiency less in this model

    Accuracy of the model is to increase between 4.01% to

    30.59%

    In future DL algorithm should be used and it is expected that the efficiency of the system will increases

    4

    PCCN:

    Parallel Cross Convolutiona l Neural Network for Abnormal Network Traffic Flows Detection in Multi-Class Imbalanced Network Traffic Flows

    PCCN,CNN

    Better performance in overall accuracy , Good generalizatio n ability for various features

    Cant detect the novelty of flow

    Use DL algorithm to detect the novelty of flow and improve the detection performance

    5

    I-SiamIDS: an improved Siam-IDS for handling class imbalance in network- based intrusion detection systems

    Improved SiamISD(2 layer ensembler), Binary eXtreme Gradient Boosting (bXG Boost), Siamese NN, Deep NN

    Incoming datas are filtered by multiple time through different classifier which minimize the chance of malicious traffic going undetected.

    Computationa l cost also showed the proposed I- Siam IDS in terms of execution time

    I-SiamIDS attained higher Accuracy, Recall, Precision, F1 scores and AUC values as compared to the five algorithms in consideration.

    6

    Machine learning

    IDS rule based technic, learning

    By this , accuracy of

    SVM=94

    %

    algorithm in intrusion detection and classification

    based algorithm, NN,RF , SVM

    algorithm

    the SVM is better than others

    NN=84% RF=82.9

    %

    7

    Decision Tree-Based Rule Derivation for Intrusion Detection in Safety- Critical Automotive Systems

    ML, rule based IDS like snort, Decision tree, LSTM

    Feasibility of our approach to derive specific IDS rules for such systems

    Cant intent in different dataset

    Try to intent to demonstrate the adaptability of the solution to different types of data sets

    8

    Intrusion Detection of Imbalanced Network Traffic Based on Machine Learning and Deep Learning

    NIDS, DSSTE,

    Edited Nearest neighbour (ENN)algorithm, RF, LSTM, SVM,

    XGBoost, AlexNet, Mini- VGG

    Tackle the class imbalance problem and balance it

    Limited in DL, cant take the automatic feature extraction

    DSSTE

    algorithm outperfor ms

    Use only DL model for feature extraction and model training

    9

    Deep Learning- Based Intrusion Detection Systems: A Systematic Review

    IDS, DL, Auto-

    encoder, Recurrent Neural network, Boltzmann machine, CNN.

    Deep learning can process raw input data and also provide support for unsupervised

    , supervised and semi- supervised learning method

    10

    Machine Learning Based Intrusion Detection System

    SVM, Naïve Bayes technique

    Machine learning technique play an essential role

    Deals with limited amount of data

    Deals with large volume of data, Hybrid multilevel model will be construct to improve the accuracy

    11

    A Deep Learning Approach Combining Autoencoder with One- class SVM for DDoS Attack Detection in SDNs

    One class SVM, SDN, DDoS

    Provides more robust and secure networks.

    It cannot be deals with imbalanced and unlabelled datasets.

    99.35%.

    Will test the suggested model in a real SDN testbed for further feature analysis and network threat detection.

    12

    Service- Aware Two- Level Partitioning for Machine Learning- Based Network Intrusion Detection With High Performance and High Scalability

    Network, intrusion detection

    provides high scalability and flexibility of rapid growing datas,

    The rati between datas are imbalanced which causes the minority class problem

    90%

    accuracy

    It can be extended in the future to support any additional classification problems that have minority class concerns, such as unreliable telecommunication customers, fraud detection, or picture classification, such as medical diagnosis.

    13

    Intrusion Detection System using Machine Learning Techniques

    Intrusion detection system, ML, security

    It detects the malicious activity.

    difficult to manage the

    informations for every host.

    58.33%

    accuracy

    14

    IoT Intrusion

    Feature selection,

    It provides

    The MH

    96%

    Furthermore, the

    Detection System Using Deep Learning and Enhanced Transient Search Optimization

    Feature extraction, Transient search optimization algorithm (TSODE)

    the high- performance measures and optimization of datas.

    optimizer cannot be work with different data sets.

    accuracy

    performance

    the TSODE's versatility allows it to be used in a variety of applications. Image processing, cloud and fog processing are examples of jobs that can be optimised.

    Scheduling computations, parameter estimations, and so on.

    15

    Federated Mimic Learning for Privacy Preserving Intrusion Detection

    Federated learning, mimic learning, intrusion detection

    Ensuring the privacy and provide accuracy

    Due to the high connectivity IOT has high risk.

    98.11%

    accuracy

  6. CONCLUSION AND FUTURESCOPE Intrusion detection systems are an important part of today's information technology-based enterprises' security.It's a difficult task to provide an efficient and high-performance IDS approach to deal with a wide range of security assaults.Deep learning approaches have recently been shown to be effective at solving intrusion detection challenges, and various deep learning-based IDS strategies have been published.Deep learning is a subset of machine learning techniques that employ multiple layers to do nonlinear processing and learn multiple levels of data representation.

From this experiment found that deep learning is better than machine learning techniques. Malicious cyber-attacks can lurk in enormous amounts of legitimate data in unbalanced network traffic. In cyberspace, it uses a high level of stealth and obfuscation, making it difficult for Network Intrusion Detection Systems (NIDS) to ensure detection accuracy and timeliness. To address the problem of class imbalance, we offer a novel Difficult Set Sampling Technique (DSSTE) algorithm. To begin, partition the imbalanced training set into the challenging and easy sets using the Edited Nearest Neighbor (ENN) algorithm. Then, to minimize the majority, apply the K-Means technique to compress the majority samples in the challenging set.In the challenging set, zoom in and out the continuous attributes of the minority samples, then synthesis fresh samples to increase the minority number. Finally, the augmentation samples are mixed with the easy set, the compressed set of majority in the difficult, and the minority in the difficult set to create a new training set. The algorithm evens out the original training set's imbalance and provides tailored data supplementation for the minority group that needs to learn. It allows the classifier to better learn the distinctions in the training stage and increase classification accuracy. Deep Learning techniques are the best for detecting the intrusions in imbalanced network. In future we can extend our work to intrusion prevention also, if an attacker try to intrude an alert can be provided along with prevention.

REFERENCES

[1] Dr. K.Sundarakantham .Machine Learning Based Intrusion Detection System Proceedings of the Third International Conference on Trends in Electronics and Informatics (ICOEI 2019) IEEE Xplore

[2] Yong Zhang. PCCN: Parallel Cross Convolutional Neural Network for Abnormal Network Traffic Flows Detection in Multi-Class Imbalanced Network Traffic Flows Digital Object Identifier 10.1109/ACCESS.2019.2933165

[3] GozdeKaratas , OnderDemir and OzgurKoraySahingoz. Increasing the Performance of Machine Learning-Based IDSs on an Imbalanced and Up-to-Date Dataset Digital Object Identifier 10.1109/ACCESS.2020.3048198

[4] Liqun Yang , Jianqiang LI2 , Liang YIN3 , Zhonghao SUN2 , Yufei zhao1 , and Zhoujun LI. Real-Time Intrusion Detection in Wireless Network: A Deep Learning-Based Intelligent Mechanism Digital Object Identifier 10.1109/ACCESS.2020.3019973

[5] Noor Ali Al-Athba Al-Marri, Bekir S. Ciftler, and Mohamed

M. Abdallah. Federated Mimic Learning for Privacy Preserving Intrusion Detection 2020 IEEE International Black Sea Conference on Communications and Networking

[6] Lan Liu Intrusion Detection of Imbalanced Network Traffic Based on Machine Learning and Deep Learning Date of publication December 30, 2020, date of current version January 13, 2021.

[7] Usman Shuaibu Musa, MeghaChhabra, Aniso Ali and Mandeep Kaur Intrusion Detection System using Machine Learning Techniques: A Review the International Conference on Smart Electronics and Communication (ICOSEC 2020) IEEE Xplore Part Number: CFP20V90-ART;

ISBN: 978-1-7281-5461-9

[8] PunamBedi. I-SiamIDS: an improved Siam-IDS for handling class imbalance in network-based intrusion detection systems Springer Science Business Media, LLC, part of Springer Nature 2020

[9] Yeongje uhm1 and Wooguil Pak Service-Aware Two-Level Partitioning for Machine Learning-Based Network Intrusion Detection with High Performance and High Scalability date of publication January 4, 2021, date of current version January 12, 2021

[10] GunaSekharSajja , Malik Mustafa ,and Dr R Ponnusamy3. Machine Learning Algorithms in Intrusion Detection and Classification Annals of R.S.C.B., ISSN: 1583-6258, Vol. 25, Issue 6

[11] LotfiMhamdl, Desmond McLernon and FadiEl- moussa. A Deep Learning Approach Combining Autoencoder with One- class SVM for DDoS Attack Detection in SDNs University of London. May 17,2021 at 22:31:09 UTC from IEEE Xplore

[12] Chao Liu, ZhaojunGu, and Jialiang Wang. A Hybrid Intrusion Detection System Based on Scalable K-Means+ Random Forest and Deep Learning Digital Object Identifier 10.1109/ACCESS.2021.3082147

[13] Jan Lansky, SaqibAli , and MokhtarMohammaiI. Deep Learning-Based Intrusion Detection Systems: A Systematic Review Creative Commons Attribution 4.0 License VOLUME 9, 2021

[14] Abdul-Aziz Fatani, Mohamed AbdElaziz, AbdelghaniDahou IoT Intrusion Detection System Using Deep Learning and Enhanced Transient Search Optimization IEEE Access vol 9,2021

[15] Lucas Buschlinger, SanatSarda, and ChristophKrauß. Decision Tree-Based Rule Derivation for Intrusion Detection in Safety-Critical Automotive Systems Original IEEE publication

Leave a Reply