Systematic investigation of machine learning techniques for network intrusion detection

September 15, 2022

Author: Jacob Martin

Introduction

Network security has become a critical research area due to the current interest in and advancements in communications and internet technologies over the past ten years. It makes use of devices like firewalls, virus protection, and intrusion detection systems (IDS) to safeguard the security of a network and all of its connected assets within a cyberspace. Among these, the network-based intrusion detection system (NIDS) is the attack detection method that offers the needed protection by continuously scanning the network traffic for hostile and suspicious activity.

The researchers have looked into the use of deep learning (DL) and machine learning (ML) approaches to meet the needs of a successful IDS. The main goal of ML and DL, which fall under the broad heading of artificial intelligence (AI), is to extract meaningful information from huge data. The tremendous growth in network traffic and the related security risks have made it extremely difficult for NIDS systems to effectively detect malicious intrusions Ahmad et al., (2021).

The study of DL approaches for NIDS is still in its early stages, and there is still a lot of room to analyze this technique within NIDS to effectively detect network invaders. In order to give a comprehensive overview of current trends and developments in ML- and DL-based NIDS systems, this research paper will focus on recent developments in these areas.

ML algorithms for NIDSDecision TreeOne of the fundamental supervised machine learning (ML) techniques, DT applies a series of judgments to both classify and predict the dataset (rules). The structure of the model is that of a typical tree, with branches, nodes, and leaves. Each node stands for a characteristic or feature. CART, ID3, and C4.5 are the three most popular DT models. Numerous decision trees are used to create many sophisticated learning algorithms, including XGBoost and Random Forest (RF).

Artificial neural network

The neurons (nodes) that make up an ANN are the processing units and the connections that link them. An input layer, numerous hidden layers, and an output layer are how these nodes are arranged. For the ANN’s learning process, the backpropagation method is employed. The ability to execute nonlinear modelling by training from larger datasets is the fundamental benefit of utilising an ANN approach.

Ensemble methods

The fundamental tenet of ensemble methods is that learning should be done collaboratively in order to benefit from the various classifiers. Considering that every classifier has its advantages and disadvantages. Some systems may be effective at spotting a particular kind of attack but perform poorly against other attack types. Using an ensemble approach, weak classifiers are combined into stronger ones by training many classifiers, which are then chosen using a voting technique Salih et al., (2021).

Research challenges

Unavailability of a systematic dataset

The current study brought to light the absence of a current dataset that reflects novel attacks for contemporary networks. The systematic creation of a current dataset with sufficient examples of practically all attack types is one of the research problems for IDS. The dataset should be regularly updated to reflect the most recent intrusion instances and made available to the public to aid the research community.

Low performance in real-world environment

The effectiveness of IDS in a real-world setting is another study issue for them. Since the majority of the suggested approaches are examined and validated in a lab setting utilising openly available datasets Imrana et al., (2021).

Future trendsEfficient NIDS framework

The attack characteristics in a dataset should be updated often by the IDS framework, and the model should continue to be trained with the upgraded definitions to enable the model to learn new features. In the long run, this will help the IDS model detect zero-day threats more accurately and reduce false alarms.

Solution to complex models

The detection accuracy will be almost as accurate when only the essential features are chosen as when the full collection of features is used. As a result, the model will gradually become less complex and will require less real-time computer power.

Efficient NIDS for cyber-physical systems

It is necessary to have an effective and intelligent NIDS that can identify intrusions within networks that support UAVs. The use of AI in NIDS for UAV-enabled systems has the potential to be a fascinating study area, but it needs additional exploration and development.

Conclusions

To offer new researchers access to the most recent information, trends, and advancements in the area, this paper offers a thorough analysis of network intrusion detection systems based on ML and DL methodologies. The choice of pertinent publications in the area of AI-based NIDS is made using a methodical methodology. Future study in this area may focus on proposing an effective NIDS framework with less complicated DL algorithms and detection mechanisms. With the use of this knowledge, we will create a cutting-edge, portable, and effective machine learning- based NIDS in the future that will successfully identify network intruders.