Abstract
Cyberattacks have become one of the most significant security threats that have emerged in the last couple of years. It is imperative to comprehend such attacks; thus, analyzing various kinds of cyberattack datasets assists in constructing the precise intrusion detection models. This paper tries to analyze many of the available cyberattack datasets and compare them with many of the fields that are used to detect and predict cyberattack, like the Internet of Things (IoT) traffic-based, network traffic-based, cyber-physical system, and web traffic-based. In the present paper, an overview of each of them is provided, as well as the course of machine learning that has employed these datasets. From this survey, the researchers and the cybersecurity professional can derive a convenient classification of these datasets and their usages based on reviewing recent papers in this field. Furthermore, the types of machine learning involved in such systems as well as the intrusion detection and anomaly detection systems used to learn the models are presented in this paper. These techniques include deep learning models, random forests, support vector machines, and other commonly applied methods. Each technique has its advantages and limitations in the context of cyberattack prediction and detection. The paper is also consider factors like the specific technique and tools used, the type of attacks taken and the accuracy rate achieved. Of a total of 85 papers, 34 were selected for review in this paper. This survey is an essential tool for improving knowledge about the state of cyber detection and prediction techniques today.
Recommended Citation
Al-zubidi, Azhar F.; Farhan, Alaa Kadhim; and El-Kenawy, El-Sayed M.
(2024)
"Surveying Machine Learning in Cyberattack Datasets: A Comprehensive Analysis,"
Journal of Soft Computing and Computer Applications: Vol. 1:
Iss.
1, Article 1000.
Available at:
https://jscca.uotechnology.edu.iq/jscca/vol1/iss1/1