Unknown

Dataset Information

0

Flow-Data Gathering Using NetFlow Sensors for Fitting Malicious-Traffic Detection Models.


ABSTRACT: Advanced persistent threats (APTs) are a growing concern in cybersecurity. Many companies and governments have reported incidents related to these threats. Throughout the life cycle of an APT, one of the most commonly used techniques for gaining access is network attacks. Tools based on machine learning are effective in detecting these attacks. However, researchers usually have problems with finding suitable datasets for fitting their models. The problem is even harder when flow data are required. In this paper, we describe a framework to gather flow datasets using a NetFlow sensor. We also present the Docker-based framework for gathering netflow data (DOROTHEA), a Docker-based solution implementing the above framework. This tool aims to easily generate taggable network traffic to build suitable datasets for fitting classification models. In order to demonstrate that datasets gathered with DOROTHEA can be used for fitting classification models for malicious-traffic detection, several models were built using the model evaluator (MoEv), a general-purpose tool for training machine-learning algorithms. After carrying out the experiments, four models obtained detection rates higher than 93%, thus demonstrating the validity of the datasets gathered with the tool.

SUBMITTER: Campazas-Vega A 

PROVIDER: S-EPMC7766632 | biostudies-literature | 2020 Dec

REPOSITORIES: biostudies-literature

altmetric image

Publications

Flow-Data Gathering Using NetFlow Sensors for Fitting Malicious-Traffic Detection Models.

Campazas-Vega Adrián A   Crespo-Martínez Ignacio Samuel IS   Guerrero-Higueras Ángel Manuel ÁM   Fernández-Llamas Camino C  

Sensors (Basel, Switzerland) 20201218 24


Advanced persistent threats (APTs) are a growing concern in cybersecurity. Many companies and governments have reported incidents related to these threats. Throughout the life cycle of an APT, one of the most commonly used techniques for gaining access is network attacks. Tools based on machine learning are effective in detecting these attacks. However, researchers usually have problems with finding suitable datasets for fitting their models. The problem is even harder when flow data are require  ...[more]

Similar Datasets

| S-EPMC9097970 | biostudies-literature
| S-EPMC9158227 | biostudies-literature
| S-EPMC4879632 | biostudies-literature
| S-EPMC5933773 | biostudies-literature
| S-EPMC5995745 | biostudies-literature
| S-EPMC6812335 | biostudies-literature
| S-EPMC8437788 | biostudies-literature
| S-EPMC6275108 | biostudies-literature
| S-EPMC6928227 | biostudies-literature
| S-EPMC2922975 | biostudies-literature