https://www.mdu.se/

mdu.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
INTRUSION DETECTION AND ATTACK CLASSIFICATION WITH FEATURE SELECTION: A COMPARISON OF UNSUPERVISED MACHINE LEARNING ALGORITHMS
Mälardalen University, School of Innovation, Design and Engineering.
2022 (English)Independent thesis Basic level (degree of Bachelor), 10 credits / 15 HE creditsStudent thesis
Abstract [en]

Intrusion detection is the process of identifying malicious activity in network traffic. When a cyberattack is detected, countermeasures can be taken to minimize the damage of the attack, which makes intrusion detection valuable for network security. One way to detect malicious traffic is by using a machine learning algorithm. Supervised machine learning algorithms are commonly used, but they require labeled data in order to function. Often, labeled data are not available. Unsupervised machine learning algorithms provide an alternative approach which does not require data to be labeled. This thesis focuses on unsupervised machine learning algorithms. Another challenge is that network traffic data often contain many features. Decreasing the number of features can speed up the algorithms that are used on the data, and if redundant features are removed the accuracy of the algorithms may improve. The process of selecting a subset of features is known as Feature Selection, and is explored in this work.

This thesis compares the unsupervised algorithms K-means, Mini Batch K-means, Gaussian Mixture Model, DBSCAN, BIRCH, Isolation Forest, and One-Class Support Vector Machine on the intrusion detection dataset UNSW-NB15. The algorithms are evaluated in two separate experiments designed to measure their clustering and classification ability. For comparison, three supervised algorithms are included in the experiments, namely K-Nearest Neighbors, Random Forest, and Support Vector Machine. The experiments are performed with all features, and with a feature subset selected through Feature Selection with a Genetic Algorithm. Results for the unsupervised algorithms show that Gaussian Mixture Model performs the best for clustering, while BIRCH and Mini Batch K-means perform the best for classification. The supervised algorithms outperform the unsupervised ones in all of the experiments. Additionally, Feature selection is found to improve the performance of the unsupervised algorithms.

Place, publisher, year, edition, pages
2022. , p. 26
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:mdh:diva-59077OAI: oai:DiVA.org:mdh-59077DiVA, id: diva2:1670140
Subject / course
Computer Science
Supervisors
Examiners
Available from: 2022-06-20 Created: 2022-06-15 Last updated: 2022-06-20Bibliographically approved

Open Access in DiVA

No full text in DiVA

By organisation
School of Innovation, Design and Engineering
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar

urn-nbn

Altmetric score

urn-nbn
Total: 190 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf