A4 Refereed article in a conference publication

Classifying Web Exploits with Topic Modeling




AuthorsJukka Ruohonen

EditorsA Min Tjoa, Roland R. Wagner

Conference nameInternational Workshop on Database and Expert Systems Applications

Publication year2017

Book title Proceedings of the 28th International Workshop on Database and Expert Systems Applications (DEXA), 2017

First page 93

Last page97

Number of pages5

ISBN978-1-5386-2207-0

eISBN978-1-5386-1051-0

ISSN1529-4188

DOIhttps://doi.org/10.1109/DEXA.2017.35

Web address http://ieeexplore.ieee.org/document/8049693/

Self-archived copy’s web addresshttps://arxiv.org/abs/1710.05561


Abstract

This short empirical paper investigates how well topic modeling and database meta-data characteristics can classify web and other proof-of-concept (PoC) exploits for publicly disclosed software vulnerabilities. By using a dataset comprised of over 36 thousand PoC exploits, near a 0.9 accuracy rate is obtained in the empirical experiment. Text mining and topic modeling are a significant boost factor behind this classification performance. In addition to these empirical results, the paper contributes to the research tradition of enhancing software vulnerability information with text mining, providing also a few scholarly observations about the potential for semi-automatic classification of exploits in the existing tracking infrastructures.


Downloadable publication

This is an electronic reprint of the original article.
This reprint may differ from the original in pagination and typographic detail. Please cite the original version.





Last updated on 2024-26-11 at 10:43