Toward validation of textual information retrieval techniques for software weaknesses
: Jukka Ruohonen, Ville Leppänen
: Mourad Elloumi, Michael Granitzer, Abdelkader Hameurlain, Christin Seifert, Benno Stein, A Min Tjoa, Roland Wagner
: International Conference on Database and Expert Systems Applications
Publisher: Springer Verlag
: 2018
: Communications in Computer and Information Science
: Database and Expert Systems Applications: DEXA 2018 International Workshops, BDMICS, BIOKDD, and TIR, Regensburg, Germany, September 3–6, 2018, Proceedings
: Communications in Computer and Information Science
: Communications in Computer and Information Science
: 903
: 265
: 277
: 13
: 978-3-319-99132-0
: 978-3-319-99133-7
: 1865-0929
DOI: https://doi.org/10.1007/978-3-319-99133-7_22
This paper presents a preliminary validation of common textual
information retrieval techniques for mapping unstructured software
vulnerability information to distinct software weaknesses. The
validation is carried out with a dataset compiled from four software
repositories tracked in the Snyk vulnerability database. According to
the results, the information retrieval techniques used perform
unsatisfactorily compared to regular expression searches. Although the
results vary from a repository to another, the preliminary validation
presented indicates that explicit referencing of vulnerability and
weakness identifiers is preferable for concrete vulnerability tracking.
Such referencing allows the use of keyword-based searches, which
currently seem to yield more consistent results compared to information
retrieval techniques. Further validation work is required for improving
the precision of the techniques, however.