Toward validation of textual information retrieval techniques for software weaknesses




Jukka Ruohonen, Ville Leppänen

Mourad Elloumi, Michael Granitzer, Abdelkader Hameurlain, Christin Seifert, Benno Stein, A Min Tjoa, Roland Wagner

International Conference on Database and Expert Systems Applications

PublisherSpringer Verlag

2018

Communications in Computer and Information Science

Database and Expert Systems Applications: DEXA 2018 International Workshops, BDMICS, BIOKDD, and TIR, Regensburg, Germany, September 3–6, 2018, Proceedings

Communications in Computer and Information Science

Communications in Computer and Information Science

903

265

277

13

978-3-319-99132-0

978-3-319-99133-7

1865-0929

DOIhttps://doi.org/10.1007/978-3-319-99133-7_22



This paper presents a preliminary validation of common textual
information retrieval techniques for mapping unstructured software
vulnerability information to distinct software weaknesses. The
validation is carried out with a dataset compiled from four software
repositories tracked in the Snyk vulnerability database. According to
the results, the information retrieval techniques used perform
unsatisfactorily compared to regular expression searches. Although the
results vary from a repository to another, the preliminary validation
presented indicates that explicit referencing of vulnerability and
weakness identifiers is preferable for concrete vulnerability tracking.
Such referencing allows the use of keyword-based searches, which
currently seem to yield more consistent results compared to information
retrieval techniques. Further validation work is required for improving
the precision of the techniques, however.



Last updated on 2024-26-11 at 20:08