LLM-Assisted Codebook Development for Cybersecurity Interviews with Enhanced Accuracy and Reduced Hallucination - UTU Research Portal

A4 Refereed article in a conference publication

LLM-Assisted Codebook Development for Cybersecurity Interviews with Enhanced Accuracy and Reduced Hallucination

Authors: Adeseye, Aisvarya; Isoaho, Jouni; Virtanen, Seppo; Mohammad, Tahir

Editors: N/A

Conference name: International Conference on AI in Cybersecurity

Publisher: IEEE

Publication year: 2026

Book title : 2026 IEEE 5th International Conference on AI in Cybersecurity (ICAIC)

First page : 1

Last page: 6

ISBN: 978-1-6654-7762-8

eISBN: 978-1-6654-7761-1

DOI: https://doi.org/10.1109/ICAIC67076.2026.11395872

Publication's open availability at the time of reporting: No Open Access

Publication channel's open availability : No Open Access publication channel

Web address : https://ieeexplore.ieee.org/document/11395872

Abstract

Beyond what numerical data captures, qualitative cybersecurity interviews reveal human behaviors, lived experiences, trust perceptions and decision-making patterns. However, today’s current manual and software-assisted coding is slow, difficult to scale and subjective when distinguishing expert and non-expert perspectives. Consequent, recent development of Large Language Models (LLMs) makes them useful for qualitative analysis, but larger models remain costly despite lower hallucination, while smaller models alternatives are cheaper but less reliable. A codebook plays an essential role in structuring themes and interpreting qualitative data transparently and consistently. Therefore, this study proposes an LLM-assisted architecture to generate traceable and hierarchically structured codebooks from cybersecurity interviews. Five techniques were grouped into three areas: accuracy improvement, hallucination reduction, and reduction of context memory usage. These techniques were applied to measure performance, reliability, and coding quality from seven LLMs of various parameter sizes. The architecture produced an accurate codebook that improved coding reliability by up to 75% for non-expert and 35% for experts when compared to baseline manual extraction. Reduction of contextual memory use increased processing efficiency by over 40%, enabling even 1B–3B models to run effectively. Hallucination dropped by 82%, which demonstrates that trustworthy qualitative codes can be generated by small and mid-sized LLMs.