A4 Refereed article in a conference publication
Iterative Verification and Batch Processing for Enhancing Accuracy and Confidence Computation in LLM-Based Phishing Email Detection
Authors: Adeseye, Aisvarya; Isoaho, Jouni
Editors: N/A
Conference name: International Conference on AI in Cybersecurity
Publication year: 2026
Book title : 2026 IEEE 5th International Conference on AI in Cybersecurity (ICAIC)
First page : 1
Last page: 7
ISBN: 978-1-6654-7762-8
eISBN: 978-1-6654-7761-1
DOI: https://doi.org/10.1109/ICAIC67076.2026.11395814
Publication's open availability at the time of reporting: No Open Access
Publication channel's open availability : No Open Access publication channel
Web address : https://ieeexplore.ieee.org/document/11395814
Large Language Models (LLMs) are preferred over traditional machine learning because they analyze unstructured email text without requiring extensive feature engineering, large labeled datasets, or specialized domain expertise. However, outputs can be unstable or hallucinated. Hence, iterative verification prompting is required to refine and validate responses, especially for smaller local models. Confidence scoring is important for decision-making, yet direct confidence estimation (DCE) by LLMs is unreliable; therefore, a structured confidence metric computation is required to improve accuracy and trustworthiness. Consequently, this study proposes an iterative verification and batch processing framework integrated with a Weighted Factor-Based Confidence Computation (WFBCC) approach. Specifically, iterative verification prompting focuses on repeatedly validating LLM outputs to remove unstable or hallucinated results, improving per-email classification reliability. Contrariwise, batch processing analyzes emails in smaller groups to limit memory consumption and maintain inference stability. WFBCC was implemented by independently scoring 8 phishing indicators via the LLMs and combining these scores with predefined weights to generate transparent, class-conditional confidence values. The proposed framework was evaluated on a dataset with 2,000 emails using GPT-5.1 and LLaMA models (8B, 3B, and 1B). Classification performance reached 99.9% for GPT-5, 97.3% for LLaMA-8B, 97.15% for LLaMA-3B, and 96.95% for LLaMA-1B, representing an accuracy gain of approximately 32% to 78% compared to baseline. Batch processing contributed more to accuracy improvements than verification prompting alone, while combining both yielded the strongest results. Additionally, the WFBCC method consistently outperforms DCE across all confidence levels, with the largest gains observed in smaller LLMs.