Benchmarking Large Language Models for Reputational Risk Assessment in Supply Chains: A Case Study - UTU Research Portal

A4 Refereed article in a conference publication

Benchmarking Large Language Models for Reputational Risk Assessment in Supply Chains: A Case Study

Authors: Davoodi, Laleh; Salimi, Sima; Jyote, Abul Khair; Mezei, Jozsef; Ginter, Filip

Editors: Potter, Andrew; Pawar, Kulwant S.; Kalverkamp, Matthias; Rogers, Helen

Conference name: International Symposium on Logistics

Publication year: 2025

Book title : Proceedings of the 29th International Symposium on Logistics (2025) : Embedding Circularity in Supply Chains

First page : 272

Last page: 279

ISBN: 978-0-85358-354-7

Publication's open availability at the time of reporting: No Open Access

Publication channel's open availability : No Open Access publication channel

Web address : https://www.islconf.org/29th-isl-wiesbaden-germany-2025/

Abstract

Risk management is essential for maintaining efficient supply chain operations amid uncertainties, as disruptions can lead to severe financial and operational consequences. While traditional supply chain risk management (SCRM) strategies focus on mitigating operational risks, reputational risk remains an overlooked yet critical factor influencing supply chain resilience. Recent advancements in machine learning (ML) and large language models (LLMs) have significantly improved risk identification, inventory management, and supply chain transparency. Studies have demonstrated the effectiveness of models such as GPT-4, GPT-3.5-Turbo, and fine-tuned RoBERTa in analysing supply chain risks using prompt engineering and retrieval-augmented generation (RAG). However, existing ML studies lack a dedicated focus on reputational risk in supply chains, particularly in realtime news article analysis.

Additionally, there is no systematic benchmarking of different LLMs in SCRM applications under identical conditions. Moreover, emerging LLMs such as DeepSeek and LLaMA3-70B remain underexplored in SCRM contexts, especially considering the rapid advancements in the field and the recent emergence of highly capable open models. This study addresses these gaps by exploring an LLM-driven framework for reputational risk assessment using news article analysis and conducting the first comparative evaluation of multiple LLMs for supply chain risk detection. The findings aim to enhance AI-driven risk management strategies and provide insights into the effectiveness of LLMs in improving supply chain resilience. Moreover, this study establishes a foundation for future research by systematically evaluating LLMs for reputational risk assessment in supply chains, providing researchers with a clear starting point for exploring different LLMs in risk management research.

Funding information in the publication:
This research was conducted as part of the AI-SIM project, which is funded by Business Finland. We sincerely appreciate the support provided by our research collaborators and funding partners