Risk Detection in E-commerce with LLMs: Annotation Challenges and Lessons from Real-World Business News
: Davoodi, Laleh; Salimi, Sima; Ginter, Filip; Lorentz, Harri
: Achilleos, Achilleas; Forti, Stefano; Papadopoulos, George Angelos; Pappas, Ilias
: IFIP Conference on e-Business, eServices, and e-Society
Publisher: Springer Science and Business Media Deutschland GmbH
: 2025
Lecture Notes in Computer Science
: Pervasive Digital Services for People’s Well-Being, Inclusion and Sustainable Development : 24th IFIP WG 6.11 Conference on e-Business, e-Services and e-Society, I3E 2025, Limassol, Cyprus, September 9–11, 2025, Proceedings
: 16079
: 146
: 160
: 978-3-032-06163-8
: 978-3-032-06164-5
: 0302-9743
: 1611-3349
DOI: https://doi.org/10.1007/978-3-032-06164-5_11
: https://link.springer.com/chapter/10.1007/978-3-032-06164-5_11
: ttps://research.utu.fi/converis/portal/detail/Publication/505553667
The growing complexity of e-commerce supply chains has amplified the need for effective risk monitoring systems. While Large Language Models (LLMs) have demonstrated potential in various domains, their application to real-world risk detection in e-commerce remains underexplored. This study introduces a novel, manually annotated dataset of 121 business news articles covering five major e-commerce-related steel companies, ArcelorMittal, Tata Steel, POSCO, NLMK, and ThyssenKrupp, annotated using the Cambridge Risk Taxonomy. We evaluate the performance of two advanced LLMs in detecting and classifying risks across multiple categories using few-shot prompting and semantic similarity-based example selection. Our results show that LLMs can approximate human annotation with moderate micro F1-scores and high coverage, though challenges remain in recognizing complex Geopolitical risks and avoiding overgeneralization. The findings provide actionable insights into the potential and limitations of LLMs for automated, domain-aware risk monitoring, laying the groundwork for future applications in supply chain risk management.
:
We gratefully acknowledge the financial support provided by Business Finland, as this research was carried out within the framework of the AI-SIM project.