Enhancing disease clustering through symptom-based analysis and large language model interpretations
: Onojete, Efe; Ibeke, Ebuka; Ezenkwu, Chinedu Pascal; Iwendi, Celestine; Ben Dhaou, Imed
Publisher: Springer Nature
: 2025
Scientific Reports
: 36651
: 15
: 2045-2322
DOI: https://doi.org/10.1038/s41598-025-20382-2
: https://doi.org/10.1038/s41598-025-20382-2
: https://research.utu.fi/converis/portal/detail/Publication/505458384
Humans face various diseases that are mainly caused by environmental conditions and living habits. These diseases exhibit several symptoms and can share a relationship based on their symptoms. The identification and interpretation of these groups of symptom-based diseases can aid in developing treatment plans for a new outbreak of disease. This research explores the intersection of machine learning and healthcare, specifically focusing on the enhancement of disease classification through symptom-based cluster analysis. By leveraging unsupervised machine learning algorithms, patterns and relationships within diverse symptom datasets were identified, revealing novel associations and subtypes in disease manifestation. The integration of a Large Language Model (LLM), specifically OpenAI’s Generative Pretrained Transformer(GPT), played a pivotal role in interpreting and communicating the complex outputs of the machine learning process. The results indicated a significant improvement in defining distinct clusters based on the relationship between diseases and symptoms, with GPT-4o providing simplified explanations that bridge the gap between machine-generated insights and healthcare professional’s understanding. The study’s findings offer a more profound understanding of the distinctive features characterising the different clusters of diseases generated by the machine learning models.
:
There was no funding to complete this research.