A1 Refereed original research article in a scientific journal
GPT-4V shows human-like social perceptual capabilities at phenomenological and neural levels
Authors: Santavirta, Severi; Wu, Yuhang; Suominen, Lauri; Nummenmaa, Lauri
Publisher: MIT Press
Publication year: 2025
Journal: Imaging neuroscience
Article number: IMAG.a.134
Volume: 3
eISSN: 2837-6056
DOI: https://doi.org/10.1162/IMAG.a.134
Web address : https://doi.org/10.1162/imag.a.134
Self-archived copy’s web address: https://research.utu.fi/converis/portal/detail/Publication/500007546
Humans navigate the social world by rapidly perceiving social features from other people and their interaction. Recently, large-language models (LLMs) have achieved high-level visual capabilities for detailed object and scene content recognition and description. This raises the question whether LLMs can infer complex social information from images and videos, and whether the high-dimensional structure of the feature annotations aligns with that of humans. We collected evaluations for 138 social features from GPT-4V for images (N = 468) and videos (N = 234) that are derived from social movie scenes. These evaluations were compared with human evaluations (N = 2,254). The comparisons established that GPT-4V can achieve human-like capabilities at annotating individual social features. The GPT-4V social feature annotations also express similar structural representation compared to the human social perceptual structure (i.e., similar correlation matrix over all social feature annotations). Finally, we modeled hemodynamic responses (N = 97) to viewing socioemotional movie clips with feature annotations by human observers and GPT-4V. These results demonstrated that GPT-4V based stimulus models can also reveal the social perceptual network in the human brain highly similar to the stimulus models based on human annotations. These human-like annotation capabilities of LLMs could have a wide range of real-life applications ranging from health care to business and would open exciting new avenues for psychological and neuroscientific research.
Downloadable publication This is an electronic reprint of the original article. |
Funding information in the publication:
The study was supported by European Research Council (advanced grant #101141656), Jane and Aatos Erkko foundation, and Sigrid och Ane Gyllenberg's Stiftelse grands to L.N. Turku University Foundation, Alfred Kordelin Foundation, and Finnish Governmental Research Funding for Turku University Hospital and for the Western Finland collaborative area grants supported S.S. in this project. We would like to express our gratitude to Haoming Zhong, a graduate student at National Key Laboratory for Novel Software Technology, Nanjing University for his valuable insights, inspiration, and programming assistance on this project.