Contrastive Language-Entity Pre-training for Richer Knowledge Graph Embedding - UTU Tutkimustietojärjestelmä

A4 Vertaisarvioitu artikkeli konferenssijulkaisussa

Contrastive Language-Entity Pre-training for Richer Knowledge Graph Embedding

Tekijät: Papaluca Andrea; Krefl Daniel; Lensky Artem; Suominen Hanna

Toimittaja: Wallraven, Christian; Liu, Cheng-Lin; Ross, Arun

Konferenssin vakiintunut nimi: International Conference on Pattern Recognition and Artificial Intelligence

Kustantaja: Springer Nature Singapore

Julkaisuvuosi: 2025

Lehti:: Lecture Notes in Computer Science

Kokoomateoksen nimi: Pattern Recognition and Artificial Intelligence: 4th International Conference, ICPRAI 2024, Jeju Island, South Korea, South Korea, July 03–06, 2024, Proceedings, Part I

Tietokannassa oleva lehden nimi: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Vuosikerta: 14892

Aloitussivu: 233

Lopetussivu: 246

ISBN: 978-981-97-8701-2

eISBN: 978-981-97-8702-9

ISSN: 0302-9743

eISSN: 1611-3349

DOI: https://doi.org/10.1007/978-981-97-8702-9_16

Verkko-osoite: https://doi.org/10.1007/978-981-97-8702-9_16

Tiivistelmä

In this work we propose a pretraining procedure that aligns a graph encoder and a text encoder to learn a common multi-modal graph-text embedding space. The alignment is obtained by training a model to predict the correct associations between Knowledge Graph nodes and their corresponding descriptions. We test the procedure with two popular Knowledge Bases: Wikidata (formerly Freebase) and YAGO. Our results indicate that such a pretraining method allows for link prediction without the need for additional fine-tuning. Furthermore, we demonstrate that a graph encoder pretrained on the description matching task allows for improved link prediction performance after fine-tuning, without the need for providing node descriptions as additional inputs. We make available the code used in the experiments on GitHub(https://github.com/BrunoLiegiBastonLiegi/CLEP) under the MIT license to encourage further work.