Template-free Data-to-Text Generation of Finnish Sports News




Jenna Kanerva, Samuel Rönnqvist, Riina Kekki, Tapio Salakoski, Filip Ginter

Mareike Hartmann, Barbara Plank

Nordic Conference on Computational Linguistics

Linköping

2019

Linköping Electronic Conference Proceedings

Proceedings of the 22nd Nordic Conference on Computational Linguistics (NoDaLiDa), September 30–October 2, Turku, Finland

NEALT Proceedings Series

42

242

252

978-91-7929-995-8

https://www.aclweb.org/anthology/W19-6125/

https://research.utu.fi/converis/portal/detail/Publication/44407121



News articles such as sports game reports
are often thought to closely follow the underlying game statistics, but in practice
they contain a notable amount of background knowledge, interpretation, insight
into the game, and quotes that are not
present in the official statistics. This
poses a challenge for automated data-totext news generation with real-world news
corpora as training data. We report on
the development of a corpus of Finnish
ice hockey news, edited to be suitable
for training of end-to-end news generation
methods, as well as demonstrate generation of text, which was judged by journalists to be relatively close to a viable product. The new dataset and system source
code are available for research purposes.


Last updated on 2024-26-11 at 21:19