A1 Vertaisarvioitu alkuperäisartikkeli tieteellisessä lehdessä
Using Statistical Models of Morphology in the Search for Optimal Units of Representation in the Human Mental Lexicon
Tekijät: Virpioja S, Lehtonen M, Hultén A, Kivikari H, Salmelin R, Lagus K
Kustantaja: Wiley Blackwell
Julkaisuvuosi: 2018
Journal: Cognitive Science
Tietokannassa oleva lehden nimi: Cognitive Science
Vuosikerta: 42
Numero: 3
Sivujen määrä: 35
ISSN: 0364-0213
DOI: https://doi.org/10.1111/cogs.12576
Verkko-osoite: https://researchportal.helsinki.fi/en/publications/5bf65051-f839-400a-bb12-e8d6eefb16d5
Tiivistelmä
Determining optimal units of representing morphologically complex words in the mental lexicon is a central question in psycholinguistics. Here, we utilize advances in computational sciences to study human morphological processing using statistical models of morphology, particularly the unsupervised Morfessor model that works on the principle of optimization. The aim was to see what kind of model structure corresponds best to human word recognition costs for multimorphemic Finnish nouns: a model incorporating units resembling linguistically defined morphemes, a whole-word model, or a model that seeks for an optimal balance between these two extremes. Our results showed that human word recognition was predicted best by a combination of two models: a model that decomposes words at some morpheme boundaries while keeping others unsegmented and a whole-word model. The results support dual-route models that assume that both decomposed and full-form representations are utilized to optimally process complex words within the mental lexicon.
Determining optimal units of representing morphologically complex words in the mental lexicon is a central question in psycholinguistics. Here, we utilize advances in computational sciences to study human morphological processing using statistical models of morphology, particularly the unsupervised Morfessor model that works on the principle of optimization. The aim was to see what kind of model structure corresponds best to human word recognition costs for multimorphemic Finnish nouns: a model incorporating units resembling linguistically defined morphemes, a whole-word model, or a model that seeks for an optimal balance between these two extremes. Our results showed that human word recognition was predicted best by a combination of two models: a model that decomposes words at some morpheme boundaries while keeping others unsegmented and a whole-word model. The results support dual-route models that assume that both decomposed and full-form representations are utilized to optimally process complex words within the mental lexicon.