A4 Vertaisarvioitu artikkeli konferenssijulkaisussa

To Compress or not to Compress? A Finite-State Approach to Nen Verbal Morphology




TekijätSaliha Muradoglu, Nicholas Evans, Hanna Suominen

Konferenssin vakiintunut nimiAnnual Meeting of the Association for Computational Linguistics

Julkaisuvuosi2020

Kokoomateoksen nimiProceedings of the 58th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop

Aloitussivu207

Lopetussivu213

Sivujen määrä7

ISBN978-1-952148-03-3

DOIhttps://doi.org/10.18653/v1/2020.acl-srw.28


Tiivistelmä

This paper describes the development of a verbal morphological parser
for an under-resourced Papuan language, Nen. Nen verbal morphology is
particularly complex, with a transitive verb taking up to 1,740 unique
features. The structural properties exhibited by Nen verbs raises
interesting choices for analysis. Here we compare two possible methods
of analysis: ‘Chunking’ and decomposition. ‘Chunking’ refers to the
concept of collating morphological segments into one, whereas the
decomposition model follows a more classical linguistic approach. Both
models are built using the Finite-State Transducer toolkit foma. The
resultant architecture shows differences in size and structural clarity.
While the ‘Chunking’ model is under half the size of the full
de-composed counterpart, the decomposition displays higher structural
order. In this paper, we describe the challenges encountered when
modelling a language exhibiting distributed exponence and present the
first morphological analyser for Nen, with an overall accuracy of 80.3%.



Last updated on 2024-26-11 at 19:43