A1 Refereed original research article in a scientific journal
Lects in Helsinki Finnish-a probabilistic component modeling approach
Authors: Kuparinen, Olli; Peltonen, Jaakko; Mustanoja, Liisa; Leino, Unni; Santaharju, Jenni
Publisher: CAMBRIDGE UNIV PRESS
Publishing place: CAMBRIDGE
Publication year: 2021
Journal: Language Variation and Change
Journal name in source: LANGUAGE VARIATION AND CHANGE
Journal acronym: LANG VAR CHANGE
Article number: PII S0954394521000041
Volume: 33
Issue: 1
First page : 1
Last page: 26
Number of pages: 26
ISSN: 0954-3945
eISSN: 1469-8021
DOI: https://doi.org/10.1017/S0954394521000041
Abstract
This article examines Finnish lects spoken in Helsinki from the 1970s to the 2010s with a probabilistic model called Latent Dirichlet Allocation. The model searches for underlying components based on the linguistic features used in the interviews. Several coherent lects were discovered as components in the data, which counters the results of previous studies that report only weak covariation between features that are assumed to be present in the same lect. The speakers, however, are not categorical in their linguistic behavior and tend to use more than one lect in their speech. This implies that the lects should not be considered in parallel with seemingly uniform linguistic systems such as languages, but as partial systems that constitute a network.
This article examines Finnish lects spoken in Helsinki from the 1970s to the 2010s with a probabilistic model called Latent Dirichlet Allocation. The model searches for underlying components based on the linguistic features used in the interviews. Several coherent lects were discovered as components in the data, which counters the results of previous studies that report only weak covariation between features that are assumed to be present in the same lect. The speakers, however, are not categorical in their linguistic behavior and tend to use more than one lect in their speech. This implies that the lects should not be considered in parallel with seemingly uniform linguistic systems such as languages, but as partial systems that constitute a network.