A1 Refereed original research article in a scientific journal

Covariance matrix estimation for left-censored data




AuthorsMaiju Pesonen, Henri Pesonen, Jaakko Nevalainen

PublisherELSEVIER SCIENCE BV

Publication year2015

JournalComputational Statistics and Data Analysis

Journal name in sourceCOMPUTATIONAL STATISTICS & DATA ANALYSIS

Journal acronymCOMPUT STAT DATA AN

Volume92

First page 13

Last page25

Number of pages13

ISSN0167-9473

DOIhttps://doi.org/10.1016/j.csda.2015.06.005


Abstract

Multivariate methods often rely on a sample covariance matrix. The conventional estimators of a covariance matrix require complete data vectors on all subjects an assumption that can frequently not be met. For example, in many fields of life sciences that are utilizing modern measuring technology, such as mass spectrometry, left-censored values caused by denoising the data are a commonplace phenomena. Left-censored values are low-level concentrations that are considered too imprecise to be reported as a single number but known to exist somewhere between zero and the laboratory's lower limit of detection. Maximum likelihood-based covariance matrix estimators that allow the presence of the left-censored values without substituting them with a constant or ignoring them completely are considered. The presented estimators efficiently use all the information available and thus, based on simulation studies, produce the least biased estimates compared to often used competing estimators. As the genuine maximum likelihood estimate can be solved fast only in low dimensions, it is suggested to estimate the covariance matrix element-wise and then adjust the resulting covariance matrix to achieve positive semi-definiteness. It is shown that the new approach succeeds in decreasing the computation times substantially and still produces accurate estimates. Finally, as an example, a left-censored data set of toxic chemicals is explored.




Last updated on 2024-26-11 at 10:48