A1 Vertaisarvioitu alkuperäisartikkeli tieteellisessä lehdessä
MicFunPred: A conserved approach to predict functional profiles from 16S rRNA gene sequence data
Tekijät: Mongad DS, Chavan NS, Narwade NP, Dixit K, Shouche YS, Dhotre DP
Kustantaja: ACADEMIC PRESS INC ELSEVIER SCIENCE
Julkaisuvuosi: 2021
Journal: Genomics
Tietokannassa oleva lehden nimi: GENOMICS
Lehden akronyymi: GENOMICS
Vuosikerta: 113
Numero: 6
Aloitussivu: 3635
Lopetussivu: 3643
Sivujen määrä: 9
ISSN: 0888-7543
DOI: https://doi.org/10.1016/j.ygeno.2021.08.016
Tiivistelmä
The 16S rRNA gene amplicon sequencing is a popular technique that provides accurate characterization of microbial taxonomic abundances but does not provide any functional information. Several tools are available to predict functional profiles based on 16S rRNA gene sequence data that use different genome databases and approaches. As variable regions of partially-sequenced 16S rRNA gene cannot resolve taxonomy accurately beyond the genus level, these tools may give inflated results. Here, we developed 'MicFunPred', which uses a novel approach to derive imputed metagenomes based on a set of core genes only, thereby minimizing falsepositive predictions. On simulated datasets, MicFunPred showed the lowest False Positive Rate (FPR) with mean Spearman's correlation of 0.89 (SD = 0.03), while on seven real datasets the mean correlation was 0.75 (SD = 0.08). MicFunPred was found to be faster with low computational requirements and performed better or comparable when compared with other tools.
The 16S rRNA gene amplicon sequencing is a popular technique that provides accurate characterization of microbial taxonomic abundances but does not provide any functional information. Several tools are available to predict functional profiles based on 16S rRNA gene sequence data that use different genome databases and approaches. As variable regions of partially-sequenced 16S rRNA gene cannot resolve taxonomy accurately beyond the genus level, these tools may give inflated results. Here, we developed 'MicFunPred', which uses a novel approach to derive imputed metagenomes based on a set of core genes only, thereby minimizing falsepositive predictions. On simulated datasets, MicFunPred showed the lowest False Positive Rate (FPR) with mean Spearman's correlation of 0.89 (SD = 0.03), while on seven real datasets the mean correlation was 0.75 (SD = 0.08). MicFunPred was found to be faster with low computational requirements and performed better or comparable when compared with other tools.