Refereed journal article or data article (A1)

Finding novel relationships with integrated gene-gene association network analysis of Synechocystis sp. PCC 6803 using species-independent text-mining




List of AuthorsSanna M. Kreula, Suwisa Kaewphan, Filip Ginter, Patrik R. Jones

Publication year2018

JournalPeerJ

Journal acronymPeerJ

Article number29844966

Volume number6

Issue numbere4806

Number of pages26

ISSN2167-8359

eISSN2167-8359

DOIhttp://dx.doi.org/10.7717/peerj.4806

URLhttps://peerj.com/articles/4806/

Self-archived copy’s web addresshttps://research.utu.fi/converis/portal/detail/Publication/32074991


Abstract

The increasing move towards open access full-text scientific literature
enhances our ability to utilize advanced text-mining methods to
construct information-rich networks that no human will be able to grasp
simply from 'reading the literature'. The utility of text-mining for
well-studied species is obvious though the utility for less studied
species, or those with no prior track-record at all, is not clear. Here
we present a concept for how advanced text-mining can be used to create
information-rich networks even for less well studied species and apply
it to generate an open-access gene-gene association network resource for
Synechocystis sp. PCC 6803, a representative model organism for
cyanobacteria and first case-study for the methodology. By merging the
text-mining network with networks generated from species-specific
experimental data, network integration was used to enhance the accuracy
of predicting novel interactions that are biologically relevant. A
rule-based algorithm (filter) was constructed in order to automate the
search for novel candidate genes with a high degree of likely
association to known target genes by (1) ignoring established
relationships from the existing literature, as they are already 'known',
and (2) demanding multiple independent evidences for every novel and
potentially relevant relationship. Using selected case studies, we
demonstrate the utility of the network resource and filter to (i) discover novel candidate associations between different genes or proteins in the network, and (ii)
rapidly evaluate the potential role of any one particular gene or
protein. The full network is provided as an open-source resource.


Downloadable publication

This is an electronic reprint of the original article.
This reprint may differ from the original in pagination and typographic detail. Please cite the original version.




Last updated on 2022-07-04 at 16:55