Unknown

Dataset Information

0

Detection of gene annotations and protein-protein interaction associated disorders through transitive relationships between integrated annotations.


ABSTRACT: Increasingly high amounts of heterogeneous and valuable controlled biomolecular annotations are available, but far from exhaustive and scattered in many databases. Several annotation integration and prediction approaches have been proposed, but these issues are still unsolved. We previously created a Genomic and Proteomic Knowledge Base (GPKB) that efficiently integrates many distributed biomolecular annotation and interaction data of several organisms, including 32,956,102 gene annotations, 273,522,470 protein annotations and 277,095 protein-protein interactions (PPIs).By comprehensively leveraging transitive relationships defined by the numerous association data integrated in GPKB, we developed a software procedure that effectively detects and supplement consistent biomolecular annotations not present in the integrated sources. According to some defined logic rules, it does so only when the semantic type of data and of their relationships, as well as the cardinality of the relationships, allow identifying molecular biology compliant annotations. Thanks to controlled consistency and quality enforced on data integrated in GPKB, and to the procedures used to avoid error propagation during their automatic processing, we could reliably identify many annotations, which we integrated in GPKB. They comprise 3,144 gene to pathway and 21,942 gene to biological function annotations of many organisms, and 1,027 candidate associations between 317 genetic disorders and 782 human PPIs. Overall estimated recall and precision of our approach were 90.56 % and 96.61 %, respectively. Co-functional evaluation of genes with known function showed high functional similarity between genes with new detected and known annotation to the same pathway; considering also the new detected gene functional annotations enhanced such functional similarity, which resembled the one existing between genes known to be annotated to the same pathway. Strong evidence was also found in the literature for the candidate associations detected between Cystic fibrosis disorder and the PPIs between the CFTR_HUMAN, DERL1_HUMAN, RNF5_HUMAN, AHSA1_HUMAN and GOPC_HUMAN proteins, and between the CHIP_HUMAN and HSP7C_HUMAN proteins.Although identified gene annotations and PPI-genetic disorder candidate associations require biological validation, our approach intrinsically provides their in silico evidence based on available data. Public availability within the GPKB (http://www.bioinformatics.deib.polimi.it/GPKB/) of all identified and integrated annotations offers a valuable resource fostering new biomedical-molecular knowledge discoveries.

SUBMITTER: Masseroli M 

PROVIDER: S-EPMC4460591 | biostudies-literature | 2015

REPOSITORIES: biostudies-literature

altmetric image

Publications

Detection of gene annotations and protein-protein interaction associated disorders through transitive relationships between integrated annotations.

Masseroli Marco M   Canakoglu Arif A   Quigliatti Massimiliano M  

BMC genomics 20150601


<h4>Background</h4>Increasingly high amounts of heterogeneous and valuable controlled biomolecular annotations are available, but far from exhaustive and scattered in many databases. Several annotation integration and prediction approaches have been proposed, but these issues are still unsolved. We previously created a Genomic and Proteomic Knowledge Base (GPKB) that efficiently integrates many distributed biomolecular annotation and interaction data of several organisms, including 32,956,102 ge  ...[more]

Similar Datasets

| S-EPMC7739483 | biostudies-literature
| S-EPMC1941744 | biostudies-literature
2019-09-09 | GSE133948 | GEO
| S-EPMC1449908 | biostudies-other
| S-EPMC3823071 | biostudies-other
| S-EPMC4466863 | biostudies-literature
| S-EPMC506829 | biostudies-literature
| S-EPMC3380739 | biostudies-literature
| S-EPMC2837030 | biostudies-literature
| S-EPMC4239042 | biostudies-literature