Unknown

Dataset Information

0

Confronting the catalytic dark matter encoded by sequenced genomes.


ABSTRACT: The post-genomic era has provided researchers with a deluge of protein sequences. However, a significant fraction of the proteins encoded by sequenced genomes remains without an identified function. Here, we aim at determining how many enzymes of uncertain or unknown function are still present in the Saccharomyces cerevisiae and human proteomes. Using information available in the Swiss-Prot, BRENDA and KEGG databases in combination with a Hidden Markov Model-based method, we estimate that >600 yeast and 2000 human proteins (>30% of their proteins of unknown function) are enzymes whose precise function(s) remain(s) to be determined. This illustrates the impressive scale of the 'unknown enzyme problem'. We extensively review classical biochemical as well as more recent systematic experimental and computational approaches that can be used to support enzyme function discovery research. Finally, we discuss the possible roles of the elusive catalysts in light of recent developments in the fields of enzymology and metabolism as well as the significance of the unknown enzyme problem in the context of metabolic modeling, metabolic engineering and rare disease research.

SUBMITTER: Ellens KW 

PROVIDER: S-EPMC5714238 | biostudies-other | 2017 Nov

REPOSITORIES: biostudies-other

altmetric image

Publications

Confronting the catalytic dark matter encoded by sequenced genomes.

Ellens Kenneth W KW   Christian Nils N   Singh Charandeep C   Satagopam Venkata P VP   May Patrick P   Linster Carole L CL  

Nucleic acids research 20171101 20


The post-genomic era has provided researchers with a deluge of protein sequences. However, a significant fraction of the proteins encoded by sequenced genomes remains without an identified function. Here, we aim at determining how many enzymes of uncertain or unknown function are still present in the Saccharomyces cerevisiae and human proteomes. Using information available in the Swiss-Prot, BRENDA and KEGG databases in combination with a Hidden Markov Model-based method, we estimate that >600 y  ...[more]

Similar Datasets

| S-EPMC6567227 | biostudies-literature
| S-EPMC2709905 | biostudies-literature
| S-EPMC7767330 | biostudies-literature
| S-EPMC4533152 | biostudies-literature
| PRJEB72101 | ENA
| S-EPMC5799589 | biostudies-other
| S-EPMC2936541 | biostudies-literature
| S-EPMC471550 | biostudies-literature
| S-EPMC2974192 | biostudies-literature
| S-EPMC5395668 | biostudies-literature