Unknown

Dataset Information

0

The y-ome defines the 35% of Escherichia coli genes that lack experimental evidence of function.


ABSTRACT: Experimental studies of Escherichia coli K-12 MG1655 often implicate poorly annotated genes in cellular phenotypes. However, we lack a systematic understanding of these genes. How many are there? What information is available for them? And what features do they share that could explain the gap in our understanding? Efforts to build predictive, whole-cell models of E. coli inevitably face this knowledge gap. We approached these questions systematically by assembling annotations from the knowledge bases EcoCyc, EcoGene, UniProt and RegulonDB. We identified the genes that lack experimental evidence of function (the 'y-ome') which include 1600 of 4623 unique genes (34.6%), of which 111 have absolutely no evidence of function. An additional 220 genes (4.7%) are pseudogenes or phantom genes. y-ome genes tend to have lower expression levels and are enriched in the termination region of the E. coli chromosome. Where evidence is available for y-ome genes, it most often points to them being membrane proteins and transporters. We resolve the misconception that a gene in E. coli whose primary name starts with 'y' is unannotated, and we discuss the value of the y-ome for systematic improvement of E. coli knowledge bases and its extension to other organisms.

SUBMITTER: Ghatak S 

PROVIDER: S-EPMC6412132 | biostudies-literature | 2019 Mar

REPOSITORIES: biostudies-literature

altmetric image

Publications

The y-ome defines the 35% of Escherichia coli genes that lack experimental evidence of function.

Ghatak Sankha S   King Zachary A ZA   Sastry Anand A   Palsson Bernhard O BO  

Nucleic acids research 20190301 5


Experimental studies of Escherichia coli K-12 MG1655 often implicate poorly annotated genes in cellular phenotypes. However, we lack a systematic understanding of these genes. How many are there? What information is available for them? And what features do they share that could explain the gap in our understanding? Efforts to build predictive, whole-cell models of E. coli inevitably face this knowledge gap. We approached these questions systematically by assembling annotations from the knowledge  ...[more]

Similar Datasets

| S-EPMC135234 | biostudies-literature
| S-EPMC3811779 | biostudies-literature
| S-EPMC93575 | biostudies-literature
| S-EPMC6928379 | biostudies-literature
| S-EPMC4059744 | biostudies-literature
| S-EPMC9380521 | biostudies-literature
| S-EPMC8253772 | biostudies-literature
| PRJNA1180887 | ENA
| S-EPMC212566 | biostudies-other
| S-EPMC1950902 | biostudies-literature