Unknown

Dataset Information

0

A revised annotation and comparative analysis of Helicobacter pylori genomes.


ABSTRACT: Huge amounts of genomic information are currently being generated. Therefore, biologists require structured, exhaustive and comparative databases. The PyloriGene database (http://genolist.pasteur.fr/PyloriGene) was developed to respond to these needs, by integrating and connecting the information generated during the sequencing of two distinct strains of Helicobacter pylori. This led to the need for a general annotation consensus, as the physical and functional annotations of the two strains differed significantly in some cases. A revised functional classification system was created to accommodate the existing data and to make it possible to classify coding sequences (CDS) into several functional categories to harmonize CDS classification. The annotation of the two complete genomes was revised in the light of new data, allowing us to reduce the percentage of hypothetical proteins from approximately 40 to 33%. This resulted in the reassignment of functions for 108 CDS (approximately 7% of all CDS). Interestingly, the functions of only approximately 13% of CDS (222 out of 1658 CDS) were annotated as a result of work done directly on H.pylori genes. Finally, comparison of the two published genomes revealed a significant amount of size variation between corresponding (orthologous) CDS. Most of these size variations were due to natural polymorphisms, although other sources of variation were identified, such as pseudogenes, new genes potentially regulated by slipped-strand mispairing mechanism, or frame-shifts. 113 of these differences were due to different start codon assignments, a common problem when constructing physical annotations.

SUBMITTER: Boneca IG 

PROVIDER: S-EPMC152854 | biostudies-literature | 2003 Mar

REPOSITORIES: biostudies-literature

altmetric image

Publications

A revised annotation and comparative analysis of Helicobacter pylori genomes.

Boneca Ivo G IG   de Reuse Hilde H   Epinat Jean-Charles JC   Pupin Maude M   Labigne Agnès A   Moszer Ivan I  

Nucleic acids research 20030301 6


Huge amounts of genomic information are currently being generated. Therefore, biologists require structured, exhaustive and comparative databases. The PyloriGene database (http://genolist.pasteur.fr/PyloriGene) was developed to respond to these needs, by integrating and connecting the information generated during the sequencing of two distinct strains of Helicobacter pylori. This led to the need for a general annotation consensus, as the physical and functional annotations of the two strains dif  ...[more]

Similar Datasets

| S-EPMC3091624 | biostudies-literature
| S-EPMC4280198 | biostudies-literature
| S-EPMC4316009 | biostudies-literature
| S-EPMC4860318 | biostudies-literature
| S-EPMC101716 | biostudies-literature
| S-EPMC3695873 | biostudies-literature
| S-EPMC8733867 | biostudies-literature
2021-02-01 | GSE165787 | GEO
2016-08-01 | GSE76321 | GEO
| S-EPMC30209 | biostudies-literature