Unknown

Dataset Information

0

The global landscape of sequence diversity.


ABSTRACT:

Background

Systematic comparisons between genomic sequence datasets have revealed a wide spectrum of sequence specificity from sequences that are highly conserved to those that are specific to individual species. Due to the limited number of fully sequenced eukaryotic genomes, analyses of this spectrum have largely focused on prokaryotes. Combining existing genomic datasets with the partial genomes of 193 eukaryotes derived from collections of expressed sequence tags, we performed a quantitative analysis of the sequence specificity spectrum to provide a global view of the origins and extent of sequence diversity across the three domains of life.

Results

Comparisons with prokaryotic datasets reveal a greater genetic diversity within eukaryotes that may be related to differences in modes of genetic inheritance. Mapping this diversity within a phylogenetic framework revealed that the majority of sequences are either highly conserved or specific to the species or taxon from which they derive. Between these two extremes, several evolutionary landmarks consisting of large numbers of sequences conserved within specific taxonomic groups were identified. For example, 8% of sequences derived from metazoan species are specific and conserved within the metazoan lineage. Many of these sequences likely mediate metazoan specific functions, such as cell-cell communication and differentiation.

Conclusion

Through the use of partial genome datasets, this study provides a unique perspective of sequence conservation across the three domains of life. The provision of taxon restricted sequences should prove valuable for future computational and biochemical analyses aimed at understanding evolutionary and functional relationships.

SUBMITTER: Peregrin-Alvarez JM 

PROVIDER: S-EPMC2258180 | biostudies-literature | 2007

REPOSITORIES: biostudies-literature

altmetric image

Publications

The global landscape of sequence diversity.

Peregrín-Alvarez José Manuel JM   Parkinson John J  

Genome biology 20070101 11


<h4>Background</h4>Systematic comparisons between genomic sequence datasets have revealed a wide spectrum of sequence specificity from sequences that are highly conserved to those that are specific to individual species. Due to the limited number of fully sequenced eukaryotic genomes, analyses of this spectrum have largely focused on prokaryotes. Combining existing genomic datasets with the partial genomes of 193 eukaryotes derived from collections of expressed sequence tags, we performed a quan  ...[more]

Similar Datasets

| S-EPMC10079069 | biostudies-literature
| S-EPMC5761093 | biostudies-literature
2023-02-01 | PXD035592 | Pride
2023-02-01 | PXD035670 | Pride
2023-02-01 | PXD035776 | Pride
2023-02-01 | PXD035536 | Pride
2023-02-01 | PXD035431 | Pride
| S-EPMC3119061 | biostudies-other
| S-EPMC9879099 | biostudies-literature
| S-EPMC10199016 | biostudies-literature