Unknown

Dataset Information

0

Drosophila genomic sequence annotation using the BLOCKS+ database.


ABSTRACT: A simple and general homology-based method for gene finding was applied to the 2.9-Mb Drosophila melanogaster Adh region, the target sequence of the Genome Annotation Assessment Project (GASP). Each strand of the entire sequence was used as query of the BLOCKS+ database of conserved regions of proteins. This led to functional assignments for more than one-third of the genes and two-thirds of the transposons. Considering the enormous size of the query, the fact that only two false-positive matches were reported emphasizes the high selectivity of protein family-based methods for gene finding. We used the search results to improve BLOCKS+ by identifying compositionally biased blocks. Our results confirm that protein family databases can be used effectively in automated sequence annotation efforts.

SUBMITTER: Henikoff JG 

PROVIDER: S-EPMC310867 | biostudies-literature | 2000 Apr

REPOSITORIES: biostudies-literature

altmetric image

Publications

Drosophila genomic sequence annotation using the BLOCKS+ database.

Henikoff J G JG   Henikoff S S  

Genome research 20000401 4


A simple and general homology-based method for gene finding was applied to the 2.9-Mb Drosophila melanogaster Adh region, the target sequence of the Genome Annotation Assessment Project (GASP). Each strand of the entire sequence was used as query of the BLOCKS+ database of conserved regions of proteins. This led to functional assignments for more than one-third of the genes and two-thirds of the transposons. Considering the enormous size of the query, the fact that only two false-positive matche  ...[more]

Similar Datasets

| S-EPMC102392 | biostudies-literature
| S-EPMC1287883 | biostudies-literature
| S-EPMC151188 | biostudies-literature
| S-EPMC3834799 | biostudies-other
| S-EPMC2808907 | biostudies-literature
| S-EPMC5325239 | biostudies-literature
| S-EPMC329128 | biostudies-literature
| S-EPMC6323909 | biostudies-other
| S-EPMC2873954 | biostudies-literature
| S-EPMC5753378 | biostudies-literature