Unknown

Dataset Information

0

A quick guide for student-driven community genome annotation.


ABSTRACT: High quality gene models are necessary to expand the molecular and genetic tools available for a target organism, but these are available for only a handful of model organisms that have undergone extensive curation and experimental validation over the course of many years. The majority of gene models present in biological databases today have been identified in draft genome assemblies using automated annotation pipelines that are frequently based on orthologs from distantly related model organisms and usually have minor or major errors. Manual curation is time consuming and often requires substantial expertise, but is instrumental in improving gene model structure and identification. Manual annotation may seem to be a daunting and cost-prohibitive task for small research communities but involving undergraduates in community genome annotation consortiums can be mutually beneficial for both education and improved genomic resources. We outline a workflow for efficient manual annotation driven by a team of primarily undergraduate annotators. This model can be scaled to large teams and includes quality control processes through incremental evaluation. Moreover, it gives students an opportunity to increase their understanding of genome biology and to participate in scientific research in collaboration with peers and senior researchers at multiple institutions.

SUBMITTER: Hosmani PS 

PROVIDER: S-EPMC6447164 | biostudies-literature | 2019 Apr

REPOSITORIES: biostudies-literature

altmetric image

Publications


High quality gene models are necessary to expand the molecular and genetic tools available for a target organism, but these are available for only a handful of model organisms that have undergone extensive curation and experimental validation over the course of many years. The majority of gene models present in biological databases today have been identified in draft genome assemblies using automated annotation pipelines that are frequently based on orthologs from distantly related model organis  ...[more]

Similar Datasets

| S-EPMC6206466 | biostudies-other
| S-EPMC8170692 | biostudies-literature
| S-EPMC2610131 | biostudies-literature
| S-EPMC5033207 | biostudies-literature
| S-EPMC3313959 | biostudies-literature
| S-EPMC7272004 | biostudies-literature
| S-EPMC2151081 | biostudies-other
| S-EPMC10177377 | biostudies-literature
| S-EPMC545604 | biostudies-literature
| S-EPMC8785237 | biostudies-literature