Dataset Information

Tracking and coordinating an international curation effort for the CCDS Project.

ABSTRACT: The Consensus Coding Sequence (CCDS) collaboration involves curators at multiple centers with a goal of producing a conservative set of high quality, protein-coding region annotations for the human and mouse reference genome assemblies. The CCDS data set reflects a 'gold standard' definition of best supported protein annotations, and corresponding genes, which pass a standard series of quality assurance checks and are supported by manual curation. This data set supports use of genome annotation information by human and mouse researchers for effective experimental design, analysis and interpretation. The CCDS project consists of analysis of automated whole-genome annotation builds to identify identical CDS annotations, quality assurance testing and manual curation support. Identical CDS annotations are tracked with a CCDS identifier (ID) and any future change to the annotated CDS structure must be agreed upon by the collaborating members. CCDS curation guidelines were developed to address some aspects of curation in order to improve initial annotation consistency and to reduce time spent in discussing proposed annotation updates. Here, we present the current status of the CCDS database and details on our procedures to track and coordinate our efforts. We also present the relevant background and reasoning behind the curation standards that we have developed for CCDS database treatment of transcripts that are nonsense-mediated decay (NMD) candidates, for transcripts containing upstream open reading frames, for identifying the most likely translation start codons and for the annotation of readthrough transcripts. Examples are provided to illustrate the application of these guidelines. DATABASE URL: http://www.ncbi.nlm.nih.gov/CCDS/CcdsBrowse.cgi.

SUBMITTER: Harte RA

PROVIDER: S-EPMC3308164 | biostudies-literature | 2012

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Tracking and coordinating an international curation effort for the CCDS Project.

Harte Rachel A RA Farrell Catherine M CM Loveland Jane E JE Suner Marie-Marthe MM Wilming Laurens L Aken Bronwen B Barrell Daniel D Frankish Adam A Wallin Craig C Searle Steve S Diekhans Mark M Harrow Jennifer J Pruitt Kim D KD

Database : the journal of biological databases and curation 20120320

The Consensus Coding Sequence (CCDS) collaboration involves curators at multiple centers with a goal of producing a conservative set of high quality, protein-coding region annotations for the human and mouse reference genome assemblies. The CCDS data set reflects a 'gold standard' definition of best supported protein annotations, and corresponding genes, which pass a standard series of quality assurance checks and are supported by manual curation. This data set supports use of genome annotation ...[more]

PMID: 22434842

Dataset Information

Tracking and coordinating an international curation effort for the CCDS Project.

Publications

Tracking and coordinating an international curation effort for the CCDS Project.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

Consensus coding sequence (CCDS) database: a standardized set of human and mouse protein-coding regions supported by expert curation.
| S-EPMC5753299 | biostudies-literature

How to get the most out of your curation effort.
| S-EPMC2678295 | biostudies-literature

Curation of an international drug proprietary names dataset.
| S-EPMC8703051 | biostudies-literature

The Gene Curation Coalition: A global effort to harmonize gene-disease evidence resources.
| S-EPMC7613247 | biostudies-literature

Genetics. Was the Human Genome Project worth the effort?
| S-EPMC2582021 | biostudies-literature

Accessible data curation and analytics for international-scale citizen science datasets.
| S-EPMC8608807 | biostudies-literature

Participation of a coordinating center pharmacy in a multicenter international study.
| S-EPMC6188656 | biostudies-literature

Transcription profiling of CEPH individuals from the International HapMap Project
2007-06-22 | E-GEOD-2552 | biostudies-arrayexpress

The MIntAct project--IntAct as a common curation platform for 11 molecular interaction databases.
| S-EPMC3965093 | biostudies-literature