Unknown

Dataset Information

0

Navigating the dynamic landscape of long noncoding RNA and protein-coding gene annotations in GENCODE.


ABSTRACT: Our understanding of the transcriptional potential of the genome and its functional consequences has undergone a significant change in the last decade. This has been largely contributed by the improvements in technology which could annotate and in many cases functionally characterize a number of novel gene loci in the human genome. Keeping pace with advancements in this dynamic environment and being able to systematically annotate a compendium of genes and transcripts is indeed a formidable task. Of the many databases which attempted to systematically annotate the genome, GENCODE has emerged as one of the largest and popular compendium for human genome annotations.The analysis of various versions of GENCODE revealed that there was a constant upgradation of transcripts for both protein-coding and long noncoding RNA (lncRNAs) leading to conflicting annotations. The GENCODE version 24 accounts for 4.18 % of the human genome to be transcribed which is an increase of 1.58 % from its first version. Out of 2,51,614 transcripts annotated across GENCODE versions, only 21.7 % had consistency. We also examined GENCODE consortia categorized transcripts into 70 biotypes out of which only 17 remained stable throughout.In this report, we try to review the impact on the dynamicity with respect to gene annotations, specifically (lncRNA) annotations in GENCODE over the years. Our analysis suggests a significant dynamism in gene annotations, reflective of the evolution and consensus in nomenclature of genes. While a progressive change in annotations and timely release of the updates make the resource reliable in the community, the dynamicity with each release poses unique challenges to its users. Taking cues from other experiments with bio-curation, we propose potential avenues and methods to mend the gap.

SUBMITTER: Jalali S 

PROVIDER: S-EPMC5084464 | biostudies-literature | 2016 Oct

REPOSITORIES: biostudies-literature

altmetric image

Publications

Navigating the dynamic landscape of long noncoding RNA and protein-coding gene annotations in GENCODE.

Jalali Saakshi S   Gandhi Shrey S   Scaria Vinod V  

Human genomics 20161028 1


<h4>Background</h4>Our understanding of the transcriptional potential of the genome and its functional consequences has undergone a significant change in the last decade. This has been largely contributed by the improvements in technology which could annotate and in many cases functionally characterize a number of novel gene loci in the human genome. Keeping pace with advancements in this dynamic environment and being able to systematically annotate a compendium of genes and transcripts is indee  ...[more]

Similar Datasets

| S-EPMC6735957 | biostudies-literature
| S-EPMC5525255 | biostudies-literature
| S-EPMC3431493 | biostudies-literature
| 2693518 | ecrin-mdr-crc
| S-EPMC4215180 | biostudies-literature
| S-EPMC6879443 | biostudies-literature
| S-EPMC5658400 | biostudies-literature
| S-EPMC7083320 | biostudies-literature
| S-EPMC5668523 | biostudies-literature
| 2330122 | ecrin-mdr-crc