Unknown

Dataset Information

0

Utilizing sequence intrinsic composition to classify protein-coding and long non-coding transcripts.


ABSTRACT: It is a challenge to classify protein-coding or non-coding transcripts, especially those re-constructed from high-throughput sequencing data of poorly annotated species. This study developed and evaluated a powerful signature tool, Coding-Non-Coding Index (CNCI), by profiling adjoining nucleotide triplets to effectively distinguish protein-coding and non-coding sequences independent of known annotations. CNCI is effective for classifying incomplete transcripts and sense-antisense pairs. The implementation of CNCI offered highly accurate classification of transcripts assembled from whole-transcriptome sequencing data in a cross-species manner, that demonstrated gene evolutionary divergence between vertebrates, and invertebrates, or between plants, and provided a long non-coding RNA catalog of orangutan. CNCI software is available at http://www.bioinfo.org/software/cnci.

SUBMITTER: Sun L 

PROVIDER: S-EPMC3783192 | biostudies-literature | 2013 Sep

REPOSITORIES: biostudies-literature

altmetric image

Publications

Utilizing sequence intrinsic composition to classify protein-coding and long non-coding transcripts.

Sun Liang L   Luo Haitao H   Bu Dechao D   Zhao Guoguang G   Yu Kuntao K   Zhang Changhai C   Liu Yuanning Y   Chen Runsheng R   Zhao Yi Y  

Nucleic acids research 20130727 17


It is a challenge to classify protein-coding or non-coding transcripts, especially those re-constructed from high-throughput sequencing data of poorly annotated species. This study developed and evaluated a powerful signature tool, Coding-Non-Coding Index (CNCI), by profiling adjoining nucleotide triplets to effectively distinguish protein-coding and non-coding sequences independent of known annotations. CNCI is effective for classifying incomplete transcripts and sense-antisense pairs. The impl  ...[more]

Similar Datasets

| S-EPMC6602462 | biostudies-literature
| S-EPMC6954391 | biostudies-literature
| S-EPMC5648457 | biostudies-literature
| S-EPMC5223071 | biostudies-literature
| S-EPMC7432689 | biostudies-literature
| S-EPMC4882039 | biostudies-literature
| S-EPMC3951366 | biostudies-literature
| S-EPMC5333057 | biostudies-literature
| S-EPMC5364679 | biostudies-literature
2020-01-09 | PXD014553 | Pride