Ontology highlight
ABSTRACT: Background
RNA sequencing (RNA-seq) measures gene expression levels and permits splicing analysis. Many existing aligners are capable of mapping millions of sequencing reads onto a reference genome. For reads that can be mapped to multiple positions along the reference genome (multireads), these aligners may either randomly assign them to a location, or discard them altogether. Either way could bias downstream analyses. Meanwhile, challenges remain in the alignment of reads spanning across splice junctions. Existing splicing-aware aligners that rely on the read-count method in identifying junction sites are inevitably affected by sequencing depths.Results
The distance between aligned positions of paired-end (PE) reads or two parts of a spliced read is dependent on the experiment protocol and gene structures. We here proposed a new method that employs an empirical geometric-tail (GT) distribution of intron lengths to make a rational choice in multireads selection and splice-sites detection, according to the aligned distances from PE and sliced reads.Conclusions
GT models that combine sequence similarity from alignment, and together with the probability of length distribution, could accurately determine the location of both multireads and spliced reads.
SUBMITTER: Lou SK
PROVIDER: S-EPMC3226252 | biostudies-literature | 2011
REPOSITORIES: biostudies-literature
Lou Shao-Ke SK Li Jing-Woei JW Qin Hao H Yim Aldrin Kay-Yuen AK Lo Leung-Yau LY Ni Bing B Leung Kwong-Sak KS Tsui Stephen Kwok-Wing SK Chan Ting-Fung TF
BMC bioinformatics 20110727
<h4>Background</h4>RNA sequencing (RNA-seq) measures gene expression levels and permits splicing analysis. Many existing aligners are capable of mapping millions of sequencing reads onto a reference genome. For reads that can be mapped to multiple positions along the reference genome (multireads), these aligners may either randomly assign them to a location, or discard them altogether. Either way could bias downstream analyses. Meanwhile, challenges remain in the alignment of reads spanning acro ...[more]