Unknown

Dataset Information

0

COSINE: non-seeding method for mapping long noisy sequences.


ABSTRACT: Third generation sequencing (TGS) are highly promising technologies but the long and noisy reads from TGS are difficult to align using existing algorithms. Here, we present COSINE, a conceptually new method designed specifically for aligning long reads contaminated by a high level of errors. COSINE computes the context similarity of two stretches of nucleobases given the similarity over distributions of their short k-mers (k = 3-4) along the sequences. The results on simulated and real data show that COSINE achieves high sensitivity and specificity under a wide range of read accuracies. When the error rate is high, COSINE can offer substantial advantages over existing alignment methods.

SUBMITTER: Afshar PT 

PROVIDER: S-EPMC5737678 | biostudies-literature | 2017 Aug

REPOSITORIES: biostudies-literature

altmetric image

Publications

COSINE: non-seeding method for mapping long noisy sequences.

Afshar Pegah Tootoonchi PT   Wong Wing Hung WH  

Nucleic acids research 20170801 14


Third generation sequencing (TGS) are highly promising technologies but the long and noisy reads from TGS are difficult to align using existing algorithms. Here, we present COSINE, a conceptually new method designed specifically for aligning long reads contaminated by a high level of errors. COSINE computes the context similarity of two stretches of nucleobases given the similarity over distributions of their short k-mers (k = 3-4) along the sequences. The results on simulated and real data show  ...[more]

Similar Datasets

| S-EPMC4937194 | biostudies-literature
| S-EPMC7879691 | biostudies-literature
| S-EPMC3256161 | biostudies-literature
| S-EPMC11320709 | biostudies-literature
| S-EPMC10959152 | biostudies-literature
| S-EPMC3138081 | biostudies-literature
| S-EPMC10510034 | biostudies-literature
| S-EPMC6547545 | biostudies-literature
| S-EPMC8106442 | biostudies-literature
| S-EPMC9117619 | biostudies-literature