Unknown

Dataset Information

0

An optimistic protein assembly from sequence reads salvaged an uncharacterized segment of mouse picobirnavirus.


ABSTRACT: Advances in Next Generation Sequencing technologies have enabled the generation of millions of sequences from microorganisms. However, distinguishing the sequence of a novel species from sequencing errors remains a technical challenge when the novel species is highly divergent from the closest known species. To solve such a problem, we developed a new method called Optimistic Protein Assembly from Reads (OPAR). This method is based on the assumption that protein sequences could be more conserved than the nucleotide sequences encoding them. By taking advantage of metagenomics, bioinformatics and conventional Sanger sequencing, our method successfully identified all coding regions of the mouse picobirnavirus for the first time. The salvaged sequences indicated that segment 1 of this virus was more divergent from its homologues in other Picobirnaviridae species than segment 2. For this reason, only segment 2 of mouse picobirnavirus has been detected in previous studies. OPAR web tool is available at http://bioinformatics.czc.hokudai.ac.jp/opar/.

SUBMITTER: Gonzalez G 

PROVIDER: S-EPMC5223137 | biostudies-literature | 2017 Jan

REPOSITORIES: biostudies-literature

altmetric image

Publications

An optimistic protein assembly from sequence reads salvaged an uncharacterized segment of mouse picobirnavirus.

Gonzalez Gabriel G   Sasaki Michihito M   Burkitt-Gray Lucy L   Kamiya Tomonori T   Tsuji Noriko M NM   Sawa Hirofumi H   Ito Kimihito K  

Scientific reports 20170110


Advances in Next Generation Sequencing technologies have enabled the generation of millions of sequences from microorganisms. However, distinguishing the sequence of a novel species from sequencing errors remains a technical challenge when the novel species is highly divergent from the closest known species. To solve such a problem, we developed a new method called Optimistic Protein Assembly from Reads (OPAR). This method is based on the assumption that protein sequences could be more conserved  ...[more]

Similar Datasets

| S-EPMC3092772 | biostudies-literature
| S-EPMC3372223 | biostudies-literature
| S-EPMC2848820 | biostudies-literature
| S-EPMC3592409 | biostudies-literature
2024-02-08 | PXD049451 | iProX
| S-EPMC3728768 | biostudies-literature
| S-EPMC4498113 | biostudies-literature
| S-EPMC6052005 | biostudies-literature
| S-EPMC6316005 | biostudies-literature
| S-EPMC4899208 | biostudies-literature