Ontology highlight
ABSTRACT: Summary
Massively parallel sequencing is now widely used, but data interpretation is only as good as the reference assembly to which it is aligned. While the number of reference assemblies has rapidly expanded, most of these remain at intermediate stages of completion, either as scaffold builds, or as chromosome builds (consisting of correctly ordered, but not necessarily correctly oriented scaffolds separated by gaps). Completion of de novo assemblies remains difficult, as regions that are repetitive or hard to sequence prevent the accumulation of larger scaffolds, and create errors such as misorientations and mislocalizations. Thus, complementary methods for determining the orientation and positioning of fragments are important for finishing assemblies. Strand-seq is a method for determining template strand inheritance in single cells, information that can be used to determine relative genomic distance and orientation between scaffolds, and find errors within them. We present contiBAIT, an R/Bioconductor package which uses Strand-seq data to repair and improve existing assemblies.Availability and implementation
contiBAIT is available on Bioconductor. Source files available from GitHub.Contact
koneill@bcgsc.ca or mark.hills@stemcell.com.Supplementary information
Supplementary data are available at Bioinformatics online.
SUBMITTER: O'Neill K
PROVIDER: S-EPMC5860061 | biostudies-literature | 2017 Sep
REPOSITORIES: biostudies-literature
O'Neill Kieran K Hills Mark M Gottlieb Mike M Borkowski Matthew M Karsan Aly A Lansdorp Peter M PM
Bioinformatics (Oxford, England) 20170901 17
<h4>Summary</h4>Massively parallel sequencing is now widely used, but data interpretation is only as good as the reference assembly to which it is aligned. While the number of reference assemblies has rapidly expanded, most of these remain at intermediate stages of completion, either as scaffold builds, or as chromosome builds (consisting of correctly ordered, but not necessarily correctly oriented scaffolds separated by gaps). Completion of de novo assemblies remains difficult, as regions that ...[more]