Dataset Information

Haplotype and minimum-chimerism consensus determination using short sequence data.

ABSTRACT:

Background

Assembling haplotypes given sequence data derived from a single individual is a well studied problem, but only recently has haplotype assembly been considered for population-sampled data. We discuss a software tool called Hapler, which is designed specifically for low-diversity, low-coverage data such as ecological samples derived from natural populations. Because such data may contain error as well as ambiguous haplotype information, we developed methods that increase confidence in these assemblies. Hapler also reconstructs full consensus sequences while minimizing and identifying possible chimeric points.

Results

Experiments on simulated data indicate that Hapler is effective at assembling haplotypes from gene-sized alignments of short reads. Further, in our tests Hapler-generated consensus sequences are less chimeric than the alternative consensus approaches of majority vote and viral quasispecies estimation regardless of error rate, read length, or population haplotype bias.

Conclusions

The analysis of genetically diverse sequence data is increasingly common, particularly in the field of ecoinformatics where transcriptome sequencing of natural populations is a cost effective alternative to genome sequencing. For such studies, it is important to consider and identify haplotype diversity. Hapler provides robust haplotype information and identifies possible phasing errors in consensus sequences, providing valuable information for population studies and downstream usage of resulting assemblies.

SUBMITTER: O'Neil ST

PROVIDER: S-EPMC3394418 | biostudies-literature | 2012 Apr

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Haplotype and minimum-chimerism consensus determination using short sequence data.

O'Neil Shawn T ST Emrich Scott J SJ

BMC genomics 20120412

<h4>Background</h4>Assembling haplotypes given sequence data derived from a single individual is a well studied problem, but only recently has haplotype assembly been considered for population-sampled data. We discuss a software tool called Hapler, which is designed specifically for low-diversity, low-coverage data such as ecological samples derived from natural populations. Because such data may contain error as well as ambiguous haplotype information, we developed methods that increase confide ...[more]

PMID: 22537299

Dataset Information

Haplotype and minimum-chimerism consensus determination using short sequence data.

Background

Results

Conclusions

Publications

Haplotype and minimum-chimerism consensus determination using short sequence data.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

Protein structure determination using metagenome sequence data.
| S-EPMC5493203 | biostudies-literature

Phylogenetic inference from homologous sequence data: minimum topological assumption, strict mutational compatibility consensus tree as the ultimate solution.
| S-EPMC1409768 | biostudies-literature

Minimum error correction-based haplotype assembly: Considerations for long read data.
| S-EPMC7292361 | biostudies-literature

Family-Based Haplotype Estimation and Allele Dosage Correction for Polyploids Using Short Sequence Reads.
| S-EPMC6477055 | biostudies-literature

Improved metagenome assemblies and taxonomic binning using long-read circular consensus sequence data.
| S-EPMC4860591 | biostudies-literature

An improved approach for reconstructing consensus repeats from short sequence reads.
| S-EPMC6101065 | biostudies-literature

Minimum and optimal numbers of psychiatric beds: expert consensus using a Delphi process.
| S-EPMC8780043 | biostudies-literature

Haplotype Counting for Sensitive Chimerism Testing: Potential for Early Leukemia Relapse Detection.
| S-EPMC5707182 | biostudies-literature

Haplotype sorting using human fosmid clone end-sequence pairs.
| S-EPMC2593576 | biostudies-literature

Determination of haplotypes at structurally complex regions using emulsion haplotype fusion PCR.
| S-EPMC3543183 | biostudies-literature