Unknown

Dataset Information

0

Accelerating the alignment processing speed of the comprehensive end-to-end whole-genome bisulfite sequencing pipeline, wg-blimp.


ABSTRACT: Analyzing whole-genome bisulfite and related sequencing datasets is a time-intensive process due to the complexity and size of the input raw sequencing files and lengthy read alignment step requiring correction for conversion of all unmethylated Cs to Ts genome-wide. The objective of this study was to modify the read alignment algorithm associated with the whole-genome bisulfite sequencing methylation analysis pipeline (wg-blimp) to shorten the time required to complete this phase while retaining overall read alignment accuracy. Here, we report an update to the recently published pipeline wg-blimp achieved by replacing the use of the bwa-meth aligner with the faster gemBS aligner. This improvement to the wg-blimp pipeline has led to a more than ×7 acceleration in the processing speed of samples when scaled to larger publicly available FASTQ datasets containing 80-160 million reads while maintaining nearly identical accuracy of properly mapped reads when compared with data from the previous pipeline. The modifications to the wg-blimp pipeline reported here merge the speed and accuracy of the gemBS aligner with the comprehensive analysis and data visualization assets of the wg-blimp pipeline to provide a significantly accelerated workflow that can produce high-quality data much more rapidly without compromising read accuracy at the expense of increasing RAM requirements up to 48 GB.

SUBMITTER: Lehle JD 

PROVIDER: S-EPMC10329742 | biostudies-literature | 2023

REPOSITORIES: biostudies-literature

altmetric image

Publications

Accelerating the alignment processing speed of the comprehensive end-to-end whole-genome bisulfite sequencing pipeline, wg-blimp.

Lehle Jake D JD   McCarrey John R JR  

Biology methods & protocols 20230627 1


Analyzing whole-genome bisulfite and related sequencing datasets is a time-intensive process due to the complexity and size of the input raw sequencing files and lengthy read alignment step requiring correction for conversion of all unmethylated Cs to Ts genome-wide. The objective of this study was to modify the read alignment algorithm associated with the whole-genome bisulfite sequencing methylation analysis pipeline (wg-blimp) to shorten the time required to complete this phase while retainin  ...[more]

Similar Datasets

| S-EPMC7195798 | biostudies-literature
| S-EPMC11576351 | biostudies-literature
| S-EPMC4769831 | biostudies-literature
| S-EPMC5883884 | biostudies-literature
| S-EPMC8425420 | biostudies-literature
| S-EPMC7359584 | biostudies-literature
| S-EPMC4234473 | biostudies-literature
| S-EPMC9014879 | biostudies-literature
| S-EPMC4488126 | biostudies-literature
| S-EPMC6061805 | biostudies-literature