Unknown

Dataset Information

0

PathoQC: Computationally Efficient Read Preprocessing and Quality Control for High-Throughput Sequencing Data Sets.


ABSTRACT: Quality control and read preprocessing are critical steps in the analysis of data sets generated from high-throughput genomic screens. In the most extreme cases, improper preprocessing can negatively affect downstream analyses and may lead to incorrect biological conclusions. Here, we present PathoQC, a streamlined toolkit that seamlessly combines the benefits of several popular quality control software approaches for preprocessing next-generation sequencing data. PathoQC provides a variety of quality control options appropriate for most high-throughput sequencing applications. PathoQC is primarily developed as a module in the PathoScope software suite for metagenomic analysis. However, PathoQC is also available as an open-source Python module that can run as a stand-alone application or can be easily integrated into any bioinformatics workflow. PathoQC achieves high performance by supporting parallel computation and is an effective tool that removes technical sequencing artifacts and facilitates robust downstream analysis. The PathoQC software package is available at http://sourceforge.net/projects/PathoScope/.

SUBMITTER: Hong C 

PROVIDER: S-EPMC4429651 | biostudies-literature | 2014

REPOSITORIES: biostudies-literature

altmetric image

Publications

PathoQC: Computationally Efficient Read Preprocessing and Quality Control for High-Throughput Sequencing Data Sets.

Hong Changjin C   Manimaran Solaiappan S   Johnson William Evan WE  

Cancer informatics 20140101 Suppl 1


Quality control and read preprocessing are critical steps in the analysis of data sets generated from high-throughput genomic screens. In the most extreme cases, improper preprocessing can negatively affect downstream analyses and may lead to incorrect biological conclusions. Here, we present PathoQC, a streamlined toolkit that seamlessly combines the benefits of several popular quality control software approaches for preprocessing next-generation sequencing data. PathoQC provides a variety of q  ...[more]

Similar Datasets

| S-EPMC5788068 | biostudies-literature
| S-EPMC2532726 | biostudies-literature
| S-EPMC3458526 | biostudies-other
| S-EPMC7144081 | biostudies-literature
| S-EPMC4708105 | biostudies-literature
| S-EPMC7356586 | biostudies-literature
| S-EPMC3558281 | biostudies-literature
| S-EPMC5017645 | biostudies-literature
| S-EPMC4666380 | biostudies-literature
| S-EPMC6440105 | biostudies-other