Measuring the level of relatedness between NGS datasets
Ontology highlight
ABSTRACT: Sequencing technologies are providing increasingly detailed insight in genetic makeup are paving their way into molecular diagnostics. The field will benefit from rigorous and bias-free measures for the quality of sequence data and for the proper representation of the complexity of the original samples. While current methodologies rely on the availability of a well-characterized reference genome, we propose kMer profiling for alignment-free assessment of the quality, comparability, and complexity of sequencing datasets. We show that kMer detects technical artefacts such as high duplication rates, library chimaeras, and differences in library preparation protocols in whole-genome, whole-exome, and RNA sequencing data. Additionally, it successfully captures the complexity and diversity of microbiomes. Thus, kMer allows for a robust evaluation of the quality and complexity of sequencing data without relying on any prior information and opens the way to a more reliable biological reasoning.
PROVIDER: EGAS00001000600 | EGA |
REPOSITORIES: EGA
ACCESS DATA