The impact of PCR duplication on RNAseq data generated using NovaSeq 6000, NovaSeq X, AVITI and G4 sequencers
Ontology highlight
ABSTRACT: Gene expression profiles generated with RNA sequencing can be biased by RNA amount and methods utilised for cDNA library generation. Polymerase chain reaction (PCR) amplification can generate high PCR duplicate proportions and introduce bias in transcript counts. In this study, we investigate the impact of input amount and PCR cycle number on the PCR duplication rate and on the RNA-seq data quality. We used a range of inputs (1 ng - 1,000 ng) and assessed the PCR duplication rate using unique molecular identifiers (UMIs). For broader applicability, we sequenced the data on four different short-read sequencing platforms: Illumina NovaSeq 6000, Illumina NovaSeq X, Element Biosciences AVITI, and Singular Genomics G4. We highlight the limitations of using input amounts below 125 ng and the advantages for using UMIs for deduplication. We contrast the data obtained from different sequencers and discuss the benefits and drawbacks of using Illumina library conversion kits.
ORGANISM(S): Homo sapiens
PROVIDER: GSE261432 | GEO | 2024/03/13
REPOSITORIES: GEO
ACCESS DATA