Impact of sequencing depth and read length on single cell RNA sequencing data of T cells.
Ontology highlight
ABSTRACT: Single cell RNA sequencing (scRNA-seq) provides great potential in measuring the gene expression profiles of heterogeneous cell populations. In immunology, scRNA-seq allowed the characterisation of transcript sequence diversity of functionally relevant T cell subsets, and the identification of the full length T cell receptor (TCR??), which defines the specificity against cognate antigens. Several factors, e.g. RNA library capture, cell quality, and sequencing output affect the quality of scRNA-seq data. We studied the effects of read length and sequencing depth on the quality of gene expression profiles, cell type identification, and TCR?? reconstruction, utilising 1,305 single cells from 8 publically available scRNA-seq datasets, and simulation-based analyses. Gene expression was characterised by an increased number of unique genes identified with short read lengths (<50?bp), but these featured higher technical variability compared to profiles from longer reads. Successful TCR?? reconstruction was achieved for 6 datasets (81% - 100%) with at least 0.25 millions (PE) reads of length >50?bp, while it failed for datasets with <30?bp reads. Sufficient read length and sequencing depth can control technical noise to enable accurate identification of TCR?? and gene expression profiles from scRNA-seq data of T cells.
SUBMITTER: Rizzetto S
PROVIDER: S-EPMC5630586 | biostudies-literature | 2017 Oct
REPOSITORIES: biostudies-literature
ACCESS DATA