Project description:Streptococcus pneumoniae (pneumococcus) is a major human respiratory pathogen and the leading cause of bacterial pneumonia worldwide. Small regulatory RNAs (sRNAs), which often act by post-transcriptionally regulating gene expression, have been shown to be crucial for the virulence of S. pneumoniae and other bacterial pathogens. Over 170 putative sRNAs have been identified in S. pneumoniae TIGR4 strain (serotype 4) through transcriptomic studies, and a subset of these sRNAs have been further implicated in regulating pneumococcal pathogenesis. However, there was little overlap in the sRNAs identified among these studies, which indicated that the approaches used for sRNA identification were not sufficiently sensitive and robust and that there were likely many more undiscovered sRNAs encoded in the S. pneumoniae genome. Here, we sought to comprehensively identify sRNAs in Avery's virulent S. pneumoniae strain D39 using two independent RNA-seq based approaches. We developed an unbiased method for identifying novel sRNAs from bacterial RNA-seq data and have further tested the specificity of our analysis program towards identifying sRNAs encoded by both strains D39 and TIGR4. Interestingly, the genes for 15% of the putative sRNAs identified in strain TIGR4 including ones previously implicated in virulence were not present in strain D39 genome suggesting that the differences in sRNA repertoires between these two serotypes may contribute to their strain-specific virulence properties. Finally, this study has identified 67 new sRNA candidates in strain D39, 28 out of which have been further validated, raising the total number of sRNAs that have been identified in strain D39 to 112.