Dataset Information

Inferring special MIRU-VNTR patterns from genomic data

ABSTRACT: Inferring special MIRU-VNTR patterns from genomic data

PROVIDER: PRJEB77424 | ENA |

REPOSITORIES: ENA

ACCESS DATA

Dataset's files

Source:

			Action	DRS
		Other

Items per page:

1 - 1 of 1

Similar Datasets

Project description:The transition from MIRU-VNTR-based epidemiology studies in tuberculosis (TB) to genomic epidemiology has transformed how we track transmission. However, short-read sequencing is poor at analyzing repetitive regions such as the MIRU-VNTR loci. This causes a gap between the new genomic data and the large amount of information stored in historical databases. Long-read sequencing could bridge this knowledge gap by allowing analysis of repetitive regions. However, the feasibility of extracting MIRU-VNTRs from long reads and linking them to historical data has not been evaluated. In our study, an in silico arm, consisting of inference of MIRU patterns from long-read sequences (using MIRUReader program), was compared with an experimental arm, involving standard amplification and fragment sizing. We analyzed overall performance on 39 isolates from South Africa and confirmed reproducibility in a sample enriched with 62 clustered cases from Spain. Finally, we ran 25 consecutive incident cases, demonstrating the feasibility of correctly assigning new clustered/orphan cases by linking data inferred from genomic analysis to MIRU-VNTR databases. Of the 3,024 loci analyzed, only 11 discrepancies (0.36%) were found between the two arms: three attributed to experimental error and eight to misassigned alleles from long-read sequencing. A second round of analysis of these discrepancies resulted in agreement between the experimental and in silico arms in all but one locus. Adjusting the MIRUReader program code allowed us to flag potential in silico misassignments due to suboptimal coverage or unfixed double alleles. Our study indicates that long-read sequencing could help address potential chronological and geographical gaps arising from the transition from molecular to genomic epidemiology of tuberculosis.ImportanceThe transition from molecular epidemiology in tuberculosis (TB), based on the analysis of repetitive regions (VNTR-based genotyping), to genomic epidemiology transforms in the precision with which we track transmission. However, short-read sequencing, the most common method for performing genomic analysis, is poor at analyzing repetitive regions. This means that we face a gap between the new genomic data and the large amount of information stored in historical databases, which is also an obstacle to cross-national surveillance involving settings where only molecular data are available. Long-read sequencing could help bridge this knowledge gap by allowing analysis of repetitive regions. Our study demonstrates that MIRU-VNTR patterns can be successfully inferred from long-read sequences, allowing the correct assignment of new cases as clustered/orphan by linking new data extracted from genomic analysis to historical MIRU-VNTR databases. Our data may provide a starting point for bridging the knowledge gap between the molecular and genomic eras in tuberculosis epidemiology.

Dataset Information

Inferring special MIRU-VNTR patterns from genomic data

Dataset's files

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets