Project description:The typical mitochondrial (mt) genomes of bilateral animals consist of 37 genes on a single circular chromosome. The mt genomes of the human body louse, Pediculus humanus, and the human head louse, Pediculus capitis, however, are extensively fragmented and contain 20 minichromosomes, with one to three genes on each minichromosome. Heteroplasmy, i.e. nucleotide polymorphisms in the mt genome within individuals, has been shown to be significantly higher in the mt cox1 gene of human lice than in humans and other animals that have the typical mt genomes. To understand whether the extent of heteroplasmy in human lice is associated with mt genome fragmentation, we sequenced the entire coding regions of all of the mt minichromosomes of six human body lice and six human head lice from Ethiopia, China and France with an Illumina HiSeq platform. For comparison, we also sequenced the entire coding regions of the mt genomes of seven species of ticks, which have the typical mitochondrial genome organization of bilateral animals. We found that the level of heteroplasmy varies significantly both among the human lice and among the ticks. The human lice from Ethiopia have significantly higher level of heteroplasmy than those from China and France (Pt<0.05). The tick, Amblyomma cajennense, has significantly higher level of heteroplasmy than other ticks (Pt<0.05). Our results indicate that heteroplasmy level can be substantially variable within a species and among closely related species, and does not appear to be determined by single factors such as genome fragmentation.
Project description:Heteroplasmy, the existence of multiple mtDNA types within an individual, has been previously detected by using mostly indirect methods and focusing largely on just the hypervariable segments of the control region. Next-generation sequencing technologies should enable studies of heteroplasmy across the entire mtDNA genome at much higher resolution, because many independent reads are generated for each position. However, the higher error rate associated with these technologies must be taken into consideration to avoid false detection of heteroplasmy. We used simulations and phiX174 sequence data to design criteria for accurate detection of heteroplasmy with the Illumina Genome Analyzer platform, and we used artificial mixtures and replicate data to test and refine the criteria. We then applied these criteria to mtDNA sequence reads for 131 individuals from five Eurasian populations that had been generated via a parallel tagged approach. We identified 37 heteroplasmies at 10% frequency or higher at 34 sites in 32 individuals. The mutational spectrum does not differ between heteroplasmic mutations and polymorphisms in the same individuals, but the relative mutation rate at heteroplasmic mutations is significantly higher than that estimated for all mutable sites in the human mtDNA genome. Moreover, there is also a significant excess of nonsynonymous mutations observed among heteroplasmies, compared to polymorphism data from the same individuals. Both mutation-drift and negative selection influence the fate of heteroplasmies to determine the polymorphism spectrum in humans. With appropriate criteria for avoiding false positives due to sequencing errors, next-generation technologies can provide novel insights into genome-wide aspects of mtDNA heteroplasmy.
Project description:We describe methods for rapid sequencing of the entire human mitochondrial genome (mtgenome), which involve long-range PCR for specific amplification of the mtgenome, pyrosequencing, quantitative mapping of sequence reads to identify sequence variants and heteroplasmy, as well as de novo sequence assembly. These methods have been used to study 40 publicly available HapMap samples of European (CEU) and African (YRI) ancestry to demonstrate a sequencing error rate <5.63×10(-4), nucleotide diversity of 1.6×10(-3) for CEU and 3.7×10(-3) for YRI, patterns of sequence variation consistent with earlier studies, but a higher rate of heteroplasmy varying between 10% and 50%. These results demonstrate that next-generation sequencing technologies allow interrogation of the mitochondrial genome in greater depth than previously possible which may be of value in biology and medicine.
Project description:Mitochondrial heteroplasmy, which fundamentally means intracellular heterogeneity of mitochondrial DNA (mtDNA), has been measured in a group of cells, regardless of intercellular heterogeneity. Ordinal methods for mitochondrial heteroplasmy cannot discriminate between an intercellular homogenic population composed of cells with similar intracellular heterogeneity for mtDNA and an intercellular heterogenic population composed of cells with different rates of mutated mtDNA. A high-throughput method to determine mitochondrial heteroplasmy in a single cell was developed by using droplet digital PCR with TaqMan polymerase in this study. This technique revealed that there are three different cell populations of cultured fibroblasts derived from patients with mitochondrial disease carrying a mutation in the mtDNA; cells with homoplasmy of either mutated or healthy mtDNA; and cells mixed with mutated and healthy mtDNA. The presence of intercellular heterogeneity, even in uniformed cultured fibroblasts, suggests that heterogeneity should exist among different kinds of cells. The diagnosis of intercellular heterogeneity with respect to mitochondrial heteroplasmy by this methodology could provide novel insight into developing a treatment strategy for mitochondrial diseases.
Project description:Mitochondrial genome sequences are important markers for phylogenetics but taxon sampling remains sporadic because of the great effort and cost required to acquire full-length sequences. Here, we demonstrate a simple, cost-effective way to sequence the full complement of protein coding mitochondrial genes from pooled samples using the 454/Roche platform. Multiplexing was achieved without the need for expensive indexing tags ('barcodes'). The method was trialled with a set of long-range polymerase chain reaction (PCR) fragments from 30 species of Coleoptera (beetles) sequenced in a 1/16th sector of a sequencing plate. Long contigs were produced from the pooled sequences with sequencing depths ranging from ∼10 to 100× per contig. Species identity of individual contigs was established via three 'bait' sequences matching disparate parts of the mitochondrial genome obtained by conventional PCR and Sanger sequencing. This proved that assembly of contigs from the sequencing pool was correct. Our study produced sequences for 21 nearly complete and seven partial sets of protein coding mitochondrial genes. Combined with existing sequences for 25 taxa, an improved estimate of basal relationships in Coleoptera was obtained. The procedure could be employed routinely for mitochondrial genome sequencing at the species level, to provide improved species 'barcodes' that currently use the cox1 gene only.
Project description:BACKGROUND Although mutations and dysfunction of mitochondrial DNA (mtDNA) are related to a variety of diseases, few studies have focused on the relationship between mtDNA and coronary artery disease (CAD), especially the relationship between rare variants and CAD. MATERIAL AND METHODS Two-stage high-throughput sequencing was performed to detect mtDNA variants or heteroplasmy and the relationship between them and CAD phenotypes. In the discovery stage, mtDNA was analyzed by high-throughput sequencing of long-range PCR products generated from the peripheral blood of 85 CAD patients and 80 demographically matched controls. In the validation stage, high-throughput sequencing for mtDNA target regions captured by GenCap Kit was performed on 100 CAD samples and 100 controls. Finally, tRNA fine mapping was performed between our study and the reported Chinese CAD study. RESULTS Among the tRNA genes, we confirmed a highly conserved rare variant, A5592G, previously reported in the Chinese CAD study, and 2 novel rare mutations that reached Bonferroni's correction significance in the combined analysis were found (P=7.39×10-4 for T5628C in tRNAAla and P=1.01×10-5 for T681C in 12S rRNA) in the CAD study. Both of them were predicted to be pathological, with T5628C disrupting an extremely conservative base-pairing at the AC stem of tRNAAla. Furthermore, we confirmed the controversial issue that the number of non-synonymous heteroplasmic sites per sample was significantly higher in CAD patients. CONCLUSIONS In conclusion, our study confirmed the contribution of rare variants in CAD and showed that CAD patients had more non-synonymous heterogeneity mutations, which may be helpful in identifying the genetic and molecular basis of CAD.
Project description:We show that existing RNA-seq, DNase-seq, and ChIP-seq data exhibit overdispersed per-base read count distributions that are not matched to existing computational method assumptions. To compensate for this overdispersion we introduce a nonparametric and universal method for processing per-base sequencing read count data called FIXSEQ. We demonstrate that FIXSEQ substantially improves the performance of existing RNA-seq, DNase-seq, and ChIP-seq analysis tools when compared with existing alternatives.
Project description:MotivationThe increasing availability of mitochondria-targeted and off-target sequencing data in whole-exome and whole-genome sequencing studies (WXS and WGS) has risen the demand of effective pipelines to accurately measure heteroplasmy and to easily recognize the most functionally important mitochondrial variants among a huge number of candidates. To this purpose, we developed MToolBox, a highly automated pipeline to reconstruct and analyze human mitochondrial DNA from high-throughput sequencing data.ResultsMToolBox implements an effective computational strategy for mitochondrial genomes assembling and haplogroup assignment also including a prioritization analysis of detected variants. MToolBox provides a Variant Call Format file featuring, for the first time, allele-specific heteroplasmy and annotation files with prioritized variants. MToolBox was tested on simulated samples and applied on 1000 Genomes WXS datasets.Availability and implementationMToolBox package is available at https://sourceforge.net/projects/mtoolbox/.
Project description:BackgroundMitochondrial dysfunction is linked to numerous pathological states, in particular related to metabolism, brain health and ageing. Nuclear encoded gene polymorphisms implicated in mitochondrial functions can be analyzed in the context of classical genome wide association studies. By contrast, mitochondrial DNA (mtDNA) variants are more challenging to identify and analyze for several reasons. First, contrary to the diploid nuclear genome, each cell carries several hundred copies of the circular mitochondrial genome. Mutations can therefore be present in only a subset of the mtDNA molecules, resulting in a heterogeneous pool of mtDNA, a situation referred to as heteroplasmy. Consequently, detection and quantification of variants requires extremely accurate tools, especially when this proportion is small. Additionally, the mitochondrial genome has pseudogenized into numerous copies within the nuclear genome over the course of evolution. These nuclear pseudogenes, named NUMTs, must be distinguished from genuine mtDNA sequences and excluded from the analysis.ResultsHere we describe a novel method, named MitoRS, in which the entire mitochondrial genome is amplified in a single reaction using rolling circle amplification. This approach is easier to setup and of higher throughput when compared to classical PCR amplification. Sequencing libraries are generated at high throughput exploiting a tagmentation-based method. Fine-tuned parameters are finally applied in the analysis to allow detection of variants even of low frequency heteroplasmy. The method was thoroughly benchmarked in a set of experiments designed to demonstrate its robustness, accuracy and sensitivity. The MitoRS method requires 5 ng total DNA as starting material. More than 96 samples can be processed in less than a day of laboratory work and sequenced in a single lane of an Illumina HiSeq flow cell. The lower limit for accurate quantification of single nucleotide variants has been measured at 1% frequency.ConclusionsThe MitoRS method enables the robust, accurate, and sensitive analysis of a large number of samples. Because it is cost effective and simple to setup, we anticipate this method will promote the analysis of mtDNA variants in large cohorts, and may help assessing the impact of mtDNA heteroplasmy on metabolic health, brain function, cancer progression, or ageing.
Project description:Sea turtle populations around the world face rapid decline due to the effect of anthropogenic and environmental factors. Among the affected populations are those of hawksbill turtles (Eretmochelys imbricata) and loggerhead turtles (Caretta caretta), which is why a greater effort is currently being made in their monitoring and tracing. The intragenic degree of heteroplasmic mutations, commonly associated with diseases of variable symptoms, has not been analyzed in these species. In this study, heteroplasmy in the complete mitogenome (mtDNA) of three loggerhead turtles and one hawksbill turtle was identified from data obtained by RNAseq. Individuals Cc3, Ei1, Cc1 and Cc2 presented 0.3, 1.7, 1.8 and 7.1% of heteroplasmic mutations in all their mtDNA, respectively. The protein-coding genes that presented the highest percentage of heteroplasmy were ND4 and ND5 in individual Cc2 with 16 and 38.6%, respectively. Of the tRNA genes, only tRNATyr was heteroplasmic in the four individuals with 5.63% (Cc1), 25.35% (Ei1 and Cc2) and 49.3% (Cc3). In this study, we identified the critical sites of heteroplasmy in each individual and the genetic variability of their mitogenomes. The data obtained represents the baseline for future projects that evaluate the population status of these species.