Project description:For high-throughput sequencing and quantification of immunoglobulin repertoires, most methodologies utilise RNA. However, output varies enormously between recombined genes due to different promoter strengths and differential activation of lymphocyte subsets, precluding quantitation of recombinants on a per cell basis. To date, DNA-based approaches have used V gene primer cocktails, with substantial inherent biases. Here we describe VDJ-seq, which accurately quantitates immunoglobulin diversity at the DNA level in an unbiased manner. This is accomplished with a single primer extension step using biotinylated J gene primers. By addition of unique molecular identifiers (UMI) before primer extension, we reliably remove duplicate sequences and correct for sequencing and PCR errors. Furthermore, VDJ-seq captures productive and non-productive VDJ and DJ recombination events on a per cell basis. Library preparation takes 3 days, with 2 days of sequencing, and 1 day of data processing and analysis.
Project description:Mycobacterium tuberculosis infects two billion people across the globe, and results in 8-9 million new tuberculosis (TB) cases and 1-1.5 million deaths each year. Most patients have no known genetic basis that predisposes them to disease. Here, we investigate the complex genetic basis of pulmonary TB by modelling human genetic diversity with the Diversity Outbred mouse population. When infected with M. tuberculosis, one-third develop early onset, rapidly progressive, necrotizing granulomas and succumb within 60 days. The remaining develop non-necrotizing granulomas and survive longer than 60 days. Genetic mapping using immune and inflammatory mediators; and clinical, microbiological, and granuloma correlates of disease identified five new loci on mouse chromosomes 1, 2, 4, 16; and three known loci on chromosomes 3 and 17. Further, multiple positively correlated traits shared loci on chromosomes 1, 16, and 17 and had similar patterns of allele effects, suggesting these loci contain critical genetic regulators of inflammatory responses to M. tuberculosis. To narrow the list of candidate genes, we used a machine learning strategy that integrated gene expression signatures from lungs of M. tuberculosis-infected Diversity Outbred mice with gene interaction networks to generate scores representing functional relationships. The scores were used to rank candidates for each mapped trait, resulting in 11 candidate genes: Ncf2, Fam20b, S100a8, S100a9, Itgb5, Fstl1, Zbtb20, Ddr1, Ier3, Vegfa, and Zfp318. Although all candidates have roles in infection, inflammation, cell migration, extracellular matrix remodeling, or intracellular signaling, and all contain single nucleotide polymorphisms (SNPs), SNPs in only four genes (S100a8, Itgb5, Fstl1, Zfp318) are predicted to have deleterious effects on protein functions. We performed methodological and candidate validations to (i) assess biological relevance of predicted allele effects by showing that Diversity Outbred mice carrying PWH/PhJ alleles at the H-2 locus on chromosome 17 QTL have shorter survival; (ii) confirm accuracy of predicted allele effects by quantifying S100A8 protein in inbred founder strains; and (iii) infection of C57BL/6 mice deficient for the S100a8 gene. Overall, this body of work demonstrates that systems genetics using Diversity Outbred mice can identify new (and known) QTLs and functionally relevant gene candidates that may be major regulators of complex host-pathogens interactions contributing to granuloma necrosis and acute inflammation in pulmonary TB.
Project description:Isolates of the gastric pathogen Helicobacter pylori harvested from different individuals are highly polymorphic. Strain variation also has been observed within a single host. To more fully ascertain the extent of H. pylori genetic diversity within the ecological niche of its natural host, we harvested additional isolates of the sequenced H. pylori strain J99 from its human source patient after a 6-year interval. Randomly amplified polymorphic DNA PCR and DNA sequencing of four unlinked loci indicated that these isolates were closely related to the original strain. In contrast, microarray analysis revealed differences in genetic content among all of the isolates that were not detected by randomly amplified polymorphic DNA PCR or sequence analysis. Several ORFs from loci scattered throughout the chromosome in the archival strain did not hybridize with DNA from the recent strains, including multiple ORFs within the J99 plasticity zone. In addition, DNA from the recent isolates hybridized with probes for ORFs specific for the other fully sequenced H. pylori strain 26695, including a putative traG homolog. Among the additional J99 isolates, patterns of genetic diversity were distinct both when compared with each other and to the original prototype isolate. These results indicate that within an apparently homogeneous population, as determined by macroscale comparison and nucleotide sequence analysis, remarkable genetic differences exist among single-colony isolates of H. pylori. Direct evidence that H. pylori has the capacity to lose and possibly acquire exogenous DNA is consistent with a model of continuous microevolution within its cognate host. Set of arrays organized by shared biological context, such as organism, tumors types, processes, etc. Keywords: Logical Set
Project description:Isolates of the gastric pathogen Helicobacter pylori harvested from different individuals are highly polymorphic. Strain variation also has been observed within a single host. To more fully ascertain the extent of H. pylori genetic diversity within the ecological niche of its natural host, we harvested additional isolates of the sequenced H. pylori strain J99 from its human source patient after a 6-year interval. Randomly amplified polymorphic DNA PCR and DNA sequencing of four unlinked loci indicated that these isolates were closely related to the original strain. In contrast, microarray analysis revealed differences in genetic content among all of the isolates that were not detected by randomly amplified polymorphic DNA PCR or sequence analysis. Several ORFs from loci scattered throughout the chromosome in the archival strain did not hybridize with DNA from the recent strains, including multiple ORFs within the J99 plasticity zone. In addition, DNA from the recent isolates hybridized with probes for ORFs specific for the other fully sequenced H. pylori strain 26695, including a putative traG homolog. Among the additional J99 isolates, patterns of genetic diversity were distinct both when compared with each other and to the original prototype isolate. These results indicate that within an apparently homogeneous population, as determined by macroscale comparison and nucleotide sequence analysis, remarkable genetic differences exist among single-colony isolates of H. pylori. Direct evidence that H. pylori has the capacity to lose and possibly acquire exogenous DNA is consistent with a model of continuous microevolution within its cognate host. Set of arrays organized by shared biological context, such as organism, tumors types, processes, etc. Keywords: Logical Set Computed
Project description:Immunoglobulin G (IgG) proteins are known for the huge diversity of the variable domains of their heavy and light chains, aimed at protecting each individual against foreign antigens. The IgG also harbor specific polymorphism concentrated in the CH2 and CH3-CHS constant regions located on the Fc fragment of their heavy chains. But this individual particularity relies only on a few amino acids among which some could make accurate sequence determination a challenge for mass spectrometry-based techniques. The purpose of the study was to bring a molecular validation of proteomic results by the sequencing of encoding DNA fragments. It was performed using ten individual samples (DNA and sera) selected on the basis of their Gm (gamma marker) allotype polymorphism in order to cover the main immunoglobulin heavy gamma (IGHG) gene diversity. Gm allotypes, reflecting part of this diversity, were determined by a serological method. On its side, the IGH locus comprises four functional IGHG genes totalizing 34 alleles and encoding the four IgG subclasses. The genomic study focused on the nucleotide polymorphism of the CH2 and CH3-CHS exons and of the intron. Despite strong sequence identity, four pairs of specific gene amplification primers could be designed. Additional primers were identified to perform the subsequent sequencing. The nucleotide sequences obtained were first assigned to a specific IGHG gene, and then IGHG alleles were deduced using a home-made decision tree reading of the nucleotide sequences. IGHG amino acid (AA) alleles were determined by mass spectrometry. Identical results were found at 95% between alleles identified by proteomics and those deduced from genomics. These results validate the proteomic approach which could be used for diagnostic purposes, namely for a mother-and-child differential IGHG detection in a context of suspicion of congenital infection.
Project description:Activation-induced cytidine deaminase (AID) is required for both somatic hypermutation (SHM) and class-switch recombination (CSR) in activated B cells. AID is also known to target non-immunoglobulin genes and introduce mutations or chromosomal translocations, eventually causing tumors. To identify as-yet-unknown AID targets, we screened early AID-induced DNA breaks using two independent genome-wide approaches. Along with known AID targets, this screen identified a set of novel genes (SNHG3, MALAT1, BCL7A, and CUX1), and confirmed that these new loci accumulated mutations as high as Ig locus after AID activation. Moreover, these genes share three important characteristics with the immunoglobulin gene: translocations in tumors, repetitive sequences and the epigenetic modification of chromatin by H3K4 trimethylation in the vicinity of cleavage sites.
Project description:Marine algae convert a substantial fraction of the carbon dioxide they fix into various polysaccharides. Bacteria specialized on the remineralization of these polysaccharides often feature genomic clusters, termed polysaccharide utilization loci (PULs). Such PULs are often prevalent in, but not limited to, marine Flavobacteriia. Since knowledge on extant PUL diversity is sparse, we sequenced the genomes of 53 North Sea Flavobacteriia. We obtained 400 PULs, suggesting usage of a large array of polysaccharides, including laminarin, α- and β-mannans, fucose-, xylose-, galactose-, rhamnose- and arabinose-containing substrates, pectins, and chitins. Many of the PULs were novel, some indicating substrates that have rarely been described in marine environments. PUL repertoires of isolates often differed significantly within genera, corroborating ecological niche-associated glycan partitioning. Polysaccharide uptake in Flavobacteriia is mediated by SusCD. Respective protein trees revealed clustering according to polysaccharide specificities. Analysis of SusCD expression in multiyear phytoplankton bloom-associated metaproteomes indicated changes in microbial utilization of glucan, ß-mannan and sulfated xylan, suggesting that distinct substrates are temporarily abundant.
Project description:Copy number variants (CNVs) affect both disease and normal phenotypic variation but those lying within heavily duplicated, highly identical sequence have been difficult to assay. By analyzing short-read mapping depth for 159 human genomes, we demonstrate accurate estimation of absolute copy number for duplications as small as 1.9 kbp, ranging from 0-48 copies. We identified 4.1 million ‘singly unique nucleotide’ (SUN) positions informative in distinguishing specific copies, and use them to genotype the copy and content of specific paralogs within highly duplicated gene families. These data identify human-specific expansions in genes associated with brain development, reveal extensive population genetic diversity, and detect signatures consistent with gene conversion in the human species. Our approach makes ~1000 genes accessible to genetic studies of disease association. This dataset complements the results from short read sequencing by performing validation on five individuals. We analyzed the 17q21.1 locus in 5 HapMap individuals by array CGH on a custom Agilent 4-plex 310k array performing 1 experiment for each sample. This array was targeted at high density (1 probe/105bp) to 7 genomic loci, including 17q21. The reference individual used was NA19240.
Project description:Programmed genetic rearrangements in lymphocytes require transcription at antigen receptor genes to promote accessibility for initiating double-strand break (DSB) formation critical for DNA recombination and repair. Here we show that activated B cells deficient in the PTIP component of the MLL3 (mixed-lineage leukemia 3) /MLL4 complex display impaired histone methylation (H3K4me3) and transcription initiation of downstream switch regions at the immunoglobulin heavy-chain (Igh) locus leading to defective immunoglobulin class-switching. We also show that PTIP accumulation at DSBs contributes to class-switch recombination (CSR) and genome stability independently from Igh switch transcription. These results demonstrate that PTIP promotes specific chromatin changes that control the accessibility of the Igh locus to CSR, and suggest a non-redundant role for the MLL3/MLL4 complex in altering antibody effector function. Genome-wide analysis of histone modifications, PTIP, and Pol II in PTIP-WT and PTIP-KO mouse activated B cells.