Project description:A comparative genomic approach was used to identify large sequence polymorphisms among Mycobacterium avium isolates obtained from a variety of host species. DNA microarrays were used as a platform for comparing mycobacteria field isolates with the sequenced bovine isolate Mycobacterium avium subsp. paratuberculosis (Map) K10. ORFs were classified as present or divergent based on the relative fluorescent intensities of the experimental samples compared to Map K10 DNA. Map isolates cultured from cattle, bison, sheep, goat, avian, and human sources were hybridized to the Map microarray. Three large deletions were observed in the genomes of four Map isolates obtained from sheep and four clusters of ORFs homologous to sequences in the Mycobacterium avium subsp. avium (Maa) 104 genome were identified as being present in these isolates. One of these clusters encodes glycopeptidolipid biosynthesis enzymes. One of the Map sheep isolates had a genome profile similar to a group of Mycobacterium avium subsp. silvaticum (Mas) isolates which included four independent laboratory stocks of the organism traditionally identified as Maa strain 18. Genome diversity in Map appears to be mostly restricted to large sequence polymorphisms that are often associated with mobile genetic elements. Keywords: Comparative genomic hybridization
Project description:Related surrogate species are often used to study the molecular basis of pathogenicity of a pathogen on the basis of a shared set of biological features generally attributable to a shared core genome consisting of orthologous genes. An important and understudied aspect, however, is the extent to which regulatory features affecting the expression of such shared genes are present in both species. Here we report on an analysis of whole transcriptome maps for an important member of the TB complex Mycobacterium bovis and a closely related model organism for studying mycobacterial pathogenicity Mycobacterium marinum.
Project description:Detection of species-specific proteotypic peptides for accurate and easy characterization of infectious non-tuberculous mycobacteria such as Mycobacterium avium subsp. paratuberculosis, Mycobacterium marinum and Mycobacterium vaccae is essential. Therefore, we carried out reanalysis of publicly available M. avium subsp. paratuberculosis, M. marinum and M. vaccae proteomic dataset PXD027444, PXD003766 and PASS00954 by proteome database search and followed by spectral library generation. The raw DDA data were searched against their respective reference proteome databases using Proteome Discoverer and FragPipe. The resulting peptide spectrum matches were converted into a spectral library using BiblioSpec.
Project description:We employed a proteogenomics workflow to identify microproteins encoded by small Open Reading Frames (ORFs) in the genome of Mycobacterium smegmatis strain mc²155.
Project description:The domestic goat, Capra hircus (2n=60), is one of the most important domestic livestock species in the world. Here we report its high quality reference genome generated by combining Illumina short reads sequencing and a new automated and high throughput whole genome mapping system based on the optical mapping technology which was used to generate extremely long super-scaffolds. The N50 size of contigs, scaffolds, and super-scaffolds for the sequence assembly reported herein are 18.7 kb, 3.06 Mb, and 18.2 Mb, respectively. Almost 95% of the supper-scaffolds are anchored on chromosomes based on conserved syntenic information with cattle. The assembly is strongly supported by the RH map of goat chromosome 1. We annotated 22,175 protein-coding genes, most of which are recovered by RNA-seq data of ten tissues. Rapidly evolving genes and gene families are enriched in metabolism and immune systems, consistent with the fact that the goat is one of the most adaptable and geographically widespread livestock species. Comparative transcriptomic analysis of the primary and secondary follicles of a cashmere goat revealed 51 genes that were significantly differentially expressed between the two types of hair follicles. This study not only provides a high quality reference genome for an important livestock species, but also shows that the new automated optical mapping technology can be used in a de novo assembly of large genomes. Corresponding whole genome sequencing is available in NCBI BioProject PRJNA158393. We have sequenced a 3-year-old female Yunnan black goat and constructed a reference sequence for this breed. In order to improve quality of gene models, RNA samples of ten tissues (Bladder, Brain, Heart, Kidney, Liver, Lung, Lymph, Muscle, Ovarian, Spleen) were extracted from the same goat which was sequenced. To investigate the genic basis underlying the development of cashmere fibers using the goat reference genome assembly and annotated genes, we extracted RNA samples of primary hair follicle and secondary hair follicle from three Inner Mongolia cashmere goats and conducted transcriptome sequencing and DGE analysis. This submission represents RNA-Seq component of study.
Project description:Related surrogate species are often used to study the molecular basis of pathogenicity of a pathogen on the basis of a shared set of biological features generally attributable to a shared core genome consisting of orthologous genes. An important and understudied aspect, however, is the extent to which regulatory features affecting the expression of such shared genes are present in both species. Here we report on an analysis of whole transcriptome maps for an important member of the TB complex Mycobacterium bovis and a closely related model organism for studying mycobacterial pathogenicity Mycobacterium marinum. Predict transcription start site
Project description:A comparative genomic approach was used to identify large sequence polymorphisms among Mycobacterium avium isolates obtained from a variety of host species. DNA microarrays were used as a platform for comparing mycobacteria field isolates with the sequenced bovine isolate Mycobacterium avium subsp. paratuberculosis (Map) K10. ORFs were classified as present or divergent based on the relative fluorescent intensities of the experimental samples compared to Map K10 DNA. Map isolates cultured from cattle, bison, sheep, goat, avian, and human sources were hybridized to the Map microarray. Three large deletions were observed in the genomes of four Map isolates obtained from sheep and four clusters of ORFs homologous to sequences in the Mycobacterium avium subsp. avium (Maa) 104 genome were identified as being present in these isolates. One of these clusters encodes glycopeptidolipid biosynthesis enzymes. One of the Map sheep isolates had a genome profile similar to a group of Mycobacterium avium subsp. silvaticum (Mas) isolates which included four independent laboratory stocks of the organism traditionally identified as Maa strain 18. Genome diversity in Map appears to be mostly restricted to large sequence polymorphisms that are often associated with mobile genetic elements. Keywords: Comparative genomic hybridization Each isolate was competitively hybridized against Map K10 with a minimum of 2 dye flip hybridizations per isolate.
Project description:The domestic goat, Capra hircus (2n=60), is one of the most important domestic livestock species in the world. Here we report its high quality reference genome generated by combining Illumina short reads sequencing and a new automated and high throughput whole genome mapping system based on the optical mapping technology which was used to generate extremely long super-scaffolds. The N50 size of contigs, scaffolds, and super-scaffolds for the sequence assembly reported herein are 18.7 kb, 3.06 Mb, and 18.2 Mb, respectively. Almost 95% of the supper-scaffolds are anchored on chromosomes based on conserved syntenic information with cattle. The assembly is strongly supported by the RH map of goat chromosome 1. We annotated 22,175 protein-coding genes, most of which are recovered by RNA-seq data of ten tissues. Rapidly evolving genes and gene families are enriched in metabolism and immune systems, consistent with the fact that the goat is one of the most adaptable and geographically widespread livestock species. Comparative transcriptomic analysis of the primary and secondary follicles of a cashmere goat revealed 51 genes that were significantly differentially expressed between the two types of hair follicles. This study not only provides a high quality reference genome for an important livestock species, but also shows that the new automated optical mapping technology can be used in a de novo assembly of large genomes. We have sequenced a 3-year-old female Yunnan black goat and constructed a reference sequence for this breed. In order to improve quality of gene models, RNA samples of ten tissues(Bladder, Brain, Heart, Kidney, Liver, Lung, Lymph, Muscle, Ovarian, Spleen) were extracted from the same goat which was sequenced. To investigate the genic basis underlying the development of cashmere fibers using the goat reference genome assembly and annotated genes, we extracted RNA samples of primary hair follicle and secondary hair follicle from three Inner Mongolia cashmere goats and conducted transcriptome sequencing and DEG analysis. Corresponding whole genome sequencing is available in NCBI BioProject PRJNA158393.
Project description:Biochemical evidence is vital for accurate genome annotation. The integration of experimental data collected at the proteome level using high resolution mass spectrometry allows for improvements in genome annotation by providing evidence for novel gene models, while validating or modifying others. Here we report the results of a proteogenomic analysis of a reference strain of Mycobacterium smegmatis (mc2155), a fast growing model organism for the pathogenic Mycobacterium tuberculosis, the causative agent for Tuberculosis. By integrating high throughput LC/MS/MS proteomic data with genomic six frame translation and ab initio gene prediction databases, a total of 2887 ORFs were identified, including 2810 ORFs annotated to a Reference protein, and 63 ORFs not previously annotated to a Reference protein. Further, the translational start site (TSS) was validated for 558 Reference proteome gene models, while upstream translational evidence was identified for 81. In addition, N-terminus derived peptide identifications allowed for downstream TSS modification of a further 24 gene models. We validated the existence of 6 previously described interrupted coding sequences at the peptide level, and provide evidence for 4 novel frameshift positions. Analysis of peptide posterior error probability (PEP) scores indicate high-confidence novel peptide identifications and indicate that the genome of M. smegmatis is not yet fully annotated.