Project description:Cancer is predominantly a somatic disease. A mutant allele present in a cancer cell genome is considered somatic when it's absent in the paired normal genome along with public SNP databases. The current build of dbSNP, the most comprehensive public SNP database, however inadequately represents several non-European Caucasian populations, posing a limitation in cancer genomic analyses of data from these populations. We present the T: ata M: emorial C: entre-SNP D: ata B: ase (TMC-SNPdb), as the first open source, flexible, upgradable, and freely available SNP database (accessible through dbSNP build 149 and ANNOVAR)-representing 114 309 unique germline variants-generated from whole exome data of 62 normal samples derived from cancer patients of Indian origin. The TMC-SNPdb is presented with a companion subtraction tool that can be executed with command line option or using an easy-to-use graphical user interface with the ability to deplete additional Indian population specific SNPs over and above dbSNP and 1000 Genomes databases. Using an institutional generated whole exome data set of 132 samples of Indian origin, we demonstrate that TMC-SNPdb could deplete 42, 33 and 28% false positive somatic events post dbSNP depletion in Indian origin tongue, gallbladder, and cervical cancer samples, respectively. Beyond cancer somatic analyses, we anticipate utility of the TMC-SNPdb in several Mendelian germline diseases. In addition to dbSNP build 149 and ANNOVAR, the TMC-SNPdb along with the subtraction tool is available for download in the public domain at the following:Database URL: http://www.actrec.gov.in/pi-webpages/AmitDutt/TMCSNP/TMCSNPdp.html.
Project description:Cancer is predominantly a somatic disease. A mutant allele found in cancer cell genome is considered somatic when it is absent in paired normal genome and dbSNP, the most comprehensive public SNP database. However, dbSNP inadequately represents several non-Caucasian populations including that from the Indian subcontinent, posing a limitation in cancer genomic analyses of data from these populations. We present TMC-SNPdb, as the first open source freely accessible (through ANNOVAR), flexible and upgradable SNP database from whole exome data of 62 normal samples derived from cancer patients of Indian origin, representing 114,309 unique germline variants. TMC-SNPdb is presented with a companion subtraction tool that can be executed with command line option or an easy-to-use graphical user interface (GUI) with the ability to deplete additional Indian population specific SNPs over and above that possible with dbSNP and 1000 Genomes databases. Using an institutional generated whole exome data set of 132 samples of Indian origin, we demonstrate that TMC-SNPdb reduced 42%, 33% and 28% false positive somatic events post dbSNP depletion in Indian origin tongue, gallbladder, and cervical cancer samples, respectively. Beyond cancer somatic analyses, we anticipate utility of TMC-SNPdb in several Mendelian germline diseases.
Project description:The study involves whole exome sequencing of 20 primary tumors obtained from lung squamous carcinoma patients of Indian origin. With this, we aim to describe the mutational profile of this specific subset of lung cancer patients. This knowledge will further allow us to gain an insight into potentially actionable genomic alterations prevalent in Indian lung squamous carcinoma.
Project description:Azospirillum brasilense is used worldwide as a plant growth-promoting inoculant for agricultural crops. To understand how the genomes of Indian strains of A. brasilense compare with their South American counterparts, we determined the whole-genome sequences of four strains of A. brasilense isolated from the rhizosphere of grasses from India.
Project description:Breast cancer (BC) has emerged as the most common malignancy among females. The genomic profile of BC is diverse in nature and complex due to heterogeneity among various geographically different ethnic groups. The primary objective of this study was to carry out a comprehensive mutational analysis of Indian BC cases by performing whole exome sequencing. The cohort included patients with a median age of 48 years. TTN, TP53, MUC16, SYNE1, and OBSCN were the frequently altered genes found in our cohort. The PIK3CA and KLC3 genes are driver genes implicated in various cellular functions and cargo transportation through microtubules, respectively. Except for CCDC168 and PIK3CA, several gene pairings were found to be significantly linked with co-occurrence. Irrespective of their hormonal receptor status, RTK/RAS was observed with frequently altered signaling pathways. Further analysis of the mutational signature revealed that SBS13, SBS6, and SBS29 were mainly observed in our cohort. This study supplements the discovery of diagnostic biomarkers and provides new therapeutic options for the improved management of BC.
Project description:Feed regimens have a pivotal role in modulating the transcriptional programs that, in turns, have an impact on many biological processes, including metabolism, health and development. Green feed diet in ruminant exerts a beneficial effect on rumen metabolism and enhances the content of health-promoting biomolecules in the milk. However, a comprehensive analysis focused to the identification of genes, and therefore, biological processes modulated by the green feed diet in buffalo rumen has never been reported so far. In this regard, to highlight the impact of the green feed diet on ruminal transcriptomic profiles, we performed RNA-sequencing in buffaloes fed a total mixed ratio (TMR) + the inclusion of 30% of ryegrass green feed (treated group) in comparison with buffaloes fed a dry TMR diet (control group).
Project description:Leber congenital amaurosis (LCA) is a severe autosomal recessive retinal degenerative disease. The current study describes exome sequencing results for two unrelated Indian LCA patients carrying novel nonsense p.(Glu636*) and frameshift p.(Pro2281Leufs*63) mutations in the ALMS1 gene. Although ALMS1 gene mutations are associated with Alstrom syndrome (AS), the current patients did not exhibit typical syndromic features of AS. These data suggest that ALMS1 should be included in the candidate gene panel for LCA to improve diagnostic efficiency.
Project description:Copy number variants are duplications and deletions of the genome that play an important role in phenotypic changes and human disease. Many software applications have been developed to detect copy number variants using either whole-genome sequencing or whole-exome sequencing data. However, there is poor agreement in the results from these applications. Simulated datasets containing copy number variants allow comprehensive comparisons of the operating characteristics of existing and novel copy number variant detection methods. Several software applications have been developed to simulate copy number variants and other structural variants in whole-genome sequencing data. However, none of the applications reliably simulate copy number variants in whole-exome sequencing data. We have developed and tested Simulator of Exome Copy Number Variants (SECNVs), a fast, robust and customizable software application for simulating copy number variants and whole-exome sequences from a reference genome. SECNVs is easy to install, implements a wide range of commands to customize simulations, can output multiple samples at once, and incorporates a pipeline to output rearranged genomes, short reads and BAM files in a single command. Variants generated by SECNVs are detected with high sensitivity and precision by tools commonly used to detect copy number variants. SECNVs is publicly available at https://github.com/YJulyXing/SECNVs.