Browse
Submit Data
Databases
API
Help

Dataset Information

19 Views

0 Connections

0 Citations

0 Reanalyses

0 Downloads

Omics score: 0

Characterizing genomic variants and mutations in SARS-CoV-2 proteins from Indian isolates.

ABSTRACT: SARS-CoV-2 is mutating and creating divergent variants by altering the composition of essential constituent proteins. Pharmacologically, it is crucial to understand the diverse mechanism of mutations for stable vaccine or anti-viral drug design. Our current study concentrates on all the constituent proteins of 469 SARS-CoV-2 genome samples, derived from Indian patients. However, the study may easily be extended to the samples across the globe. We perform clustering analysis towards identifying unique variants in each of the SARS-CoV-2 proteins. A total of 536 mutated positions within the coding regions of SARS-CoV-2 proteins are detected among the identified variants from Indian isolates. We quantify mutations by focusing on the unique variants of each SARS-CoV-2 protein. We report the average number of mutation per variant, percentage of mutated positions, synonymous and non-synonymous mutations, mutations occurring in three codon positions and so on. Our study reveals the most susceptible six (06) proteins, which are ORF1ab, Spike (S), Nucleocapsid (N), ORF3a, ORF7a, and ORF8. Several non-synonymous substitutions are observed to be unique in different SARS-CoV-2 proteins. A total of 57 possible deleterious amino acid substitutions are predicted, which may impact on the protein functions. Several mutations show a large decrease in protein stability and are observed in putative functional domains of the proteins that might have some role in disease pathogenesis. We observe a good number of physicochemical property change during above deleterious substitutions.

SUBMITTER: Das JK

PROVIDER: S-EPMC7893251 | biostudies-literature |

REPOSITORIES: biostudies-literature

ACCESS DATA

Json Xml

Similar Datasets

The molecular assessment of SARS-CoV-2 Nucleocapsid Phosphoprotein variants among Indian isolates.

Project description:Coronavirus disease- 2019 (COVID-19) has rapidly become a major threat to humans due to its high infection rate and deaths caused worldwide. This disease is caused by an RNA virus, Severe Acquired Respiratory Syndrome -Corona Virus-2 (SARS-CoV-2). This class of viruses have a high rate of mutation than DNA viruses that enables them to adapt and also evade host immune system. Here, we compared the first known Nucleocapsid Phosphoprotein (N protein) sequence of SARS-CoV-2 from China with the sequences from Indian COVID-19 patients to understand, if this virus is also mutating, as it is spreading to new locations. Our data revealed twenty mutations present among Indian isolates. Out of these, mutation at six positions led to changes in the secondary structure of N protein. Further, we also show that these mutations are primarily destabilising the protein structure. The candidate mutations identified in this study may help to speed up the understanding of variations occurring in SARS-CoV-2.

| S-EPMC7848562 | biostudies-literature

Identification of genotypic variants and its proteomic mutations of Brazilian SARS-CoV-2 isolates.

Project description:The second wave of COVID-19 caused by severe acute respiratory syndrome virus (SARS-CoV-2) is rapidly spreading over the world. Mechanisms behind the flee from current antivirals are still unclear due to the continuous occurrence of SARS-CoV-2 genetic variants. Brazil is the world's second-most COVID-19 affected country. In the present study, we identified the genomic and proteomic variants of Brazilian SARS-CoV-2 isolates. We identified 16 different genotypic variants were found among the 27 isolates. The genotypes of three isolates such as Bra/1236/2021 (G15), Bra/MASP2C844R2/2020 (G11), and Bra/RJ-DCVN5/2020 (G9) have a unique mutant in NSP4 (S184N), 2'O-Mutase (R216N), membrane protein (A2V) and Envelope protein (V5A). A mutation in RdRp of SARS-CoV-2, particularly the change of Pro-to Leu-at 323 resulted in the stabilization of the structure in BRA/CD1739-P4/2020. NSP4, NSP5 protein mutants are more virulent in genotype 15 and 16. A fast protein folding rate changes the structural stability and leads to escape for current antivirals. Thus, our findings help researchers to develop the best potent antivirals based on the new mutant of Brazilian isolates.

| S-EPMC8563081 | biostudies-literature

Characterizing SARS-CoV-2 mutations in the United States.

Project description:The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has been mutating since it was first sequenced in early January 2020. The genetic variants have developed into a few distinct clusters with different properties. Since the United States (US) has the highest number of viral infected patients globally, it is essential to understand the US SARS-CoV-2. Using genotyping, sequence-alignment, time-evolution, k-means clustering, protein-folding stability, algebraic topology, and network theory, we reveal that the US SARS-CoV-2 has four substrains and five top US SARS-CoV-2 mutations were first detected in China (2 cases), Singapore (2 cases), and the United Kingdom (1 case). The next three top US SARS-CoV-2 mutations were first detected in the US. These eight top mutations belong to two disconnected groups. The first group consisting of 5 concurrent mutations is prevailing, while the other group with three concurrent mutations gradually fades out. We identify that one of the top mutations, 27964C>T-(S24L) on ORF8, has an unusually strong gender dependence. Based on the analysis of all mutations on the spike protein, we further uncover that three of four US SASR-CoV-2 substrains become more infectious. Our study calls for effective viral control and containing strategies in the US.

| S-EPMC7430589 | biostudies-literature

Genomic characterization of SARS-CoV-2 isolates from patients in Turkey reveals the presence of novel mutations in spike and nsp12 proteins.

Project description:Novel mutations have been emerging in the genome of severe acute respiratory syndrome-coronavirus 2 (SARS-CoV-2); consequently, the evolving of more virulent and treatment resistance strains have the potential to increase transmissibility and mortality rates. The characterization of full-length SARS-CoV-2 genomes is critical for understanding the origin and transmission pathways of the virus, as well as identifying mutations that affect the transmissibility and pathogenicity of the virus. We present an analysis of the mutation pattern and clade distribution of full-length SARS-CoV-2 genome sequences obtained from specimens tested at Gazi University Medical Virology Laboratory. Viral RNA was extracted from nasopharyngeal specimens. Next-generation sequencing libraries were prepared and sequenced on Illumina iSeq 100 platform. Raw sequencing data were processed to obtain full-length genome sequences and variant calling was performed to analyze amino acid changes. Clade distribution was determined to understand the phylogenetic background in relation to global data. A total of 293 distinct mutations were identified, of which 152 missense, 124 synonymous, 12 noncoding, and 5 deletions. The most frequent mutations were P323L (nsp12), D614G (ORF2/S), and 2421C>T (5'-untranslated region) found simultaneously in all sequences. Novel mutations were found in nsp12 (V111A, H133R, Y453C, M626K) and ORF2/S (R995G, V1068L). Nine different Pangolin lineages were detected. The most frequently assigned lineage was B.1.1 (17 sequences), followed by B.1 (7 sequences) and B.1.1.36 (3 sequences). Sequence information is essential for revealing genomic diversity. Mutations might have significant functional implications and analysis of these mutations provides valuable information for therapeutic and vaccine development studies. Our findings point to the introduction of the virus into Turkey through various sources and the subsequent spread of several key variants.

| S-EPMC8426744 | biostudies-literature

Outbreak.info genomic reports: scalable and dynamic surveillance of SARS-CoV-2 variants and mutations.

Project description:The emergence of SARS-CoV-2 variants of concern has prompted the need for near real-time genomic surveillance to inform public health interventions. In response to this need, the global scientific community, through unprecedented effort, has sequenced and shared over 11 million genomes through GISAID, as of May 2022. This extraordinarily high sampling rate provides a unique opportunity to track the evolution of the virus in near real-time. Here, we present outbreak.info, a platform that currently tracks over 40 million combinations of PANGO lineages and individual mutations, across over 7,000 locations, to provide insights for researchers, public health officials, and the general public. We describe the interpretable and opinionated visualizations in the variant and location focussed reports available in our web application, the pipelines that enable the scalable ingestion of heterogeneous sources of SARS-CoV-2 variant data, and the server infrastructure that enables widespread data dissemination via a high performance API that can be accessed using an R package. We present a case study that illustrates how outbreak.info can be used for genomic surveillance and as a hypothesis generation tool to understand the ongoing pandemic at varying geographic and temporal scales. With an emphasis on scalability, interactivity, interpretability, and reusability, outbreak.info provides a template to enable genomic surveillance at a global and localized scale.

| S-EPMC9258294 | biostudies-literature

Molecular docking studies of Indian variants of pathophysiological proteins of SARS-CoV-2 with selected drug candidates.

Project description:SARS-CoV-2 pandemic has recently made the entire world come to a standstill. The number of cases in the world, especially India, have been increasing exponentially. The need of the hour is to assimilate as much data as possible to fast track the pipeline of bringing in new therapeutic tools against this fatal virus. In this brief communication, we aim to throw light on the various variants of the proteins involved heavily in the pathophysiology of COVID-19, namely Spike protein, ACE2, GRP78, TMPRSS2 and NSP-12. We also portray the molecular docking studies of these proteins with specific drugs that are currently being associated with the same. In our brief study, we come across a few key findings. First of all the combinations of the variants of spike protein and ACE2 binding show overall 25% unfavourable ΔΔG. Second, NSP12 is the most mutation prone among all the NSPs of the SARS-CoV-2 genome and the most common mutations are P323L and A97V. Third, we discovered the variants found in the Indian subpopulation that have greater binding with the currently investigated drugs.

| S-EPMC8435403 | biostudies-literature

Genomic Characterization of SARS-CoV-2 Variants from Clinical Isolates during the COVID-19 Epidemic in Mauritania.

Project description:The rapid genetic evolution of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) during the coronavirus disease 2019 (COVID-19) pandemic has greatly challenged public health authorities worldwide, including in Mauritania. Despite the presence of the virus in Mauritania, only one study described its genomic variation during the course of the epidemic. The purpose of the present study was to document the genomic pattern of SARS-CoV-2 variants from clinical isolates during the COVID-19 outbreak in Mauritania, from September to November 2021. The whole genomes from 54 SARS-CoV-2 strains detected in nasopharyngeal swabs with a cycle threshold value ≤ 30 were successfully sequenced using next-generation sequencing (NGS) and the Illumina protocol. The mean genome coverage (±standard deviation) was 96.8% (±3.7). The most commonly identified clade was 21J (57.4%), followed by 21D (16.7%), 20A (11.1%), and 20B (9.2%). At the level of lineages, the majority of the samples were Delta variants with the sub-lineage AY.34 (or B.1.617.2.34). Among the 54 SARS-CoV-2 isolates that were successfully sequenced, 33 (61.1%) came from vaccinated individuals, and 21 (38.9%) were from unvaccinated individuals. Several SARS-CoV-2 variants were present in Mauritania between September and November 2021. As Mauritania, like many West African countries, is resource-limited regarding viral genome sequencing facilities, establishment of mutualized sub-regional sequencing platforms will be necessary to ensure continuous monitoring of mutations in viral genomes and track potential reduction in COVID-19 vaccine efficacy, increased transmissibility, and disease severity.

| S-EPMC10970642 | biostudies-literature

Geographic and Genomic Distribution of SARS-CoV-2 Mutations.

Project description:The novel respiratory disease COVID-19 has reached the status of worldwide pandemic and large efforts are currently being undertaken in molecularly characterizing the virus causing it, SARS-CoV-2. The genomic variability of SARS-CoV-2 specimens scattered across the globe can underly geographically specific etiological effects. In the present study, we gather the 48,635 SARS-CoV-2 complete genomes currently available thanks to the collection endeavor of the GISAID consortium and thousands of contributing laboratories. We analyzed and annotated all SARS-CoV-2 mutations compared with the reference Wuhan genome NC_045512.2, observing an average of 7.23 mutations per sample. Our analysis shows the prevalence of single nucleotide transitions as the major mutational type across the world. There exist at least three clades characterized by geographic and genomic specificity. In particular, clade G, prevalent in Europe, carries a D614G mutation in the Spike protein, which is responsible for the initial interaction of the virus with the host human cell. Our analysis may facilitate custom-designed antiviral strategies based on the molecular specificities of SARS-CoV-2 in different patients and geographical locations.

| S-EPMC7387429 | biostudies-literature

Genomic Tracking of SARS-CoV-2 Variants in Myanmar.

Project description:In December 2019, the COVID-19 disease started in Wuhan, China. The WHO declared a pandemic on 12 March 2020, and the disease started in Myanmar on 23 March 2020. In December 2020, different variants were brought worldwide, threatening global health. To counter those threats, Myanmar started the COVID-19 variant surveillance program in late 2020. Whole genome sequencing was done six times between January 2021 and March 2022. Among them, 83 samples with a PCR threshold cycle of less than 25 were chosen. Then, we used MiSeq FGx for sequencing and Illumina DRAGEN COVIDSeq pipeline, command line interface, GISAID, and MEGA version 7 for data analysis. In January 2021, no variant was detected. The second run, during the rise of cases in June 2021, showed Alpha, Delta, and Kappa variants. The third and the fourth runs in August and December showed only a Delta variant. Omicron and Delta variants were detected during the fifth run in January 2022. The sixth run in March 2022 showed only Omicron BA.2. Amino acid mutation at the receptor binding domain of Spike glycoprotein started since the second run coupling with high transmission, recurrence, and vaccine escape. We also found the mutation at the primer targets used in current RT-PCR platforms, but there was no mutation at the existing antiviral drug targets. The occurrence of multiple variants and mutations claimed vigilance at ports of entry and preparedness for effective control measures. Genomic surveillance with the observation of evolutionary data is required to predict imminent threats of the current disease and diagnose emerging infectious diseases.

| S-EPMC9862072 | biostudies-literature

Identification of twenty-five mutations in surface glycoprotein (Spike) of SARS-CoV-2 among Indian isolates and their impact on protein dynamics.

Project description:SARS-CoV-2, the causative agent of the COVID-19 pandemic, is an RNA virus that has inherent high rate of mutation. Due to the mutations, the virus evolves at a rapid pace that helps them to survive better inside the host. One of the hotspots of pharmacological interventions is to inhibit binding of virus with the host cells, which is mediated by Spike glycoprotein of SARS-CoV-2 and ACE2 receptors present on the human cells. This study was conducted with an aim to identify and characterise the mutation (s) present in the Spike glycoprotein of the SARS-CoV-2. Towards this, an in silico methodology was used, and the mutations on Spike glycoprotein were identified by comparing the Spike glycoprotein of first reported sequence from Wuhan wet seafood market virus with the available sequences of SARS-CoV-2 from Indian isolates. Our analysis revealed the presence of twenty-five mutations in Spike glycoprotein among Indian SARS-CoV-2 isolates. These mutations spread all over the protein and can be clustered at least into four distinct positions. Further, mutations at eleven positions exhibited alterations in the secondary structure of the polypeptide chain. We also investigated the influence of these mutations on overall protein dynamics and have shown that they affect the dynamic stability of the Spike glycoprotein.

| S-EPMC7521409 | biostudies-literature

OmicsDI is part of the ELIXIR infrastructure

OmicsDI is an Elixir interoperability service. Learn more ›

Tweets

OmicsDI Databases

PRIDE
PeptideAtlas
MassIVE
JPOST Repository
Physiome Model Repository

EGA
EVA
ENA
LINCS
PAXDB
Cell Collective

MetaboLights
Metabolomics Workbench
MetabolomeExpress
GNPS
BioModels
FAIRDOMHub

ArrayExpress
dbGaP
ExpressionAtlas
GEO
NODE

Information

Databases
Help
API
Contact us
Code on GitHub
Terms of use
Submit Data