Dataset Information

Characterizing SARS-CoV-2 mutations in the United States.

ABSTRACT: The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has been mutating since it was first sequenced in early January 2020. The genetic variants have developed into a few distinct clusters with different properties. Since the United States (US) has the highest number of viral infected patients globally, it is essential to understand the US SARS-CoV-2. Using genotyping, sequence-alignment, time-evolution, k-means clustering, protein-folding stability, algebraic topology, and network theory, we reveal that the US SARS-CoV-2 has four substrains and five top US SARS-CoV-2 mutations were first detected in China (2 cases), Singapore (2 cases), and the United Kingdom (1 case). The next three top US SARS-CoV-2 mutations were first detected in the US. These eight top mutations belong to two disconnected groups. The first group consisting of 5 concurrent mutations is prevailing, while the other group with three concurrent mutations gradually fades out. We identify that one of the top mutations, 27964C>T-(S24L) on ORF8, has an unusually strong gender dependence. Based on the analysis of all mutations on the spike protein, we further uncover that three of four US SASR-CoV-2 substrains become more infectious. Our study calls for effective viral control and containing strategies in the US.

SUBMITTER: Wang R

PROVIDER: S-EPMC7430589 | biostudies-literature | 2020 Aug

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Characterizing SARS-CoV-2 mutations in the United States.

Wang Rui R Chen Jiahui J Gao Kaifu K Hozumi Yuta Y Yin Changchuan C Wei Guo-Wei GW

Research square 20200811

The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has been mutating since it was first sequenced in early January 2020. The genetic variants have developed into a few distinct clusters with different properties. Since the United States (US) has the highest number of viral infected patients globally, it is essential to understand the US SARS-CoV-2. Using genotyping, sequence-alignment, time-evolution, k-means clustering, protein-folding stability, algebraic topology, and network th ...[more]

PMID: 32818213

Dataset Information

Characterizing SARS-CoV-2 mutations in the United States.

Publications

Characterizing SARS-CoV-2 mutations in the United States.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

SARS-CoV-2 Genomes From Oklahoma, United States.
| S-EPMC7886813 | biostudies-literature

SARS-CoV-2 Delta-Omicron Recombinant Viruses, United States.
| S-EPMC9239886 | biostudies-literature

Analysis of SARS-CoV-2 mutations in the United States suggests presence of four substrains and novel variants.
| S-EPMC7884689 | biostudies-literature

Estimating unobserved SARS-CoV-2 infections in the United States.
| S-EPMC7486725 | biostudies-literature

Characterizing genomic variants and mutations in SARS-CoV-2 proteins from Indian isolates.
| S-EPMC7893251 | biostudies-literature

Mapping a Pandemic: SARS-CoV-2 Seropositivity in the United States.
| S-EPMC7852277 | biostudies-literature

Substantial underestimation of SARS-CoV-2 infection in the United States.
| S-EPMC7481226 | biostudies-literature

Household transmission of SARS-CoV-2 Alpha variant - United States, 2021.
| S-EPMC9047162 | biostudies-literature

Tracking SARS-CoV-2 Spike Protein Mutations in the United States (2020/01 - 2021/03) Using a Statistical Learning Strategy.
| S-EPMC8219100 | biostudies-literature

Tracking SARS-CoV-2 Spike Protein Mutations in the United States (January 2020-March 2021) Using a Statistical Learning Strategy.
| S-EPMC8777887 | biostudies-literature