Unknown

Dataset Information

0

Characterizing SARS-CoV-2 mutations in the United States.


ABSTRACT: The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has been mutating since it was first sequenced in early January 2020. The genetic variants have developed into a few distinct clusters with different properties. Since the United States (US) has the highest number of viral infected patients globally, it is essential to understand the US SARS-CoV-2. Using genotyping, sequence-alignment, time-evolution, k-means clustering, protein-folding stability, algebraic topology, and network theory, we reveal that the US SARS-CoV-2 has four substrains and five top US SARS-CoV-2 mutations were first detected in China (2 cases), Singapore (2 cases), and the United Kingdom (1 case). The next three top US SARS-CoV-2 mutations were first detected in the US. These eight top mutations belong to two disconnected groups. The first group consisting of 5 concurrent mutations is prevailing, while the other group with three concurrent mutations gradually fades out. We identify that one of the top mutations, 27964C>T-(S24L) on ORF8, has an unusually strong gender dependence. Based on the analysis of all mutations on the spike protein, we further uncover that three of four US SASR-CoV-2 substrains become more infectious. Our study calls for effective viral control and containing strategies in the US.

SUBMITTER: Wang R 

PROVIDER: S-EPMC7430589 | biostudies-literature | 2020 Aug

REPOSITORIES: biostudies-literature

altmetric image

Publications

Characterizing SARS-CoV-2 mutations in the United States.

Wang Rui R   Chen Jiahui J   Gao Kaifu K   Hozumi Yuta Y   Yin Changchuan C   Wei Guo-Wei GW  

Research square 20200811


The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has been mutating since it was first sequenced in early January 2020. The genetic variants have developed into a few distinct clusters with different properties. Since the United States (US) has the highest number of viral infected patients globally, it is essential to understand the US SARS-CoV-2. Using genotyping, sequence-alignment, time-evolution, k-means clustering, protein-folding stability, algebraic topology, and network th  ...[more]

Similar Datasets

| S-EPMC7886813 | biostudies-literature
| S-EPMC9239886 | biostudies-literature
| S-EPMC8406864 | biostudies-literature
| S-EPMC7486725 | biostudies-literature
| S-EPMC7884689 | biostudies-literature
| S-EPMC7481226 | biostudies-literature
| S-EPMC7893251 | biostudies-literature
| S-EPMC7872376 | biostudies-literature
| S-EPMC8662038 | biostudies-literature
| S-EPMC8313480 | biostudies-literature