Ontology highlight
ABSTRACT:
SUBMITTER: Hahn G
PROVIDER: S-EPMC9761049 | biostudies-literature | 2022 Dec
REPOSITORIES: biostudies-literature
Hahn Georg G Lee Sanghun S Prokopenko Dmitry D Abraham Jonathan J Novak Tanya T Hecker Julian J Cho Michael M Khurana Surender S Baden Lindsey R LR Randolph Adrienne G AG Weiss Scott T ST Lange Christoph C
BMC bioinformatics 20221219 1
As of June 2022, the GISAID database contains more than 11 million SARS-CoV-2 genomes, including several thousand nucleotide sequences for the most common variants such as delta or omicron. These SARS-CoV-2 strains have been collected from patients around the world since the beginning of the pandemic. We start by assessing the similarity of all pairs of nucleotide sequences using the Jaccard index and principal component analysis. As shown previously in the literature, an unsupervised cluster an ...[more]