Unknown

Dataset Information

0

Nonunique UPGMA clusterings of microsatellite markers.


ABSTRACT: Agglomerative hierarchical clustering has become a common tool for the analysis and visualization of data, thus being present in a large amount of scientific research and predating all areas of bioinformatics and computational biology. In this work, we focus on a critical problem, the nonuniqueness of the clustering when there are tied distances, for which several solutions exist but are not implemented in most hierarchical clustering packages. We analyze the magnitude of this problem in one particular setting: the clustering of microsatellite markers using the Unweighted Pair-Group Method with Arithmetic Mean. To do so, we have calculated the fraction of publications at the Scopus database in which more than one hierarchical clustering is possible, showing that about 46% of the articles are affected. Additionally, to show the problem from a practical point of view, we selected two opposite examples of articles that have multiple solutions: one with two possible dendrograms, and the other with more than 2.5 million different possible hierarchical clusterings.

SUBMITTER: Segura-Alabart N 

PROVIDER: S-EPMC9487649 | biostudies-literature | 2022 Sep

REPOSITORIES: biostudies-literature

altmetric image

Publications

Nonunique UPGMA clusterings of microsatellite markers.

Segura-Alabart Natàlia N   Serratosa Francesc F   Gómez Sergio S   Fernández Alberto A  

Briefings in bioinformatics 20220901 5


Agglomerative hierarchical clustering has become a common tool for the analysis and visualization of data, thus being present in a large amount of scientific research and predating all areas of bioinformatics and computational biology. In this work, we focus on a critical problem, the nonuniqueness of the clustering when there are tied distances, for which several solutions exist but are not implemented in most hierarchical clustering packages. We analyze the magnitude of this problem in one par  ...[more]

Similar Datasets

| PRJEB78443 | ENA
| PRJEB11435 | ENA
| S-EPMC8489535 | biostudies-literature
| S-EPMC3781276 | biostudies-literature
| S-EPMC5546169 | biostudies-literature
| S-EPMC5114825 | biostudies-literature
| S-EPMC4356322 | biostudies-literature
| S-EPMC4103145 | biostudies-literature
| S-EPMC4103467 | biostudies-literature
| S-EPMC4105364 | biostudies-literature