Unknown

Dataset Information

0

Robust, scalable, and informative clustering for diverse biological networks.


ABSTRACT: Clustering molecular data into informative groups is a primary step in extracting robust conclusions from big data. However, due to foundational issues in how they are defined and detected, such clusters are not always reliable, leading to unstable conclusions. We compare popular clustering algorithms across thousands of synthetic and real biological datasets, including a new consensus clustering algorithm-SpeakEasy2: Champagne. These tests identify trends in performance, show no single method is universally optimal, and allow us to examine factors behind variation in performance. Multiple metrics indicate SpeakEasy2 generally provides robust, scalable, and informative clusters for a range of applications.

SUBMITTER: Gaiteri C 

PROVIDER: S-EPMC10571258 | biostudies-literature | 2023 Oct

REPOSITORIES: biostudies-literature

altmetric image

Publications

Robust, scalable, and informative clustering for diverse biological networks.

Gaiteri Chris C   Connell David R DR   Sultan Faraz A FA   Iatrou Artemis A   Ng Bernard B   Szymanski Boleslaw K BK   Zhang Ada A   Tasaki Shinya S  

Genome biology 20231012 1


Clustering molecular data into informative groups is a primary step in extracting robust conclusions from big data. However, due to foundational issues in how they are defined and detected, such clusters are not always reliable, leading to unstable conclusions. We compare popular clustering algorithms across thousands of synthetic and real biological datasets, including a new consensus clustering algorithm-SpeakEasy2: Champagne. These tests identify trends in performance, show no single method i  ...[more]

Similar Datasets

| S-EPMC10622433 | biostudies-literature
| S-EPMC3125314 | biostudies-literature
| S-EPMC3311098 | biostudies-literature
| S-EPMC11635021 | biostudies-literature
| S-EPMC7450094 | biostudies-literature
| S-EPMC10769222 | biostudies-literature
| S-EPMC11406744 | biostudies-literature
| S-EPMC2853685 | biostudies-literature
| S-EPMC1414113 | biostudies-literature
| S-EPMC9699063 | biostudies-literature