Unknown

Dataset Information

0

Nearest Consensus Clustering Classification to Identify Subclasses and Predict Disease.


ABSTRACT: Disease subtyping, which helps to develop personalized treatments, remains a challenge in data analysis because of the many different ways to group patients based upon their data. However, if we can identify subclasses of disease, then it will help to develop better models that are more specific to individuals and should therefore improve prediction and understanding of the underlying characteristics of the disease in question. This paper proposes a new algorithm that integrates consensus clustering methods with classification in order to overcome issues with sample bias. The new algorithm combines K-means with consensus clustering in order build cohort-specific decision trees that improve classification as well as aid the understanding of the underlying differences of the discovered groups. The methods are tested on a real-world freely available breast cancer dataset and data from a London hospital on systemic sclerosis, a rare potentially fatal condition. Results show that "nearest consensus clustering classification" improves the accuracy and the prediction significantly when this algorithm has been compared with competitive similar methods.

SUBMITTER: Alyousef AA 

PROVIDER: S-EPMC6245235 | biostudies-literature | 2018

REPOSITORIES: biostudies-literature

altmetric image

Publications

Nearest Consensus Clustering Classification to Identify Subclasses and Predict Disease.

Alyousef Awad A AA   Nihtyanova Svetlana S   Denton Chris C   Bosoni Pietro P   Bellazzi Riccardo R   Tucker Allan A  

Journal of healthcare informatics research 20180730 4


Disease subtyping, which helps to develop personalized treatments, remains a challenge in data analysis because of the many different ways to group patients based upon their data. However, if we can identify subclasses of disease, then it will help to develop better models that are more specific to individuals and should therefore improve prediction and understanding of the underlying characteristics of the disease in question. This paper proposes a new algorithm that integrates consensus cluste  ...[more]

Similar Datasets

| S-EPMC8259015 | biostudies-literature
| S-EPMC3789539 | biostudies-other
| S-EPMC7940639 | biostudies-literature
| S-EPMC3703613 | biostudies-literature
| S-EPMC8195153 | biostudies-literature
| S-EPMC7005057 | biostudies-literature
2017-02-16 | GSE79102 | GEO
| S-EPMC7924696 | biostudies-literature
| S-EPMC3313482 | biostudies-literature
| S-EPMC8709870 | biostudies-literature