Unknown

Dataset Information

0

Topology based data analysis identifies a subgroup of breast cancers with a unique mutational profile and excellent survival.


ABSTRACT: High-throughput biological data, whether generated as sequencing, transcriptional microarrays, proteomic, or other means, continues to require analytic methods that address its high dimensional aspects. Because the computational part of data analysis ultimately identifies shape characteristics in the organization of data sets, the mathematics of shape recognition in high dimensions continues to be a crucial part of data analysis. This article introduces a method that extracts information from high-throughput microarray data and, by using topology, provides greater depth of information than current analytic techniques. The method, termed Progression Analysis of Disease (PAD), first identifies robust aspects of cluster analysis, then goes deeper to find a multitude of biologically meaningful shape characteristics in these data. Additionally, because PAD incorporates a visualization tool, it provides a simple picture or graph that can be used to further explore these data. Although PAD can be applied to a wide range of high-throughput data types, it is used here as an example to analyze breast cancer transcriptional data. This identified a unique subgroup of Estrogen Receptor-positive (ER(+)) breast cancers that express high levels of c-MYB and low levels of innate inflammatory genes. These patients exhibit 100% survival and no metastasis. No supervised step beyond distinction between tumor and healthy patients was used to identify this subtype. The group has a clear and distinct, statistically significant molecular signature, it highlights coherent biology but is invisible to cluster methods, and does not fit into the accepted classification of Luminal A/B, Normal-like subtypes of ER(+) breast cancers. We denote the group as c-MYB(+) breast cancer.

SUBMITTER: Nicolau M 

PROVIDER: S-EPMC3084136 | biostudies-literature | 2011 Apr

REPOSITORIES: biostudies-literature

altmetric image

Publications

Topology based data analysis identifies a subgroup of breast cancers with a unique mutational profile and excellent survival.

Nicolau Monica M   Levine Arnold J AJ   Carlsson Gunnar G  

Proceedings of the National Academy of Sciences of the United States of America 20110411 17


High-throughput biological data, whether generated as sequencing, transcriptional microarrays, proteomic, or other means, continues to require analytic methods that address its high dimensional aspects. Because the computational part of data analysis ultimately identifies shape characteristics in the organization of data sets, the mathematics of shape recognition in high dimensions continues to be a crucial part of data analysis. This article introduces a method that extracts information from hi  ...[more]

Similar Datasets

| S-EPMC5189935 | biostudies-literature
| S-EPMC6832691 | biostudies-literature
| S-EPMC7353062 | biostudies-literature
| S-EPMC3414841 | biostudies-literature
| S-EPMC5630389 | biostudies-other
| S-EPMC5458195 | biostudies-literature
| S-EPMC11289481 | biostudies-literature
| S-EPMC6193541 | biostudies-literature
| S-EPMC4873607 | biostudies-literature
| S-EPMC8956741 | biostudies-literature