Unknown

Dataset Information

0

Bayesian clustering and feature selection for cancer tissue samples.


ABSTRACT:

Background

The versatility of DNA copy number amplifications for profiling and categorization of various tissue samples has been widely acknowledged in the biomedical literature. For instance, this type of measurement techniques provides possibilities for exploring sets of cancerous tissues to identify novel subtypes. The previously utilized statistical approaches to various kinds of analyses include traditional algorithmic techniques for clustering and dimension reduction, such as independent and principal component analyses, hierarchical clustering, as well as model-based clustering using maximum likelihood estimation for latent class models.

Results

While purely algorithmic methods are usually easily applicable, their suboptimal performance and limitations in making formal inference have been thoroughly discussed in the statistical literature. Here we introduce a Bayesian model-based approach to simultaneous identification of underlying tissue groups and the informative amplifications. The model-based approach provides the possibility of using formal inference to determine the number of groups from the data, in contrast to the ad hoc methods often exploited for similar purposes. The model also automatically recognizes the chromosomal areas that are relevant for the clustering.

Conclusion

Validatory analyses of simulated data and a large database of DNA copy number amplifications in human neoplasms are used to illustrate the potential of our approach. Our software implementation BASTA for performing Bayesian statistical tissue profiling is freely available for academic purposes at (http://web.abo.fi/fak/mnf/mate/jc/software/basta.html).

SUBMITTER: Marttinen P 

PROVIDER: S-EPMC2679022 | biostudies-literature | 2009 Mar

REPOSITORIES: biostudies-literature

altmetric image

Publications

Bayesian clustering and feature selection for cancer tissue samples.

Marttinen Pekka P   Myllykangas Samuel S   Corander Jukka J  

BMC bioinformatics 20090318


<h4>Background</h4>The versatility of DNA copy number amplifications for profiling and categorization of various tissue samples has been widely acknowledged in the biomedical literature. For instance, this type of measurement techniques provides possibilities for exploring sets of cancerous tissues to identify novel subtypes. The previously utilized statistical approaches to various kinds of analyses include traditional algorithmic techniques for clustering and dimension reduction, such as indep  ...[more]

Similar Datasets

| S-EPMC5181536 | biostudies-literature
| S-EPMC8644062 | biostudies-literature
| S-EPMC8216648 | biostudies-literature
| S-EPMC7297975 | biostudies-literature
| S-EPMC6412528 | biostudies-literature
| S-EPMC3789539 | biostudies-other
| S-EPMC2585795 | biostudies-literature