Unknown

Dataset Information

0

Distance-based classifiers as potential diagnostic and prediction tools for human diseases.


ABSTRACT: Typically, gene expression biomarkers are being discovered in course of high-throughput experiments, for example, RNAseq or microarray profiling. Analytic pipelines that extract so-called signatures suffer from the "Dimensionality curse": the number of genes expressed exceeds the number of patients we can enroll in the study and use to train the discriminator algorithm. Hence, problems with the reproducibility of gene signatures are more common than not; when the algorithm is executed using a different training set, the resulting diagnostic signature may turn out to be completely different. In this paper we propose an alternative novel approach which takes into account quantifiable expression levels of all genes assayed. In our analysis, the cumulative gene expression pattern of an individual patient is represented as a point in the multidimensional space formed by all gene expression profiles assayed in given system, where the clusters of "normal samples" and "affected samples" and defined. The degree of separation of the given sample from the space occupied by "normal samples" reflects the drift of the sample away from homeostasis in the course of development of the pathophysiological process that underly the disease. The outlined approach was validated using the publicly available glioma dataset deposited in Rembrandt and associated with survival data. Additionally, the applicability of the distance analysis to the classification of non-malignant sampled was tested using psoriatic lesions and non-lesional matched controls as a model.

SUBMITTER: Veytsman B 

PROVIDER: S-EPMC4303935 | biostudies-literature | 2014

REPOSITORIES: biostudies-literature

altmetric image

Publications

Distance-based classifiers as potential diagnostic and prediction tools for human diseases.

Veytsman Boris B   Wang Lei L   Cui Tiange T   Bruskin Sergey S   Baranova Ancha A  

BMC genomics 20141219


Typically, gene expression biomarkers are being discovered in course of high-throughput experiments, for example, RNAseq or microarray profiling. Analytic pipelines that extract so-called signatures suffer from the "Dimensionality curse": the number of genes expressed exceeds the number of patients we can enroll in the study and use to train the discriminator algorithm. Hence, problems with the reproducibility of gene signatures are more common than not; when the algorithm is executed using a di  ...[more]

Similar Datasets

2015-12-03 | GSE72526 | GEO
| S-EPMC4672770 | biostudies-literature
2015-07-27 | PXD001740 | Pride
| S-EPMC2253293 | biostudies-literature
| S-EPMC6851505 | biostudies-literature
| S-EPMC7038525 | biostudies-literature
| S-EPMC5915128 | biostudies-literature
| S-EPMC5207395 | biostudies-literature
| S-EPMC6221336 | biostudies-literature
| S-EPMC7245692 | biostudies-literature