Unknown

Dataset Information

0

Statistical classification of multivariate flow cytometry data analyzed by manual gating: stem, progenitor, and epithelial marker expression in nonsmall cell lung cancer and normal lung.


ABSTRACT: The use of supervised classification to extract markers from primary flow cytometry data is an emerging field that has made significant progress, spurred by the growing complexity of multidimensional flow cytometry. Whether the markers are extracted without supervision or by conventional gate and region methods, the number of candidate variables identified is typically larger than the number of specimens (p > n) and many variables are highly intercorrelated. Thus, comparison across groups or treatments to determine which markers are significant is challenging. Here, we utilized a data set in which 86 variables were created by conventional manual analysis of individual listmode data files, and compared the application of five multivariate classification methods to discern subtle differences between the stem/progenitor content of 35 nonsmall cell lung cancer and adjacent normal lung specimens. The methods compared include elastic-net, lasso, random forest, diagonal linear discriminant analysis, and best single variable (best-1). We described a broadly applicable methodology consisting of: 1) variable transformation and standardization; 2) visualization and assessment of correlation between variables; 3) selection of significant variables and modeling; and 4) characterization of the quality and stability of the model. The analysis yielded both validating results (tumors are aneuploid and have higher light scatter properties than normal lung), as well as leads that require followup: Cytokeratin+ CD133+ progenitors are present in normal lung but reduced in lung cancer; diploid (or pseudo-diploid) CD117+CD44+ cells are more prevalent in tumor. We anticipate that the methods described here will be broadly applicable to a variety of multidimensional cytometry problems.

SUBMITTER: Normolle DP 

PROVIDER: S-EPMC4149906 | biostudies-literature | 2013 Jan

REPOSITORIES: biostudies-literature

altmetric image

Publications

Statistical classification of multivariate flow cytometry data analyzed by manual gating: stem, progenitor, and epithelial marker expression in nonsmall cell lung cancer and normal lung.

Normolle Daniel P DP   Donnenberg Vera S VS   Donnenberg Albert D AD  

Cytometry. Part A : the journal of the International Society for Analytical Cytology 20121213 1


The use of supervised classification to extract markers from primary flow cytometry data is an emerging field that has made significant progress, spurred by the growing complexity of multidimensional flow cytometry. Whether the markers are extracted without supervision or by conventional gate and region methods, the number of candidate variables identified is typically larger than the number of specimens (p > n) and many variables are highly intercorrelated. Thus, comparison across groups or tre  ...[more]

Similar Datasets

| S-EPMC2234502 | biostudies-literature
| S-EPMC4325545 | biostudies-literature
| S-EPMC2585156 | biostudies-literature
| S-EPMC2377041 | biostudies-other
| S-EPMC2910822 | biostudies-literature
| S-EPMC4162487 | biostudies-literature
| S-EPMC7503296 | biostudies-literature
| S-EPMC5860171 | biostudies-literature
| S-EPMC9487349 | biostudies-literature
| S-EPMC5949280 | biostudies-literature