Unknown

Dataset Information

0

Multivariate curve resolution of time course microarray data.


ABSTRACT: BACKGROUND: Modeling of gene expression data from time course experiments often involves the use of linear models such as those obtained from principal component analysis (PCA), independent component analysis (ICA), or other methods. Such methods do not generally yield factors with a clear biological interpretation. Moreover, implicit assumptions about the measurement errors often limit the application of these methods to log-transformed data, destroying linear structure in the untransformed expression data. RESULTS: In this work, a method for the linear decomposition of gene expression data by multivariate curve resolution (MCR) is introduced. The MCR method is based on an alternating least-squares (ALS) algorithm implemented with a weighted least squares approach. The new method, MCR-WALS, extracts a small number of basis functions from untransformed microarray data using only non-negativity constraints. Measurement error information can be incorporated into the modeling process and missing data can be imputed. The utility of the method is demonstrated through its application to yeast cell cycle data. CONCLUSION: Profiles extracted by MCR-WALS exhibit a strong correlation with cell cycle-associated genes, but also suggest new insights into the regulation of those genes. The unique features of the MCR-WALS algorithm are its freedom from assumptions about the underlying linear model other than the non-negativity of gene expression, its ability to analyze non-log-transformed data, and its use of measurement error information to obtain a weighted model and accommodate missing measurements.

SUBMITTER: Wentzell PD 

PROVIDER: S-EPMC1539028 | biostudies-literature | 2006

REPOSITORIES: biostudies-literature

altmetric image

Publications

Multivariate curve resolution of time course microarray data.

Wentzell Peter D PD   Karakach Tobias K TK   Roy Sushmita S   Martinez M Juanita MJ   Allen Christopher P CP   Werner-Washburne Margaret M  

BMC bioinformatics 20060713


<h4>Background</h4>Modeling of gene expression data from time course experiments often involves the use of linear models such as those obtained from principal component analysis (PCA), independent component analysis (ICA), or other methods. Such methods do not generally yield factors with a clear biological interpretation. Moreover, implicit assumptions about the measurement errors often limit the application of these methods to log-transformed data, destroying linear structure in the untransfor  ...[more]

Similar Datasets

| S-EPMC2697656 | biostudies-literature
| S-EPMC6470876 | biostudies-literature
| S-EPMC1920252 | biostudies-other
| S-EPMC6176743 | biostudies-literature
| S-EPMC7997113 | biostudies-literature
| S-EPMC2682797 | biostudies-literature
| S-EPMC6042830 | biostudies-literature
| S-EPMC7735651 | biostudies-literature
| S-EPMC7589786 | biostudies-literature
| S-EPMC1201697 | biostudies-literature