Dataset Information

Regularized estimation of large-scale gene association networks using graphical Gaussian models.

ABSTRACT:

Background

Graphical Gaussian models are popular tools for the estimation of (undirected) gene association networks from microarray data. A key issue when the number of variables greatly exceeds the number of samples is the estimation of the matrix of partial correlations. Since the (Moore-Penrose) inverse of the sample covariance matrix leads to poor estimates in this scenario, standard methods are inappropriate and adequate regularization techniques are needed. Popular approaches include biased estimates of the covariance matrix and high-dimensional regression schemes, such as the Lasso and Partial Least Squares.

Results

In this article, we investigate a general framework for combining regularized regression methods with the estimation of Graphical Gaussian models. This framework includes various existing methods as well as two new approaches based on ridge regression and adaptive lasso, respectively. These methods are extensively compared both qualitatively and quantitatively within a simulation study and through an application to six diverse real data sets. In addition, all proposed algorithms are implemented in the R package "parcor", available from the R repository CRAN.

Conclusion

In our simulation studies, the investigated non-sparse regression methods, i.e. Ridge Regression and Partial Least Squares, exhibit rather conservative behavior when combined with (local) false discovery rate multiple testing in order to decide whether or not an edge is present in the network. For networks with higher densities, the difference in performance of the methods decreases. For sparse networks, we confirm the Lasso's well known tendency towards selecting too many edges, whereas the two-stage adaptive Lasso is an interesting alternative that provides sparser solutions. In our simulations, both sparse and non-sparse methods are able to reconstruct networks with cluster structures. On six real data sets, we also clearly distinguish the results obtained using the non-sparse methods and those obtained using the sparse methods where specification of the regularization parameter automatically means model selection. In five out of six data sets, Partial Least Squares selects very dense networks. Furthermore, for data that violate the assumption of uncorrelated observations (due to replications), the Lasso and the adaptive Lasso yield very complex structures, indicating that they might not be suited under these conditions. The shrinkage approach is more stable than the regression based approaches when using subsampling.

SUBMITTER: Kramer N

PROVIDER: S-EPMC2808166 | biostudies-literature | 2009 Nov

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Regularized estimation of large-scale gene association networks using graphical Gaussian models.

Krämer Nicole N Schäfer Juliane J Boulesteix Anne-Laure AL

BMC bioinformatics 20091124

<h4>Background</h4>Graphical Gaussian models are popular tools for the estimation of (undirected) gene association networks from microarray data. A key issue when the number of variables greatly exceeds the number of samples is the estimation of the matrix of partial correlations. Since the (Moore-Penrose) inverse of the sample covariance matrix leads to poor estimates in this scenario, standard methods are inappropriate and adequate regularization techniques are needed. Popular approaches inclu ...[more]

PMID: 19930695

Dataset Information

Regularized estimation of large-scale gene association networks using graphical Gaussian models.

Background

Results

Conclusion

Publications

Regularized estimation of large-scale gene association networks using graphical Gaussian models.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

The cluster graphical lasso for improved estimation of Gaussian graphical models.
| S-EPMC4307846 | biostudies-literature

Transfer Learning in Large-scale Gaussian Graphical Models with False Discovery Rate Control.
| S-EPMC10746133 | biostudies-literature

Fast Bayesian inference in large Gaussian graphical models.
| S-EPMC6916355 | biostudies-literature

On joint estimation of Gaussian graphical models for spatial and temporal data.
| S-EPMC5515703 | biostudies-literature

Joint Estimation of Multiple Dependent Gaussian Graphical Models with Applications to Mouse Genomics.
| S-EPMC5640885 | biostudies-literature

Assisted estimation of gene expression graphical models.
| S-EPMC8137544 | biostudies-literature

Gene regulation network inference with joint sparse Gaussian graphical models.
| S-EPMC4743539 | biostudies-literature

A graphical model approach for inferring large-scale networks integrating gene expression and genetic polymorphism.
| S-EPMC2694152 | biostudies-literature

GeneNetTools: tests for Gaussian graphical models with shrinkage.
| S-EPMC9665865 | biostudies-literature

High-Dimensional Gaussian Graphical Regression Models with Covariates.
| S-EPMC10746132 | biostudies-literature