Unknown

Dataset Information

0

Fast and accurate gene regulatory network inference by normalized least squares regression.


ABSTRACT:

Motivation

Inferring an accurate gene regulatory network (GRN) has long been a key goal in the field of systems biology. To do this, it is important to find a suitable balance between the maximum number of true positive and the minimum number of false-positive interactions. Another key feature is that the inference method can handle the large size of modern experimental data, meaning the method needs to be both fast and accurate. The Least Squares Cut-Off (LSCO) method can fulfill both these criteria, however as it is based on least squares it is vulnerable to known issues of amplifying extreme values, small or large. In GRN this manifests itself with genes that are erroneously hyper-connected to a large fraction of all genes due to extremely low value fold changes.

Results

We developed a GRN inference method called Least Squares Cut-Off with Normalization (LSCON) that tackles this problem. LSCON extends the LSCO algorithm by regularization to avoid hyper-connected genes and thereby reduce false positives. The regularization used is based on normalization, which removes effects of extreme values on the fit. We benchmarked LSCON and compared it to Genie3, LASSO, LSCO and Ridge regression, in terms of accuracy, speed and tendency to predict hyper-connected genes. The results show that LSCON achieves better or equal accuracy compared to LASSO, the best existing method, especially for data with extreme values. Thanks to the speed of least squares regression, LSCON does this an order of magnitude faster than LASSO.

Availability and implementation

Data: https://bitbucket.org/sonnhammergrni/lscon; Code: https://bitbucket.org/sonnhammergrni/genespider.

Supplementary information

Supplementary data are available at Bioinformatics online.

SUBMITTER: Hillerton T 

PROVIDER: S-EPMC9004640 | biostudies-literature | 2022 Apr

REPOSITORIES: biostudies-literature

altmetric image

Publications

Fast and accurate gene regulatory network inference by normalized least squares regression.

Hillerton Thomas T   Seçilmiş Deniz D   Nelander Sven S   Sonnhammer Erik L L ELL  

Bioinformatics (Oxford, England) 20220401 8


<h4>Motivation</h4>Inferring an accurate gene regulatory network (GRN) has long been a key goal in the field of systems biology. To do this, it is important to find a suitable balance between the maximum number of true positive and the minimum number of false-positive interactions. Another key feature is that the inference method can handle the large size of modern experimental data, meaning the method needs to be both fast and accurate. The Least Squares Cut-Off (LSCO) method can fulfill both t  ...[more]

Similar Datasets

| S-EPMC9710113 | biostudies-literature
| S-EPMC5140053 | biostudies-literature
| S-EPMC8545292 | biostudies-literature
| S-EPMC9113237 | biostudies-literature
| S-EPMC10082608 | biostudies-literature
| S-EPMC7652823 | biostudies-literature
| S-EPMC9344847 | biostudies-literature
| S-EPMC9529923 | biostudies-literature
| S-EPMC9340570 | biostudies-literature
| S-EPMC6044105 | biostudies-literature