Dataset Information

A non-linear data mining parameter selection algorithm for continuous variables.

ABSTRACT: In this article, we propose a new data mining algorithm, by which one can both capture the non-linearity in data and also find the best subset model. To produce an enhanced subset of the original variables, a preferred selection method should have the potential of adding a supplementary level of regression analysis that would capture complex relationships in the data via mathematical transformation of the predictors and exploration of synergistic effects of combined variables. The method that we present here has the potential to produce an optimal subset of variables, rendering the overall process of model selection more efficient. This algorithm introduces interpretable parameters by transforming the original inputs and also a faithful fit to the data. The core objective of this paper is to introduce a new estimation technique for the classical least square regression framework. This new automatic variable transformation and model selection method could offer an optimal and stable model that minimizes the mean square error and variability, while combining all possible subset selection methodology with the inclusion variable transformations and interactions. Moreover, this method controls multicollinearity, leading to an optimal set of explanatory variables.

SUBMITTER: Tavallali P

PROVIDER: S-EPMC5683644 | biostudies-literature | 2017

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

A non-linear data mining parameter selection algorithm for continuous variables.

Tavallali Peyman P Razavi Marianne M Brady Sean S

PloS one 20171113 11

In this article, we propose a new data mining algorithm, by which one can both capture the non-linearity in data and also find the best subset model. To produce an enhanced subset of the original variables, a preferred selection method should have the potential of adding a supplementary level of regression analysis that would capture complex relationships in the data via mathematical transformation of the predictors and exploration of synergistic effects of combined variables. The method that we ...[more]

PMID: 29131829

Dataset Information

A non-linear data mining parameter selection algorithm for continuous variables.

Publications

A non-linear data mining parameter selection algorithm for continuous variables.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

Non-linear Parameter Estimates from Non-stationary MEG Data.
| S-EPMC4993126 | biostudies-literature

Characterizing non-linear dependencies among pairs of clinical variables and imaging data.
| S-EPMC3561932 | biostudies-literature

BOSO: A novel feature selection algorithm for linear regression with high-dimensional data.
| S-EPMC9187084 | biostudies-literature

Amazon Employees Resources Access Data Extraction via Clonal Selection Algorithm and Logic Mining Approach.
| S-EPMC7517133 | biostudies-literature

Decoding continuous variables from event-related potential (ERP) data with linear support vector regression using the Decision Decoding Toolbox (DDTBOX).
| S-EPMC9669708 | biostudies-literature

Spatial inequalities and non-linear association of continuous variables with mortality risk of liver transplantation in Iran: a retrospective cohort study.
| S-EPMC10764747 | biostudies-literature

Scrutinizing XAI using linear ground-truth data with suppressor variables.
| S-EPMC9123083 | biostudies-literature

A parameter-independent algorithm of finding maximum clique with Seidel continuous-time quantum walks.
| S-EPMC10850753 | biostudies-literature

Multivariate meta-analysis for non-linear and other multi-parameter associations.
| S-EPMC3546395 | biostudies-literature

Data on optimization of the non-linear Muskingum flood routing in Kardeh River using Goa algorithm.
| S-EPMC7083777 | biostudies-literature