Unknown

Dataset Information

0

Large-Scale Automatic Feature Selection for Biomarker Discovery in High-Dimensional OMICs Data.


ABSTRACT: The identification of biomarker signatures in omics molecular profiling is usually performed to predict outcomes in a precision medicine context, such as patient disease susceptibility, diagnosis, prognosis, and treatment response. To identify these signatures, we have developed a biomarker discovery tool, called BioDiscML. From a collection of samples and their associated characteristics, i.e., the biomarkers (e.g., gene expression, protein levels, clinico-pathological data), BioDiscML exploits various feature selection procedures to produce signatures associated to machine learning models that will predict efficiently a specified outcome. To this purpose, BioDiscML uses a large variety of machine learning algorithms to select the best combination of biomarkers for predicting categorical or continuous outcomes from highly unbalanced datasets. The software has been implemented to automate all machine learning steps, including data pre-processing, feature selection, model selection, and performance evaluation. BioDiscML is delivered as a stand-alone program and is available for download at https://github.com/mickaelleclercq/BioDiscML.

SUBMITTER: Leclercq M 

PROVIDER: S-EPMC6532608 | biostudies-literature | 2019

REPOSITORIES: biostudies-literature

altmetric image

Publications

Large-Scale Automatic Feature Selection for Biomarker Discovery in High-Dimensional OMICs Data.

Leclercq Mickael M   Vittrant Benjamin B   Martin-Magniette Marie Laure ML   Scott Boyer Marie Pier MP   Perin Olivier O   Bergeron Alain A   Fradet Yves Y   Droit Arnaud A  

Frontiers in genetics 20190516


The identification of biomarker signatures in omics molecular profiling is usually performed to predict outcomes in a precision medicine context, such as patient disease susceptibility, diagnosis, prognosis, and treatment response. To identify these signatures, we have developed a biomarker discovery tool, called BioDiscML. From a collection of samples and their associated characteristics, i.e., the biomarkers (e.g., gene expression, protein levels, clinico-pathological data), BioDiscML exploits  ...[more]

Similar Datasets

| S-EPMC5738110 | biostudies-literature
| S-EPMC4937038 | biostudies-literature
| S-EPMC8165452 | biostudies-literature
| S-EPMC10119907 | biostudies-literature
| S-EPMC8386074 | biostudies-literature
| S-EPMC9533501 | biostudies-literature
| S-EPMC6751684 | biostudies-literature
| S-EPMC6874333 | biostudies-literature
| S-EPMC8631639 | biostudies-literature
| S-EPMC10605029 | biostudies-literature