Unknown

Dataset Information

0

Ranking, selecting, and prioritising genes with desirability functions.


ABSTRACT: In functional genomics experiments, researchers often select genes to follow-up or validate from a long list of differentially expressed genes. Typically, sharp thresholds are used to bin genes into groups such as significant/non-significant or fold change above/below a cut-off value, and ad hoc criteria are also used such as favouring well-known genes. Binning, however, is inefficient and does not take the uncertainty of the measurements into account. Furthermore, p-values, fold-changes, and other outcomes are treated as equally important, and relevant genes may be overlooked with such an approach. Desirability functions are proposed as a way to integrate multiple selection criteria for ranking, selecting, and prioritising genes. These functions map any variable to a continuous 0-1 scale, where one is maximally desirable and zero is unacceptable. Multiple selection criteria are then combined to provide an overall desirability that is used to rank genes. In addition to p-values and fold-changes, further experimental results and information contained in databases can be easily included as criteria. The approach is demonstrated with a breast cancer microarray data set. The functions and an example data set can be found in the desiR package on CRAN (https://cran.r-project.org/web/packages/desiR/) and the development version is available on GitHub (https://github.com/stanlazic/desiR).

SUBMITTER: Lazic SE 

PROVIDER: S-EPMC4671156 | biostudies-literature | 2015

REPOSITORIES: biostudies-literature

altmetric image

Publications

Ranking, selecting, and prioritising genes with desirability functions.

Lazic Stanley E SE  

PeerJ 20151126


In functional genomics experiments, researchers often select genes to follow-up or validate from a long list of differentially expressed genes. Typically, sharp thresholds are used to bin genes into groups such as significant/non-significant or fold change above/below a cut-off value, and ad hoc criteria are also used such as favouring well-known genes. Binning, however, is inefficient and does not take the uncertainty of the measurements into account. Furthermore, p-values, fold-changes, and ot  ...[more]

Similar Datasets

| S-EPMC3439680 | biostudies-literature
| S-EPMC9677339 | biostudies-literature
| S-EPMC6129274 | biostudies-literature
| S-EPMC1184045 | biostudies-literature
| S-EPMC5534459 | biostudies-other
| S-EPMC507884 | biostudies-other
| S-EPMC193620 | biostudies-literature
| S-EPMC7840943 | biostudies-literature
| S-EPMC10169394 | biostudies-literature
| S-EPMC9125302 | biostudies-literature