Unknown

Dataset Information

0

Gene-set distance analysis (GSDA): a powerful tool for gene-set association analysis.


ABSTRACT:

Background

Identifying sets of related genes (gene sets) that are empirically associated with a treatment or phenotype often yields valuable biological insights. Several methods effectively identify gene sets in which individual genes have simple monotonic relationships with categorical, quantitative, or censored event-time variables. Some distance-based methods, such as distance correlations, may detect complex non-monotone associations of a gene-set with a quantitative variable that elude other methods. However, the distance correlations have yet to be generalized to associate gene-sets with categorical and censored event-time endpoints. Also, there is a need to determine which genes empirically drive the significance of an association of a gene set with an endpoint.

Results

We develop gene-set distance analysis (GSDA) by generalizing distance correlations to evaluate the association of a gene set with categorical and censored event-time variables. We also develop a backward elimination procedure to identify a subset of genes that empirically drive significant associations. In simulation studies, GSDA more effectively identified complex non-monotone gene-set associations than did six other published methods. In the analysis of a pediatric acute myeloid leukemia (AML) data set, GSDA was the only method to discover that event-free survival (EFS) was associated with the 56-gene AML pathway gene-set, narrow that result down to 5 genes, and confirm the association of those 5 genes with EFS in a separate validation cohort. These results indicate that GSDA effectively identifies and characterizes complex non-monotonic gene-set associations that are missed by other methods.

Conclusion

GSDA is a powerful and flexible method to detect gene-set association with categorical, quantitative, or censored event-time variables, especially to detect complex non-monotonic gene-set associations. Available at https://CRAN.R-project.org/package=GSDA .

SUBMITTER: Cao X 

PROVIDER: S-EPMC8059024 | biostudies-literature |

REPOSITORIES: biostudies-literature

Similar Datasets

| S-EPMC4382905 | biostudies-literature
| S-EPMC4411044 | biostudies-literature
| S-EPMC3032061 | biostudies-literature
| S-EPMC6436759 | biostudies-literature
| S-EPMC3605602 | biostudies-literature
| S-EPMC5570149 | biostudies-literature
| S-EPMC5669628 | biostudies-literature
| S-EPMC2852214 | biostudies-literature
| S-EPMC3345567 | biostudies-literature
| S-EPMC7542881 | biostudies-literature