Unknown

Dataset Information

0

Knowledge-constrained K-medoids Clustering of Regulatory Rare Alleles for Burden Tests.


ABSTRACT: Rarely occurring genetic variants are hypothesized to influence human diseases, but statistically associating these rare variants to disease is challenging due to a lack of statistical power in most feasibly sized datasets. Several statistical tests have been developed to either collapse multiple rare variants from a genomic region into a single variable (presence/absence) or to tally the number of rare alleles within a region, relating the burden of rare alleles to disease risk. Both these approaches, however, rely on user-specification of a genomic region to generate these collapsed or burden variables, usually an entire gene. Recent studies indicate that most risk variants for common diseases are found within regulatory regions, not genes. To capture the effect of rare alleles within non-genic regulatory regions for burden tests, we contrast a simple sliding window approach with a knowledge-guided k-medoids clustering method to group rare variants into statistically powerful, biologically meaningful windows. We apply these methods to detect genomic regions that alter expression of nearby genes.

SUBMITTER: Sivley RM 

PROVIDER: S-EPMC4274942 | biostudies-literature | 2013

REPOSITORIES: biostudies-literature

altmetric image

Publications

Knowledge-constrained K-medoids Clustering of Regulatory Rare Alleles for Burden Tests.

Sivley R Michael RM   Fish Alexandra E AE   Bush William S WS  

Evolutionary computation, machine learning and data mining in bioinformatics. EvoBIO (Conference) 20130101


Rarely occurring genetic variants are hypothesized to influence human diseases, but statistically associating these rare variants to disease is challenging due to a lack of statistical power in most feasibly sized datasets. Several statistical tests have been developed to either collapse multiple rare variants from a genomic region into a single variable (presence/absence) or to tally the number of rare alleles within a region, relating the burden of rare alleles to disease risk. Both these appr  ...[more]

Similar Datasets

| S-EPMC7460393 | biostudies-literature
| S-EPMC10746319 | biostudies-literature
| S-EPMC9949046 | biostudies-literature
| S-EPMC9455583 | biostudies-literature
| S-EPMC9294413 | biostudies-literature
| S-EPMC3562702 | biostudies-literature
| S-EPMC3048103 | biostudies-literature
| S-EPMC6984355 | biostudies-literature
| S-EPMC6568016 | biostudies-literature
| S-EPMC9364503 | biostudies-literature