Unknown

Dataset Information

0

Classification of rheumatoid arthritis status with candidate gene and genome-wide single-nucleotide polymorphisms using random forests.


ABSTRACT: Using the North American Rheumatoid Arthritis Consortium (NARAC) candidate gene and genome-wide single-nucleotide polymorphism (SNP) data sets, we applied regression methods and tree-based random forests to identify genetic associations with rheumatoid arthritis (RA) and to predict RA disease status. Several genes were consistently identified as weakly associated with RA without a significant interaction or combinatorial effect with other candidate genes. Using random forests, the tested candidate gene SNPs were not sufficient to predict RA patients and normal subjects with high accuracy. However, using the top 500 SNPs, ranked by the importance score, from the genome-wide linkage panel of 5742 SNPs, we were able to accurately predict RA patients and normal subjects with sensitivity of approximately 90% and specificity of approximately 80%, which was confirmed by five-fold cross-validation. However, in a complete training-testing framework, replication of genetic predictors was less satisfactory; thus, further evaluation of existing methodology and development of new methods are warranted.

SUBMITTER: Sun YV 

PROVIDER: S-EPMC2367463 | biostudies-literature | 2007

REPOSITORIES: biostudies-literature

altmetric image

Publications

Classification of rheumatoid arthritis status with candidate gene and genome-wide single-nucleotide polymorphisms using random forests.

Sun Yan V YV   Cai Zhaohui Z   Desai Kaushal K   Lawrance Rachael R   Leff Richard R   Jawaid Ansar A   Kardia Sharon Lr SL   Yang Huiying H  

BMC proceedings 20071218


Using the North American Rheumatoid Arthritis Consortium (NARAC) candidate gene and genome-wide single-nucleotide polymorphism (SNP) data sets, we applied regression methods and tree-based random forests to identify genetic associations with rheumatoid arthritis (RA) and to predict RA disease status. Several genes were consistently identified as weakly associated with RA without a significant interaction or combinatorial effect with other candidate genes. Using random forests, the tested candida  ...[more]

Similar Datasets

| S-EPMC2795970 | biostudies-literature
| S-EPMC2367457 | biostudies-literature
| S-EPMC1526613 | biostudies-literature
| S-EPMC2795969 | biostudies-literature
| S-EPMC2443997 | biostudies-other
| S-EPMC6626904 | biostudies-literature
| S-EPMC2795881 | biostudies-literature
| S-EPMC2276142 | biostudies-literature
| S-EPMC6986563 | biostudies-literature
| S-EPMC4331719 | biostudies-literature