Unknown

Dataset Information

0

Empirical prediction of genomic susceptibilities for multiple cancer classes.


ABSTRACT: An empirical approach is presented for predicting the genomic susceptibility of an individual to the most likely one among nine traits, consisting of eight major cancer classes plus a healthy trait. We use four prediction methods by applying two supervised learning algorithms to two different descriptors of common genomic variations (the profiles of genotypes of SNPs and SNP syntaxes with low P values or low frequencies) of each individual genome from normal cells. All four methods made correct predictions substantially better than random predictions for most cancer classes, but not for some others. A combination of the four results using Bayesian inference better predicted overall than any individual method. The multiclass accuracy of the combined prediction ranges from 33% to 56% depending on cancer classes of testing sets, compared with 11% for a random prediction among nine traits. Despite limited SNP data available and the absence of rare SNPs in public databases, at present, the results suggest that the framework of this approach or its improvement can predict cancer susceptibility with probability estimates useful for making health decisions for individuals or for a population.

SUBMITTER: Kim M 

PROVIDER: S-EPMC3918817 | biostudies-other | 2014 Feb

REPOSITORIES: biostudies-other

altmetric image

Publications

Empirical prediction of genomic susceptibilities for multiple cancer classes.

Kim Minseung M   Kim Sung-Hou SH  

Proceedings of the National Academy of Sciences of the United States of America 20140121 5


An empirical approach is presented for predicting the genomic susceptibility of an individual to the most likely one among nine traits, consisting of eight major cancer classes plus a healthy trait. We use four prediction methods by applying two supervised learning algorithms to two different descriptors of common genomic variations (the profiles of genotypes of SNPs and SNP syntaxes with low P values or low frequencies) of each individual genome from normal cells. All four methods made correct  ...[more]

Similar Datasets

| S-EPMC6694479 | biostudies-literature
| S-EPMC5558709 | biostudies-other
| S-EPMC3512156 | biostudies-literature
| S-EPMC7862747 | biostudies-literature
| S-EPMC2778678 | biostudies-other
| S-EPMC8601116 | biostudies-literature
| S-EPMC5565406 | biostudies-literature
| S-EPMC5937171 | biostudies-literature
| S-EPMC6737184 | biostudies-literature
| S-EPMC8343134 | biostudies-literature