Unknown

Dataset Information

0

Machine learning for effectively avoiding overfitting is a crucial strategy for the genetic prediction of polygenic psychiatric phenotypes.


ABSTRACT: The accuracy of previous genetic studies in predicting polygenic psychiatric phenotypes has been limited mainly due to the limited power in distinguishing truly susceptible variants from null variants and the resulting overfitting. A novel prediction algorithm, Smooth-Threshold Multivariate Genetic Prediction (STMGP), was applied to improve the genome-based prediction of psychiatric phenotypes by decreasing overfitting through selecting variants and building a penalized regression model. Prediction models were trained using a cohort of 3685 subjects in Miyagi prefecture and validated with an independently recruited cohort of 3048 subjects in Iwate prefecture in Japan. Genotyping was performed using HumanOmniExpressExome BeadChip Arrays. We used the target phenotype of depressive symptoms and simulated phenotypes with varying complexity and various effect-size distributions of risk alleles. The prediction accuracy and the degree of overfitting of STMGP were compared with those of state-of-the-art models (polygenic risk scores, genomic best linear-unbiased prediction, summary-data-based best linear-unbiased prediction, BayesR, and ridge regression). In the prediction of depressive symptoms, compared with the other models, STMGP showed the highest prediction accuracy with the lowest degree of overfitting, although there was no significant difference in prediction accuracy. Simulation studies suggested that STMGP has a better prediction accuracy for moderately polygenic phenotypes. Our investigations suggest the potential usefulness of STMGP for predicting polygenic psychiatric conditions while avoiding overfitting.

SUBMITTER: Takahashi Y 

PROVIDER: S-EPMC7442807 | biostudies-literature | 2020 Aug

REPOSITORIES: biostudies-literature

altmetric image

Publications

Machine learning for effectively avoiding overfitting is a crucial strategy for the genetic prediction of polygenic psychiatric phenotypes.

Takahashi Yuta Y   Ueki Masao M   Tamiya Gen G   Ogishima Soichi S   Kinoshita Kengo K   Hozawa Atsushi A   Minegishi Naoko N   Nagami Fuji F   Fukumoto Kentaro K   Otsuka Kotaro K   Tanno Kozo K   Sakata Kiyomi K   Shimizu Atsushi A   Sasaki Makoto M   Sobue Kenji K   Kure Shigeo S   Yamamoto Masayuki M   Tomita Hiroaki H  

Translational psychiatry 20200817 1


The accuracy of previous genetic studies in predicting polygenic psychiatric phenotypes has been limited mainly due to the limited power in distinguishing truly susceptible variants from null variants and the resulting overfitting. A novel prediction algorithm, Smooth-Threshold Multivariate Genetic Prediction (STMGP), was applied to improve the genome-based prediction of psychiatric phenotypes by decreasing overfitting through selecting variants and building a penalized regression model. Predict  ...[more]

Similar Datasets

| S-EPMC5627249 | biostudies-literature
| S-EPMC7610853 | biostudies-literature
| S-EPMC7351018 | biostudies-literature
| S-EPMC6599546 | biostudies-literature
| 2406067 | ecrin-mdr-crc
| S-EPMC4357859 | biostudies-other
| S-EPMC6168558 | biostudies-literature
| S-EPMC3138621 | biostudies-literature
2013-01-01 | E-GEOD-29210 | biostudies-arrayexpress