Unknown

Dataset Information

0

Phenotype Prediction and Genome-Wide Association Study Using Deep Convolutional Neural Network of Soybean.


ABSTRACT: Genomic selection uses single-nucleotide polymorphisms (SNPs) to predict quantitative phenotypes for enhancing traits in breeding populations and has been widely used to increase breeding efficiency for plants and animals. Existing statistical methods rely on a prior distribution assumption of imputed genotype effects, which may not fit experimental datasets. Emerging deep learning technology could serve as a powerful machine learning tool to predict quantitative phenotypes without imputation and also to discover potential associated genotype markers efficiently. We propose a deep-learning framework using convolutional neural networks (CNNs) to predict the quantitative traits from SNPs and also to investigate genotype contributions to the trait using saliency maps. The missing values of SNPs are treated as a new genotype for the input of the deep learning model. We tested our framework on both simulation data and experimental datasets of soybean. The results show that the deep learning model can bypass the imputation of missing values and achieve more accurate results for predicting quantitative phenotypes than currently available other well-known statistical methods. It can also effectively and efficiently identify significant markers of SNPs and SNP combinations associated in genome-wide association study.

SUBMITTER: Liu Y 

PROVIDER: S-EPMC6883005 | biostudies-literature | 2019

REPOSITORIES: biostudies-literature

altmetric image

Publications

Phenotype Prediction and Genome-Wide Association Study Using Deep Convolutional Neural Network of Soybean.

Liu Yang Y   Wang Duolin D   He Fei F   Wang Juexin J   Joshi Trupti T   Xu Dong D  

Frontiers in genetics 20191122


Genomic selection uses single-nucleotide polymorphisms (SNPs) to predict quantitative phenotypes for enhancing traits in breeding populations and has been widely used to increase breeding efficiency for plants and animals. Existing statistical methods rely on a prior distribution assumption of imputed genotype effects, which may not fit experimental datasets. Emerging deep learning technology could serve as a powerful machine learning tool to predict quantitative phenotypes without imputation an  ...[more]

Similar Datasets

| S-EPMC6192215 | biostudies-literature
2021-01-11 | GSE147113 | GEO
| S-EPMC8866467 | biostudies-literature
| S-EPMC9040020 | biostudies-literature
| S-EPMC6110828 | biostudies-other
| S-EPMC7838556 | biostudies-literature
| S-EPMC4707437 | biostudies-literature
| S-EPMC10333175 | biostudies-literature
| S-EPMC7674929 | biostudies-literature
| S-EPMC7643315 | biostudies-literature