Research on the Mechanism of Soybean Resistance to Phytophthora Infection Using Machine Learning Methods.
Ontology highlight
ABSTRACT: Since the emergence of the Phytophthora sojae infection, economic losses of 10-20 billion U.S. dollars have been annually reported. Studies have revealed that P. sojae works by releasing effect factors such as small RNA in the process of infecting soybeans, but research on the interaction mechanism between plants and fungi at the small RNA level remains vague and unclear. For this reason, studying the resistance mechanism of the hosts after P. sojae invades soybeans has critical theoretical and practical significance for increasing soybean yield. The present article is premised on the high-throughput data published by the National Center of Biotechnology Information (NCBI). We selected 732 sRNA sequences through big data analysis whose expression level increased sharply after soybean was infected by P. sojae and 36 sRNA sequences with massive expression levels newly generated after infection. This article analyzes the resistance mechanism of soybean to P. sojae from two aspects of plant's own passive stress and active resistance. This article analyzes the resistance mechanism of soybean to P. sojae from two aspects of plant's own passive stress and active resistance. These 768 sRNA sequences are targeted to soybean mRNA and P. sojae mRNA, and 2,979 and 1,683 targets are obtained, respectively. The PageRank algorithm was used to screen the core functional clusters, and 50 core nodes targeted to soybeans were obtained, which were analyzed for functional enrichment, and 12 KEGG_Pathway and 18 Go(BP) were obtained. The node targeted to P. sojae was subjected to functional enrichment analysis to obtain 11 KEGG_Pathway. The results show that there are multiple Go(BP) and KEGG_Pathway related to soybean growth and defense and reverse resistance of P. sojae. In addition, by comparing the small RNA prediction model of soybean resistance with Phytophthora pathogenicity constructed by the three machine learning methods of random forest, support vector machine, and XGBoost, about the accuracy, precision, recall rate, and F-measure, the results show that the three models have satisfied classification effect. Among the three models, XGBoost had an accuracy rate of 86.98% in the verification set.
SUBMITTER: Chi J
PROVIDER: S-EPMC7928311 | biostudies-literature | 2021
REPOSITORIES: biostudies-literature
ACCESS DATA