Dataset Information

Feature Selection Stability and Accuracy of Prediction Models for Genomic Prediction of Residual Feed Intake in Pigs Using Machine Learning.

ABSTRACT: Feature selection (FS, i.e., selection of a subset of predictor variables) is essential in high-dimensional datasets to prevent overfitting of prediction/classification models and reduce computation time and resources. In genomics, FS allows identifying relevant markers and designing low-density SNP chips to evaluate selection candidates. In this research, several univariate and multivariate FS algorithms combined with various parametric and non-parametric learners were applied to the prediction of feed efficiency in growing pigs from high-dimensional genomic data. The objective was to find the best combination of feature selector, SNP subset size, and learner leading to accurate and stable (i.e., less sensitive to changes in the training data) prediction models. Genomic best linear unbiased prediction (GBLUP) without SNP pre-selection was the benchmark. Three types of FS methods were implemented: (i) filter methods: univariate (univ.dtree, spearcor) or multivariate (cforest, mrmr), with random selection as benchmark; (ii) embedded methods: elastic net and least absolute shrinkage and selection operator (LASSO) regression; (iii) combination of filter and embedded methods. Ridge regression, support vector machine (SVM), and gradient boosting (GB) were applied after pre-selection performed with the filter methods. Data represented 5,708 individual records of residual feed intake to be predicted from the animal's own genotype. Accuracy (stability of results) was measured as the median (interquartile range) of the Spearman correlation between observed and predicted data in a 10-fold cross-validation. The best prediction in terms of accuracy and stability was obtained with SVM and GB using 500 or more SNPs [0.28 (0.02) and 0.27 (0.04) for SVM and GB with 1,000 SNPs, respectively]. With larger subset sizes (1,000-1,500 SNPs), the filter method had no influence on prediction quality, which was similar to that attained with a random selection. With 50-250 SNPs, the FS method had a huge impact on prediction quality: it was very poor for tree-based methods combined with any learner, but good and similar to what was obtained with larger SNP subsets when spearcor or mrmr were implemented with or without embedded methods. Those filters also led to very stable results, suggesting their potential use for designing low-density SNP chips for genome-based evaluation of feed efficiency.

SUBMITTER: Piles M

PROVIDER: S-EPMC7938892 | biostudies-literature | 2021

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Feature Selection Stability and Accuracy of Prediction Models for Genomic Prediction of Residual Feed Intake in Pigs Using Machine Learning.

Piles Miriam M Bergsma Rob R Gianola Daniel D Gilbert Hélène H Tusell Llibertat L

Frontiers in genetics 20210222

Feature selection (FS, i.e., selection of a subset of predictor variables) is essential in high-dimensional datasets to prevent overfitting of prediction/classification models and reduce computation time and resources. In genomics, FS allows identifying relevant markers and designing low-density SNP chips to evaluate selection candidates. In this research, several univariate and multivariate FS algorithms combined with various parametric and non-parametric learners were applied to the prediction ...[more]

PMID: 33692825

Similar Datasets

Project description:To identify a proper strategy for future feed-efficient pig farming, it is required to evaluate the ongoing selection scenarios. Tools are lacking for the evaluation of pig selection scenarios in terms of environmental impacts to provide selection guidelines for a more sustainable pig production. Selection on residual feed intake (RFI) has been proposed to improve feed efficiency and potentially reduce the associated environmental impacts. The aim of this study was thus to develop a model to account for individual animal performance in life cycle assessment (LCA) methods to quantify the responses to selection. Experimental data were collected from the fifth generation of pig lines divergently selected for RFI (low line, more efficient pigs, LRFI; high line, less efficient pigs, HRFI). The average feed conversion ratio (FCR) and daily feed intake of LRFI pigs were 7% lower than the average of HRFI pigs (P < 0.0001). A parametric model was developed for LCA based on the dietary net energy fluxes in a pig system. A nutritional pig growth tool, InraPorc®, was included as a module in the model to embed flexibility for changes in feed composition, animal performance traits and housing conditions and to simulate individual pig performance. The comparative individual-based LCA showed that LRFI had an average of 7% lower environmental impacts per kilogram live pig at farm gate compared to HRFI (P < 0.0001) on climate change, acidification potential, freshwater eutrophication potential, land occupation and water depletion. High correlations between FCR and all environmental impact categories (>0.95) confirmed the importance of improvement in feed efficiency to reduce environmental impacts. Significant line differences in all impact categories and moderate correlations with impacts (>0.51) revealed that RFI is an effective measure to select for improved environmental impacts, despite lower correlations compared to FCR. Altogether more optimal criteria for efficient environment-friendly selection can then be expected through restructuring the selection indexes from an environmental point of view.

Project description:This review summarizes the results from the INRA (Institut National de la Recherche Agronomique) divergent selection experiment on residual feed intake (RFI) in growing Large White pigs during nine generations of selection. It discusses the remaining challenges and perspectives for the improvement of feed efficiency in growing pigs. The impacts on growing pigs raised under standard conditions and in alternative situations such as heat stress, inflammatory challenges or lactation have been studied. After nine generations of selection, the divergent selection for RFI led to highly significant (P<0.001) line differences for RFI (-165 g/day in the low RFI (LRFI) line compared with high RFI line) and daily feed intake (-270 g/day). Low responses were observed on growth rate (-12.8 g/day, P<0.05) and body composition (+0.9 mm backfat thickness, P=0.57; -2.64% lean meat content, P<0.001) with a marked response on feed conversion ratio (-0.32 kg feed/kg gain, P<0.001). Reduced ultimate pH and increased lightness of the meat (P<0.001) were observed in LRFI pigs with minor impact on the sensory quality of the meat. These changes in meat quality were associated with changes of the muscular energy metabolism. Reduced maintenance energy requirements (-10% after five generations of selection) and activity (-21% of time standing after six generations of selection) of LRFI pigs greatly contributed to the gain in energy efficiency. However, the impact of selection for RFI on the protein metabolism of the pig remains unclear. Digestibility of energy and nutrients was not affected by selection, neither for pigs fed conventional diets nor for pigs fed high-fibre diets. A significant improvement of digestive efficiency could likely be achieved by selecting pigs on fibre diets. No convincing genetic or blood biomarker has been identified for explaining the differences in RFI, suggesting that pigs have various ways to achieve an efficient use of feed. No deleterious impact of the selection on the sow reproduction performance was observed. The resource allocation theory states that low RFI may reduce the ability to cope with stressors, via the reduction of a buffer compartment dedicated to responses to stress. None of the experiments focussed on the response of pigs to stress or challenges could confirm this theory. Understanding the relationships between RFI and responses to stress and energy demanding processes, as such immunity and lactation, remains a major challenge for a better understanding of the underlying biological mechanisms of the trait and to reconcile the experimental results with the resource allocation theory.

Project description:Microbes and microbial components potentially impact the performance of pigs through immune stimulation and altered metabolism. These immune modulating factors can include endotoxin from gram negative bacterial outer membrane component, commonly referred to as lipopolysaccharide (LPS). In this study, our objective was to examine the relationship between intestinal barrier integrity, endotoxin and inflammation with feed efficiency (FE), using pig lines divergently selected for residual feed intake (RFI) as a model. Twelve gilts (62 ± 3 kg BW) from the low RFI (LRFI, more efficient) and 12 from the high RFI (HRFI, less efficient) were used. Individual performance data was recorded for 5 wk. At the end of the experimental period, ADFI of LRFI pigs was less (P < 0.001), ADG not different between the 2 lines (P = 0.72) but the G:F of LRFI pigs was greater than for HRFI pigs (P = 0.019). Serum endotoxin concentration (P < 0.01) and the acute phase protein haptoglobin (P < 0.05) were greater in HRFI pigs. Transepithelial resistance of the ileum, transport of fluorescein isothiocyanate labeled-Dextran and-LPS in ileum and colon, as well as tight junction protein mRNA expression in ileum, did not differ between the lines, indicating the 2 lines did not differ in transport characteristics at the intestinal level. Ileum inflammatory markers, myeloperoxidase (P < 0.05) and IL-8 (P < 0.10), were found to be greater in HRFI pigs. Alkaline phosphatase (ALP) activity was significantly increased in the LRFI pigs in ileum and liver tissues and negatively correlated with blood endotoxin (P < 0.05). Lysozyme activity in the liver was not different between the lines; however, the LRFI pigs had a twofold greater lysozyme activity in ileum (P < 0.05). Despite the difference in their activity, ALP or lysozyme mRNA expression was not different between the lines in either tissue. Decreased endotoxin and inflammatory markers and the enhanced activities of antimicrobial enzymes in the LRFI line may not fully explain the difference in the FE between the lines, but they have the potential to prevent the growth potential in HRFI pigs. Further studies are needed to identify the other mechanisms that may contribute to the greater endotoxin and acute phase proteins in the HRFI pigs and the greater FE in the LRFI pigs.

Project description:BACKGROUND:Improving feed efficiency (FE) of pigs by genetic selection is of economic and environmental significance. An increasingly accepted measure of feed efficiency is residual feed intake (RFI). Currently, the molecular mechanisms underlying RFI are largely unknown. Additionally, to incorporate RFI into animal breeding programs, feed intake must be recorded on individual pigs, which is costly and time-consuming. Thus, convenient and predictive biomarkers for RFI that can be measured at an early age are greatly desired. In this study, we aimed to explore whether differences exist in the global gene expression profiles of peripheral blood of 35 to 42 day-old pigs with extremely low (more efficient) and high RFI (less efficient) values from two lines that were divergently selected for RFI during the grow-finish phase, to use such information to explore the potential molecular basis of RFI differences, and to initiate development of predictive biomarkers for RFI. RESULTS:We identified 1972 differentially expressed genes (DEGs) (q ? 0.15) between the low (n = 15) and high (n = 16) RFI groups of animals by using RNA sequencing technology. We validated 24 of 37 selected DEGs by reverse transcription-quantitative PCR (RT-qPCR) in a joint analysis of 24 (12 per line) of the 31 samples already used for RNA-seq plus 24 (12 per line) novel samples from the same contemporary group of pigs. Using an analysis of the 24 novel samples alone, only nine of the 37 selected DEGs were validated. Genes involved in small molecule biosynthetic process, antigen processing and presentation of peptide antigen via major histocompatibility complex (MHC) class I, and steroid biosynthetic process were overrepresented among DEGs that had higher expression in the low versus high RFI animals. Genes known to function in the proteasome complex or mitochondrion were also significantly enriched among genes with higher expression in the low versus high RFI animals. Alternatively, genes involved in signal transduction, bone mineralization and regulation of phosphorylation were overrepresented among DEGs with lower expression in the low versus high RFI animals. The DEGs significantly overlapped with genes associated with disease, including hyperphagia, eating disorders and mitochondrial diseases (q < 1E-05). A weighted gene co-expression network analysis (WGCNA) identified four co-expression modules that were differentially expressed between the low and high RFI groups. Genes involved in lipid metabolism, regulation of bone mineralization, cellular immunity and response to stimulus were overrepresented within the two modules that were most significantly differentially expressed between the low and high RFI groups. We also found five of the DEGs and one of the co-expression modules were significantly associated with the RFI phenotype of individual animals (q < 0.05). CONCLUSIONS:The post-weaning blood transcriptome was clearly different between the low and high RFI groups. The identified DEGs suggested potential differences in mitochondrial and proteasomal activities, small molecule biosynthetic process, and signal transduction between the two RFI groups and provided potential new insights into the molecular basis of RFI in pigs, although the observed relationship between the post-weaning blood gene expression and RFI phenotype measured during the grow-finish phase was not strong. DEGs and representative genes in co-expression modules that were associated with RFI phenotype provide a preliminary list for developing predictive biomarkers for RFI in pigs.

Project description:BACKGROUND:It is unclear whether improving feed efficiency by selection for low residual feed intake (RFI) compromises pigs' immunocompetence. Here, we aimed at investigating whether pig lines divergently selected for RFI had different inflammatory responses to lipopolysaccharide (LPS) exposure, regarding to clinical presentations and transcriptomic changes in peripheral blood cells. RESULTS:LPS injection induced acute systemic inflammation in both the low-RFI and high-RFI line (n = 8 per line). At 4 h post injection (hpi), the low-RFI line had a significantly lower (p = 0.0075) mean rectal temperature compared to the high-RFI line. However, no significant differences in complete blood count or levels of several plasma cytokines were detected between the two lines. Profiling blood transcriptomes at 0, 2, 6, and 24 hpi by RNA-sequencing revealed that LPS induced dramatic transcriptional changes, with 6296 genes differentially expressed at at least one time point post injection relative to baseline in at least one line (n = 4 per line) (|log2(fold change)| ≥ log2(1.2); q < 0.05). Furthermore, applying the same cutoffs, we detected 334 genes differentially expressed between the two lines at at least one time point, including 33 genes differentially expressed between the two lines at baseline. But no significant line-by-time interaction effects were detected. Genes involved in protein translation, defense response, immune response, and signaling were enriched in different co-expression clusters of genes responsive to LPS stimulation. The two lines were largely similar in their peripheral blood transcriptomic responses to LPS stimulation at the pathway level, although the low-RFI line had a slightly lower level of inflammatory response than the high-RFI line from 2 to 6 hpi and a slightly higher level of inflammatory response than the high-RFI line at 24 hpi. CONCLUSIONS:The pig lines divergently selected for RFI had a largely similar response to LPS stimulation. However, the low-RFI line had a relatively lower-level, but longer-lasting, inflammatory response compared to the high-RFI line. Our results suggest selection for feed efficient pigs does not significantly compromise a pig's acute systemic inflammatory response to LPS, although slight differences in intensity and duration may occur.

Dataset Information

Feature Selection Stability and Accuracy of Prediction Models for Genomic Prediction of Residual Feed Intake in Pigs Using Machine Learning.

Publications

Feature Selection Stability and Accuracy of Prediction Models for Genomic Prediction of Residual Feed Intake in Pigs Using Machine Learning.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets