Dataset Information

Identification of Predictor Genes for Feed Efficiency in Beef Cattle by Applying Machine Learning Methods to Multi-Tissue Transcriptome Data.

ABSTRACT: Machine learning (ML) methods have shown promising results in identifying genes when applied to large transcriptome datasets. However, no attempt has been made to compare the performance of combining different ML methods together in the prediction of high feed efficiency (HFE) and low feed efficiency (LFE) animals. In this study, using RNA sequencing data of five tissues (adrenal gland, hypothalamus, liver, skeletal muscle, and pituitary) from nine HFE and nine LFE Nellore bulls, we evaluated the prediction accuracies of five analytical methods in classifying FE animals. These included two conventional methods for differential gene expression (DGE) analysis (t-test and edgeR) as benchmarks, and three ML methods: Random Forests (RFs), Extreme Gradient Boosting (XGBoost), and combination of both RF and XGBoost (RX). Utility of a subset of candidate genes selected from each method for classification of FE animals was assessed by support vector machine (SVM). Among all methods, the smallest subsets of genes (117) identified by RX outperformed those chosen by t-test, edgeR, RF, or XGBoost in classification accuracy of animals. Gene co-expression network analysis confirmed the interactivity existing among these genes and their relevance within the network related to their prediction ranking based on ML. The results demonstrate a great potential for applying a combination of ML methods to large transcriptome datasets to identify biologically important genes for accurately classifying FE animals.

SUBMITTER: Chen W

PROVIDER: S-EPMC7921797 | biostudies-literature | 2021

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Identification of Predictor Genes for Feed Efficiency in Beef Cattle by Applying Machine Learning Methods to Multi-Tissue Transcriptome Data.

Chen Weihao W Alexandre Pâmela A PA Ribeiro Gabriela G Fukumasu Heidge H Sun Wei W Reverter Antonio A Li Yutao Y

Frontiers in genetics 20210216

Machine learning (ML) methods have shown promising results in identifying genes when applied to large transcriptome datasets. However, no attempt has been made to compare the performance of combining different ML methods together in the prediction of high feed efficiency (HFE) and low feed efficiency (LFE) animals. In this study, using RNA sequencing data of five tissues (adrenal gland, hypothalamus, liver, skeletal muscle, and pituitary) from nine HFE and nine LFE Nellore bulls, we evaluated th ...[more]

PMID: 33664767

Similar Datasets

Project description:BackgroundGeneral, breed- and diet-dependent associations between feed efficiency in beef cattle and single nucleotide polymorphisms (SNPs) or haplotypes were identified on a population of 1321 steers using a 50 K SNP panel. Genomic associations with traditional two-step indicators of feed efficiency - residual feed intake (RFI), residual average daily gain (RADG), and residual intake gain (RIG) - were compared to associations with two complementary one-step indicators of feed efficiency: efficiency of intake (EI) and efficiency of gain (EG). Associations uncovered in a training data set were evaluated on independent validation data set. A multi-SNP model was developed to predict feed efficiency. Functional analysis of genes harboring SNPs significantly associated with feed efficiency and network visualization aided in the interpretation of the results.ResultsFor the five feed efficiency indicators, the numbers of general, breed-dependent, and diet-dependent associations with SNPs (P-value?<?0.0001) were 31, 40, and 25, and with haplotypes were six, ten, and nine, respectively. Of these, 20 SNP and six haplotype associations overlapped between RFI and EI, and five SNP and one haplotype associations overlapped between RADG and EG. This result confirms the complementary value of the one and two-step indicators. The multi-SNP models included 89 SNPs and offered a precise prediction of the five feed efficiency indicators. The associations of 17 SNPs and 7 haplotypes with feed efficiency were confirmed on the validation data set. Nine clusters of Gene Ontology and KEGG pathway categories (mean P-value?<?0.001) including, 9nucleotide binding; ion transport, phosphorous metabolic process, and the MAPK signaling pathway were overrepresented among the genes harboring the SNPs associated with feed efficiency.ConclusionsThe general SNP associations suggest that a single panel of genomic variants can be used regardless of breed and diet. The breed- and diet-dependent associations between SNPs and feed efficiency suggest that further refinement of variant panels require the consideration of the breed and management practices. The unique genomic variants associated with the one- and two-step indicators suggest that both types of indicators offer complementary description of feed efficiency that can be exploited for genome-enabled selection purposes.

Project description:The objective of this study was to develop and validate a customized cost-effective single nucleotide polymorphism (SNP) panel for genetic improvement of feed efficiency in beef cattle. The SNPs identified in previous association studies and through extensive analysis of candidate genomic regions and genes, were screened for their functional impact and allele frequency in Angus and Hereford breeds used as validation candidates for the panel. Association analyses were performed on genotypes of 159 SNPs from new samples of Angus (n = 160), Hereford (n = 329), and Angus-Hereford crossbred (n = 382) cattle using allele substitution and genotypic models in ASReml. Genomic heritabilities were estimated for feed efficiency traits using the full set of SNPs, SNPs associated with at least one of the traits (at P ? 0.05 and P < 0.10), as well as the Illumina bovine 50K representing a widely used commercial genotyping panel. A total of 63 SNPs within 43 genes showed association (P ? 0.05) with at least one trait. The minor alleles of SNPs located in the GHR and CAST genes were associated with decreasing effects on residual feed intake (RFI) and/or RFI adjusted for backfat (RFIf), whereas minor alleles of SNPs within MKI67 gene were associated with increasing effects on RFI and RFIf. Additionally, the minor allele of rs137400016 SNP within CNTFR was associated with increasing average daily gain (ADG). The SNPs genotypes within UMPS, SMARCAL, CCSER1, and LMCD1 genes showed significant over-dominance effects whereas other SNPs located in SMARCAL1, ANXA2, CACNA1G, and PHYHIPL genes showed additive effects on RFI and RFIf. Gene enrichment analysis indicated that gland development, as well as ion and cation transport are important physiological mechanisms contributing to variation in feed efficiency traits. The study revealed the effect of the Jak-STAT signaling pathway on feed efficiency through the CNTFR, OSMR, and GHR genes. Genomic heritability using the 63 significant (P ? 0.05) SNPs was 0.09, 0.09, 0.13, 0.05, 0.05, and 0.07 for ADG, dry matter intake, midpoint metabolic weight, RFI, RFIf, and backfat, respectively. These SNPs contributed to genetic variation in the studied traits and thus can potentially be used or tested to generate cost-effective molecular breeding values for feed efficiency in beef cattle.

Dataset Information

Identification of Predictor Genes for Feed Efficiency in Beef Cattle by Applying Machine Learning Methods to Multi-Tissue Transcriptome Data.

Publications

Identification of Predictor Genes for Feed Efficiency in Beef Cattle by Applying Machine Learning Methods to Multi-Tissue Transcriptome Data.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets