Dataset Information

Explainability in transformer models for functional genomics.

ABSTRACT: The effectiveness of deep learning methods can be largely attributed to the automated extraction of relevant features from raw data. In the field of functional genomics, this generally concerns the automatic selection of relevant nucleotide motifs from DNA sequences. To benefit from automated learning methods, new strategies are required that unveil the decision-making process of trained models. In this paper, we present a new approach that has been successful in gathering insights on the transcription process in Escherichia coli. This work builds upon a transformer-based neural network framework designed for prokaryotic genome annotation purposes. We find that the majority of subunits (attention heads) of the model are specialized towards identifying transcription factors and are able to successfully characterize both their binding sites and consensus sequences, uncovering both well-known and potentially novel elements involved in the initiation of the transcription process. With the specialization of the attention heads occurring automatically, we believe transformer models to be of high interest towards the creation of explainable neural networks in this field.

SUBMITTER: Clauwaert J

PROVIDER: S-EPMC8425421 | biostudies-literature | 2021 Sep

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Explainability in transformer models for functional genomics.

Clauwaert Jim J Menschaert Gerben G Waegeman Willem W

Briefings in bioinformatics 20210901 5

The effectiveness of deep learning methods can be largely attributed to the automated extraction of relevant features from raw data. In the field of functional genomics, this generally concerns the automatic selection of relevant nucleotide motifs from DNA sequences. To benefit from automated learning methods, new strategies are required that unveil the decision-making process of trained models. In this paper, we present a new approach that has been successful in gathering insights on the transc ...[more]

PMID: 33834200

Similar Datasets

Project description:Purpose: Explainability and fairness are two key factors for the effective and ethical clinical implementation of deep learning-based machine learning models in healthcare settings. However, there has been limited work on investigating how unfair performance manifests in explainable artificial intelligence (XAI) methods, and how XAI can be used to investigate potential reasons for unfairness. Thus, the aim of this work was to analyze the effects of previously established sociodemographic-related confounders on classifier performance and explainability methods. Approach: A convolutional neural network (CNN) was trained to predict biological sex from T1-weighted brain MRI datasets of 4547 9- to 10-year-old adolescents from the Adolescent Brain Cognitive Development study. Performance disparities of the trained CNN between White and Black subjects were analyzed and saliency maps were generated for each subgroup at the intersection of sex and race. Results: The classification model demonstrated a significant difference in the percentage of correctly classified White male ( 90.3%±1.7% ) and Black male ( 81.1%±4.5% ) children. Conversely, slightly higher performance was found for Black female ( 89.3%±4.8% ) compared with White female ( 86.5%±2.0% ) children. Saliency maps showed subgroup-specific differences, corresponding to brain regions previously associated with pubertal development. In line with this finding, average pubertal development scores of subjects used in this study were significantly different between Black and White females ( p<0.001 ) and males ( p<0.001 ). Conclusions: We demonstrate that a CNN with significantly different sex classification performance between Black and White adolescents can identify different important brain regions when comparing subgroup saliency maps. Importance scores vary substantially between subgroups within brain structures associated with pubertal development, a race-associated confounder for predicting sex. We illustrate that unfair models can produce different XAI results between subgroups and that these results may explain potential reasons for biased performance.

Dataset Information

Explainability in transformer models for functional genomics.

Publications

Explainability in transformer models for functional genomics.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets