Dataset Information

The importance of definitions in the study of polyQ regions: A tale of thresholds, impurities and sequence context.

ABSTRACT: Polyglutamine (polyQ) regions are one of the most prevalent homorepeats in eukaryotes. It is however difficult to evaluate their prevalence because various studies claim different results. The reason is the lack of a consensus to define what is indeed a polyQ region. We have tackled this issue by studying how the use of different thresholds (i.e., minimum number of glutamines required in a protein region of a given size), to detect polyQ regions in the human proteome influences not only their prevalence but also their general features and sequence context. Threshold definition shapes the length distribution of the polyQ dataset, and changes the observed number and position of impurities (amino acids other than glutamine) within polyQ regions. Irrespective of the chosen threshold, leucine and proline residues are enriched both within and around polyQ. While leucine is enriched at the N-terminus of polyQ and specially at position -1 (amino acid preceding the polyQ), proline is prevalent in the C-terminus (positions +1 to +5, that is, the first five amino acids after the polyQ). We also checked the suitability of these thresholds for other species, and compared their polyQ features with those found in humans. As the sequence context and features of polyQ regions are threshold-dependent, we propose a method to quickly scan the polyQ landscape of a proteome. We complement our results with a summarized overview about which biases are to be expected per threshold when studying polyQ regions.

SUBMITTER: Mier P

PROVIDER: S-EPMC7016039 | biostudies-literature | 2020

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

The importance of definitions in the study of polyQ regions: A tale of thresholds, impurities and sequence context.

Mier Pablo P Elena-Real Carlos C Urbanek Annika A Bernadó Pau P Andrade-Navarro Miguel A MA

Computational and structural biotechnology journal 20200204

Polyglutamine (polyQ) regions are one of the most prevalent homorepeats in eukaryotes. It is however difficult to evaluate their prevalence because various studies claim different results. The reason is the lack of a consensus to define what is indeed a polyQ region. We have tackled this issue by studying how the use of different thresholds (i.e., minimum number of glutamines required in a protein region of a given size), to detect polyQ regions in the human proteome influences not only their pr ...[more]

PMID: 32071707

Dataset Information

The importance of definitions in the study of polyQ regions: A tale of thresholds, impurities and sequence context.

Publications

The importance of definitions in the study of polyQ regions: A tale of thresholds, impurities and sequence context.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

The Protein Structure Context of PolyQ Regions.
| S-EPMC5268486 | biostudies-literature

Sequence Context Influences the Structure and Aggregation Behavior of a PolyQ Tract.
| S-EPMC4900447 | biostudies-literature

The sequence context in poly-alanine regions: structure, function and conservation.
| S-EPMC9620824 | biostudies-literature

Deep learning model of somatic hypermutation reveals importance of sequence context beyond hotspot targeting.
| S-EPMC8749460 | biostudies-literature

SEQATOMS: a web tool for identifying missing regions in PDB in sequence context.
| S-EPMC2447787 | biostudies-literature

The importance of making testable predictions: A cautionary tale.
| S-EPMC7723288 | biostudies-literature

annotatr: genomic regions in context.
| S-EPMC5860117 | biostudies-other

Context Influences on TALE-DNA Binding Revealed by Quantitative Profiling
2015-06-11 | GSE56978 | GEO

Context influences on TALE-DNA binding revealed by quantitative profiling.
| S-EPMC4467457 | biostudies-literature

Prioritizing non-coding regions based on human genomic constraint and sequence context with deep learning.
| S-EPMC7940646 | biostudies-literature