Unknown

Dataset Information

0

The flax genome reveals orbitide diversity.


ABSTRACT:

Background

Ribosomally-synthesized cyclic peptides are widely found in plants and exhibit useful bioactivities for humans. The identification of cyclic peptide sequences and their precursor proteins is facilitated by the growing number of sequenced genomes. While previous research largely focused on the chemical diversity of these peptides across various species, there is little attention to a broader range of potential peptides that are not chemically identified.

Results

A pioneering study was initiated to explore the genetic diversity of linusorbs, a group of cyclic peptides uniquely occurring in cultivated flax (Linum usitatissimum). Phylogenetic analysis clustered the 5 known linusorb precursor proteins into two clades and one singleton. Preliminary tBLASTn search of the published flax genome using the whole protein sequence as query could only retrieve its homologues within the same clade. This limitation was overcome using a profile-based mining strategy. After genome reannotation, a hidden Markov Model (HMM)-based approach identified 58 repeats homologous to the linusorb-embedded repeats in 8 novel proteins, implying that they share common ancestry with the linusorb-embedded repeats. Subsequently, we developed a customized profile composed of a random linusorb-like domain (LLD) flanked by 5 conserved sites and used it for string search of the proteome, which extracted 281 LLD-containing repeats (LLDRs) in 25 proteins. Comparative analysis of different repeat categories suggested that the 5 conserved flanking sites among the non-homologous repeats have undergone convergent evolution driven by functional selection.

Conclusions

The profile-based mining approach is suitable for analyzing repetitive sequences. The 25 LLDR proteins identified herein represent the potential diversity of cyclic peptides within the flax genome and lay a foundation for further studies on the functions and evolution of these protein tandem repeats.

SUBMITTER: Song Z 

PROVIDER: S-EPMC9308333 | biostudies-literature |

REPOSITORIES: biostudies-literature

Similar Datasets

| PRJNA478805 | ENA
| PRJEB20299 | ENA
| S-EPMC7430851 | biostudies-literature
| S-EPMC4544635 | biostudies-literature
2022-07-13 | GSE207874 | GEO
| S-EPMC8473814 | biostudies-literature
| S-EPMC6598216 | biostudies-literature
| S-EPMC18976 | biostudies-literature
| S-EPMC7567804 | biostudies-literature
| S-EPMC3585000 | biostudies-literature