Dataset Information

Quick and clean: Cracking sentences encoded in E. coli by LC-MS/MS, de novo sequencing, and dictionary search.

ABSTRACT: In this study, we faced the challenge of deciphering a protein that has been designed and expressed by E. coli in such a way that the amino acid sequence encodes two concatenated English sentences. The letters 'O' and 'U' in the sentence are both replaced by 'K' in the protein. The sequence cannot be found online and carried to-be-discovered modifications. With limited information in hand, to solve the challenge, we developed a workflow consisting of bottom-up proteomics, de novo sequencing and a bioinformatics pipeline for data processing and searching for frequently appearing words. We assembled a complete first question: "Have you ever wondered what the most fundamental limitations in life are?" and validated the result by sequence database search against a customized FASTA file. We also searched the spectra against an E. coli proteome database and found close to 600 endogenous, co-purified E. coli proteins and contaminants introduced during sample handling, which made the inference of the sentence very challenging. We conclude that E. coli can express English sentences, and that de novo sequencing combined with clever sequence database search strategies is a promising tool for the identification of uncharacterized proteins.

SUBMITTER: Niu L

PROVIDER: S-EPMC6924291 | biostudies-literature | 2019 Mar

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Quick and clean: Cracking sentences encoded in <i>E. coli</i> by LC-MS/MS, de novo sequencing, and dictionary search.

Niu Lili L Mann Matthias M

EuPA open proteomics 20190301

In this study, we faced the challenge of deciphering a protein that has been designed and expressed by <i>E. coli</i> in such a way that the amino acid sequence encodes two concatenated English sentences. The letters 'O' and 'U' in the sentence are both replaced by 'K' in the protein. The sequence cannot be found online and carried to-be-discovered modifications. With limited information in hand, to solve the challenge, we developed a workflow consisting of bottom-up proteomics, de novo sequenci ...[more]

PMID: 31890553

Similar Datasets

Project description:BackgroundA safe, effective, and reversible nonhormonal male contraceptive drug is greatly needed for male contraception as well as for circumventing the side effects of female hormonal contraceptives. Phosducin-like 2 (PDCL2) is a testis-specific phosphoprotein in mice and humans. We recently found that male PDCL2 knockout mice are sterile due to globozoospermia caused by impaired sperm head formation, indicating that PDCL2 is a potential target for male contraception. Herein, our study for the first time developed a biophysical assay for PDCL2 allowing us to screen a series of small molecules, to study structure-activity relationships, and to discover two PDCL2 binders with novel chemical structure.ObjectiveTo identify a PDCL2 ligand for therapeutic male contraception, we performed DNA-encoded chemical library (DECL) screening and off-DNA hit validation using a unique affinity selection mass spectrometry (ASMS) biophysical profiling strategy.Materials and methodsWe employed the screening process of DECL, which contains billions of chemically unique DNA-barcoded compounds generated through individual sequences of reactions and different combinations of functionalized building blocks. The structures of the PDCL2 binders are proposed based on the sequencing analysis of the DNA barcode attached to each individual DECL compound. The proposed structure is synthesized through multistep reactions. To confirm and determine binding affinity between the DECL identified molecules and PDCL2, we developed an ASMS assay that incorporates liquid chromatography with tandem mass spectrometry (LC-MS/MS).ResultsAfter a screening process of PDCL2 with DECLs containing >440 billion compounds, we identified a series of hits. The selected compounds were synthesized as off-DNA small molecules, characterized by spectroscopy data, and subjected to our ASMS/LC-MS/MS binding assay. By this assay, we discovered two novel compounds, which showed good binding affinity for PDCL2 in comparison to other molecules generated in our laboratory and which were further confirmed by a thermal shift assay.Discussion and conclusion and relevanceWith the ASMS/LC-MS/MS assay developed in this paper, we successfully discovered a PDCL2 ligand that warrants further development as a male contraceptive.

Dataset Information

Quick and clean: Cracking sentences encoded in E. coli by LC-MS/MS, de novo sequencing, and dictionary search.

Publications

Quick and clean: Cracking sentences encoded in <i>E. coli</i> by LC-MS/MS, de novo sequencing, and dictionary search.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets