Unknown

Dataset Information

0

findMySequence: a neural-network-based approach for identification of unknown proteins in X-ray crystallography and cryo-EM.


ABSTRACT: Although experimental protein-structure determination usually targets known proteins, chains of unknown sequence are often encountered. They can be purified from natural sources, appear as an unexpected fragment of a well characterized protein or appear as a contaminant. Regardless of the source of the problem, the unknown protein always requires characterization. Here, an automated pipeline is presented for the identification of protein sequences from cryo-EM reconstructions and crystallographic data. The method's application to characterize the crystal structure of an unknown protein purified from a snake venom is presented. It is also shown that the approach can be successfully applied to the identification of protein sequences and validation of sequence assignments in cryo-EM protein structures.

SUBMITTER: Chojnowski G 

PROVIDER: S-EPMC8733886 | biostudies-literature |

REPOSITORIES: biostudies-literature

Similar Datasets

| S-EPMC8589417 | biostudies-literature
| S-EPMC5138254 | biostudies-literature
| S-EPMC8733892 | biostudies-literature
| S-EPMC2695944 | biostudies-literature
| S-EPMC6468305 | biostudies-literature
| S-EPMC4125826 | biostudies-literature
| S-EPMC6096492 | biostudies-literature
| S-EPMC8595878 | biostudies-literature
| S-EPMC7900225 | biostudies-literature
| S-EPMC5192981 | biostudies-literature