Dataset Information

Genome-based peptide fingerprint scanning.

ABSTRACT: We have implemented a method that identifies the genomic origins of sample proteins by scanning their peptide-mass fingerprint against the theoretical translation and proteolytic digest of an entire genome. Unlike previously reported techniques, this method requires no predefined ORF or protein annotations. Fixed-size windows along the genome sequence are scored by an equation accounting for the number of matching peptides, the number of missed enzymatic cleavages in each peptide, the number of in-frame stop codons within a window, the adjacency between peptides, and duplicate peptide matches. Statistical significance of matching regions is assessed by comparing their scores to scores from windows matching randomly generated mass data. Tests with samples from Saccharomyces cerevisiae mitochondria and Escherichia coli have demonstrated the ability to produce statistically significant identifications, agreeing with two commonly used programs, peptident and mascot, in 86% of samples analyzed. This genome fingerprint scanning method has the potential to aid in genome annotation, identify proteins for which annotation is incorrect or missing, and handle cases where sequencing errors have caused framing mistakes in the databases. It might also aid in the identification of proteins in which recoding events such as frameshifting or stop-codon read-through have occurred, elucidating alternative translation mechanisms. The prototype is implemented as a clientserver pair, allowing the distribution, among a set of cluster nodes, of a single or multiple genomes for concurrent analysis.

SUBMITTER: Giddings MC

PROVIDER: S-EPMC140871 | biostudies-literature | 2003 Jan

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Genome-based peptide fingerprint scanning.

Giddings Michael C MC Shah Atul A AA Gesteland Ray R Moore Barry B

Proceedings of the National Academy of Sciences of the United States of America 20021223 1

We have implemented a method that identifies the genomic origins of sample proteins by scanning their peptide-mass fingerprint against the theoretical translation and proteolytic digest of an entire genome. Unlike previously reported techniques, this method requires no predefined ORF or protein annotations. Fixed-size windows along the genome sequence are scored by an equation accounting for the number of matching peptides, the number of missed enzymatic cleavages in each peptide, the number of ...[more]

PMID: 12518051

Similar Datasets

Project description:BackgroundNo attention has been paid on comparing a set of genome sequences crossing genetic components and biological categories with far divergence over large size range. We define it as the systematic comparative genomics and aim to develop the methodology.ResultsFirst, we create a method, GenomeFingerprinter, to unambiguously produce a set of three-dimensional coordinates from a sequence, followed by one three-dimensional plot and six two-dimensional trajectory projections, to illustrate the genome fingerprint of a given genome sequence. Second, we develop a set of concepts and tools, and thereby establish a method called the universal genome fingerprint analysis (UGFA). Particularly, we define the total genetic component configuration (TGCC) (including chromosome, plasmid, and phage) for describing a strain as a systematic unit, the universal genome fingerprint map (UGFM) of TGCC for differentiating strains as a universal system, and the systematic comparative genomics (SCG) for comparing a set of genomes crossing genetic components and biological categories. Third, we construct a method of quantitative analysis to compare two genomes by using the outcome dataset of genome fingerprint analysis. Specifically, we define the geometric center and its geometric mean for a given genome fingerprint map, followed by the Euclidean distance, the differentiate rate, and the weighted differentiate rate to quantitatively describe the difference between two genomes of comparison. Moreover, we demonstrate the applications through case studies on various genome sequences, giving tremendous insights into the critical issues in microbial genomics and taxonomy.ConclusionsWe have created a method, GenomeFingerprinter, for rapidly computing, geometrically visualizing, intuitively comparing a set of genomes at genome fingerprint level, and hence established a method called the universal genome fingerprint analysis, as well as developed a method of quantitative analysis of the outcome dataset. These have set up the methodology of systematic comparative genomics based on the genome fingerprint analysis.

Project description:Background informationProstate cancer (PCa) is a common disease but only a small subset of patients are at risk of developing metastasis and lethal disease, and identifying which patients will progress is challenging because of the heterogeneity underlying tumour progression. Understanding this heterogeneity at the molecular level and the resulting clinical impact is a critical step necessary for risk stratification. Defining genomic fingerprint elucidates molecular variation and may improve PCa risk stratification, providing more accurate prognostic information of tumour aggressiveness (or lethality) for prognostic biomarker development. Therefore, we explored transcriptomic differences between patients with indolent disease outcome and patients who developed metastasis post-radical prostatectomy using genome-wide expression data in the post radical prostatectomy clinical space before metastatic spread.ResultsBased on differential expression analysis, patients with adverse pathological findings who are at higher risk of developing metastasis have a distinct transcriptomic fingerprint that can be detected on surgically removed prostate specimens several years before metastasis detection. Nearly half of the transcriptomic fingerprint features were non-coding RNA highlighting their pivotal role in PCa progression. Protein-coding RNA features in the fingerprint are involved in multiple pathways including cell cycle, chromosome structure maintenance and cytoskeleton organisation. The metastatic transcriptomic fingerprint was determined in independent cohorts verifying the association between the fingerprint and metastatic patients. Further, the fingerprint was confirmed in metastasis lesions demonstrating that the fingerprint represents early metastatic transcriptomic changes, suggesting its utility as a prognostic tool to predict metastasis and provide clinical value in the early radical prostatectomy setting.ConclusionsHere, we show that transcriptomic patterns of metastatic PCa exist that can be detected early after radical prostatectomy. This metastatic fingerprint has potential prognostic ability that can impact PCa treatment management potentially circumventing the requirements for unnecessary therapies.

Project description:A substantial proportion of protein interactions relies on small domains binding to short peptides in the partner proteins. Many of these interactions are relatively low affinity and transient, and they impact on signal transduction. However, neither the number of potential interactions mediated by each domain nor the degree of promiscuity at a whole proteome level has been investigated. We have used a combination of phage display and SPOT synthesis to discover all the peptides in the yeast proteome that have the potential to bind to eight SH3 domains. We first identified the peptides that match a relaxed consensus, as deduced from peptides selected by phage display experiments. Next, we synthesized all the matching peptides at high density on a cellulose membrane, and we probed them directly with the SH3 domains. The domains that we have studied were grouped by this approach into five classes with partially overlapping specificity. Within the classes, however, the domains display a high promiscuity and bind to a large number of common targets with comparable affinity. We estimate that the yeast proteome contains as few as six peptides that bind to the Abp1 SH3 domain with a dissociation constant lower than 100 microM, while it contains as many as 50-80 peptides with corresponding affinity for the SH3 domain of Yfr024c. All the targets of the Abp1 SH3 domain, identified by this approach, bind to the native protein in vivo, as shown by coimmunoprecipitation experiments. Finally, we demonstrate that this strategy can be extended to the analysis of the entire human proteome. We have developed an approach, named WISE (whole interactome scanning experiment), that permits rapid and reliable identification of the partners of any peptide recognition module by peptide scanning of a proteome. Since the SPOT synthesis approach is semiquantitative and provides an approximation of the dissociation constants of the several thousands of interactions that are simultaneously analyzed in an array format, the likelihood of each interaction occurring in any given physiological settings can be evaluated. WISE can be easily extended to a variety of protein interaction domains, including those binding to modified peptides, thereby offering a powerful proteomic tool to help completing a full description of the cell interactome.

Dataset Information

Genome-based peptide fingerprint scanning.

Publications

Genome-based peptide fingerprint scanning.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets