Dataset Information

In search for more accurate alignments in the twilight zone.

ABSTRACT: A major bottleneck in comparative modeling is the alignment quality; this is especially true for proteins whose distant relationships could be reliably recognized only by recent advances in fold recognition. The best algorithms excel in recognizing distant homologs but often produce incorrect alignments for over 50% of protein pairs in large fold-prediction benchmarks. The alignments obtained by sequence-sequence or sequence-structure matching algorithms differ significantly from the structural alignments. To study this problem, we developed a simplified method to explicitly enumerate all possible alignments for a pair of proteins. This allowed us to estimate the number of significantly different alignments for a given scoring method that score better than the structural alignment. Using several examples of distantly related proteins, we show that for standard sequence-sequence alignment methods, the number of significantly different alignments is usually large, often about 10(10) alternatives. This distance decreases when the alignment method is improved, but the number is still too large for the brute force enumeration approach. More effective strategies were needed, so we evaluated and compared two well-known approaches for searching the space of suboptimal alignments. We combined their best features and produced a hybrid method, which yielded alignments that surpassed the original alignments for about 50% of protein pairs with minimal computational effort.

SUBMITTER: Jaroszewski L

PROVIDER: S-EPMC2373660 | biostudies-literature | 2002 Jul

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

In search for more accurate alignments in the twilight zone.

Jaroszewski Lukasz L Li Weizhong W Godzik Adam A

Protein science : a publication of the Protein Society 20020701 7

A major bottleneck in comparative modeling is the alignment quality; this is especially true for proteins whose distant relationships could be reliably recognized only by recent advances in fold recognition. The best algorithms excel in recognizing distant homologs but often produce incorrect alignments for over 50% of protein pairs in large fold-prediction benchmarks. The alignments obtained by sequence-sequence or sequence-structure matching algorithms differ significantly from the structural ...[more]

PMID: 12070323

Similar Datasets

Project description:BackgroundProtein structure prediction methods provide accurate results when a homologous protein is predicted, while poorer predictions are obtained in the absence of homologous templates. However, some protein chains that share twilight-zone pairwise identity can form similar folds and thus determining structural similarity without the sequence similarity would be desirable for the structure prediction. The folding type of a protein or its domain is defined as the structural class. Current structural class prediction methods that predict the four structural classes defined in SCOP provide up to 63% accuracy for the datasets in which sequence identity of any pair of sequences belongs to the twilight-zone. We propose SCPRED method that improves prediction accuracy for sequences that share twilight-zone pairwise similarity with sequences used for the prediction.ResultsSCPRED uses a support vector machine classifier that takes several custom-designed features as its input to predict the structural classes. Based on extensive design that considers over 2300 index-, composition- and physicochemical properties-based features along with features based on the predicted secondary structure and content, the classifier's input includes 8 features based on information extracted from the secondary structure predicted with PSI-PRED and one feature computed from the sequence. Tests performed with datasets of 1673 protein chains, in which any pair of sequences shares twilight-zone similarity, show that SCPRED obtains 80.3% accuracy when predicting the four SCOP-defined structural classes, which is superior when compared with over a dozen recent competing methods that are based on support vector machine, logistic regression, and ensemble of classifiers predictors.ConclusionThe SCPRED can accurately find similar structures for sequences that share low identity with sequence used for the prediction. The high predictive accuracy achieved by SCPRED is attributed to the design of the features, which are capable of separating the structural classes in spite of their low dimensionality. We also demonstrate that the SCPRED's predictions can be successfully used as a post-processing filter to improve performance of modern fold classification methods.

Project description:What happens in the brain when conscious awareness of the surrounding world fades? We manipulated consciousness in two experiments in a group of healthy males and measured brain activity with positron emission tomography. Measurements were made during wakefulness, escalating and constant levels of two anesthetic agents (experiment 1, n = 39), and during sleep-deprived wakefulness and non-rapid eye movement sleep (experiment 2, n = 37). In experiment 1, the subjects were randomized to receive either propofol or dexmedetomidine until unresponsiveness. In both experiments, forced awakenings were applied to achieve rapid recovery from an unresponsive to a responsive state, followed by immediate and detailed interviews of subjective experiences during the preceding unresponsive condition. Unresponsiveness rarely denoted unconsciousness, as the majority of the subjects had internally generated experiences. Unresponsive anesthetic states and verified sleep stages, where a subsequent report of mental content included no signs of awareness of the surrounding world, indicated a disconnected state. Functional brain imaging comparing responsive and connected versus unresponsive and disconnected states of consciousness during constant anesthetic exposure revealed that activity of the thalamus, cingulate cortices, and angular gyri are fundamental for human consciousness. These brain structures were affected independent from the pharmacologic agent, drug concentration, and direction of change in the state of consciousness. Analogous findings were obtained when consciousness was regulated by physiological sleep. State-specific findings were distinct and separable from the overall effects of the interventions, which included widespread depression of brain activity across cortical areas. These findings identify a central core brain network critical for human consciousness.SIGNIFICANCE STATEMENT Trying to understand the biological basis of human consciousness is currently one of the greatest challenges of neuroscience. While the loss and return of consciousness regulated by anesthetic drugs and physiological sleep are used as model systems in experimental studies on consciousness, previous research results have been confounded by drug effects, by confusing behavioral "unresponsiveness" and internally generated consciousness, and by comparing brain activity levels across states that differ in several other respects than only consciousness. Here, we present carefully designed studies that overcome many previous confounders and for the first time reveal the neural mechanisms underlying human consciousness and its disconnection from behavioral responsiveness, both during anesthesia and during normal sleep, and in the same study subjects.

Dataset Information

In search for more accurate alignments in the twilight zone.

Publications

In search for more accurate alignments in the twilight zone.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets