Unknown

Dataset Information

0

Identifying foldable regions in protein sequence from the hydrophobic signal.


ABSTRACT: Structural genomics initiatives aim to elucidate representative 3D structures for the majority of protein families over the next decade, but many obstacles must be overcome. The correct design of constructs is extremely important since many proteins will be too large or contain unstructured regions and will not be amenable to crystallization. It is therefore essential to identify regions in protein sequences that are likely to be suitable for structural study. Scooby-Domain is a fast and simple method to identify globular domains in protein sequences. Domains are compact units of protein structure and their correct delineation will aid structural elucidation through a divide-and-conquer approach. Scooby-Domain predictions are based on the observed lengths and hydrophobicities of domains from proteins with known tertiary structure. The prediction method employs an A*-search to identify sequence regions that form a globular structure and those that are unstructured. On a test set of 173 proteins with consensus CATH and SCOP domain definitions, Scooby-Domain has a sensitivity of 50% and an accuracy of 29%, which is better than current state-of-the-art methods. The method does not rely on homology searches and, therefore, can identify previously unknown domains.

SUBMITTER: Pang CN 

PROVIDER: S-EPMC2241846 | biostudies-literature | 2008 Feb

REPOSITORIES: biostudies-literature

altmetric image

Publications

Identifying foldable regions in protein sequence from the hydrophobic signal.

Pang Chi N I CN   Lin Kuang K   Wouters Merridee A MA   Heringa Jaap J   George Richard A RA  

Nucleic acids research 20071201 2


Structural genomics initiatives aim to elucidate representative 3D structures for the majority of protein families over the next decade, but many obstacles must be overcome. The correct design of constructs is extremely important since many proteins will be too large or contain unstructured regions and will not be amenable to crystallization. It is therefore essential to identify regions in protein sequences that are likely to be suitable for structural study. Scooby-Domain is a fast and simple  ...[more]

Similar Datasets

| S-EPMC8018078 | biostudies-literature
| S-EPMC3812050 | biostudies-literature
| S-EPMC5278394 | biostudies-literature
| S-EPMC2912341 | biostudies-literature
| S-EPMC2788375 | biostudies-literature
| S-EPMC2919199 | biostudies-literature
2006-03-21 | GSE3810 | GEO
| S-EPMC2447787 | biostudies-literature
| S-EPMC8265191 | biostudies-literature
| S-EPMC9272798 | biostudies-literature