Dataset Information

A comparison of computational methods for identifying virulence factors.

ABSTRACT: Bacterial pathogens continue to threaten public health worldwide today. Identification of bacterial virulence factors can help to find novel drug/vaccine targets against pathogenicity. It can also help to reveal the mechanisms of the related diseases at the molecular level. With the explosive growth in protein sequences generated in the postgenomic age, it is highly desired to develop computational methods for rapidly and effectively identifying virulence factors according to their sequence information alone. In this study, based on the protein-protein interaction networks from the STRING database, a novel network-based method was proposed for identifying the virulence factors in the proteomes of UPEC 536, UPEC CFT073, P. aeruginosa PAO1, L. pneumophila Philadelphia 1, C. jejuni NCTC 11168 and M. tuberculosis H37Rv. Evaluated on the same benchmark datasets derived from the aforementioned species, the identification accuracies achieved by the network-based method were around 0.9, significantly higher than those by the sequence-based methods such as BLAST, feature selection and VirulentPred. Further analysis showed that the functional associations such as the gene neighborhood and co-occurrence were the primary associations between these virulence factors in the STRING database. The high success rates indicate that the network-based method is quite promising. The novel approach holds high potential for identifying virulence factors in many other various organisms as well because it can be easily extended to identify the virulence factors in many other bacterial species, as long as the relevant significant statistical data are available for them.

SUBMITTER: Zheng LL

PROVIDER: S-EPMC3411817 | biostudies-literature | 2012

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

A comparison of computational methods for identifying virulence factors.

Zheng Lu-Lu LL Li Yi-Xue YX Ding Juan J Guo Xiao-Kui XK Feng Kai-Yan KY Wang Ya-Jun YJ Hu Le-Le LL Cai Yu-Dong YD Hao Pei P Chou Kuo-Chen KC

PloS one 20120803 8

Bacterial pathogens continue to threaten public health worldwide today. Identification of bacterial virulence factors can help to find novel drug/vaccine targets against pathogenicity. It can also help to reveal the mechanisms of the related diseases at the molecular level. With the explosive growth in protein sequences generated in the postgenomic age, it is highly desired to develop computational methods for rapidly and effectively identifying virulence factors according to their sequence info ...[more]

PMID: 22880014

Similar Datasets

Project description:BackgroundPreventive interventions with post-exposure prophylaxis (PEP) are needed in leprosy high-endemic areas to interrupt the transmission of Mycobacterium leprae. Program managers intend to use Geographic Information Systems (GIS) to target preventive interventions considering efficient use of public health resources. Statistical GIS analyses are commonly used to identify clusters of disease without accounting for the local context. Therefore, we propose a contextualized spatial approach that includes expert consultation to identify clusters and compare it with a standard statistical approach.Methodology/principal findingsWe included all leprosy patients registered from 2014 to 2020 at the Health Centers in Fatehpur and Chandauli districts, Uttar Pradesh State, India (n = 3,855). Our contextualized spatial approach included expert consultation determining criteria and definition for the identification of clusters using Density Based Spatial Clustering Algorithm with Noise, followed by creating cluster maps considering natural boundaries and the local context. We compared this approach with the commonly used Anselin Local Moran's I statistic to identify high-risk villages. In the contextualized approach, 374 clusters were identified in Chandauli and 512 in Fatehpur. In total, 75% and 57% of all cases were captured by the identified clusters in Chandauli and Fatehpur, respectively. If 100 individuals per case were targeted for PEP, 33% and 11% of the total cluster population would receive PEP, respectively. In the statistical approach, more clusters in Chandauli and fewer clusters in Fatehpur (508 and 193) and lower proportions of cases in clusters (66% and 43%) were identified, and lower proportions of population targeted for PEP was calculated compared to the contextualized approach (11% and 11%).ConclusionA contextualized spatial approach could identify clusters in high-endemic districts more precisely than a standard statistical approach. Therefore, it can be a useful alternative to detect preventive intervention targets in high-endemic areas.

Dataset Information

A comparison of computational methods for identifying virulence factors.

Publications

A comparison of computational methods for identifying virulence factors.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets