Dataset Information

PanDelos: a dictionary-based method for pan-genome content discovery.

ABSTRACT:

Background

Pan-genome approaches afford the discovery of homology relations in a set of genomes, by determining how some gene families are distributed among a given set of genomes. The retrieval of a complete gene distribution among a class of genomes is an NP-hard problem because computational costs increase with the number of analyzed genomes, in fact, all-against-all gene comparisons are required to completely solve the problem. In presence of phylogenetically distant genomes, due to the variability introduced in gene duplication and transmission, the task of recognizing homologous genes becomes even more difficult. A challenge on this field is that of designing fast and adaptive similarity measures in order to find a suitable pan-genome structure of homology relations.

Results

We present PanDelos, a stand alone tool for the discovery of pan-genome contents among phylogenetic distant genomes. The methodology is based on information theory and network analysis. It is parameter-free because thresholds are automatically deduced from the context. PanDelos avoids sequence alignment by introducing a measure based on k-mer multiplicity. The k-mer length is defined according to general arguments rather than empirical considerations. Homology candidate relations are integrated into a global network and groups of homologous genes are extracted by applying a community detection algorithm.

Conclusions

PanDelos outperforms existing approaches, Roary and EDGAR, in terms of running times and quality content discovery. Tests were run on collections of real genomes, previously used in analogous studies, and in synthetic benchmarks that represent fully trusted golden truth. The software is available at https://github.com/GiugnoLab/PanDelos .

SUBMITTER: Bonnici V

PROVIDER: S-EPMC6266927 | biostudies-literature |

REPOSITORIES: biostudies-literature

ACCESS DATA

Similar Datasets

Project description:BackgroundThere are two US Food and Drug Administration (FDA)-approved drugs, pirfenidone and nintedanib, for treatment of patients with idiopathic pulmonary fibrosis (IPF). However, neither of these drugs provide a cure. In addition, both are associated with several drug-related adverse events. Hence, the pursuit for newer IPF therapeutics continues. Recent studies show that joint analysis of systems-biology-level information with drug-disease connectivity are effective in discovery of biologically relevant candidate therapeutics.MethodsPublicly available gene expression signatures from patients with IPF were used to query a large-scale perturbagen signature library to discover compounds that can potentially reverse dysregulated gene expression in IPF. Two methods were used to calculate IPF-compound connectivity: gene expression-based connectivity and feature-based connectivity. Identified compounds were further prioritized if their shared mechanism(s) of action were IPF-related.ResultsWe found 77 compounds as potential candidate therapeutics for IPF. Of these, 39 compounds are either FDA-approved for other diseases or are currently in phase II/III clinical trials suggesting their repurposing potential for IPF. Among these compounds are multiple receptor kinase inhibitors (e.g. nintedanib, currently approved for IPF, and sunitinib), aurora kinase inhibitor (barasertib), epidermal growth factor receptor inhibitors (erlotinib, gefitinib), calcium channel blocker (verapamil), phosphodiesterase inhibitors (roflumilast, sildenafil), PPAR agonists (pioglitazone), histone deacetylase inhibitors (entinostat), and opioid receptor antagonists (nalbuphine). As a proof of concept, we performed in vitro validations with verapamil using lung fibroblasts from IPF and show its potential benefits in pulmonary fibrosis.ConclusionsAs about half of the candidates discovered in this study are either FDA-approved or are currently in clinical trials for other diseases, rapid translation of these compounds as potential IPF therapeutics is possible. Further, the integrative connectivity analysis framework in this study can be adapted in early phase drug discovery for other common and rare diseases with transcriptomic profiles.The reviews of this paper are available via the supplemental material section.

Project description:Chagas disease is caused by Trypanosoma cruzi infection and remains a relevant cause of chronic heart failure in Latin America. The pharmacological arsenal for Chagas disease is limited, and the available anti-T. cruzi drugs are not effective when administered during the chronic phase. Cardiomyocytes derived from human-induced pluripotent stem cells (hiPSC-CMs) have the potential to accelerate the process of drug discovery for Chagas disease, through predictive preclinical assays in target human cells. Here, we aimed to establish a novel high-content screening- (HCS-) based method using hiPSC-CMs to simultaneously evaluate anti-T. cruzi activity and cardiotoxicity of chemical compounds. To provide proof-of-concept data, the reference drug benznidazole and three compounds with known anti-T. cruzi activity (a betulinic acid derivative named BA5 and two thiazolidinone compounds named GT5A and GT5B) were evaluated in the assay. hiPSC-CMs were infected with T. cruzi and incubated for 48 h with serial dilutions of the compounds for determination of EC50 and CC50 values. Automated multiparametric analyses were performed using an automated high-content imaging system. Sublethal toxicity measurements were evaluated through morphological measurements related to the integrity of the cytoskeleton by phalloidin staining, nuclear score by Hoechst 33342 staining, mitochondria score following MitoTracker staining, and quantification of NT-pro-BNP, a peptide released upon mechanical myocardial stress. The compounds showed EC50 values for anti-T. cruzi activity similar to those previously described for other cell types, and GT5B showed a pronounced trypanocidal activity in hiPSC-CMs. Sublethal changes in cytoskeletal and nucleus scores correlated with NT-pro-BNP levels in the culture supernatant. Mitochondrial score changes were associated with increased cytotoxicity. The assay was feasible and allowed rapid assessment of anti-T. cruzi action of the compounds, in addition to cardiotoxicity parameters. The utilization of hiPSC-CMs in the drug development workflow for Chagas disease may help in the identification of novel compounds.