Project description:We constructed a lung cancer-specific database housing expression data and clinical data from over 6700 patients in 56 studies. Expression data from 23 genome-wide platforms were carefully processed and quality controlled, whereas clinical data were standardized and rigorously curated. Empowered by this lung cancer database, we created an open access web resource-the Lung Cancer Explorer (LCE), which enables researchers and clinicians to explore these data and perform analyses. Users can perform meta-analyses on LCE to gain a quick overview of the results on tumor vs non-malignant tissue (normal) differential gene expression and expression-survival association. Individual dataset-based survival analysis, comparative analysis, and correlation analysis are also provided with flexible options to allow for customized analyses from the user.
Project description:Recent advances in next-generation sequencing technologies and genome assembly algorithms have enabled the accumulation of a huge volume of genome sequences from various species. This has provided new opportunities for large-scale comparative genomics studies. Identifying and utilizing synteny blocks, which are genomic regions conserved among multiple species, is key to understanding genomic architecture and the evolutionary history of genomes. However, the construction and visualization of such synteny blocks from multiple species are very challenging, especially for biologists with a lack of computational skills. Here, we present Synteny Portal, a versatile web-based application portal for constructing, visualizing and browsing synteny blocks. With Synteny Portal, users can easily (i) construct synteny blocks among multiple species by using prebuilt alignments in the UCSC genome browser database, (ii) visualize and download syntenic relationships as high-quality images, (iii) browse synteny blocks with genetic information and (iv) download the details of synteny blocks to be used as input for downstream synteny-based analyses, all in an intuitive and easy-to-use web-based interface. We believe that Synteny Portal will serve as a highly valuable tool that will enable biologists to easily perform comparative genomics studies by compensating limitations of existing tools. Synteny Portal is freely available at http://bioinfo.konkuk.ac.kr/synteny_portal.
Project description:BackgroundMicroarray techniques are one of the main methods used to investigate thousands of gene expression profiles for enlightening complex biological processes responsible for serious diseases, with a great scientific impact and a wide application area. Several standalone applications had been developed in order to analyze microarray data. Two of the most known free analysis software packages are the R-based Bioconductor and dChip. The part of dChip software concerning the calculation and the analysis of gene expression has been modified to permit its execution on both cluster environments (supercomputers) and Grid infrastructures (distributed computing).This work is not aimed at replacing existing tools, but it provides researchers with a method to analyze large datasets without any hardware or software constraints.ResultsAn application able to perform the computation and the analysis of gene expression on large datasets has been developed using algorithms provided by dChip. Different tests have been carried out in order to validate the results and to compare the performances obtained on different infrastructures. Validation tests have been performed using a small dataset related to the comparison of HUVEC (Human Umbilical Vein Endothelial Cells) and Fibroblasts, derived from same donors, treated with IFN-alpha.Moreover performance tests have been executed just to compare performances on different environments using a large dataset including about 1000 samples related to Breast Cancer patients.ConclusionA Grid-enabled software application for the analysis of large Microarray datasets has been proposed. DChip software has been ported on Linux platform and modified, using appropriate parallelization strategies, to permit its execution on both cluster environments and Grid infrastructures. The added value provided by the use of Grid technologies is the possibility to exploit both computational and data Grid infrastructures to analyze large datasets of distributed data. The software has been validated and performances on cluster and Grid environments have been compared obtaining quite good scalability results.
Project description:With currently more than 126 000 publicly available structures and an increasing growth rate, the Protein Data Bank constitutes a rich data source for structure-driven research in fields like drug discovery, crop science and biotechnology in general. Typical workflows in these areas involve manifold computational tools for the analysis and prediction of molecular functions. Here, we present the ProteinsPlus web server that offers a unified easy-to-use interface to a broad range of tools for the early phase of structure-based molecular modeling. This includes solutions for commonly required pre-processing tasks like structure quality assessment (EDIA), hydrogen placement (Protoss) and the search for alternative conformations (SIENA). Beyond that, it also addresses frequent problems as the generation of 2D-interaction diagrams (PoseView), protein-protein interface classification (HyPPI) as well as automatic pocket detection and druggablity assessment (DoGSiteScorer). The unified ProteinsPlus interface covering all featured approaches provides various facilities for intuitive input and result visualization, case-specific parameterization and download options for further processing. Moreover, its generalized workflow allows the user a quick familiarization with the different tools. ProteinsPlus also stores the calculated results temporarily for future request and thus facilitates convenient result communication and re-access. The server is freely available at http://proteins.plus.
Project description:Circadian rhythms are a foundational aspect of biology. These rhythms are found at the molecular level in every cell of every living organism and they play a fundamental role in homeostasis and a variety of physiological processes. As a result, biomedical research of circadian rhythms continues to expand at a rapid pace. To support this research, CircadiOmics (http://circadiomics.igb.uci.edu/) is the largest annotated repository and analytic web server for high-throughput omic (e.g. transcriptomic, metabolomic, proteomic) circadian time series experimental data. CircadiOmics contains over 290 experiments and over 100 million individual measurements, across >20 unique tissues/organs, and 11 different species. Users are able to visualize and mine these datasets by deriving and comparing periodicity statistics for oscillating molecular species including: period, amplitude, phase, P-value and q-value. These statistics are obtained from BIO_CYCLE and JTK_CYCLE and are intuitively aggregated and displayed for comparison. CircadiOmics is the most up-to-date and cutting-edge web portal for searching and analyzing circadian omic data and is used by researchers around the world.
Project description:Circadian rhythms play a fundamental role at all levels of biological organization. Understanding the mechanisms and implications of circadian oscillations continues to be the focus of intense research. However, there has been no comprehensive and integrated way for accessing and mining all circadian omic datasets. The latest release of CircadiOmics (http://circadiomics.ics.uci.edu) fills this gap for providing the most comprehensive web server for studying circadian data. The newly updated version contains high-throughput 227 omic datasets corresponding to over 74 million measurements sampled over 24 h cycles. Users can visualize and compare oscillatory trajectories across species, tissues and conditions. Periodicity statistics (e.g. period, amplitude, phase, P-value, q-value etc.) obtained from BIO_CYCLE and other methods are provided for all samples in the repository and can easily be downloaded in the form of publication-ready figures and tables. New features and substantial improvements in performance and data volume make CircadiOmics a powerful web portal for integrated analysis of circadian omic data.
Project description:We introduce a web portal that employs network theory for the analysis of trajectories from molecular dynamics simulations. Users can create protein energy networks following methodology previously introduced by our group, and can identify residues that are important for signal propagation, as well as measure the efficiency of signal propagation by calculating the network coupling. This tool, called MDN, was used to characterize signal propagation in Escherichia coli heat-shock protein 70-kDa. Two variants of this protein experimentally shown to be allosterically active exhibit higher network coupling relative to that of two inactive variants. In addition, calculations of partial coupling suggest that this quantity could be used as part of the criteria to determine pocket druggability in drug discovery studies.
Project description:We describe here the structure and organization of TnCentral (https://tncentral.proteininformationresource.org/ [or the mirror link at https://tncentral.ncc.unesp.br/]), a web resource for prokaryotic transposable elements (TE). TnCentral currently contains ∼400 carefully annotated TE, including transposons from the Tn3, Tn7, Tn402, and Tn554 families; compound transposons; integrons; and associated insertion sequences (IS). These TE carry passenger genes, including genes conferring resistance to over 25 classes of antibiotics and nine types of heavy metal, as well as genes responsible for pathogenesis in plants, toxin/antitoxin gene pairs, transcription factors, and genes involved in metabolism. Each TE has its own entry page, providing details about its transposition genes, passenger genes, and other sequence features required for transposition, as well as a graphical map of all features. TnCentral content can be browsed and queried through text- and sequence-based searches with a graphic output. We describe three use cases, which illustrate how the search interface, results tables, and entry pages can be used to explore and compare TE. TnCentral also includes downloadable software to facilitate user-driven identification, with manual annotation, of certain types of TE in genomic sequences. Through the TnCentral homepage, users can also access TnPedia, which provides comprehensive reviews of the major TE families, including an extensive general section and specialized sections with descriptions of insertion sequence and transposon families. TnCentral and TnPedia are intuitive resources that can be used by clinicians and scientists to assess TE diversity in clinical, veterinary, and environmental samples. IMPORTANCE The ability of bacteria to undergo rapid evolution and adapt to changing environmental circumstances drives the public health crisis of multiple antibiotic resistance, as well as outbreaks of disease in economically important agricultural crops and animal husbandry. Prokaryotic transposable elements (TE) play a critical role in this. Many carry "passenger genes" (not required for the transposition process) conferring resistance to antibiotics or heavy metals or causing disease in plants and animals. Passenger genes are spread by normal TE transposition activities and by insertion into plasmids, which then spread via conjugation within and across bacterial populations. Thus, an understanding of TE composition and transposition mechanisms is key to developing strategies to combat bacterial pathogenesis. Toward this end, we have developed TnCentral, a bioinformatics resource dedicated to describing and exploring the structural and functional features of prokaryotic TE whose use is intuitive and accessible to users with or without bioinformatics expertise.
Project description:Serial transcriptomics experiments investigate the dynamics of gene expression changes associated with a quantitative variable such as time or dosage. The statistical analysis of these data implies the study of global and gene-specific expression trends, the identification of significant serial changes, the comparison of expression profiles and the assessment of transcriptional changes in terms of cellular processes. We have created the SEA (Serial Expression Analysis) suite to provide a complete web-based resource for the analysis of serial transcriptomics data. SEA offers five different algorithms based on univariate, multivariate and functional profiling strategies framed within a user-friendly interface and a project-oriented architecture to facilitate the analysis of serial gene expression data sets from different perspectives. SEA is available at sea.bioinfo.cipf.es.
Project description:Background & objectiveGenome-wide profiles of tumors obtained using functional genomics platforms are being deposited to the public repositories at an astronomical scale, as a result of focused efforts by individual laboratories and large projects such as the Cancer Genome Atlas (TCGA) and the International Cancer Genome Consortium. Consequently, there is an urgent need for reliable tools that integrate and interpret these data in light of current knowledge and disseminate results to biomedical researchers in a user-friendly manner. We have built the canEvolve web portal to meet this need.ResultscanEvolve query functionalities are designed to fulfill most frequent analysis needs of cancer researchers with a view to generate novel hypotheses. canEvolve stores gene, microRNA (miRNA) and protein expression profiles, copy number alterations for multiple cancer types, and protein-protein interaction information. canEvolve allows querying of results of primary analysis, integrative analysis and network analysis of oncogenomics data. The querying for primary analysis includes differential gene and miRNA expression as well as changes in gene copy number measured with SNP microarrays. canEvolve provides results of integrative analysis of gene expression profiles with copy number alterations and with miRNA profiles as well as generalized integrative analysis using gene set enrichment analysis. The network analysis capability includes storage and visualization of gene co-expression, inferred gene regulatory networks and protein-protein interaction information. Finally, canEvolve provides correlations between gene expression and clinical outcomes in terms of univariate survival analysis.ConclusionAt present canEvolve provides different types of information extracted from 90 cancer genomics studies comprising of more than 10,000 patients. The presence of multiple data types, novel integrative analysis for identifying regulators of oncogenesis, network analysis and ability to query gene lists/pathways are distinctive features of canEvolve. canEvolve will facilitate integrative and meta-analysis of oncogenomics datasets.AvailabilityThe canEvolve web portal is available at http://www.canevolve.org/.