Project description:sangeranalyseR is feature-rich, free, and open-source R package for processing Sanger sequencing data. It allows users to go from loading reads to saving aligned contigs in a few lines of R code by using sensible defaults for most actions. It also provides complete flexibility for determining how individual reads and contigs are processed, both at the command-line in R and via interactive Shiny applications. sangeranalyseR provides a wide range of options for all steps in Sanger processing pipelines including trimming reads, detecting secondary peaks, viewing chromatograms, detecting indels and stop codons, aligning contigs, estimating phylogenetic trees, and more. Input data can be in either ABIF or FASTA format. sangeranalyseR comes with extensive online documentation and outputs aligned and unaligned reads and contigs in FASTA format, along with detailed interactive HTML reports. sangeranalyseR supports the use of colorblind-friendly palettes for viewing alignments and chromatograms. It is released under an MIT licence and available for all platforms on Bioconductor (https://bioconductor.org/packages/sangeranalyseR, last accessed February 22, 2021) and on Github (https://github.com/roblanf/sangeranalyseR, last accessed February 22, 2021).
Project description:In biological membranes, many factors such as cytoskeleton, lipid composition, crowding, and molecular interactions deviate lateral diffusion from the expected random walks. These factors have different effects on diffusion but act simultaneously, so the observed diffusion is a complex mixture of diffusive behaviors (directed, Brownian, anomalous, or confined). Therefore, commonly used approaches to quantify diffusion based on averaging of the displacements such as the mean square displacement, are not adapted to the analysis of this heterogeneity. We introduce a parameter-the packing coefficient Pc, which gives an estimate of the degree of free movement that a molecule displays in a period of time independently of its global diffusivity. Applying this approach to two different situations (diffusion of a lipid probe and trapping of receptors at synapses), we show that Pc detected and localized temporary changes of diffusive behavior both in time and in space. More importantly, it allowed the detection of periods with very high confinement as well as their frequency and duration, and thus it can be used to calculate the effective kon and koff of scaffolding interactions such as those that immobilize receptors at synapses.
Project description:Functional enrichment analysis has played a key role in the biological interpretation of high-throughput omics data. As a long-standing and widely used web application for functional enrichment analysis, WebGestalt has been constantly updated to satisfy the needs of biologists from different research areas. WebGestalt 2017 supports 12 organisms, 324 gene identifiers from various databases and technology platforms, and 150 937 functional categories from public databases and computational analyses. Omics data with gene identifiers not supported by WebGestalt and functional categories not included in the WebGestalt database can also be uploaded for enrichment analysis. In addition to the Over-Representation Analysis in the previous versions, Gene Set Enrichment Analysis and Network Topology-based Analysis have been added to WebGestalt 2017, providing complementary approaches to the interpretation of high-throughput omics data. The new user-friendly output interface and the GOView tool allow interactive and efficient exploration and comparison of enrichment results. Thus, WebGestalt 2017 enables more comprehensive, powerful, flexible and interactive functional enrichment analysis. It is freely available at http://www.webgestalt.org.
Project description:Cell type assignment is a major challenge for all types of high throughput single cell data. In many cases such assignment requires the repeated manual use of external and complementary data sources. To improve the ability to uniformly assign cell types across large consortia, platforms and modalities, we developed Cellar, a software tool that provides interactive support to all the different steps involved in the assignment and dataset comparison process. We discuss the different methods implemented by Cellar, how these can be used with different data types, how to combine complementary data types and how to analyze and visualize spatial data. We demonstrate the advantages of Cellar by using it to annotate several HuBMAP datasets from multi-omics single-cell sequencing and spatial proteomics studies. Cellar is open-source and includes several annotated HuBMAP datasets.
Project description:Large studies profiling microbial communities and their association with healthy or disease phenotypes are now commonplace. Processed data from many of these studies are publicly available but significant effort is required for users to effectively organize, explore and integrate it, limiting the utility of these rich data resources. Effective integrative and interactive visual and statistical tools to analyze many metagenomic samples can greatly increase the value of these data for researchers. We present Metaviz, a tool for interactive exploratory data analysis of annotated microbiome taxonomic community profiles derived from marker gene or whole metagenome shotgun sequencing. Metaviz is uniquely designed to address the challenge of browsing the hierarchical structure of metagenomic data features while rendering visualizations of data values that are dynamically updated in response to user navigation. We use Metaviz to provide the UMD Metagenome Browser web service, allowing users to browse and explore data for more than 7000 microbiomes from published studies. Users can also deploy Metaviz as a web service, or use it to analyze data through the metavizr package to interoperate with state-of-the-art analysis tools available through Bioconductor. Metaviz is free and open source with the code, documentation and tutorials publicly accessible.
Project description:Better tools are needed to enable researchers to quickly identify and explore effective and interpretable feature-based explanations for discriminating multi-class genomic datasets, e.g., healthy versus diseased samples. We develop an interactive exploration tool, GENVISAGE, which rapidly discovers the most discriminative feature pairs that separate two classes of genomic objects and then displays the corresponding visualizations. Since quickly finding top feature pairs is computationally challenging, especially for large numbers of objects and features, we propose a suite of optimizations to make GENVISAGE responsive at scale and demonstrate that our optimizations lead to a 400× speedup over competitive baselines for multiple biological datasets. We apply our rapid and interpretable tool to identify literature-supported pairs of genes whose transcriptomic responses significantly discriminate several chemotherapy drug treatments. With its generalizable optimizations and framework, GENVISAGE opens up real-time feature-based explanation generation to data from massive sequencing efforts, as well as many other scientific domains.
Project description:Background. Most genetic disorders are caused by single nucleotide variations (SNVs) or small insertion/deletions (indels). High throughput sequencing has broadened the catalogue of human variation, including common polymorphisms, rare variations or disease causing mutations. However, identifying one variation among hundreds or thousands of others is still a complex task for biologists, geneticists and clinicians. Results. We have developed VaRank, a command-line tool for the ranking of genetic variants detected by high-throughput sequencing. VaRank scores and prioritizes variants annotated either by Alamut Batch or SnpEff. A barcode allows users to quickly view the presence/absence of variants (with homozygote/heterozygote status) in analyzed samples. VaRank supports the commonly used VCF input format for variants analysis thus allowing it to be easily integrated into NGS bioinformatics analysis pipelines. VaRank has been successfully applied to disease-gene identification as well as to molecular diagnostics setup for several hundred patients. Conclusions. VaRank is implemented in Tcl/Tk, a scripting language which is platform-independent but has been tested only on Unix environment. The source code is available under the GNU GPL, and together with sample data and detailed documentation can be downloaded from http://www.lbgi.fr/VaRank/.
Project description:MotivationSingle-cell proteomics technologies, such as mass cytometry, have enabled characterization of cell-to-cell variation and cell populations at a single-cell resolution. These large amounts of data, require dedicated, interactive tools for translating the data into knowledge.ResultsWe present a comprehensive, interactive method called Cyto to streamline analysis of large-scale cytometry data. Cyto is a workflow-based open-source solution that automates the use of state-of-the-art single-cell analysis methods with interactive visualization. We show the utility of Cyto by applying it to mass cytometry data from peripheral blood and high-grade serous ovarian cancer (HGSOC) samples. Our results show that Cyto is able to reliably capture the immune cell sub-populations from peripheral blood and cellular compositions of unique immune- and cancer cell subpopulations in HGSOC tumor and ascites samples.Availabilityand implementationThe method is available as a Docker container at https://hub.docker.com/r/anduril/cyto and the user guide and source code are available at https://bitbucket.org/anduril-dev/cyto.Supplementary informationSupplementary data are available at Bioinformatics online.
Project description:Omics data are broadly used to get a snap-shot of the molecular status of cells. In particular, changes in omics can be used to estimate the activity of pathways, transcription factors and kinases based on known regulated targets, that we call footprints. Then the molecular paths driving these activities can be estimated using causal reasoning on large signalling networks. We have developed FUNKI, a FUNctional toolKIt for footprint analysis. It provides a user-friendly interface for an easy and fast analysis of transcriptomics, phosphoproteomics and metabolomics data, either from bulk or single-cell experiments. FUNKI also features different options to visualise the results and run post-analyses, and is mirrored as a scripted version in R. FUNKI is a free and open-source application built on R and Shiny, available at https://github.com/saezlab/ShinyFUNKI and https://saezlab.shinyapps.io/funki/. We provide data examples within the app, and extensive information about the different variables to select, the results, and the different plots in the help page. User can also check the tutorial and more information in https://saezlab.github.io/ShinyFUNKI/.
Project description:Analyzing mass spectrometry-based metabolomics data presents a major challenge to metabolism researchers, as it requires downloading and processing large data volumes through complex "pipelines", even in cases where only a single metabolite or peak is of interest. This presents a significant hurdle for data sharing, reanalysis, or meta-analysis of existing data sets, whether locally stored or available from public repositories. Here we introduce mzAccess, a software system that provides interactive, online access to primary mass spectrometry data in real-time via a Web service protocol, circumventing the need for bulk data processing. mzAccess allows querying instrument data for spectra, chromatograms, or two-dimensional MZ-RT areas in either profile or centroid modes through a simple, uniform interface that is independent of vendor or instrument type. Using a cache mechanism, mzAccess achieves response times in the millisecond range for typical liquid chromatography-mass spectrometry (LC-MS) peaks, enabling real-time browsing of large data sets with hundreds or even thousands of samples. By simplifying access to metabolite data, we hope that this system will help enable data sharing and reanalysis in the metabolomics field.