Project description:Sarcomas are a heterogeneous group of rare malignancies that exhibit remarkable heterogeneity, with more than 50 subtypes recognized. Advances in next-generation sequencing technology have resulted in the discovery of genetic events in these mesenchymal tumors, which in addition to enhancing understanding of the biology, have opened up avenues for molecularly targeted therapy and immunotherapy. This review focuses on how incorporation of next-generation sequencing has affected drug development in sarcomas and strategies for optimizing precision oncology for these rare cancers. In a significant percentage of soft tissue sarcomas, which represent up to 40% of all sarcomas, specific driver molecular abnormalities have been identified. The challenge to evaluate these mutations across rare cancer subtypes requires the careful characterization of these genetic alterations to further define compelling drivers with therapeutic implications. Novel models of clinical trial design also are needed. This shift would entail sustained efforts by the sarcoma community to move from one-size-fits-all trials, in which all sarcomas are treated similarly, to divide-and-conquer subtype-specific strategies.
Project description:BackgroundIn environmental sequencing projects, a mix of DNA from a whole microbial community is fragmented and sequenced, with one of the possible goals being to reconstruct partial or complete genomes of members of the community. In communities with high diversity of species, a significant proportion of the sequences do not overlap any other fragment in the sample. This problem will arise not only in situations with a relatively even distribution of many species, but also when the community in a particular environment is routinely dominated by the same few species. In the former case, no genomes may be assembled at all, while in the latter case a few dominant species in an environment will always be sequenced at high coverage to the detriment of coverage of the greater number of sparse species.Methods and resultsHere we show that, with the same global sequencing effort, separating the species into two or more sub-communities prior to sequencing can yield a much higher proportion of sequences that can be assembled. We first use the Lander-Waterman model to show that, if the expected percentage of singleton sequences is higher than 25%, then, under the uniform distribution hypothesis, splitting the community is always a wise choice. We then construct simulated microbial communities to show that the results hold for highly non-uniform distributions. We also show that, for the distributions considered in the experiments, it is possible to estimate quite accurately the relative diversity of the two sub-communities.ConclusionGiven the fact that several methods exist to split microbial communities based on physical properties such as size, density, surface biochemistry, or optical properties, we strongly suggest that groups involved in environmental sequencing, and expecting high diversity, consider splitting their communities in order to maximize the information content of their sequencing effort.
Project description:We propose a computationally and statistically efficient divide-and-conquer (DAC) algorithm to fit sparse Cox regression to massive datasets where the sample size $n_0$ is exceedingly large and the covariate dimension $p$ is not small but $n_0\gg p$. The proposed algorithm achieves computational efficiency through a one-step linear approximation followed by a least square approximation to the partial likelihood (PL). These sequences of linearization enable us to maximize the PL with only a small subset and perform penalized estimation via a fast approximation to the PL. The algorithm is applicable for the analysis of both time-independent and time-dependent survival data. Simulations suggest that the proposed DAC algorithm substantially outperforms the full sample-based estimators and the existing DAC algorithm with respect to the computational speed, while it achieves similar statistical efficiency as the full sample-based estimators. The proposed algorithm was applied to extraordinarily large survival datasets for the prediction of heart failure-specific readmission within 30 days among Medicare heart failure patients.
Project description:The ability to perform ab initio electronic structure calculations that scales linearly with the system size is one of the central aims in theoretical chemistry. In this study, the implementation of the divide-and-conquer (DC) algorithm, an algorithm with the potential to aid the achievement of true linear scaling within Hartree-Fock (HF) theory is revisited. Standard HF calculations solve the Roothaan-Hall equations for the whole system; in the DC-HF approach, the diagonalization of the Fock matrix is carried out on smaller subsystems. The DC algorithm for HF calculations was validated on polyglycines, polyalanines and eleven real three-dimensional proteins of up to 608 atoms in this work. We also found that a fragment-based initial guess using molecular fractionation with conjugated caps (MFCC) method significantly reduces the number of SCF cycles and even is capable of achieving convergence for some globular proteins where the simple superposition of atomic densities (SAD) initial guess fails.
Project description:The aggressive peripheral T-cell lymphomas (PTCLs) are a heterogenous group of uncommon lymphomas of mature T lymphocytes dominated by 3 subtypes: systemic anaplastic large-cell lymphoma, both anaplastic lymphoma kinase positive and negative; nodal PTCL with T-follicular helper phenotype; and PTCL, not otherwise specified. Although the accurate diagnosis of T-cell lymphoma and the subtyping of these lymphomas may be challenging, there is growing evidence that knowledge of the subtype of disease can aid in prognostication and in the selection of optimal treatments, in both the front-line and the relapsed or refractory setting. This report focuses on the 3 most common subtypes of aggressive PTCL, to learn how current knowledge may dictate choices of therapy and consultative referrals and inform rational targets and correlative studies in the development of future clinical trials. Finally, I note that clinical-pathologic correlation, especially in cases of T-cell lymphomas that may present with an extranodal component, is essential in the accurate diagnosis and subsequent treatment of our patients.
Project description:Many trait measurements are size-dependent, and while we often divide these traits by size before fitting statistical models to control for the effect of size, this approach does not account for allometry and the intermediate outcome problem. We describe these problems and outline potential solutions.
Project description:MotivationNext-generation sequencing (NGS) provides a great opportunity to investigate genome-wide variation at nucleotide resolution. Due to the huge amount of data, NGS applications require very fast and accurate alignment algorithms. Most existing algorithms for read mapping basically adopt seed-and-extend strategy, which is sequential in nature and takes much longer time on longer reads.ResultsWe develop a divide-and-conquer algorithm, called Kart, which can process long reads as fast as short reads by dividing a read into small fragments that can be aligned independently. Our experiment result indicates that the average size of fragments requiring the more time-consuming gapped alignment is around 20 bp regardless of the original read length. Furthermore, it can tolerate much higher error rates. The experiments show that Kart spends much less time on longer reads than other aligners and still produce reliable alignments even when the error rate is as high as 15%.Availability and implementationKart is available at https://github.com/hsinnan75/Kart/ .Contacthsu@iis.sinica.edu.tw.Supplementary informationSupplementary data are available at Bioinformatics online.
Project description:Chronic heart failure is a worldwide cause of mortality and morbidity and is the final outcome of a number of different etiologies. This reflects both the complexity of the disease and our incomplete understanding of its underlying molecular mechanisms. One experimental approach to address this is to study subcellular organelles and how their functions are activated and synchronized under physiological and pathological conditions. In this review, we discuss the application of proteomic technologies to organelles and how this has deepened our perception of the cellular proteome and its alterations with heart failure. The use of proteomics to monitor protein quantity and posttranslational modifications has revealed a highly intricate and sophisticated level of protein regulation. Posttranslational modifications have the potential to regulate organelle function and interplay most likely by targeting both structural and signaling proteins throughout the cell, ultimately coordinating their responses. The potentials and limitations of existing proteomic technologies are also discussed emphasizing that the development of novel methods will enhance our ability to further investigate organelles and decode intracellular communication.
Project description:Identifying the determinants of cumulative cultural evolution is a key issue in the interdisciplinary field of cultural evolution. A widely held view is that large and well-connected social networks facilitate cumulative cultural evolution because they promote the spread of useful cultural traits and prevent the loss of cultural knowledge through factors such as drift. This view stems from models that focus on the transmission of cultural information, without considering how new cultural traits actually arise. In this paper, we review the literature from various fields that suggest that, under some circumstances, increased connectedness can decrease cultural diversity and reduce innovation rates. Incorporating this idea into an agent-based model, we explore the effect of population fragmentation on cumulative culture and show that, for a given population size, there exists an intermediate level of population fragmentation that maximizes the rate of cumulative cultural evolution. This result is explained by the fact that fully connected, non-fragmented populations are able to maintain complex cultural traits but produce insufficient variation and so lack the cultural diversity required to produce highly complex cultural traits. Conversely, highly fragmented populations produce a variety of cultural traits but cannot maintain complex ones. In populations with intermediate levels of fragmentation, cultural loss and cultural diversity are balanced in a way that maximizes cultural complexity. Our results suggest that population structure needs to be taken into account when investigating the relationship between demography and cumulative culture.This article is part of the theme issue 'Bridging cultural gaps: interdisciplinary studies in human cultural evolution'.
Project description:BackgroundDivide-and-conquer methods, which divide the species set into overlapping subsets, construct a tree on each subset, and then combine the subset trees using a supertree method, provide a key algorithmic framework for boosting the scalability of phylogeny estimation methods to large datasets. Yet the use of supertree methods, which typically attempt to solve NP-hard optimization problems, limits the scalability of such approaches.ResultsIn this paper, we introduce a divide-and-conquer approach that does not require supertree estimation: we divide the species set into pairwise disjoint subsets, construct a tree on each subset using a base method, and then combine the subset trees using a distance matrix. For this merger step, we present a new method, called NJMerge, which is a polynomial-time extension of Neighbor Joining (NJ); thus, NJMerge can be viewed either as a method for improving traditional NJ or as a method for scaling the base method to larger datasets. We prove that NJMerge can be used to create divide-and-conquer pipelines that are statistically consistent under some models of evolution. We also report the results of an extensive simulation study evaluating NJMerge on multi-locus datasets with up to 1000 species. We found that NJMerge sometimes improved the accuracy of traditional NJ and substantially reduced the running time of three popular species tree methods (ASTRAL-III, SVDquartets, and "concatenation" using RAxML) without sacrificing accuracy. Finally, although NJMerge can fail to return a tree, in our experiments, NJMerge failed on only 11 out of 2560 test cases.ConclusionsTheoretical and empirical results suggest that NJMerge is a valuable technique for large-scale phylogeny estimation, especially when computational resources are limited. NJMerge is freely available on Github (http://github.com/ekmolloy/njmerge).