Dataset Information

Improving contig binning of metagenomic data using [Formula: see text] oligonucleotide frequency dissimilarity.

ABSTRACT: Metagenomics sequencing provides deep insights into microbial communities. To investigate their taxonomic structure, binning assembled contigs into discrete clusters is critical. Many binning algorithms have been developed, but their performance is not always satisfactory, especially for complex microbial communities, calling for further development.According to previous studies, relative sequence compositions are similar across different regions of the same genome, but they differ between distinct genomes. Generally, current tools have used the normalized frequency of k-tuples directly, but this represents an absolute, not relative, sequence composition. Therefore, we attempted to model contigs using relative k-tuple composition, followed by measuring dissimilarity between contigs using [Formula: see text]. The [Formula: see text] was designed to measure the dissimilarity between two long sequences or Next-Generation Sequencing data with the Markov models of the background genomes. This method was effective in revealing group and gradient relationships between genomes, metagenomes and metatranscriptomes. With many binning tools available, we do not try to bin contigs from scratch. Instead, we developed [Formula: see text] to adjust contigs among bins based on the output of existing binning tools for a single metagenomic sample. The tool is taxonomy-free and depends only on k-tuples. To evaluate the performance of [Formula: see text], five widely used binning tools with different strategies of sequence composition or the hybrid of sequence composition and abundance were selected to bin six synthetic and real datasets, after which [Formula: see text] was applied to adjust the binning results. Our experiments showed that [Formula: see text] consistently achieves the best performance with tuple length k?=?6 under the independent identically distributed (i.i.d.) background model. Using the metrics of recall, precision and ARI (Adjusted Rand Index), [Formula: see text] improves the binning performance in 28 out of 30 testing experiments (6 datasets with 5 binning tools). The [Formula: see text] is available at https://github.com/kunWangkun/d2SBin .Experiments showed that [Formula: see text] accurately measures the dissimilarity between contigs of metagenomic reads and that relative sequence composition is more reasonable to bin the contigs. The [Formula: see text] can be applied to any existing contig-binning tools for single metagenomic samples to obtain better binning results.

SUBMITTER: Wang Y

PROVIDER: S-EPMC5607646 | biostudies-literature | 2017 Sep

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Improving contig binning of metagenomic data using [Formula: see text] oligonucleotide frequency dissimilarity.

Wang Ying Y Wang Kun K Lu Yang Young YY Sun Fengzhu F

BMC bioinformatics 20170920 1

<h4>Background</h4>Metagenomics sequencing provides deep insights into microbial communities. To investigate their taxonomic structure, binning assembled contigs into discrete clusters is critical. Many binning algorithms have been developed, but their performance is not always satisfactory, especially for complex microbial communities, calling for further development.<h4>Results</h4>According to previous studies, relative sequence compositions are similar across different regions of the same ge ...[more]

PMID: 28931373

Similar Datasets

Project description:The experimental value for the isospin amplitude [Formula: see text] in [Formula: see text] decays has been successfully explained within the standard model (SM), both within the large [Formula: see text] approach to QCD and by QCD lattice calculations. On the other hand within the large [Formula: see text] approach the value of [Formula: see text] is by at least [Formula: see text] below the data. While this deficit could be the result of theoretical uncertainties in this approach and could be removed by future precise QCD lattice calculations, it cannot be excluded that the missing piece in [Formula: see text] comes from new physics (NP). We demonstrate that this deficit can be significantly softened by tree-level FCNC transitions mediated by a heavy colourless [Formula: see text] gauge boson with a flavour-violating left-handed coupling [Formula: see text] and an approximately universal flavour diagonal right-handed coupling [Formula: see text] to the quarks. The approximate flavour universality of the latter coupling assures negligible NP contributions to [Formula: see text]. This property, together with the breakdown of the GIM mechanisms at tree level, allows one to enhance significantly the contribution of the leading QCD-penguin operator [Formula: see text] to [Formula: see text]. A large fraction of the missing piece in the [Formula: see text] rule can be explained in this manner for [Formula: see text] in the reach of the LHC, while satisfying the constraints from [Formula: see text], [Formula: see text], [Formula: see text], LEP-II and the LHC. The presence of a small right-handed flavour-violating coupling [Formula: see text] and of enhanced matrix elements of [Formula: see text] left-right operators allows one to satisfy simultaneously the constraints from [Formula: see text] and [Formula: see text], although this requires some fine-tuning. We identify the quartic correlation between [Formula: see text] contributions to [Formula: see text], [Formula: see text], [Formula: see text] and [Formula: see text]. The tests of this proposal will require much improved evaluations of [Formula: see text] and [Formula: see text] within the SM, of [Formula: see text] as well as precise tree-level determinations of [Formula: see text] and [Formula: see text]. We present correlations between [Formula: see text], [Formula: see text] and [Formula: see text] with and without the [Formula: see text] rule constraint and generalise the whole analysis to [Formula: see text] with colour ([Formula: see text]) and [Formula: see text] with FCNC couplings. In the latter case no improvement on [Formula: see text] can be achieved without destroying the agreement of the SM with the data on [Formula: see text]. Moreover, this scenario is very tightly constrained by [Formula: see text]. On the other hand, in the context of the [Formula: see text] rule [Formula: see text] is even more effective than [Formula: see text]: it provides the missing piece in [Formula: see text] for [Formula: see text]-[Formula: see text].

Project description:The Josephson effect in point contacts between an "ordinary" superconductor [Formula: see text]In[Formula: see text] ([Formula: see text]) and single crystals of the Fe-based superconductor Ba[Formula: see text]K[Formula: see text](FeAs)[Formula: see text] ([Formula: see text]), was investigated. In order to shed light on the order parameter symmetry of Ba[Formula: see text]K[Formula: see text](FeAs)[Formula: see text], the dependence of the Josephson supercurrent [Formula: see text] on the temperature and on [Formula: see text] with [Formula: see text] was studied. The dependencies of the critical current on temperature [Formula: see text] and of the amplitudes of the first current steps of the current-voltage characteristic [Formula: see text] [Formula: see text] on the power of microwave radiation with frequency [Formula: see text] were measured. It is shown that the dependencies [Formula: see text] are close to the well-known Ambegaokar-Baratoff (AB) dependence for tunnel contacts between "ordinary" superconductors and to the dependence calculated by Burmistrova et al. (Phys Rev B 91, 214501 (2015)) for microshorts between an "ordinary" superconductor and a two-band superconductor with [Formula: see text] order parameter symmetry at certain values of the transparency of boundaries and thickness of the transition layer. It is found that the dependencies [Formula: see text] cannot be approximated within the resistively shunted model using the normalized microwave frequencies [Formula: see text] with characteristic voltages [Formula: see text], [Formula: see text]-normal resistance of the contact) found from the low-voltage parts of the current-voltage characteristics. The reasons for this failure are discussed and a method is proposed for accurately determining the value of [Formula: see text], which takes into account all the features of the point contact affecting the period of the dependence [Formula: see text]. An analysis of the [Formula: see text] and [Formula: see text] dependencies shows that the superconducting current of the Josephson contacts under investigation is proportional to the [Formula: see text] of the phase difference [Formula: see text], [Formula: see text]. The implications of these results on the symmetry of the order parameter are also discussed.

Dataset Information

Improving contig binning of metagenomic data using [Formula: see text] oligonucleotide frequency dissimilarity.

Publications

Improving contig binning of metagenomic data using [Formula: see text] oligonucleotide frequency dissimilarity.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets