Unknown

Dataset Information

0

CAFE: aCcelerated Alignment-FrEe sequence analysis.


ABSTRACT: Alignment-free genome and metagenome comparisons are increasingly important with the development of next generation sequencing (NGS) technologies. Recently developed state-of-the-art k-mer based alignment-free dissimilarity measures including CVTree, $d_2^*$ and $d_2^S$ are more computationally expensive than measures based solely on the k-mer frequencies. Here, we report a standalone software, aCcelerated Alignment-FrEe sequence analysis (CAFE), for efficient calculation of 28 alignment-free dissimilarity measures. CAFE allows for both assembled genome sequences and unassembled NGS shotgun reads as input, and wraps the output in a standard PHYLIP format. In downstream analyses, CAFE can also be used to visualize the pairwise dissimilarity measures, including dendrograms, heatmap, principal coordinate analysis and network display. CAFE serves as a general k-mer based alignment-free analysis platform for studying the relationships among genomes and metagenomes, and is freely available at https://github.com/younglululu/CAFE.

SUBMITTER: Lu YY 

PROVIDER: S-EPMC5793812 | biostudies-other | 2017 Jul

REPOSITORIES: biostudies-other

altmetric image

Publications

CAFE: aCcelerated Alignment-FrEe sequence analysis.

Lu Yang Young YY   Tang Kujin K   Ren Jie J   Fuhrman Jed A JA   Waterman Michael S MS   Sun Fengzhu F  

Nucleic acids research 20170701 W1


Alignment-free genome and metagenome comparisons are increasingly important with the development of next generation sequencing (NGS) technologies. Recently developed state-of-the-art k-mer based alignment-free dissimilarity measures including CVTree, $d_2^*$ and $d_2^S$ are more computationally expensive than measures based solely on the k-mer frequencies. Here, we report a standalone software, aCcelerated Alignment-FrEe sequence analysis (CAFE), for efficient calculation of 28 alignment-free di  ...[more]

Similar Datasets

| S-EPMC3799466 | biostudies-literature
| S-EPMC4410667 | biostudies-literature
| S-EPMC6659240 | biostudies-literature
| S-EPMC5627421 | biostudies-literature
| S-EPMC2818754 | biostudies-literature
| S-EPMC3704055 | biostudies-literature
| S-EPMC4080745 | biostudies-literature
| S-EPMC3429886 | biostudies-literature
| S-EPMC3581251 | biostudies-literature
| S-EPMC6937637 | biostudies-literature