Unknown

Dataset Information

0

PASTA for proteins.


ABSTRACT:

Summary

PASTA is a multiple sequence method that uses divide-and-conquer plus iteration to enable base alignment methods to scale with high accuracy to large sequence datasets. By default, PASTA included MAFFT L-INS-i; our new extension of PASTA enables the use of MAFFT G-INS-i, MAFFT Homologs, CONTRAlign and ProbCons. We analyzed the performance of each base method and PASTA using these base methods on 224 datasets from BAliBASE 4 with at least 50 sequences. We show that PASTA enables the most accurate base methods to scale to larger datasets at reduced computational effort, and generally improves alignment and tree accuracy on the largest BAliBASE datasets.

Availability and implementation

PASTA is available at https://github.com/kodicollins/pasta and has also been integrated into the original PASTA repository at https://github.com/smirarab/pasta.

Supplementary information

Supplementary data are available at Bioinformatics online.

SUBMITTER: Collins K 

PROVIDER: S-EPMC6223367 | biostudies-literature | 2018 Nov

REPOSITORIES: biostudies-literature

altmetric image

Publications

PASTA for proteins.

Collins Kodi K   Warnow Tandy T  

Bioinformatics (Oxford, England) 20181101 22


<h4>Summary</h4>PASTA is a multiple sequence method that uses divide-and-conquer plus iteration to enable base alignment methods to scale with high accuracy to large sequence datasets. By default, PASTA included MAFFT L-INS-i; our new extension of PASTA enables the use of MAFFT G-INS-i, MAFFT Homologs, CONTRAlign and ProbCons. We analyzed the performance of each base method and PASTA using these base methods on 224 datasets from BAliBASE 4 with at least 50 sequences. We show that PASTA enables t  ...[more]

Similar Datasets

| S-EPMC5223256 | biostudies-literature
| S-EPMC5795192 | biostudies-literature
| S-EPMC7465979 | biostudies-literature
| S-EPMC6823870 | biostudies-literature
| S-EPMC5709772 | biostudies-literature
| S-EPMC3623791 | biostudies-literature
| S-EPMC7905464 | biostudies-literature
| S-EPMC8467960 | biostudies-literature
| S-EPMC4086119 | biostudies-literature
| S-EPMC6232839 | biostudies-literature