Ontology highlight
ABSTRACT:
SUBMITTER: Sikic K
PROVIDER: S-EPMC3055704 | biostudies-literature | 2010 Nov
REPOSITORIES: biostudies-literature
Sikic Kresimir K Carugo Oliviero O
Bioinformation 20101127 6
Non-redundant protein datasets are of utmost importance in bioinformatics. Constructing such datasets means removing protein sequences that overreach certain similarity thresholds. Several programs such as 'Decrease redundancy', 'cd-hit', 'Pisces', 'BlastClust' and 'SkipRedundant' are available. The issue that we focus on here is to what extent the non-redundant datasets produced by different programs are similar to each other. A systematic comparison of the features and of the outputs of these ...[more]