Ontology highlight
ABSTRACT:
SUBMITTER: De Pierri CR
PROVIDER: S-EPMC6952362 | biostudies-literature | 2020 Jan
REPOSITORIES: biostudies-literature
Scientific reports 20200109 1
Vectoral and alignment-free approaches to biological sequence representation have been explored in bioinformatics to efficiently handle big data. Even so, most current methods involve sequence comparisons via alignment-based heuristics and fail when applied to the analysis of large data sets. Here, we present "Spaced Words Projection (SWeeP)", a method for representing biological sequences using relatively small vectors while preserving intersequence comparability. SWeeP uses spaced-words by sca ...[more]