Ontology highlight
ABSTRACT:
SUBMITTER: Solan Z
PROVIDER: S-EPMC1187953 | biostudies-literature | 2005 Aug
REPOSITORIES: biostudies-literature
Solan Zach Z Horn David D Ruppin Eytan E Edelman Shimon S
Proceedings of the National Academy of Sciences of the United States of America 20050808 33
We address the problem, fundamental to linguistics, bioinformatics, and certain other disciplines, of using corpora of raw symbolic sequential data to infer underlying rules that govern their production. Given a corpus of strings (such as text, transcribed speech, chromosome or protein sequence data, sheet music, etc.), our unsupervised algorithm recursively distills from it hierarchically structured patterns. The adios (automatic distillation of structure) algorithm relies on a statistical meth ...[more]