Unknown

Dataset Information

0

Extraction of time-related expressions using text mining with application to Hebrew.


ABSTRACT: In this research, we extract time-related expressions from a rabbinic text in a semi-automatic manner. These expressions usually appear next to rabbinic references (name / nickname / acronym / book-name). The first step toward our goal is to find all the expressions near references in the corpus. However, not all of the phrases around the references are time-related expressions. Therefore, these phrases are initially considered to be potential time-related expressions. To extract the time-related expressions, we formulate two new statistical functions, and we use screening and heuristic methods. We tested these statistical functions, grammatical screenings, and heuristic methods on a corpus containing responsa documents. In this corpus, many rabbinic citations are known and marked. The statistical functions and the screening methods filtered the potential time-related expressions and reduced 99.88% of the initial expressions (from 484,681 to 575).

SUBMITTER: Mughaz D 

PROVIDER: S-EPMC10889890 | biostudies-literature | 2024

REPOSITORIES: biostudies-literature

altmetric image

Publications

Extraction of time-related expressions using text mining with application to Hebrew.

Mughaz Dror D   HaCohen-Kerner Yaakov Y   Gabbay Dov D  

PloS one 20240223 2


In this research, we extract time-related expressions from a rabbinic text in a semi-automatic manner. These expressions usually appear next to rabbinic references (name / nickname / acronym / book-name). The first step toward our goal is to find all the expressions near references in the corpus. However, not all of the phrases around the references are time-related expressions. Therefore, these phrases are initially considered to be potential time-related expressions. To extract the time-relate  ...[more]

Similar Datasets

| S-EPMC4830514 | biostudies-literature
| S-EPMC4583433 | biostudies-literature
| S-EPMC6829803 | biostudies-literature
2020-04-30 | GSE142100 | GEO
| S-EPMC9710573 | biostudies-literature
| S-EPMC2901371 | biostudies-literature
| S-EPMC5760847 | biostudies-literature
| S-EPMC4926749 | biostudies-literature
| S-EPMC8696973 | biostudies-literature
| S-EPMC4457984 | biostudies-literature