Unknown

Dataset Information

0

Evaluating prose style transfer with the Bible.


ABSTRACT: In the prose style transfer task a system, provided with text input and a target prose style, produces output which preserves the meaning of the input text but alters the style. These systems require parallel data for evaluation of results and usually make use of parallel data for training. Currently, there are few publicly available corpora for this task. In this work, we identify a high-quality source of aligned, stylistically distinct text in different versions of the Bible. We provide a standardized split, into training, development and testing data, of the public domain versions in our corpus. This corpus is highly parallel since many Bible versions are included. Sentences are aligned due to the presence of chapter and verse numbers within all versions of the text. In addition to the corpus, we present the results, as measured by the BLEU and PINC metrics, of several models trained on our data which can serve as baselines for future research. While we present these data as a style transfer corpus, we believe that it is of unmatched quality and may be useful for other natural language tasks as well.

SUBMITTER: Carlson K 

PROVIDER: S-EPMC6227951 | biostudies-other | 2018 Oct

REPOSITORIES: biostudies-other

altmetric image

Publications

Evaluating prose style transfer with the Bible.

Carlson Keith K   Riddell Allen A   Rockmore Daniel D  

Royal Society open science 20181024 10


In the prose style transfer task a system, provided with text input and a target prose style, produces output which preserves the meaning of the input text but alters the style. These systems require parallel data for evaluation of results and usually make use of parallel data for training. Currently, there are few publicly available corpora for this task. In this work, we identify a high-quality source of aligned, stylistically distinct text in different versions of the Bible. We provide a stan  ...[more]

Similar Datasets

| S-EPMC6603166 | biostudies-literature
| S-EPMC8400862 | biostudies-literature
| S-EPMC10291316 | biostudies-literature
| S-EPMC7774645 | biostudies-literature
| S-EPMC9606769 | biostudies-literature
| S-EPMC8247631 | biostudies-literature
| S-EPMC7755413 | biostudies-literature
| S-EPMC6088372 | biostudies-literature
| S-EPMC8452401 | biostudies-literature
2019-09-25 | PXD014415 | Pride