Unknown

Dataset Information

0

Sequence tagging for biomedical extractive question answering.


ABSTRACT:

Motivation

Current studies in extractive question answering (EQA) have modeled the single-span extraction setting, where a single answer span is a label to predict for a given question-passage pair. This setting is natural for general domain EQA as the majority of the questions in the general domain can be answered with a single span. Following general domain EQA models, current biomedical EQA (BioEQA) models utilize the single-span extraction setting with post-processing steps.

Results

In this article, we investigate the question distribution across the general and biomedical domains and discover biomedical questions are more likely to require list-type answers (multiple answers) than factoid-type answers (single answer). This necessitates the models capable of producing multiple answers for a question. Based on this preliminary study, we propose a sequence tagging approach for BioEQA, which is a multi-span extraction setting. Our approach directly tackles questions with a variable number of phrases as their answer and can learn to decide the number of answers for a question from training data. Our experimental results on the BioASQ 7b and 8b list-type questions outperformed the best-performing existing models without requiring post-processing steps.

Availability and implementation

Source codes and resources are freely available for download at https://github.com/dmis-lab/SeqTagQA.

Supplementary information

Supplementary data are available at Bioinformatics online.

SUBMITTER: Yoon W 

PROVIDER: S-EPMC9344839 | biostudies-literature | 2022 Aug

REPOSITORIES: biostudies-literature

altmetric image

Publications

Sequence tagging for biomedical extractive question answering.

Yoon Wonjin W   Jackson Richard R   Lagerberg Aron A   Kang Jaewoo J  

Bioinformatics (Oxford, England) 20220801 15


<h4>Motivation</h4>Current studies in extractive question answering (EQA) have modeled the single-span extraction setting, where a single answer span is a label to predict for a given question-passage pair. This setting is natural for general domain EQA as the majority of the questions in the general domain can be answered with a single span. Following general domain EQA models, current biomedical EQA (BioEQA) models utilize the single-span extraction setting with post-processing steps.<h4>Resul  ...[more]

Similar Datasets

| S-EPMC4307891 | biostudies-literature
| S-EPMC10468517 | biostudies-literature
| S-EPMC10042099 | biostudies-literature
| S-EPMC10603356 | biostudies-literature
| S-EPMC7148067 | biostudies-literature
| S-EPMC11491595 | biostudies-literature
| S-EPMC7148018 | biostudies-literature
| S-EPMC4572360 | biostudies-literature
| S-EPMC5857288 | biostudies-literature
| S-EPMC10306234 | biostudies-literature