Text-Image-Video Summary Generation Using Joint Integer Linear Programming
Ontology highlight
ABSTRACT: Automatically generating a summary for asynchronous data can help users to keep up with the rapid growth of multi-modal information on the Internet. However, the current multi-modal systems usually generate summaries composed of text and images. In this paper, we propose a novel research problem of text-image-video summary generation (TIVS). We first develop a multi-modal dataset containing text documents, images and videos. We then propose a novel joint integer linear programming multi-modal summarization (JILP-MMS) framework. We report the performance of our model on the developed dataset.
SUBMITTER: Jose J
PROVIDER: S-EPMC7148046 | biostudies-literature | 2020 Mar
REPOSITORIES: biostudies-literature
ACCESS DATA