Dataset Information

Coffee With a Hint of Data: Towards Using Data-Driven Approaches in Personalised Long-Term Interactions.

ABSTRACT: While earlier research in human-robot interaction pre-dominantly uses rule-based architectures for natural language interaction, these approaches are not flexible enough for long-term interactions in the real world due to the large variation in user utterances. In contrast, data-driven approaches map the user input to the agent output directly, hence, provide more flexibility with these variations without requiring any set of rules. However, data-driven approaches are generally applied to single dialogue exchanges with a user and do not build up a memory over long-term conversation with different users, whereas long-term interactions require remembering users and their preferences incrementally and continuously and recalling previous interactions with users to adapt and personalise the interactions, known as the lifelong learning problem. In addition, it is desirable to learn user preferences from a few samples of interactions (i.e., few-shot learning). These are known to be challenging problems in machine learning, while they are trivial for rule-based approaches, creating a trade-off between flexibility and robustness. Correspondingly, in this work, we present the text-based Barista Datasets generated to evaluate the potential of data-driven approaches in generic and personalised long-term human-robot interactions with simulated real-world problems, such as recognition errors, incorrect recalls and changes to the user preferences. Based on these datasets, we explore the performance and the underlying inaccuracies of the state-of-the-art data-driven dialogue models that are strong baselines in other domains of personalisation in single interactions, namely Supervised Embeddings, Sequence-to-Sequence, End-to-End Memory Network, Key-Value Memory Network, and Generative Profile Memory Network. The experiments show that while data-driven approaches are suitable for generic task-oriented dialogue and real-time interactions, no model performs sufficiently well to be deployed in personalised long-term interactions in the real world, because of their inability to learn and use new identities, and their poor performance in recalling user-related data.

SUBMITTER: Irfan B

PROVIDER: S-EPMC8505524 | biostudies-literature |

REPOSITORIES: biostudies-literature

ACCESS DATA

Similar Datasets

Project description:Personalized medicine (PM) operates with biological data to optimize therapy or prevention and to achieve cost reduction. Associated data may consist of large variations of informational subtypes e.g. genetic characteristics and their epigenetic modifications, biomarkers or even individual lifestyle factors. Present innovations in the field of information technology have already enabled the procession of increasingly large amounts of such data ('volume') from various sources ('variety') and varying quality in terms of data accuracy ('veracity') to facilitate the generation and analyzation of messy data sets within a short and highly efficient time period ('velocity') to provide insights into previously unknown connections and correlations between different items ('value'). As such developments are characteristics of Big Data approaches, Big Data itself has become an important catchphrase that is closely linked to the emerging foundations and approaches of PM. However, as ethical concerns have been pointed out by experts in the debate already, moral concerns by stakeholders such as patient organizations (POs) need to be reflected in this context as well. We used an empirical-ethical approach including a website-analysis and 27 telephone-interviews for gaining in-depth insight into German POs' perspectives on PM and Big Data. Our results show that not all POs are stakeholders in the same way. Comparing the perspectives and political engagement of the minority of POs that is currently actively involved in research around PM and Big Data-driven research led to four stakeholder sub-classifications: 'mediators' support research projects through facilitating researcher's access to the patient community while simultaneously selecting projects they preferably support while 'cooperators' tend to contribute more directly to research projects by providing and implemeting patient perspectives. 'Financers' provide financial resources. 'Independents' keep control over their collected samples and associated patient-related information with a strong interest in making autonomous decisions about its scientific use. A more detailed terminology for the involvement of POs as stakeholders facilitates the adressing of their aims and goals. Based on our results, the 'independents' subgroup is a promising candidate for future collaborations in scientific research. Additionally, we identified gaps in PO's knowledge about PM and Big Data. Based on these findings, approaches can be developed to increase data and statistical literacy. This way, the full potential of stakeholder involvement of POs can be made accessible in discourses around PM and Big Data.

Dataset Information

Coffee With a Hint of Data: Towards Using Data-Driven Approaches in Personalised Long-Term Interactions.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets