A Multimodal Dialogue System for Conversational Image Editing

T.-H. Lin, T. Bui, D. S. Kim, and J. Oh

Workshop Paper, NeurIPS '18 2nd Workshop on Conversational AI, November, 2018

Abstract

In this paper, we present a multimodal dialogue system for Conversational Image Editing. We formulate our multimodal dialogue system as a Partially Observed Markov Decision Process (POMDP) and trained it with Deep Q-Network (DQN) and a user simulator. Our evaluation shows that the DQN policy outperforms a rule-based baseline policy, achieving 90% success rate under high error rates. We also conducted a real user study and analyzed real user behavior.

BibTeX

@workshop{Lin-2018-113122,
author = {T.-H. Lin and T. Bui and D. S. Kim and J. Oh},
title = {A Multimodal Dialogue System for Conversational Image Editing},
booktitle = {Proceedings of NeurIPS '18 2nd Workshop on Conversational AI},
year = {2018},
month = {November},
}

Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.