Which system differences matter?: using l 1/l 2 regularization to compare dialogue systems

José P. González-Brenes and Jack Mostow

Conference Paper, Proceedings of 12th Annual Meeting of the Special Interest Group on Discourse and Dialogue (SIGDIAL '11), pp. 8 - 17, June, 2011

Abstract

We investigate how to jointly explain the performance and behavioral differences of two spoken dialogue systems. The Join Evaluation and Differences Identification (JEDI), finds differences between systems relevant to performance by formulating the problem as a multi-task feature selection question. JEDI provides evidence on the usefulness of a recent method, l 1/l p-regularized regression (Obozinski et al., 2007). We evaluate against manually annotated success criteria from real users interacting with five different spoken user interfaces that give bus schedule information.

BibTeX

@conference{Gonzalez-Brenes-2011-122075,
author = {José P. González-Brenes and Jack Mostow},
title = {Which system differences matter?: using l 1/l 2 regularization to compare dialogue systems},
booktitle = {Proceedings of 12th Annual Meeting of the Special Interest Group on Discourse and Dialogue (SIGDIAL '11)},
year = {2011},
month = {June},
pages = {8 - 17},
}

Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.