Error-Responsive Feedback Mechanisms for Speech Recognizers
Abstract
This thesis is about modeling, analyzing, and predicting errorful behavior in large vocabulary continuous speech recognition systems. Because today's state-of-the-art recognizers are not designed to be situated naturally in an error feedback loop, they are ill-positioned for inclusion in multi-modal interfaces, multi-media databases, and other interesting applications. I make improvements to the current approach to predicting and analyzing error behaviors, which is currently based only on the measurement of word error rate. The speech recognizer's functionality is extended to include confidence annotations, which are "meta-level" markings that indicate how certain the recognizer is that it has decoded its input correctly. This is accomplished by feeding externally defined error conditions back to the recognizer. Error feedback enables the construction of statistical models that map measurements of the recognizer's internal states nad behaviors to externally defined error conditions. The measureing and modeling techniques used for confidence annotation are extended to create a blame assignment system for utterances whose actual transcripts are known. Errors are classified into a set of categories, some of which are directly useful in automatic adaptation schemes while others are more suited for human interpretation. This classification approach is enhanced when used in conjunction with a visual error analysis tool that was developed during the thesis project.
BibTeX
@phdthesis{Chase-1997-14366,author = {Lin Chase},
title = {Error-Responsive Feedback Mechanisms for Speech Recognizers},
year = {1997},
month = {April},
school = {Carnegie Mellon University},
address = {Pittsburgh, PA},
number = {CMU-RI-TR-97-18},
}