Language and Pronunciation Modeling in the CMU 1996 Hub 4 Evaluation
Workshop Paper, DARPA Spoken Language Systems Technology Workshop (SLSTW '97), pp. 141 - 146, February, 1997
We describe several language and pronunciation modeling techniques that were applied to the 1996 Hub Broadcast News transcription task. These include topic adaptation, the use of remote corpora, vocabulary size optimization, n-gram cutoff optimization, modeling of spontaneous speech, handling of unknown linguistic boundaries, higher order n-grams, weight optimization in rescoring, and lexical modeling of phrases and acronyms.
@workshop{Seymore-1997-16467,author = {Kristie Seymore and Stanley Chen and Maxine Eskenazi and Ronald Rosenfeld},
title = {Language and Pronunciation Modeling in the CMU 1996 Hub 4 Evaluation},
booktitle = {Proceedings of DARPA Spoken Language Systems Technology Workshop (SLSTW '97)},
year = {1997},
month = {February},
pages = {141 - 146},
Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.