Adaptation of Pronunciation Dictionaries for Recognition of Unseen Languages - Robotics Institute Carnegie Mellon University

Adaptation of Pronunciation Dictionaries for Recognition of Unseen Languages

Tanja Schultz and Alex Waibel
Workshop Paper, SPECOM '98 Workshop on Speech and Communication, pp. 207 - 210, October, 1998

Abstract

This paper studies the relative effectiveness of different methods for multilingual model combination and dictionary mapping for recognizing a new unseen target language if training data are limited. We examine the crosslanguage transfer from monolingual and multilingual models to German and Russian language for large vocabulary speech recognition using a dictation database which has been collected under the project GlobalPhone. This project at the University of Karlsruhe investigates LVCSR systems in 15 languages of the world, namely Arabic, Chinese, Croatian, English, French, German, Italian, Japanese, Korean, Portuguese, Russian, Spanish, Swedish, Tamil, and Turkish. Based on a global phoneme set we create recognizer which combine up to eight languages and perform recognition results in language independent and adaptive setups. We found that multilingual context dependent models outperform monolingual models for the purpose of crosslanguage transfer. Two dictionary mapping approaches are compared. Results show that the IPA-based mapping produces better results than a data-driven procedure.

BibTeX

@workshop{Schultz-1998-14780,
author = {Tanja Schultz and Alex Waibel},
title = {Adaptation of Pronunciation Dictionaries for Recognition of Unseen Languages},
booktitle = {Proceedings of SPECOM '98 Workshop on Speech and Communication},
year = {1998},
month = {October},
pages = {207 - 210},
}