Serbo-Croatian LVCSR on the Dictation and Broadcast News Domain

Peter Scheytt, Petra Geutner, and Alex Waibel

Conference Paper, Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '98), Vol. 2, pp. 897 - 900, May, 1998

View Publication

Abstract

This paper describes the development of a Serbo-Croatian dictation and broadcast news speech recognizer. The intention is to generate an automatic text transcription of a news show, which will be submitted to a multilingual informedia database. We outline the complete system development process using the JanusRTk, beginning with data collection, design and training of the parameters, tuning and evaluation. We report on general recognition techniques like segmentation, adaptation and language model interpolation, as well as language specific problems, e.g. high OOV rate due to inflected word forms. We show that even with a low amount of acoustic training data, combined with Web based interpolated language models, it is sufficient to build up a fairly reliable automatic news transcription system, which yields a performance of 36.0% word error (WE).

BibTeX

@conference{Scheytt-1998-16605,
author = {Peter Scheytt and Petra Geutner and Alex Waibel},
title = {Serbo-Croatian LVCSR on the Dictation and Broadcast News Domain},
booktitle = {Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '98)},
year = {1998},
month = {May},
volume = {2},
pages = {897 - 900},
}

Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.