An investigation into subspace rapid speaker adaptation for verification

S. Lucey and T. Chen

Conference Paper, Proceedings of International Conference on Multimedia and Expo (ICME '03), pp. 69 - 72, July, 2003

Abstract

Rapid speaker adaptation is becoming more important in emerging applications where storage, computation and training utterances are at a premium (e.g. PDAs, cell phones). Effective adaptation can be achieved for the task of speaker verification, based on a maximum a posteriori (MAP) learning framework, by restricting the client's parametric model to be a linear combination of parameters estimated from training observations and a speaker independent "world" model (i.e. relevance adaptation (RA)). Subspace adaptation (SA) attempts to restrict a client's parametric representation to a pre-defined subspace during estimation. In this paper we elucidate where subspace adaptation outperforms world adaptation, demonstrate where and why subspace adaptation is sometimes not as effective and give insights into what cost criteria should be used to construct the adaptation parametric subspace. Results are presented on the acoustic portion of the XM2VTS database for the task of Gaussian mixture model (GMM) based text-independent speaker verification.

BibTeX

@conference{Lucey-2003-121086,
author = {S. Lucey and T. Chen},
title = {An investigation into subspace rapid speaker adaptation for verification},
booktitle = {Proceedings of International Conference on Multimedia and Expo (ICME '03)},
year = {2003},
month = {July},
pages = {69 - 72},
}

Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.