MIT CogNet, The Brain Sciences ConnectionFrom the MIT Press, Link to Online Catalog
SPARC Communities
Subscriber : Stanford University Libraries » LOG IN

space

Powered By Google 
Advanced Search

 

Speech Modelling Using Subspace and EM Techniques

 Gavin Smith, Nando de Freitas, Tony Robinson and Mahesan Niranjan
  
 

Abstract:
This paper concerns modelling using a piecewise-stationary discrete-time linear stochastic state space model, with applications to speech modelling. The purpose of the paper is to compare two algorithms for model parameter estimation: subspace state space system identification (4SID) and expectation-maximisation (EM). Both algorithms estimate state sequence and parameters jointly. EM is related to Kalman smoothing, maximises likelihoods, is iterative and requires parameter initialisation. Whereas 4SID is related to Kalman filtering, minimises a criterion involving both short and long-term prediction errors using least-squares, is closed-form and requires no parameter initialisation. Therefore 4SID has the advantage that it avoids iterative algorithm problems and requires less a priori knowledge because initialisation parameters are not needed. 4SID and EM methods are compared through experiments on real speech data. EM is sensitive to initialisation. Different initialisation methods are discussed and compared.

 
 


© 2010 The MIT Press
MIT Logo