| |
Abstract:
Hidden Markov models (HMMs) for automatic speech recognition
rely on high dimensional feature vectors to summarize the
short-time properties of speech. Correlations between features can
arise when the speech signal is non-stationary or corrupted by
noise. We investigate how to model these correlations using factor
analysis, a statistical method for dimensionality reduction. Factor
analysis uses a small number of parameters to model the covariance
structure of high dimensional data. These parameters are estimated
by an Expectation-Maximization (EM) algorithm that can be embedded
in the training procedures for HMMs. We evaluate the combined use
of mixture densities and factor analysis in HMMs that recognize
alphanumeric strings. Holding the total number of parameters fixed,
we find that these methods, properly combined, yield better models
than either method on its own.
|