| |
Abstract:
Face motion during speech is a direct consequence of
vocal-tract motion which also shapes the acoustics of speech.
This fact indicates the possibility of using speech acoustics to
estimate face motion, and vice versa. Another interesting
acoustic-motion relation that occurs during production of speech
happens between head motion and fundamental frequency (F0). This
work is focused on the development of a system which takes speech
acoustics as input, and gives as output coefficients that control
a parametric face model with natural head motion. The results
obtained are based on simultaneous measurements of facial
deformation, head motion and speech acoustics collected for two
subjects during production of naturalistic sentences and
spontaneous speech. The procedure for estimating face motion from
speech acoustics first trains nonlinear estimators whose inputs
are LSP coefficients and whose outputs are face marker positions.
These estimators are then applied to test data. The estimated
marker trajectories are then objectively compared with their
measured counterparts. Results indicate correlation coefficients
between 0.8 and 0.9. Linear estimators are then used to relate F0
and head motion. As F0 to head motion is a one-to-many problem,
constraints must be added for the estimation of head motion. This
is done by computation of the co-dependence among head motion
components. Finally, measured and estimated face and head motion
data are used to animate a parametric talking face.
|