| |
Abstract:
This paper presents a parametric method to compute the
acoustic characteristics of 3D vocal tract models, in order to
reduce computational time, and to explore the vocal tract
acoustic characteristics that can not be represented by the
traditional 1D model. A cascaded structure of acoustic tubes,
connected asymmetrically with respect to their axes, is
introduced as an approximation of the vocal tract geometry. Each
tube is assumed to have a rectangular cross-sectional shape whose
geometry (size and axis position) can be determined from MRI
data. The 3D acoustic field in each tube is represented in terms
of higher-order modes. A mode-matching technique is used to
establish mode coupling at the junctions between tubes. In the
proposed method, both propagative and evanescent higher-order
modes are considered in each tube, since each section is often
not long enough for the evanescent modes to decay away.
Considering several evanescent higher-order modes sometimes
causes computational instabilities related to the numerical
precision. In the proposed method, the number of higher-order
modes can be selected independently in each tube. In particular,
only plane waves may be considered for narrow tubes and several
higher-order modes should be taken into account for wider tubes.
The flexibility in the selection of the number of the
higher-order modes in each tube increases the computation
stability significantly, while also reducing computational time.
Calculation results for two configurations are discussed: (1) the
sound-pressure distributions for a 5-section configuration, which
approximates an occlusion at the teeth, clearly show the curved
path of wave propagation inside the occlusion area even at low
frequencies; (2) a realistic three-dimensional vocal tract
configuration based on MRI data is also evaluated. The results
obtained show that formant frequencies are lowered due to the
presence of the evanescent higher-order modes in low frequencies.
Moreover, a zero can appear in the transfer functions. These
results are consistent with those obtained by FEM or TLM
simulations. In summary, the proposed method has the following
advantages: (1) more accurate acoustic characteristics are
obtained compared to those obtained from the one-dimensional
modeling; (2) the computational time is much shorter than that of
FEM and/or TLM. These are useful features for the improvement of
speech synthesis systems based on vocal tract models.
|