MIT CogNet, The Brain Sciences ConnectionFrom the MIT Press, Link to Online Catalog
SPARC Communities
Subscriber : Stanford University Libraries » LOG IN

space

Powered By Google 
Advanced Search

The CogNet Library : References Collection
mitecs_logo  The MIT Encyclopedia of Communication Disorders : Table of Contents: Voice Production: Physics and Physiology : Section 1
Next »»
 

When the vocal folds are near each other, a sufficient transglottal pressure will set them into oscillation. This oscillation produces cycles of airflow that create the acoustic signal known as phonation, the voicing sound source, the voice signal, or more generally, voice. This article discusses some of the mechanistic aspects of phonation.

The most general expression of forces in the larynx dealing with motion of the vocal folds during phonation is

F(x) = mx″ + bx′ + kx,(1)

where F(x) is the air pressure forces on the vocal fold tissues, m is the mass of the tissue in motion, b is a viscous coefficient, k is a spring constant coefficient, x is the position of the tissue from rest, x′ is the velocity of the tissue, and x″ is the acceleration of the tissue. In multimass models of phonation (e.g., Ishizaka and Flanagan, 1972), this equation is used for each mass proposed. Each term on the right-hand side characterizes forces in the tissue, and the left-hand side represents the external forces. This equation emphasizes the understanding that mass, viscosity, and stiffness each play a role in the motion (normal or abnormal) of the vocal folds, that these are associated with the acceleration, velocity, and displacement of the tissue, respectively, and they are balanced by the external air pressure forces acting on the vocal folds.

Glottal adduction has three parts. (1) How close the vocal processes are to each other determines the posterior prephonatory closeness of the membranous vocal folds. (2) The space created by the intercartilaginous glottis determines the “constant” opening there through which some or all of the dc (baseline) air will flow. (3) The closeness of the membranous vocal folds partly determines whether vocal fold oscillation can take place. To permit oscillation, the vocal folds must be within the phonatory adductory range (not too far apart, and yet not too overly compressed; Scherer, 1995), and the transglottal pressure must be at or greater than the phonatory threshold pressure (Titze, 1992) for the prevailing conditions of the vocal fold tissues and adduction.

The fundamental frequency F0, related to the pitch of the voice, will tend to rise if the tension of the tissue in motion increases, and will tend to fall if the length, mass, or density increases. The most general expression to date for pitch control has been offered by Titze (1994), viz.,

F0=(0.5/L) (sp/ρ) * (1+(da/d)*(sam/sp)*aTA) 0.5 , (2)

where L is the vibrating length of the vocal folds, sp is the passive tension of the tissue in motion, da /d is the ratio of the depth of the thyroarytenoid (TA) muscle in vibration to the total depth in vibration (the other tissue in motion is the more medial mucosal tissue), sam is the maximum active stress that the TA muscle can produce, aTA is the activity level of the TA muscle, and ρ is the density of the tissue in motion. When the vocal folds are lengthened by rotation of the thyroid and cricoid cartilages through the contraction of the cricothyroid (CT) muscles, the passive stretching of the vocal folds increases their passive tension, and thus L and sp tend to counter each other, with sp being more dominant (F0 generally rises with vocal fold elongation). Increasing subglottal pressure increases the lateral amplitude of motion of the vocal folds, thus increasing sp (via greater passive stretch; Titze, 1994) and da, thereby increasing F0. Increasing the contraction of the TA muscle (aTA) would tend to stiffen the muscle and shorten the vocal fold length (L), both of which would raise F0 but at the same time decrease the passive tension (sp) and the depth of vibration (da), which would decrease F0. Typically, large changes in F0 are associated with increased contraction of both the CT and TA muscles (Hirano, Ohala, and Vennard, 1969; Titze, 1994). Thus, the primary control for F0 is through the coordinative contraction of the TA and CT muscles, and subglottal pressure. F0 control, including the differentiated contraction of the complex TA muscle, anterior pull by the hyoid bone (Honda, 1983), cricoid tilt via tracheal pull (Sundberg, Leanderson, and von Euler, 1989), and the associations with adduction and vocal quality all need much study.

The intensity of voiced sounds, related to the loudness of the voice, is a combination and coordination of respiratory, laryngeal, and vocal tract aspects. Intensity increases with an increase in subglottal pressure, which itself depends on both lung volume reduction (an increase in air pressure in the lungs) and adduction of the vocal folds (which offers resistance to the flow of air from the lungs). An increase in the subglottal pressure during phonation can affect the cyclic glottal flow waveform (Fig. 1) by increasing its flow peak, increasing the maximum flow declination rate (MFDR, the maximum rate that the flow shuts off as the glottis is closing), and the sharpness of the baseline corner when the flow is near zero (or near its minimum value in the cycle). Greater peak flow, MFDR, and corner sharpness respectively increase the intensity of F0, the intensity of the first formant region (at least), and the intensity of the higher partials (Fant, Liljencrants, and Lin, 1985; Gauffin and Sundberg, 1989). Glottal adduction level greatly affects the source spectrum or quality of the voice, increasing the negative slope of the spectrum as one changes voice production from highly compressed voice (a relatively flat spectrum) to normal adduction to highly breathy voice (a relatively steep spectrum) (Scherer, 1995). The vocal tract filter function will augment the spectral intensity values of the glottal flow source in the region of the formants (resonances), and will decrease their intensity values in the valleys of the resonant structure (Titze, 1994). The radiation away from the lips will increase the spectrum slope (by about 6 dB per octave).

Figure 1..  

One cycle of glottal airflow. Uac is the varying portion of the waveform, and Udc is the offset or bias flow. The flow peak is the maximum flow in the cycle, MFDR is the maximum flow declination rate (derivative of the flow), typically located on the right-hand side of the flow pulse, and the corner curvature at the end of the flow pulse describes how sharp the corner “shut-off” is. The flow peak, MFDR, and corner sharpness are all important for the spectral aspects of the flow pulse (see text).


Maintenance of vocal fold oscillation during phonation depends on the tissue characteristics mentioned above, as well as the changing shape of the glottis and the changing intraglottal air pressures during each cycle. During glottal opening, the shape of the glottis corresponding to the vibrating vocal folds is convergent (wider in the lower glottis, narrower in the upper glottis), and the pressures on the walls of the glottis are positive due to this shape and to the (always) positive subglottal pressure (for normal egressive phonation) (Fig. 2). This positive pressure separates the folds during glottal opening. During glottal closing, the shape of the glottis is divergent (narrower in the lower glottis, wider in the upper glottis), and the pressures on the walls of the lower glottis are negative because of this shape (Fig. 2), and negative throughout the glottis when there also is rarefaction (negative pressure) of the supraglottal region. This alternation in glottal shape and intraglottal pressures, along with the alternation of the internal forces of the vocal folds, maintains the oscillation of the vocal folds. The exact glottal shape and intraglottal pressure changes, however, need to be established in the human larynx for the wide range of possible phonatory and vocal tract acoustic conditions.

Figure 2..  

Pressure profiles within the glottis. The upper trace corresponds to the data for a glottis with a 10° convergence and the lower trace to data for a glottis with a 10° divergence, both having a minimal glottal diameter of 0.04 cm (using a Plexiglas model of the larynx; Scherer and Shinwari, 2000). The transglottal pressure was 10 cm H2O in this illustration. Glottal entrance is at the minimum diameter position for the divergent glottis. The length of the glottal duct was 0.3 cm. Supraglottal pressure was taken to be atmospheric (zero). The convergent glottis shows positive pressures and the divergent glottis shows negative pressures throughout most of the glottis. The curvature at the glottal exit of the convergent glottis prevents the pressures from being positive throughout (Scherer, DeWitt, and Kucinschi, 2001).


When the two medial vocal fold surfaces are not mirror images of each other across the midline, the geometric asymmetry creates different pressures on the two sides (i.e., pressure asymmetries; Scherer et al., 2001, 2002) and therefore different driving forces on the two sides. Also, if there is tissue asymmetry, that is, if the two vocal folds themselves do not have equal values of tension (stiffness) and mass, one vocal fold may not vibrate like the other one, creating roughness, subharmonics, and cyclic groupings (Isshiki and Ishizaka, 1976; Gerratt et al., 1988; Wong et al., 1991; Titze, Baken, and Herzel, 1993; Steinecke and Herzel, 1995).

Figure 3 summarizes some basic aspects of phonation. The upper left suggests muscle contraction effects of vocal fold length (via CT and TA action), adduction (via TA, lateral cricoarytenoid, posterior cricoarytenoid, and interarytenoid muscle contraction), tension (via CT, TA, and adduction), and glottal shape (via vocal fold length, adduction, and TA rounding effect). When lung volume reduction is then employed, glottal airflow and subglottal pressure are created, resulting in motion of the vocal folds (if the adduction and pressure are sufficient), glottal flow resistance (transglottal pressure divided by the airflow), and the fundamental frequency (and pitch) of the voice. With the vocal tract included, the glottal flow is affected by the resonances of the vocal tract (pressures acting at the glottis level) and the inertance of the air of the vocal tract (to skew the glottal flow waveform to the right; Rothenberg, 1983), and the output spectra (quality) and intensity (loudness) result from the combination of the glottal flow, resonance, and radiation from the lips.

Figure 3..  

Factors leading to pitch, quality, and loudness production. See text.


Many basic issues of glottal aerodynamics, aeroacoustics, and modeling remain unclear for both normal and abnormal phonation. The glottal flow (the volume velocity flow) is considered a primary sound source, and the presence of the false vocal folds may interfere with the glottal jet and create a secondary sound source (Zhang et al., 2001). The turbulence and vorticities of the glottal flow may also contribute sound sources (Zhang et al., 2001). The false vocal folds themselves may contribute significant control of the flow resistance through the larynx, from more resistance (decreasing the flow if the false folds are quite close) to less resistance (increasing the glottal flow when the false folds are in an intermediate position) (Agarwal and Scherer, in press). Computer modeling needs to be practical, as in two-mass modeling (Ishizaka and Flanagan, 1972), but also closer to physiological reality, as in finite element modeling (Alipour, Berry, and Titze, 2000; Alipour and Scherer, 2000). The most complete approach so far is to combine finite element modeling of the tissue with computational fluid dynamics of the flow (to solve the Navier-Stokes equations; Alipour and Titze, 1996). However, we still need models of phonation that are helpful in describing and predicting subtle aspects of laryngeal function necessary for differentiating vocal pathologies, phonation styles and types, and approaches for phonosurgery, as well as for providing rehabilitation and training feedback for clients.

See also voice acoustics.

 
Next »»


© 2010 The MIT Press
MIT Logo