| |
Among the many published approaches for the treatment of aphasia, melodic intonation therapy (MIT) is one of the few techniques whose clinical effectiveness has been established by peer review (American Academy of Neurology, 1994). The effectiveness of the program is based on the specific guidelines for patient candidacy, its formalized protocol, and a variety of reports testifying to improved communication competence following MIT. After evaluating the available evidence, the American Academy of Neurology considers the program to be promising when administered by a qualified speech-language pathologist.
The guiding principles and procedures associated with MIT were set forth in the early works of Albert, Sparks, and Helm (1973), Sparks, Helm, and Albert (1974), and Sparks and Holland (1976). More recent descriptions of the program can be found in Helm-Estabrooks and Albert (1991) and Sparks (2001). Generally, three principles form the conceptual foundation for MIT. First, in most of the population, the right cerebral hemisphere mediates music and speech prosody. Second, the right hemisphere is preserved in most individuals with aphasia, and as a result, singing abilities are generally preserved even in the most severe cases of aphasia. Third, the preserved musical and prosodic capabilities of the right hemisphere can be exploited to rehabilitate language production in patients with aphasia.
The goals of MIT are to facilitate some recovery of language production in severely nonfluent speakers with poorly articulated or severely restricted verbal output. Good candidates have poor repetition but at least moderately preserved to essentially normal language comprehension. Attempts at self-correction are evident. They are emotionally stable, if sometimes depressed, and highly motivated to improve their speech. A coexisting buccofacial apraxia is usually observed, as well as right hemiplegia that is greater in the arm than leg. The program therefore seems to be particularly suited for patients with Broca's or mixed nonfluent aphasia with accompanying apraxia of speech (Tonkovich and Peach, 1989; Square, Martin, and Bose, 2001). These characteristics also generally exclude patients with Wernicke's, transcortical, or global aphasia.
The initial computed tomographic profile for good candidates included a large lesion in Broca's area extending superiorly to the left premotor and sensorimotor cortex for the face and deep to the periventricular white matter, putamen, and internal capsule. The lesion also typically spared Wernicke's area and the temporal isthmus. No lesions of the right hemisphere were detected; this evidence was used to support the preservation of melodic functions in these patients (Naeser and Helm-Estabrooks, 1985). Naeser (1994) subsequently identified two important areas in the subcortical white matter that appeared to have an important role regarding recovery of spontaneous speech. Lesions of good responders involved no more than half of the total area, including the medial subcallosal fasciculus and the middle one-third of the periventricular white matter. The extent of lesion in cortical language areas, including Broca's area, could not be used to discriminate among individuals who responded well or poorly to MIT. Lesions may have involved Wernicke's area or the subcortical temporal isthmus, but when they did, they involved less than half of those areas.
During the beginning stages of an MIT program, emphasis is placed on the production of syntactically and phonologically simplified phrases and sentences that gradually increase in complexity throughout the course of the program. Ideally, language materials are thematically related and relevant to the patient's daily needs and background. A large corpus of materials is recommended to vary the stimuli from session to session and to decrease practice effects. It is debatable whether the use of supplementary pictures or written sentences is appropriate (Helm-Estabrooks and Albert, 1991; Sparks, 2001). Frequent treatment, perhaps twice daily, is essential, but when unattainable, family members might be used to assist with the program (Sparks, 2001).
MIT focuses on three elements represented in the spoken prosody of verbal utterances: the melodic line or variation in pitch in the spoken phrase or sentence, the tempo and rhythm of the utterance, and the points of stress for emphasis. The intoned pattern has a range of only three or four whole notes that is selected from several reasonable speech prosody patterns for the target sentence. Tempo is slowed by syllable lengthening; phrase accuracy appears to be best when syllable durations approximate 2.0 s per syllable. The effects of this tempo are most pronounced when patients are required to intone utterances independent of the clinician (Laughlin, Naeser, and Gordon, 1979). Rhythm and stress are exaggerated by elevating intoned notes and increasing loudness. Clinicians tap out and further reinforce the rhythm and stress of the utterances using the patient's hand. The emphasis on slow tempo, precise rhythm, and distinct stress appears to facilitate the processing of the structure and the articulation of the intoned utterances.
The MIT program consists of four levels. In level I, the clinician hums a melody pattern within the three-to four-note range and aids the patient in tapping the rhythm and stress of the stimulus melody to establish the process of intoning melody patterns with hand tapping. Level II requires the patient to tap and repeat the clinician's production of the intoned utterance and to respond to a probe question eliciting an intoned repetition of the intoned utterance. Hand tapping is not used in response to probe questions. The clinician provides assistance by intoning the utterance in unison with the patient and then fading his participation so that the patient subsequently intones the utterance on his own. In level III, unison intoning of the utterance is followed by immediate fading of the clinician's participation. The patient then produces the target utterance following an enforced delay after the clinician presents it. Finally, the patient gives an appropriate intoned response to an intoned probe question from the clinician. A backup procedure is introduced at this level to provide the patient an opportunity to correct errors. The backups consist of repeating the previous step and attempting the failed step again, and as such constitute an “indirect” approach to correcting errors. The goal of level IV is normal speech prosody. Latencies for delayed repetition are increased, and the training sentences become more complex. A technique called Sprechgesang (speech-song) is used in the transition to speech prosody. In this technique, the constant pitch of the intoned words is replaced by the variable pitch of speech while retaining the tempo, rhythm, and stress of the intoned sentence. Unison production of the target sentence in Sprechgesang is followed by fading, delayed spoken repetition using normal speech prosody, and production using normal prosody in response to a probe question with normal speech prosody.
MIT uses a scoring method where values of 2, 1, or 0 can be obtained. Full scores (i.e., 1 for items with no backups, 2 for items with backups) are assigned to successful responses, while partial scores (i.e., 1) are assigned to responses that require a backup where available. No score is assigned to unsuccessful responses following multiple attempts. The average score for three sessions must be higher than the average score of the three previous sessions for the participant to remain in the program. An overall score of 90% or better for five consecutive sessions is required to advance from one level of MIT to the next.
The neurophysiological model offered by the developers of MIT to account for its effectiveness has been controversial since it was first proposed. Berlin (1976) stated that the evidence linking the right hemisphere to the interpretation of nonverbal acoustic processes like music is insufficient to conclude that MIT activates the right hemisphere in some way to control motor speech gestures. Instead, he suggested that good candidates for MIT might have an intact left primary motor area that is deprived of input from the damaged left Broca's area. Improved speech production might then result from transcallosal input to left hemisphere speech motor centers arising from the MIT-activated right hemisphere homologue of Broca's area. An alternative explanation involved input from a disconnected intact left Broca's area to an intact left primary motor area via a transcollosal pathway involving the right hemisphere homologues to these areas.
Belin et al. (1996) used positron emission tomography to investigate recovery from nonfluent aphasia following treatment with MIT. Changes in cerebral blood flow were measured while the participant listened to and repeated simple words, and during repetition of intoned words. Abnormal activation of right hemisphere structures homotopic to those normally activated in the intact left hemisphere was observed during the simple word tasks performed without intoning, while word repetition with intoning reactivated essential motor language zones, including Broca's area and the adjacent left prefrontal cortex. Belin et al. concluded that MIT is more strongly associated with exaggerated speech prosody than with singing and therefore recruits language-related brain areas of the left hemisphere rather than right hemisphere areas.
Boucher et al. (2001) investigated whether the processing of melodic contours in music applies similarly to the processing of speech prosody. According to these authors, melody is associated with musical tone and rhythm. Tonal elements include pitch, timbre, and chord and correspond to intonation in speech. Musical rhythm refers to the timing distribution of tonal elements and is comparable to the stress points of speech. Although there is support for right hemisphere processing of intonation, Boucher et al. (2001) provide evidence that the left hemisphere is involved in the processing of rhythm, and consequently question whether melody-based interventions such as MIT facilitate speech production because of right hemisphere contributions. Following interventions in two speakers with nonfluent aphasia using stimuli emphasizing tone or rhythm in varying conditions, equal or greater success in responding was found for conditions emphasizing rhythm than for conditions emphasizing melodic intoning. Boucher et al. concluded that the right hemisphere explanation for the facilitating effects of MIT could not be supported strongly.
| |