Music Training Facilitates Lexical Stress Processing

July 26, 2017 | Autor: Régine Kolinsky | Categoría: Psychology, Cognitive Science, Music and Language, Music Perception, Lexical stress
Share Embed


Descripción

Music2603_07

1/16/09

9:48 AM

Page 235

Music Training Facilitates Lexical Stress Processing

235

M USIC T RAINING FACILITATES L EXICAL S TRESS P ROCESSING

R ÉGINE KOLINSKY, H ÉLÈNE C UVELIER , V INCENT G OETRY Université Libre de Bruxelles (ULB), Brussels, Belgium AND

I SABELLE P ERETZ Université de Montréal, Montréal, Canada J OSÉ M ORAIS Université Libre de Bruxelles (ULB), Brussels, Belgium WE INVESTIGATED WHETHER MUSIC TRAINING facilitates the processing of lexical stress in natives of a language that does not use lexical stress contrasts. Musically trained (musicians) or untrained (nonmusicians) French natives were presented with two tasks: speeded classification that required them to focus on a segmental contrast and ignore irrelevant stress variations, and sequence repetition involving either segmental or stress contrasts. In the latter situation, French natives are usually “deaf ” to lexical stress, but this was less the case for musicians, demonstrating that music expertise enhances sensitivity to stress contrasts. This increased sensitivity does not seem, however, to unavoidably bias musicians’ attention to stress contrasts: in segmental-based speeded classification, musicians were not more affected than nonmusicians by irrelevant stress variations when overall performance was controlled for. Implications regarding both the notion of modularity of processing and the advantage that musicianship may afford for second language learning are discussed.

Received May 22, 2008, accepted October 25, 2008. Key words: lexical stress, stress deafness, musicianship, domain-transfer effects, modularity

I

NCREASING EVIDENCE POINTS TO A LINK BETWEEN

music lessons and prosodic skills in speech. Speech prosody refers to the “musical aspects” of speech, including its “melody” (commonly called intonation) and its rhythm (stress and timing). In music, pitch and temporal relations define musical tunes. In speech, duration, pitch height, intensity, and spectral variations

Music Perception

VOLUME

26,

ISSUE

3,

PP.

235–246,

ISSN

0730-7829,

are to different degrees responsible for prosody, which contributes to speech communication through accents, pauses, and intonation that convey important sources of linguistic and emotional information. Music training has been shown to enhance the ability to decode emotions conveyed by speech prosody (e.g., Nilsonne & Sundberg, 1985; Thompson, Schellenberg, & Husain, 2004). Indeed, emotions are also at the core of music. However, this effect may be mediated by emotional intelligence rather than by music training per se (Trimmer & Cuddy, 2008). Prosody also provides indexical or extralinguistic information such as the talker’s voice, gender, or dialect, as well as important linguistic information, for example, allowing one to differentiate questions from statements. More generally, linguistic prosody helps listeners to determine boundaries between words and phrases. Descending pitch contours and syllables of longer duration typically mark ends of words or phrases in speech (e.g., Beckman & Pierrehumbert, 1986; Klatt, 1976), such as pitch drops and durational lengthening at the end of musical phrases (e.g., Narmour, 1990; Todd, 1985). Variations in pitch, duration, and intensity give clues for differentiating among different performances of the same music (Palmer, 1997; Palmer, Jungers, & Jusczyk, 2001) as well as among structural ambiguities (such as phrasal and metrical boundaries) that arise in music (Palmer, 1996; Sloboda, 1985), just as prosodic features of speech can clarify the intended meaning of a syntactically ambiguous sentence (Lehiste, Olive, & Streeter, 1976; Price, Ostendorf, Shattuck-Hufnagel, & Fong, 1991; Scott, 1982). Moreover, both music and speech present accents whose function is to capture the listener’s attention during the unfolding of the verbal message or of a musical piece (e.g., Jones & Boltz, 1989). This common function is well illustrated by the fact that harmonic accent allows faster processing of target phonemes in phoneme monitoring (Bigand, Tillmann, Poulin, D’Adamo, & Madurell, 2001). Given these similarities, one may expect processing of subtle and hence difficult to detect prosodic variations to benefit from music training. As a matter of fact, musically trained adults and children (musicians) were found to outperform untrained participants (nonmusicians) in

ELECTRONIC ISSN

1533-8312 © 2009

BY THE REGENTS OF THE UNIVERSITY OF CALIFORNIA . ALL

RIGHTS RESERVED. PLEASE DIRECT ALL REQUESTS FOR PERMISSION TO PHOTOCOPY OR REPRODUCE ARTICLE CONTENT THROUGH THE UNIVERSITY OF CALIFORNIA PRESS ’ S RIGHTS AND PERMISSIONS WEBSITE , HTTP :// WWW. UCPRESSJOURNALS . COM / REPRINTINFO. ASP.

DOI:10.1525/MP.2009.26.3.235

Music2603_07

1/16/09

236

9:48 AM

Page 236

Régine Kolinsky, Hélène Cuvelier, Vincent Goetry, Isabelle Peretz, & José Morais

detecting weak pitch violations in both melodies and intonation contours of spoken sentences (Magne, Schön, & Besson, 2006; Schön, Magne, & Besson, 2004). Phonetic experts also benefit from music training in difficult prosodic analyses such as identifying local variations in nuclear linguistic tones (Dancovicová, House, Crooks, & Jones, 2007). Languages also can be characterized by prosodic cues associated with lexical stress or tones. In tone languages, some words are distinguished by the sole variations in pitch contour. In comparison to nonmusicians, musicians have been shown to display more robust brainstem encoding of Mandarin Chinese lexical tones (Wong, Skoe, Russo, Dees, & Kraus, 2007). Thus, via the corticofugal pathway, music training can enhance auditory functions subserved by the rostral brainstem that are relevant to speech. Surprisingly, until now relatively little attention has been paid to the relations between music training and the processing of word stress patterns. Only some commonalities between music structure and linguistic stress patterns have been reported. The prosody of a composer’s native language influences the structure of his/her instrumental music (e.g., Abraham, 1974; Wenk, 1987). In English, which presents contrastive opposition between stressed and unstressed syllables, these stress patterns tend to coincide with the metrical structure in music (Palmer & Kelly, 1992; see also Lerdahl, 2001). And, analyzing the instrumental music of 16 classic British and French composers (e.g., Delius vs. Saint-Saëns), Patel and Daniele (2003a, 2003b; see also Huron & Ollen, 2003) observed the structure of their musical themes to parallel the one of their mother tongue, mainly by the greater variability of the temporal patterning of vowels in (British) English than in French (Grabe & Low, 2002; Low, Grabe, & Nolan, 2000; Ramus, Nespor, & Mehler, 1999). In the present study, we explored the relations between music and linguistic stress patterns in a different way: we investigated whether music training might facilitate the processing of lexical stress patterns in native speakers of French, a fixed stress language. This language does not use lexical stress contrasts: it displays fixed word-final stress (e.g., Garde, 1968). French differs from varying stress languages such as English, Dutch, and Spanish, which use stress to distinguish lexical items, allowing, for example, Spanish speakers to contrast between BEbe1 (“drink”) and beBE (“baby”).

1 In all the examples, uppercase denotes stressed syllables, lowercase unstressed ones.

Lexical stress is instantiated by a variety of prosodic cues, namely pitch, energy, and duration, the importance of which varies across languages (e.g., Garde, 1968; Lehiste, 1970). In English, stressed syllables are higher-pitched, louder and somewhat longer than unstressed ones. In fact, English has two kinds of unstressed vowels: reduced and full. Indeed, in this language (but not in others such as Spanish), the variations of pitch, loudness and duration may be accompanied by a modification of the quality of the unstressed vowel, which is then reduced to a central “schwa” (/∂/). For example, the quality of the unstressed first vowel of the verb obJECT /∂b’d εkt/ is quite different from that of the stressed first vowel of the noun OBject, /’ bd εkt/. In other minimal pairs, vocalic reduction is not as strong, such as in TRUSty vs. trusTEE. Within such pairs, stressed and unstressed syllables differ in vowel length, intensity, and pitch, but far less in vocalic quality. To what extent are natives of languages such as French “deaf ” to such lexical stress differences? Since lexical stress contrasts have substantial acoustic correlates in terms of duration, fundamental frequency (F0), and energy, French natives are not expected to have problems with the perception of stress. In fact, Dupoux, Pallier, Sebastián, and Mehler (1997) reported that in a same/different task, French participants had no difficulty discriminating nonwords that differ only in the position of stress, such as BOpelo-boPElo. However, French listeners were found to be “deaf ” to stress contrasts at a more abstract processing level. Dupoux et al. (1997) used a speeded ABX task in which participants heard three successive items (VAsuma-vaSUma-VAsuma) and judged whether the third item was identical to the first or the second one. This task requires a short-term memory buffer because the decision has to be delayed until the final stimulus is heard, and the phonetic variability linked to the fact that different talkers pronounced the three stimuli made an acoustically based response strategy more difficult to use. Under these conditions, French natives made many more errors and were far slower than Spanish natives. In addition, when they did not know what dimension they had to pay attention to, they performed worse on stress (FIdape-fiDApe) than on segmental ( fidape-lidape) contrasts, and did not benefit, contrary to Spanish natives, from redundant stress information when segmental information alone was sufficient to perform the task (in FIdape-liDApe compared to fidape-lidape). Thus, only natives of a stress language represent and jointly use both stress and segmental information.

Music2603_07

1/16/09

9:48 AM

Page 237

Music Training Facilitates Lexical Stress Processing

Peperkamp, Dupoux, and Sebastián-Gallés (1999) and Dupoux, Peperkamp, and Sebastián-Gallés (2001) observed even more robust stress deafness in French natives by using a sequence repetition task in which performance on a stress contrast was compared with that on a control segmental contrast across different levels of memory load. The participants were first required to associate two nonwords of a minimal pair differing only in one phonological dimension, either stress location (PIki–piKI) or segmental contrast (TUku–TUpu), with two different keyboard keys. Then, they listened to longer and longer random sequences of the two items, which they were required to recall and transcribe as sequences of key presses. Analysis of the difference scores (performance on stress contrast minus performance on segmental contrast) revealed not only a French-Spanish difference but also nonoverlapping distributions of individual results between groups. In addition, the advantage of segmental over stress contrast in French natives was found to be almost independent of sequence length. Interestingly, French natives who are late learners of Spanish show the same limitation in short-term memory encoding of stress contrasts that was found in French monolinguals (Dupoux, Sebastián-Gallés, Navarrete, & Peperkamp, 2008). Other studies showed that only speakers of languages in which stress is contrastive are unable to ignore stress information, even in tasks in which this filtering capacity would be beneficial. Dupoux et al. (1997) found that Spanish natives hardly performed a segmental-based ABX task when stress variations were put in conflict with the expected response (e.g., to answer “A” to the ABX triple BOpelo–soPElo– boPElo), whereas French natives had no difficulty ignoring the irrelevant stress variations. Using an adaptation of Garner’s (1974) speeded classification task, Pallier, Cutler, and Sebastián-Gallés (1997) showed that natives of both Spanish and Dutch (two varying stress languages) had difficulties focusing on the segments by filtering out irrelevant stress variations in a segmental-based classification task that required them to press one of two buttons corresponding respectively to deki and nusa, irrespective of stress position. Indeed, both groups presented an interference effect, namely longer reaction times (RTs) when the stress of the nonwords varied from trial to trial (DEki and deKI associated to one response button vs. NUsa and nuSA to the other than when it was fixed (DEki vs. NUsa). The effect was larger for Spanish than for Dutch, suggesting that the degree of interference from stress variations may be mitigated by the predictability of stress, which is greater in Dutch than in Spanish.2

237

Here, we focused on the facilitation possibly induced by music training on the processing of lexical stress patterns in native speakers of a language that does not use lexical stress contrasts. French natives, musicians, and nonmusicians were examined. Both groups were presented with two tasks: an adaptation of the sequence repetition task designed by Dupoux et al. (2001), in which they had to process either segmental or stress contrasts, and an adaptation of the segmental-based speeded classification task used by Pallier et al. (1997), in which they were required to ignore the irrelevant stress variations. Concerning sequence repetition, we predicted that, as is typical for French natives, musicians and nonmusicians would perform similarly for the segmental contrast and better for the segmental than for the stress contrasts. More interestingly, if musicians were better at processing the acoustic correlates of stress, their segmental over stress contrast advantage should be reduced in comparison to nonmusicians, especially for a weak stress contrast that does not involve strong vocalic reduction. In the speeded classification task, we predicted that the interference entailed by the irrelevant and unpredictable stress variations on the segmentalbased response would be stronger for musicians than for nonmusicians. Method Participants

French native speakers without hearing problems participated in the experiment in exchange for course credits (24) or payment (37). Data for six participants were discarded: two because they were bilinguals, two nonmusicians because they did not reach the success criterion in the sequence repetition task (see Procedure), and two because their music experience exceeded our selection criteria for nonmusicians, which were (i) no solfeggio lessons (ii) never learned to read and write music, and were unable to do so (iii) no more than three years of music practice, stopped for at least six years. The 32 remaining nonmusicians (9 males and 23 females, aged 18 to 27 yrs, average: 20 yrs) were tested at the Université Libre de Bruxelles. The 25 musicians (12 males and 13 females, aged 21 to 30 yrs, average = 25 yrs) were selected at the Conservatoire Royal de Bruxelles. They had begun music lessons between 4 and 12 yrs of age (on 2 In Dutch, stress contrasts are usually accompanied by contrasts in syllable weight, whereby stress is assigned to the heavier syllable (e.g., Van der Hulst, 1984).

1/16/09

238

9:48 AM

Page 238

Régine Kolinsky, Hélène Cuvelier, Vincent Goetry, Isabelle Peretz, & José Morais

average, at 7 yrs) and had between 12 and 22 yrs of music training (16 yrs, on average). Most were educated in classical music, the majority (16) playing violin, the others playing the piano or other instruments. Most (16) were absolute pitch possessors (AP). Since most French speakers in Belgium are exposed to a stressed language (Dutch), proficiency in Dutch was assessed in all participants. According to their naming scores of 48 standardized drawings (Snodgrass & Vanderwart, 1980), both groups displayed poor knowledge of Dutch, with 17 and 20% correct for nonmusicians and musicians, respectively, F < 1, as well as poor knowledge of English, with 28 and 33% correct, F(1, 55) = 2.56, p > .10.

namely a contrast with strong vocalic reduction (Odrept– oDREPT, /’ drεpt/–/ ’drεpt/), and the third a weak stress contrast, namely a contrast with weak vocalic reduction (DRUSnee–drusNEE, /’dr sni/–/dr ’sni:/). The nonwords were derived from English words: the imbect pair was derived from invent, and the stress contrast pairs from OBject–obJECT and TRUSty–trusTEE. A female native speaker of Canadian English recorded each nonword several times in a sound-treated booth. Ten tokens of each nonword were used so as to promote acoustic and phonetic variability. As illustrated in Figure 1, in the stress contrast pairs, stressed vowels were longer (F(1, 36) = 725.45, p < .0001) and louder (F(1, 36) = 231.69, p < .0001) than unstressed ones and had higher pitch, F(1, 36) = 382.42, p < .0001. Stressed and unstressed vowels differed more in the strong than in the weak stress contrast pairs in terms of intensity (F(1, 36) = 4.12, p < .05) and length (F(1, 36) = 16.25, p < .0005), as well as by their vocalic quality (Figure 1d).

Materials SEQUENCE REPETITION TASK

Three minimal pairs of nonwords were constructed, one involving a segmental contrast (IMbect–OMbect, ’imbεkt/–/’ mbεkt/), another a strong stress contrast, 210

76

200

74 72 Intensity (dB)

Pitch (Hz)

190 180 170

70 68 66

160 64 150

62

140 IMbect

OMbect

DRUSnee drusNEE

Odrept

oDREPT

60 IMbect

OMbect

DRUSnee drusNEE

Nonword

Nonword

(a)

(b)

450 700 900

400 350

900

1100

F2 1300

1500

Odrept oDREPT

1700

1900

800

300

700

250 600 F1

Syllable Length (ms)

Music2603_07

200

500

150

400

100

300

50 0

200 IMbect

OMbect

DRUSnee drusNEE Nonword (c)

Odrept oDREPT

Nonword (d)

FIGURE 1. Average pitch (1a), intensity (1b), syllable length (1c) and formant values (1d) of each syllable of each nonword used in sequence repetition. In Figures a, b, and c, black triangles = syllable 1; transparent squares = syllable 2. In Figure 1d, F1 and F2 values of the stimuli are represented in hertz only for syllable 1, with the initial stress version of each nonword displayed on the left. Plain lines: “odrept” stimuli; dotted lines: “drusnee” stimuli.

Music2603_07

1/16/09

9:48 AM

Page 239

Music Training Facilitates Lexical Stress Processing

For each minimal pair, five experimental blocks were constructed. Each block was made out of 12 trials, except the first block that presented only eight trials. In this first block, each trial, namely each sequence that had to be reproduced, contained two nonwords. In the other blocks, trials consisted of sequences of increasing length, from 3-nonword sequences in block 2 up to 6nonword sequences in block 5. As done in Dupoux et al. (2001), we used a short silence interval (30 ms) between successive nonwords to discourage participants to use recoding strategies, e.g., to mentally translate the nonwords into their associated key values while listening to the sequence. SEGMENTAL-BASED SPEEDED CLASSIFICATION TASK

The nonwords mispardof and misparkof were chosen, given that both would make phonologically acceptable items in English and would thus be easily pronounceable by the native speaker of Canadian English (the same as for the previous task) who recorded the material. This speaker recorded several tokens of these nonwords, half with stress on the first syllable (MISpardof, MISparkof, /’mispardof/, /’misparkof/), and half with stress on the second syllable (misPARdof, misPARkof, /mis’pa:rdof/, /mis’pa:rkof/). Ten tokens of each stimulus were chosen so as to maximize differences in duration (F(1, 36) = 542.49, p < .0001), intensity (F(1, 36) = 76.28, p < .0001), and F0 (F(1, 36) = 837.88, p < .0001) between the stressed and unstressed syllables. Both materials were digitized at 16 kHz and 16 bits, edited, and stored on a computer. Procedure

Each participant was tested individually in a quiet room, facing a Macintosh computer with an Azerty keyboard. Presentation of the stimuli and response recording were controlled using PsyScope (Cohen, MacWhinney, Flatt, & Provost, 1993). Stimuli were delivered through two loudspeakers located in front of the participant, one on the left, the other on the right. In the sequence repetition task, all participants were first presented with the segmental contrast, which was used to establish baseline performance. Presentation order of the strong and weak stress contrasts was counterbalanced between participants. Participants could listen to the various tokens of these items by pressing the corresponding keyboard keys as many times as they wanted. IMbect was associated to the key “q” and OMbect to the key “ù.” The nonwords with stress on the first syllable, Odrept and DRUSnee, were associated to “q.” Those with stress on the second syllable to “ù.” In the training phase, participants heard a token of one of

239

the nonwords and had to press the associated key. An auditory beep informed them when they made an error. The success criterion was eight correct responses in a row for one particular nonword. Those who succeeded turned then to the experimental phase, their task being to reproduce each sequence by typing the associated keys in the correct order. Order of the nonwords within sequences excluded long repetitive sequences (e.g., qqq, ùùùù). Order of sequences within each block was randomized for each participant. To prevent participants from using echoic memory, on each trial they had to wait 30 ms after the last nonword for a beep and the word “OK” to appear on the computer screen before beginning to type their response. During this phase, no feedback was provided. A 2-s pause separated the end of the reproduction of the sequence from the next trial. In the speeded classification task each trial started with the auditory presentation of a nonword that participants had to identify. They indicated their response by pressing one of two keys corresponding respectively to mispardof (q) and misparkof (ù), irrespective of stress position. The initial consonant of the third syllable was the only relevant difference. Belated responses (occurring more than 1500 ms after stimulus onset) were signaled by auditory feedback. A 200-ms pause separated response from the next trial. There were two varied blocks and two fixed blocks of 64 trials each. In the varied blocks, four items (MISpardof and misPARdof vs. MISparkof and misPARkof ) were used and presented in a random sequence. In the fixed blocks, only two items were used, keeping the stress pattern constant. One fixed block contained nonwords with initial stress (MISpardof vs. MISparkof ), the other nonwords with final stress (misPARdof vs. misPARkof). Block order was counterbalanced across participants; item order within a block was randomized for each participant. This experimental phase was preceded by a learning phase for acquisition of the association of each nonword with a particular key, followed by 16 practice trials. Participants were tested in a single session, which lasted about one hour. First, they answered to a questionnaire assessing their music training and expertise. Then, they performed the speeded classification task, after which, using a naming task, their knowledge of Dutch was assessed to verify that they were indeed French monolinguals. Finally, they performed the sequence repetition task, after which their knowledge of English was assessed in the same way as before for Dutch. The sequence repetition task was run after the segmentalbased classification task in order to reduce the possibility that having to process a stress contrast first would make it more difficult to ignore this speech dimension.

1/16/09

240

9:48 AM

Page 240

Régine Kolinsky, Hélène Cuvelier, Vincent Goetry, Isabelle Peretz, & José Morais

100 90 80 % Correct

Music2603_07

70 60 50 40 30 2

3

4

5

6 2 Sequence Length

3

Nonmusicians Segmental Contrast

4

5

6

Musicians Strong Stress Contrast

Weak Stress Contrast

FIGURE 2. Mean percentages of correct responses in sequence repetition, separately for each length, contrast, and group.

Results Sequence Repetition

Mean percentages of correct responses (fully correct transcriptions of the input sequences) are presented in Figure 2, separately for each length, contrast, and group. One-tailed t-tests showed that performance was always above chance level,3 all ps < .0001. The ANOVA revealed better performance for shorter than longer sequences, F(4, 220) = 205.37, p < .0001, and better performance for musicians than for nonmusicians, F(1, 55) = 25.88, p < .0001. It also showed an effect of contrast, F(2, 110) = 74.03, p < .0001, with better performance for the segmental (F(1, 110) = 83.82, p < .0001) and strong F(1, 110) = 99.9, p < .0001) stress contrasts (that did not differ from each other, F < 1) than for the weak stress contrast. There were also significant interactions between contrast and sequence length, F(8, 440) = 3.54, p < .0005, between group and contrast, F(2, 110) = 7.34, p = .001, and between group, contrast, and sequence length, F(8, 440) = 2.19, p < .05. Decomposition of the latter interaction (using Bonferroni-corrected alpha rate of .017) showed that while for the segmental and strong stress contrast groups tended to differ from each 3

According to the binomial law, chance level is equal to 1/2n with n being the sequence length. The chance level is thus 1/4 = 25% at length 2, 1/8 = 12.5% at length 3, 1/16 = 6.25% at length 4, 1/32 = 3.12% at length 5, and 1/64 = 1.56% at length 6.

other only on the longest sequences (interaction with sequence length: F(4, 220) = 3.82, p = .005 and F(4, 220) = 3.01, p = .019), they differed at all sequence lengths for the weak stress contrast, F(1, 55) = 19.16, p < .0001 (interaction with length: F < 1), with musicians always significantly outperforming nonmusicians (all ps ≤ .005).4 The fact that groups differed on all long sequences, even for the segmental contrast (from sequence length 4 on, all ps ≤ .05) induces a potential confound. In order to obtain more convincing evidence on the specific effect that music training might have on the processing of stress contrasts, further analyses were performed on a difference score similar to the one used by Dupoux et al. (2001). This was calculated as performance on the segmental contrast minus performance on the stress contrast. Hence, any remaining length effect observed on the difference score may be attributed to specific difficulties for stress at longer sequence lengths. Since the weak stress contrast was more difficult than the strong stress contrast, two separate difference scores were calculated: the segmental advantage was calculated 4 A separate ANOVA revealed that there was no difference between AP possessors and other musicians regarding stress deafness. AP possessors did not outperform other musicians overall, F < 1. Although they tended to perform better than the others on the longest sequences (on average, 74.7% vs. 62.4% for other musicians), the group by sequence length interaction only tended towards significance, F(4, 92) = 2.13, p = .08. More crucially, contrast did not interact with group, and the interaction between group, length, and contrast was not significant, F < 1 in both cases.

1/16/09

9:48 AM

Page 241

Music Training Facilitates Lexical Stress Processing

241

45

Difference Scores (in %)

35

25

15

5

−5

−15 Weak Stress Contrast

Strong Stress Contrast (a)

45 35 Difference Scores (in %)

Music2603_07

25 15 5 −5 −15

2

3

4

5

6

2

3

4

5

6

Sequence Length Weak Stress Contrast

Strong Stress Contrast (b)

Nonmusicians

Musicians

FIGURE 3. Difference scores in sequence repetition for the segmental vs. strong stress contrast and for the segmental vs. weak stress contrast comparisons, calculated either (a) overall (error-bars represent the standard deviation) or (b) separately for each sequence length.

in comparison to either the strong or the weak stress contrast. Whereas both groups obtained similar performance for the segmental and strong stress contrasts, and hence no segmental advantage in this case (Figure 3a), the performance advantage of the segmental over the weak stress contrast, reflecting stress deafness for the latter, was much less pronounced in musicians. A Mann Whitney U test on the difference scores (cf. Peperkamp et al., 1999) showed no overlap between the distributions

of the segmental advantage displayed by musicians and nonmusicians with the weak stress contrast, U = 186.0, p = .0006, this not being the case with the strong stress contrast, U = 330.5, p = .263. An ANOVA performed on the difference scores, calculated separately for each sequence length, showed significant effects of group, F(1, 55) = 11.71 p < .005, and length, F(4, 220) = 3.78, p < .01, as well as a significant difference between the two difference scores,

1/16/09

242

9:48 AM

Page 242

Régine Kolinsky, Hélène Cuvelier, Vincent Goetry, Isabelle Peretz, & José Morais

1150

1100

1050 RTs (ms)

Music2603_07

1000

950

900

850 Musicians

Nonmusicians Fixed Blocks

Varied Blocks

FIGURE 4. RTs to correct responses in speeded classification, separately for fixed and varied blocks in each group (error bars represent the standard deviation).

F(1, 55) = 67.53, p < .0001. There were significant interactions between length and type of difference score, F(4, 220) = 4.28, p < .005, as well as between group and type of difference score, F(1, 55) = 5.80, p < .025, but neither the interaction between group and sequence length, F(4, 220) = 1.91, p >.10, nor the three factors interaction, F < 1, reached significance. Decomposition of the group by type of difference score interaction (using Bonferroni-corrected alpha rate of .025) showed that group did not differ for the comparison between the segmental and strong stress contrasts, F(1, 55) = 2.54, p >.10, whereas a highly significant group difference was observed for the comparison between the segmental and weak stress contrasts, F(1, 55) = 9.67, p < .005, without interaction with length, F < 1. Indeed, when considering the segmental and weak stress contrasts, stress deafness was weaker in musicians at all sequence lengths (Figure 3b).5 5

In agreement with the previous analysis, no difference was found between AP possessors and other musicians, be it in the distribution of the overall segmental advantage (both Mann-Whitney U tests were nonsignificant, p >.10) or in the ANOVA considering sequence length as an additional variable (F < 1 for the group effect as well as for the interactions involving group).

Speeded Classification

The ANOVA on RTs to correct responses showed significant effects of group, F(1, 55) = 11.49, p < .005, and block type,6 F(1, 55) = 14.57, p < .005, as well as an interaction between these factors, F(1, 55) = 4.99, p < .05. As illustrated in Figure 4, only musicians suffered from interference, F(1, 24) = 33.46, p < .0001 (nonmusicians: F = 1). However, interpretation of this effect is made difficult by the fact that musicians were faster than nonmusicians overall, on both varied (F(1, 55) = 5.54, p < .025) and fixed (F(1, 55) = 15.59, p < .0005) blocks. On accuracy, musicians also outperformed nonmusicians, with 98.9 vs. 97.5% correct, respectively, F(1, 55) = 5.82, p < .025. Performance was better overall for fixed than varied blocks, with 98.5 vs. 97.8% correct, respectively, F(1, 55) = 12.34, p < .001, but this effect did not depend on group, F < 1.7 6 Preliminary analysis showed no difference between the two kinds of fixed blocks (stressed on the first vs. second syllable: 1033 vs. 1035 ms on average, respectively). 7 Again, no difference was found between AP possessors and other musicians. AP possessors did not outperform and were not more rapid than other musicians, and group did not interact with block type in the analysis on either RTs or correct scores, all ps > .10.

Music2603_07

1/16/09

9:48 AM

Page 243

Music Training Facilitates Lexical Stress Processing

Since musicians outperformed nonmusicians overall in terms of both response speed and accuracy, interference may have been attenuated in the latter because they were responding more slowly and more poorly even in the fixed blocks. Amount of interference was actually related to overall response speed: the correlation between the size of the interference on RTs and the RTs observed on the fixed blocks was significant, r (55) = −.46, p < .0005, at least in nonmusicians, r (30) = −.46, p < .01 (in musicians: r(23) = −.14, p > .10). We therefore tried to match the two groups on their fixed block scores.8 In doing this, 20 musicians and 18 nonmusicians were found to obtain similar average RTs in the fixed blocks (1018 and 1013 ms, respectively). The ANOVA on these scores showed only an effect of condition, with shorter RTs in the fixed than in the varied blocks (1016 vs. 1042 ms, respectively, F(1, 36) = 18.71, p < .0001), without interaction with group, F < 1. Interference was about the same in musicians (28 ms) and nonmusicians (25 ms). Discussion

The results of the sequence repetition task show that, in natives of a nonstress language, music expertise enhances sensitivity to stress contrasts that do not present major segmental correlates. Indeed, musicians exhibited reduced stress deafness for the weak stress contrast, whereas no group difference was observed for the strong stress contrast, in which vocalic reduction induced major segmental correlates to stress, and which presented stronger syllable length and intensity differences as a function of stress. This is a new finding supporting the idea that processing of difficult linguistic prosodic variations benefits from music training. Reduced stress deafness in musicians was observed at all sequence lengths, suggesting that nonmusicians’ difficulties with stress arise at the encoding of these contrasts rather than from memory. Consistently, nonmusicans’ difficulties with stress were already observed in the phase of learning the association of the isolated nonwords with response keys: they were slower to reach the criterion of eight correct responses in a row (nine participants needed more than 20 training trials, among whom seven needed more than 30 trials, which was never the case for musicians, who always needed less than 20 trials to distinguish the stressed words) and their performance was poorer: 91%, while musicians reached 98% correct. Dupoux et al. 8 We discarded the nonmusicians with fixed block RTs higher than the maximum displayed by musicians, as well as the musicians with fixed block RTs lower than the minimum observed in nonmusicians.

243

(2008) also reported an effect at training: their French participants showed an effect of contrast, due to more errors for stress than for the segment. The late learners of Spanish showed a smaller but significant effect, and the Spanish no effect at all. Thus, in both studies the group differences do not depend on memory load and are quite strong. Still, the results of the speeded classification task show that the musicians’ enhanced sensitivity to stress contrasts does not lead them to be poorer than nonmusicians when stress variations must be ignored. In fact, only musicians presented interference from irrelevant stress variations in this task, but it could be due to their overall higher response speed. Indeed, when groups were matched on the fixed block scores, and hence on overall performance level, no difference in the amount of interference was observed anymore. Taken together, the results of the two experiments suggest that music training entails a benefit without cost pattern in the processing of prosodic features like lexical stress. Although far more sensitive than nonmusicians to lexical stress variations, as shown by their reduced stress deafness phenomenon, musicians are able to discard such variations when they are irrelevant to the task, as it was the case in speeded classification. Positive associations between music training and speech processing are sometimes referred to as domaintransfer effects that contradict a modular view of speech and music processing. We believe, however, that the debate is still open regarding the origin of such effects. First, when comparing musicians and nonmusicians, observed associations could be either genetic or the consequence of music lessons. As a matter of fact, musicianship is likely to be a consequence of both music aptitude and training combined with other factors. Second, many data indicate that music and language have distinct cortical representations and that either domain can be selectively impaired (see review in e.g., Peretz, 2006). Even the ability to process prosodic information such as linguistic intonation seems at least partly independent from music processing. Indeed, patients who are impaired in music processing as a consequence of a brain lesion and individuals who suffer from lifelong difficulties with music do not necessarily present severe problems in processing linguistic intonations (e.g., Ayotte, Peretz, & Hyde, 2002; Patel, Wong, Foxton, Lochy, & Peretz, 2008; Peretz et al., 1994). While it is possible that only some aspects of language and music processing are modular (e.g., Peretz & Coltheart, 2003), and that prosody does not belong to such modular subsystems, additional evidence is needed on this point. Third, as argued by Schellenberg and

Music2603_07

1/16/09

244

9:48 AM

Page 244

Régine Kolinsky, Hélène Cuvelier, Vincent Goetry, Isabelle Peretz, & José Morais

Peretz (2008), observed associations between music and language could provide evidence against the modular position, but could also be the product of domaingeneral attentional or corticofugal influences. For example, taking music lessons could be one learning experience that improves executive function and, consequently, test-taking abilities in a variety of cognitive domains. Some results of the present study support this view. Indeed, in the sequence repetition task musicians were better than nonmusicians on all long sequences, even for the segmental contrast. This converges with former evidence of a positive association between music training and memory performance (e.g., Chan, Ho, & Cheung, 1998; Ho, Cheung, & Chan, 2003; Kilgour, Jakobson, & Cuddy, 2000). In addition, since musicians were far more sensitive than nonmusicians to lexical stress variations, as shown by their reduced stress deafness phenomenon, one would have expected them to be less able to discard such variations when they are irrelevant to the task. This was not what we observed in speeded classification. This task actually required selective attention to the target dimension, here a segmental one. Thus, the fact that musicians were as able as nonmusicians to discard irrelevant stress variations (at least when they performed at the same level overall) suggests that musicianship does not only tune encoding of some linguistically relevant parameters, but also develops, in parallel, the capacity to attend selectively to auditory dimensions, whatever the domain. Analytic skills developed through music training could be at the core of this effect. Whether general cognitive skills can also explain the increased sensitivity to lexical stress contrast exhibited by musicians in the sequence repetition task should be further investigated. As already commented on, this effect does not seem to depend on improved memory capacity. In fact, both the observation of the participants’ behavior during the sequence repetition task and data from debriefing suggest that musicians used some music strategies to perform the task. Indeed, when first presented with the stress contrasts, several musicians spontaneously referred to intensity and/or duration differences between the stressed and unstressed syllables. Although some nonmusicians also noticed such differences, only musicians frequently referred to the “rhythm” of the sequences, and during the task, some of them presented a rhythmic balancing or nodding behavior never adopted by nonmusicians. However, such music strategies may depend on a general executive function improvement, allowing musicians, for example, to better focus their attention on duration or intensity variations between the stressed and unstressed

syllables of the weak stress contrast. In other words, although in the present study the stress deafness effect displayed by musicians was not significantly correlated with the amount of interference they presented in speeded classification (on RTs: r = −.21; on correct scores: r = −.11, p >.10 in both cases), both effects may in principle depend on a general executive function improvement induced by musicianship. In any case, the fact that musicianship affords an advantage for lexical stress processing may have important educational consequences. For example, we know that learning a new language, especially in the initial phase wherein one needs to segment new words, may largely benefit from the motivational and structuring properties of music in songs (Schön et al., 2008). Future work should examine whether the effect observed here is an aid for the acquisition of a foreign language. It has been reported that music ability facilitates the acquisition of a second language sound structure by improving receptive and productive phonological abilities (Slevc & Miyake, 2006). Late arrival Japanese adults (living in the United States) who were good at analyzing, discriminating, and remembering music stimuli also were better than their less musically talented peers at accurately perceiving and producing sounds of their second language (English). Yet, Slevc and Miyake only looked at the ability to discriminate segmental contrasts, by asking participants to decide which member of a minimal pair like “playing/praying” was presented in contextually neutral sentences. What the present results suggest is that music training may favor second language learning by improving the ability not only to process non-native segmental contrasts (as shown by Slevc & Miyake, 2006) but also to process (or to pay attention to) non-native prosodic contrasts. Our finding that musicians show increased sensitivity to stress contrasts in a foreign language (the English-like nonwords they were presented with) is consistent with the fact that musicians are more sensitive than nonmusicians to pitch variations in intonation contours for a foreign language that they do not understand (Marques, Moreno, Castro, & Besson, 2007). One way to assess whether such effects of music training afford substantial benefit to second language acquisition may be to compare French musicians and nonmusicians in a word-learning paradigm to study how easily they can acquire new lexical items that differ either in only their stress patterns, or in only their pitch patterns. Indeed, presenting native English speakers with English pseudosyllables superimposed with three non-native suprasegmental contrasts (pitch patterns), Wong and Peracchione (2007) observed large individual differences

Music2603_07

1/16/09

9:48 AM

Page 245

Music Training Facilitates Lexical Stress Processing

in learning success that were associated with the learners’ ability to perceive pitch patterns in a nonlexical context and their previous music experience. We would predict a similar strong association as regards both lexical stress and pitch patterns. In addition, the positive effect of music training could generalize to reading abilities, since stress processing abilities influence reading development in a second, stress-based language (Goetry, Wade-Woolley, Kolinsky, & Mousty, 2006).

245

of language and music specificity”) and by a FRFC grant (2.4633.66, “Mental representations of music and language in singing and nature of their interactions”). The first author is Senior Research Associate of the Fonds de la Recherche Scientifique—FNRS. Many thanks to Leslie Wade-Woolley for having produced the material, to Pascale Lidji and two anonymous reviewers for their helpful comments on a previous version of this paper, and to Emmanuel Bigand for his attentive and friendly organization.

Author Note

Preparation of this article was supported by a grant from the Human Frontier of Science Program (RGP 53/2002, “An interdisciplinary approach to the problem

Correspondence concerning this article should be addressed to Régine Kolinsky, UNESCOG Université libre de Bruxelles, CP 191Av. Franklin Roosevelt, 50B1050 Brussels, Belgium. E-MAIL: [email protected]

References A BRAHAM , G. (1974). The tradition of Western music (pp. 62-83). Berkeley, CA: University of California Press. AYOTTE , J., P ERETZ , I., & H YDE , K. (2002). Congenital amusia: A group study of adults afflicted with a music-specific disorder. Brain, 125, 238-251. B ECKMAN , M. E., & P IERREHUMBERT, J. (1986). Intonational structure in Japanese and English. Phonology Yearbook, 3, 255309. B IGAND, E., T ILLMANN , B., P OULIN , B., D’A DAMO, D. A., & M ADURELL , F. (2001). The effect of harmonic context on phoneme monitoring in vocal music. Cognition, 81, B11-B20. C HAN , A. S., H O, Y.-C., & C HEUNG , M.-C. (1998). Music training improves verbal memory. Nature, 396, 128. C OHEN , J. D., M AC W HINNEY, B., F LATT, M., & P ROVOST, J. (1993). PsyScope: An interactive graphic system for designing and controlling experiments in the psychology laboratory using Macintosh computers. Behavior Research Methods, Instruments, and Computers, 25, 257-271. DANCOVICOVÁ , J., H OUSE , J., C ROOKS , A., & J ONES , K. (2007). The relationship between musical skills, music training, and intonation analysis skills. Language and Speech, 50, 177-225. D UPOUX , E., PALLIER , C., S EBASTIÁN , N., & M EHLER , J. (1997). A destressing ‘deafness’ in French? Journal of Memory and Language, 36, 406-421. D UPOUX , E., P EPERKAMP, S., & S EBASTIÁN -G ALLES , N. (2001). A robust method to study stress “deafness.” Journal of the Acoustical Society of America, 110, 1606-1618. D UPOUX , E., S EBASTIÁN -G ALLÉS , N., NAVARRETE , E., & P EPERKAMP, S. (2008). Persistent stress ‘deafness’: The case of French learners of Spanish. Cognition, 106, 682-706. G ARDE , P. (1968). L’accent [The accent]. Paris: Presses Universitaires de France.

G ARNER , W. R. (1974). The processing of information and structure. Potomac: Erlbaum. G OETRY, V., WADE -WOOLLEY, L., KOLINSKY, R., & M OUSTY, P. (2006). The role of stress processing abilities in the development of bilingual reading. Journal of Research in Reading, 29, 349-362. G RABE , E., & LOW, E. L. (2002). Durational variability in speech and the rhythm class hypothesis. In C. Gussenhoven & N. Warner (Eds.), Laboratory phonology 7 (pp. 515-546). Berlin: Mouton de Gruyter. H O, Y.-C., C HEUNG , M.-C., & C HAN , A. S. (2003). Music training improves verbal but not visual memory: Crosssectional and longitudinal explorations in children. Neuropsychology, 17, 439-450. H URON , D., & O LLEN , J. (2003). Agogic contrast in French and English themes: Further support for Patel and Daniele (2003). Music Perception, 21, 267-271. J ONES , M. R., & B OLTZ , M. (1989). Dynamic attending and responses to time. Psychological Review, 96, 459-491. K ILGOUR , A. R., JAKOBSON , L. S., & C UDDY, L. L. (2000). Music training and rate of presentation as mediators of text and song recall. Memory and Cognition, 28, 700-710. K LATT, D. (1976). Linguistic uses of segmental duration in English: Acoustic and perceptual evidence. Journal of the Acoustical Society of America, 59, 1208-1221. L EHISTE , I. (1970). Suprasegmentals. Cambridge, MA: MIT Press. L EHISTE , I., O LIVE , J. P., & S TREETER , L. (1976). The role of duration in disambiguating syntactically ambiguous sentences. Journal of the Acoustical Society of America, 60, 1199-1202. L ERDAHL , F. (2001). The sounds of poetry viewed as music. Annals of the New York Academy of Sciences, 930, 337-354.

Music2603_07

1/16/09

246

9:48 AM

Page 246

Régine Kolinsky, Hélène Cuvelier, Vincent Goetry, Isabelle Peretz, & José Morais

LOW, E. L., G RABE , E., & N OLAN , F. (2000). Quantitative characterizations of speech rhythm: Syllable-timing in Singapore English. Language and Speech, 43, 377-401. M AGNE , C., S CHÖN , D., & B ESSON , M. (2006). Musician children detect pitch violations in both music and language better than nonmusician children: Behavioral and electrophysiological approaches. Journal of Cognitive Neuroscience, 18, 199-211. M ARQUES , C., M ORENO, S., C ASTRO, S.-L., & B ESSON , M. (2007). Musicians detect pitch violation in a foreign language better than nonmusicians: Behavioral and electrophysiological evidence. Journal of Cognitive Neuroscience, 19, 1453-1463. NARMOUR , E. (1990). The analysis and cognition of basic melodic structures. Chicago, IL: University of Chicago Press. N ILSONNE , A., & S UNDBERG , J. (1985). Differences in ability of musicians and nonmusicians to judge emotional state from the fundamental frequency of voice samples. Music Perception, 2, 507-516. PALLIER , C., C UTLER , A., & S EBASTIÁN -G ALLÉS , N. (1997). Prosodic structure and phonetic processing: A crosslinguistic study. Proceedings of Eurospeech ’97, 5th European Conference on Speech Communication and Technology, 4, 2131-2134. PALMER , C. (1996). On the assignment of structure in music performance. Music Perception, 14, 23-56. PALMER , C. (1997). Music performance. Annual Review of Psychology, 48, 115–138. PALMER , C., J UNGERS , M. K., & J USCZYK , P. W. (2001). Episodic memory for musical prosody. Journal of Memory and Language, 45, 526–545. PALMER , C., & K ELLY, M. H. (1992). Linguistic prosody and musical meter in song. Journal of Memory and Language, 31, 525-542. PATEL , A. D., & DANIELE , J. R. (2003a). An empirical comparison of rhythm in language and music. Cognition, 87, B35-B45. PATEL , A. D., & DANIELE , J. R. (2003b). Stress-timed vs. syllable-timed music? A comment on Huron and Ollen (2003). Music Perception, 21, 273-276. PATEL , A., WONG , M., F OXTON , J., LOCHY, A., & P ERETZ , I. (2008). Speech intonation perception deficits in musical tone deafness (congenital amusia). Music Perception, 25, 357-368. P EPERKAMP, S., D UPOUX , E., & S EBASTIÁN -G ALLÉS , N. (1999). Perception of stress by French, Spanish, and bilingual subjects. Proceedings of Eurospeech ’99, 6, 2683-2686. P ERETZ , I. (2006). The nature of music from a biological perspective. Cognition, 100, 1-32. P ERETZ , I., & C OLTHEART, M. (2003). Modularity of music processing. Nature Neuroscience, 6, 688-691.

P ERETZ , I., KOLINSKY, R., L ABREQUE , R., T RAMO, M., H UBLET, C., D EMEURISSE , G., & B ELLEVILLE , S. (1994). Functional dissociations following bilateral lesions of auditory cortex. Brain, 117, 1283-1301. P RICE , P. J., O STENDORF, M., S HATTUCKHUFNAGEL , S., & F ONG , C. (1991). The use of prosody in syntactic disambiguation. Journal of the Acoustical Society of America, 90, 2956-2970. R AMUS , F., N ESPOR , M., & M EHLER , J. (1999). Correlates of linguistic rhythm in the speech signal. Cognition, 73, 265-292. S CHELLENBERG , E. G., & P ERETZ , I. (2008). Music, language and cognition: unresolved issues. Trends in Cognitive Sciences, 12, 45-46. S CHÖN , D., B OYER , M., M ORENO, S., B ESSON , M., P ERETZ , I., & KOLINSKY, R. (2008). Songs as an aid for language acquisition. Cognition, 106, 975-983. S CHÖN , D., M AGNE , C., & B ESSON , M. (2004). The music of speech: Music training facilitates pitch processing in both music and language. Psychophysiology, 41, 341-349. S COTT, D. R. (1982). Duration as a cue to the perception of a phrase boundary. Journal of the Acoustical Society of America, 71, 996-1007. S LEVC , L. R., & M IYAKE , A. (2006). Individual differences in second language proficiency: Does musical ability matter? Psychological Science, 17, 675-681. S LOBODA , J. A. (1985). Expressive skill in two pianists: Metrical communication in real and simulated performances. Canadian Journal of Psychology, 39, 273-293. S NODGRASS , J. G., & VANDERWART, M. A. (1980). A standardized set of 260 pictures: Norms for name agreement, familiarity and visual complexity. Journal of Experimental Psychology: General, 6, 174-215. T HOMPSON , W. F., S CHELLENBERG , E. G., & H USAIN , G. (2004). Decoding speech prosody: Do music lessons help? Emotion, 4, 46-64. TODD, N. (1985). A model of expressive timing in tonal music. Music Perception, 3, 33-58. T RIMMER , C. G., & C UDDY, L. L. (2008). Emotional intelligence, not music training, predicts recognition. Emotion, 8, 838-849. VAN DER H ULST, H. (1984). Syllable structure and stress in Dutch. Dordrecht: Foris. W ENK , B. J. (1987). Just in time: On speech rhythms in music. Linguistics, 25, 969-981. WONG , P. C. M., & P ERACCHIONE , T. K. (2007). Learning pitch patterns in lexical identification by native Englishspeaking adults. Applied Psycholinguistics, 28, 565-585. WONG , P., S KOE , E., RUSSO, N., D EES , T., & K RAUS , N. (2007). Musical experience shapes human brainstem encoding of linguistic pitch patterns. Nature Neuroscience, 10, 420-422.

Lihat lebih banyak...

Comentarios

Copyright © 2017 DATOSPDF Inc.