Tuesday, December 1, 2009

Rhythm: The Essence of Music

Thaut, Michael. “The Structure of Rhythm.” In Rhythm, Music, and the Brain: Scientific Foundations and Clinical Applications, 1-17. New York: Routledge, 2005.

Summary of the Chapter:
In this Chapter, Thaut basically defines music and its elements, especially the rhythmic ones.
Thaut begins by describing music as a language-like form of communication, which is similar to speech. Both have pitch, timbre, accents, duration, intensity, and inflection; both have syntactical structure (i.e. grammar and sentence structure vs. musical form and structure); and both are affected by, and derive meaning from, a cultural context. However, there are some key differences: music has a lack of semantic or referential meaning. In music, the sounds are abstract, and do not denote or refer to specific objects, events, or concepts, as in speech/language. Also, music is neurologically processed in a different way.

Music communicates meaning in three different ways:
- Indexically (where music is associated with extramusical material)
- Ironically (where music has a likeness to extramusical events/experiences)
- Symbolically (where musical events communicate roles and values)

The character of music is temporal; it is an art form that exists in time. Music can be perceived as both sequential (for example, a moving melody line) and simultaneous (for example, a chord). Rhythm is used to create formal meaning in music, in that it binds music’s simultaneity and sequentiality in an organizational form.

Defining and describing rhythm further, there are two core aspects of temporal organization:
- Periodicity (events grouped into successive sequences of equal time/space content)
- Subdivision (similarity of internal structures among the grouped events)

Rhythm allows the perception of larger units and events in music, while imposing order. It also helps with cognition, by creating anticipation and predictability. These, in turn, create expectations and predictions that, when met or not met, give rise to emotional meaning in music.
The perception and formation of rhythm in music is based on the entrainment of oscillatory circuits in the brain. Different oscillatory circuits respond individually to different periodicities in the spectrum of complex rhythmic patterns. This set-up gives the brain more flexibility in perceiving, processing, forming, and modulating rhythms than would be possible from “stopwatch-like” mechanisms in the brain. What this means is that the human mind can accommodate time adaptations such as rubato and ritardando, without distorting the perception of an overall coherent temporal structure.
Thaut then went on to talk about specific elements of time and rhythm in music, such as “rhythmic pulses.” Rhythmic pulses act as “event markers” in the periodicity of music. Pulses divide the flow of time into regular reference points, and create a framework for the synchronization of many musical elements. Pulses establish anticipation and predictability and act as constraints that do not change throughout a sequence.
“Beats,” on the other hand, are time points of temporal positions rather than durations. They can be simultaneous with the rhythmic pulse, or differ from it. “Beat events” can contribute to the idea of slowing or rushing in contrast to the underlying, steady, and unmoving pulse. When beats come slightly after the pre-established pulse rate, for example, they can create the sense of a slowing tempo in music.
Tempo is a large-scale organizational form in music, where rhythmic pattern events (such as beats and pulses) occur within it. A fluctuating tempo does not necessarily undermine the feeling of a pulse, which is interesting to note.
In summary, the tempo is how fast or slow a piece of music is, on a large scale. The pulse acts like metronome ticks, and keeps time. Beats are events overtop of the pulse, acting as temporal points and activity in the music.

A few other key rhythmic elements in music:
- The accent (an expressive device created by changes in loudness, duration, timbre, and pitch contour). Regular groupings of accents give measures, and repeated accents give a sense of coherence to music. Accents also define musical meter.
- Meter. Meter occurs with an underlying pulse; meter actually organizes the pulse. Beat events come within the meter and pulse framework. The metric pulse acts as a framework, or reference.

Thaut then brings up the question of whether or not humans have a biological mechanism for the categorical tempo perception that also produces our sense of proportional time and tempo-keeping. Rhythm is an element that has appeared in almost all musical cultures, albeit it at different structural levels. This could suggest biological factors as underlying mechanisms for rhythmic formation and understanding.
Rhythmic and time elements in music can be used to represent spatial images, extensions and distances, and multi-dimensional forms. Rhythm affects the PERCEPTION OF TIME. Rhythm could potentially enhance brain operations through providing structure and anticipation in time, and could help learning and perception.

My Response:
I believe that sometimes music can mimic speech in sound and function. For instance, in certain African cultures there are talking drums, which use rhythms and pitches to convey a message. Music mimicking speech, or language, even occurs in the western tradition. An interesting, if perhaps unrelated, sample can be found in Richard Strauss’ Don Quixote tone poem. In this, brass chords and noises literally sound like the bleeting of sheep; so music is being used to mimic the language of animals. And music is used to express words and language especially in song. Music heightens the meaning of words by adding imagery, and extra sound and emotional quality to the words. I believe that this is a language function.
On another note, I would like to talk a bit about my own difficulties with rhythm. Often, I will record myself so that I can critique my playing, and listen to it with a critical ear. What I have consistently found is that my rhythm, or pulse, is not steady. This occurs even when I play with the aim of having a steady beat and tempo. I believe there is a distinct discrepancy between our perception of the rhythm of the music we play, and its actual rhythm and steadiness. Perhaps this comes from having so much to think about when playing music. There is intonation, muscle coordination, musicality ... the list goes on. But if our mind works in those “oscillatory circuits” that Thaut described, why is it so difficult to entrain part of our brains to play a steady rhythm, while other parts of the brain cover the other aspects of music? Or is this a personal difficulty of mine that others do not share? And if so, how do I overcome it?
I absolutely believe that any element of music can be related back to rhythm. Time and music are so intrinsically linked! Take pitch, for example. Even pitch has a rhythmic element. It is measured in Hertz, which amount to cycles per second. This means that pitch, the physical sound of music, is a series of pulses (or beat events) in a periodic pattern. Sound exists in time, pitch exists in time, timbre exists in time, and music exists in time.
Lastly, I fully agree that rhythm in music could be linked to biological and evolutionary processes in humans. I have a good friend who is a wonderful musician. He has his Master of Music degree in clarinet performance, and he is musical in every way but one: his rhythm. He can’t play a steady beat, no matter how hard he tries. The process of counting seems intrinsically hard for him, in a way that is much more extreme than my own case of discrepancy between what I thought I played, and how it actually sounded. This leads me to believe that there is something biologically different about my friend, and this makes the steady pulse aspect of music very difficult for him. For my friend, his own perception of a steady pulse is very different from that of most other musicians.
Time in music, or even time perception in music, is a wonderful tool for expression. When “beat events” do not match up with the underlying pulse, a listener gets the impression that time is speeding up or slowing down. In a largo movement, such as in a Mahler symphony, time can seem to be suspended; nothing moves. In a vivace movement, such as in the overture to Mozart’s Marriage of Figaro, time can seem like it’s flying by at an unreal pace. Music is, in essence, a means of manipulating the perception of time.

A Brief Summary of an Experiment Dealing with Autism, Emotion, and SCRs.

Khalfa, Stephanie and Peretz, Isabelle. “Atypical Emotional Judgments and Skin Conductance Responses to Music and Language in Autism.” In Autism Research Advances, edited by L.B. Zhao, 101-119. New York: Nova Science Publishers, 2007.

General Summary:
This was a really interesting experiment, designed to further study and characterize emotional processing in autism by monitoring responses to music samples (small clips) and verbal samples (short and recorded spoken sentences). The process was divided into two different and somewhat separate experiments. In each one, emotion recognition and feeling (the actual, automatic and physiological responses generated by each participant) were tested and recorded. It is certainly worth noting that to test the feeling aspect, skin conductance responses (SCRs) were recorded; this involved measuring the rapid fluctuations in electrodermal activity of the sweat glands. During some types of emotional responses, acetylcholine is released by the sympathetic nervous system, causing these electrical fluctuations in the skin, and they can be easily detected. In this way, a person’s arousal level (stimulated vs. relaxed) and valence perception (pleasant vs. unpleasant) can be physically shown and recorded. [Note: The researchers made specific mention of the brain’s amygdala and its role in SCRs. Amygdala activity results in a positive SCR.] The emotion recognition was measured by having each participant label samples as “scary,” “happy,” “sad,” or “peaceful,” and then rating each judgment on a 10-point scale.
Experiment 1. SCRs and Emotional Evaluation of Music, Summary:
For this study, 9 participants classed as “high-functioning autistic,” with IQs greater than 89, and who were older than 16, made up the “clinical” group. The “control” group was made up of 13 individuals who were not autistic, and had similar IQ averages and ages.
The musical clips, or stimuli, were drawn from 7-second clips of western classical music. There were 5 separate samples from the Baroque, Classical, Romantic, and early 20th-century periods. Each sample in its original form was labelled “consonant,” and each was altered by semi-tones in key pitches to create the “dissonant” version. Then each consonant and dissonant clip was sped up and slowed down, and half were played in minor and half were played in major mode, creating happy consonant (original form, fast, major), happy dissonant (dissonant form, fast, major), sad consonant (original form, slow, minor) and sad dissonant (dissonant form, slow, minor) categories.
Each participant evaluated arousal (stimulation) and valence (pleasantness) on a 10-point scale (1= calm or 1=unpleasant; 10=stimulating or 10=pleasant), where arousal was shown in the sad/happy aspect of each clip, and valence was shown in the dissonant/consonant aspect. While listening to the samples, each participant had his skin conductance recorded.
The Results. The control group judged consonant excerpts as more pleasant than the clinical group, but both judged dissonant as the same degree of unpleasant. Both groups judged happy (fast) excerpts as more stimulating than the sad, and dissonant as more stimulating than the consonant. Autistic individuals exhibited larger SCR amplitudes on average, especially for dissonant and “sad” music.
ONLY the control group exhibited variations in SCR in relation to the music: happy/consonant caused larger SCRs; happy dissonant and sad consonant caused smaller SCRs. In the clinical group, all four “emotion” categories elicited similar SCR levels. Within the autistic group, there were shown to be “high-responders,” and “normal responders.” “Normal Responders” exhibited similar SCR responses as the control group.

My Response, Experiment 1:
I would really like to have had a listing of exactly what musical samples the experimenters used. The “consonant” ones could very well have included some chords that could be perceived as dissonant, thus muddying the conscious recognition aspect of the valence measurements. This would have been especially true for some of the highly chromatic pieces of the Romantic era, and many 20th-century pieces.
This study doesn’t mention the cultural background of each participant. The “musical categories” (happy, sad, consonant, dissonant, etc.) were all assumed to be understood by the participants, i.e. each was assumed to only be sensitive to a western musical interpretation. If participants were more familiar with eastern music styles, their perception of the samples could have been different than waht would have been expected, causing a skew in the results.
If autistic participants had musical training, or attended many concerts (especially classical), they may have been able to recognize the intended feeling of each sample, without actually feeling or understanding the emotions of the music.
Would SCR recording pick up “sweat signals” from participants who felt uneasy, or nervous, in the setting and situation? I know that when I am nervous, I sometimes get sweaty palms. Would something like this have skewed results? And could the recorded emotion have come from other sources rather than the music, for example, nerves? With these variables in mind, how accurate was the SCR recording?
The control and clinical groups were very small. I would be more convinced of the data comparisons if this had been a more extensive study, involving more people.
In the judging of “happy” vs. “sad” as “stimulating” vs. “unstimulating”: I think that perhaps the stimulation factor could have come from the speed of the musical clip, and not necessarily its “happy” or “sad” emotional content; the labelling of the musical clips as stimulating or not may have had nothing to do with the emotional content/quality of the music.
7-second clips are not really a good example of music that could be used to elicit emotion, in my opinion; the clips used were only sound samples. I believe that complex emotions are elicited more from actual musical pieces, as a whole, where the music is developed and changed over time, in form and harmony especially.
I think the small number in the clinical group would not necessarily represent a whole population of autistic people. As a result, dividing them into “normal and high responders” may have been erroneous. Perhaps autistic individuals exhibit a range of autonomic responses, just like the general population, and thus should not be labelled as 2 separate and distinct groups.

Experiment 2: Emotional Judgments for Music and Language, Summary.
In this experiment, there were 18 individuals in the clinical (autistic) group and 19 individuals in the control group. 28 short (7-second) musical excerpts, written expressly for this experiment, were played; each was meant to represent sadness, happiness, fear, or peacefulness. There were 24 verbal stimuli, as well. These involved short, spoken, and recorded sentences designed to express one of the 4 emotions mentioned. For each sample (musical and verbal) the participants were asked to choose from the 4 emotions, and to assess arousal and valence on the 10-point scale.
The results from this experiment showed some interesting things. Participants judged fear and sadness as unpleasant in the verbal stimuli, but not in the musical stimuli. Also, participants judged “musical sadness” as more relaxing than “verbal sadness.” “Happiness” in music was shown to be more stimulating than “fear,” but this difference was not seen in the “happy” vs. “fearful” language samples. Overall, the results failed to show a difference between how autistic and “normal” participants made emotional judgments, in music or language.
In terms of emotional recognition, the researchers noted that “normal” participants could better recognize verbal sadness. Both groups had difficultly recognizing fear, sadness, and peacefulness in music, but could recognize happiness. There was no significant difference between how the groups identified musical emotions.
Emotions in verbal stimuli (language) were better recognized than emotions in music, in the control group. Some emotions in the verbal stimuli were deemed unpleasant, while no emotions conveyed by music were deemed as such. There was no evidence for a lack of emotion recognition in individuals with high-functioning autism, in terms of aural musical and verbal samples. However, autistic individuals showed smaller valance labelling of musical samples (i.e. labelled them less pleasant or unpleasant, on average) than the control group. This may have indicated that autistic individuals lack the ability to recognize subtle emotional aspects.

My Response, Experiment 2:
The musical excerpts that were used in this experiment were, essentially, “canned” music. They only involved the piano timbre and sound, and were very short. How much emotion could possibly be conveyed in 7 seconds? How would the results have differed if the samples were more extensive, and drawn from the symphonic repertoire? Would orchestration have caused a difference in the perception of emotion? Would familiarity or unfamiliarity with the music cause any differences in the perception of its underlying emotion? Also, would it have been useful to use some musical samples that involved words, to combine the musical and verbal stimuli? What would this have shown?
What would have been shown, especially with the musical samples, if participants had not been given the 4 emotions to choose from, but had to pick their own label? How would the autistic individuals have compared to the control group? Would this have gotten ride of any pre-conceived cultural notions, such as equating major-mode and fast music with “happiness”?
How would cultural pre-conceptions have affected the emotion recognition in autism? Are we taught to hear music a certain way? Can emotion recognition in music be learned in the same way that an autistic individual learns what a smile means?

Monday, November 30, 2009

How what we see affects what we hear: Visual perception of music performance

Reference: Thompson, W. F., Graham, P., and Russo, F. A. (2005) “Seeing music performance: Visual influences on perception and experience”, Semiotica 156: 177-201.


This long and detailed article addresses a very important and interesting aspect of music performance-its visual dimensions. It discusses the extent to which these aspects contribute to the communication between performer and a listener. The authors maintain that there are several levels at which facial expressions and gestures might affect our perception of music. First, there is basic level at which listeners’ attention is directed to critical acoustic information at a certain moment of time. The authors argue that such cues can increase musical intelligibility. They draw an analogy referring to the face of a speaker, that when seen may increase the intelligibility of speech in a noisy environment. Then there is a perceptual level, at which visual cues are given to the listener to indicate important melodic, harmonic, and rhythmic events. The authors suggest that at this level, facial expressions and gestures are intentionally introduced by performers as a way of sharing with listeners their understanding of the significance of such events, as well as emphasizing the musical activity as a shared experience between performer and listener. Finally, the authors observe that visual information is “highly effective at conveying persona and attitude” (p. 204), arguing that facial expressions and other bodily gestures help performer to personalize the music, creating the feeling of reciprocal human interaction during the performance. Next, the authors discuss a variety of different media technologies, which can be used to experience music performance, and argue that each technological medium provides listener with different experience. They put forward a concept of music as “multimedially performed and multimodally experienced” (p. 205), and present empirical findings that appear to support this model. Two case studies are presented in order to discuss the relationship between visual and aural aspects of performance. The authors thoroughly examine the use of facial expression, body movement, and gesture in two filmed performances, one of B. B. King (playing Blues Boys Tune), and another by Judy Garland (singing Just in Time). Interestingly, the analysis of these samples is based on structured interviews by a trained musicologist of two other musicologists. The authors observe that the wide range of potential effects that visual information can have on our perception of music suggests the necessity of further and more systematic research in this area. Accordingly, they designed five experiments to assess visual influences on two types of musical judgments. In three experiments, they investigated visual influences on structural interpretations of music and in two experiments - visual influences on emotional interpretations of music. The experiments confirmed that visual characteristics of performance influence music experience at a perceptual and an emotional level. In conclusion, the authors observe that, during performance, listeners incorporate visual and aural aspects to form an integrated audio-visual mental representation of music.


In my opinion, visual dimension is one of the most interesting and significant characteristics of music performance. In this regard, the article is definitely thought provoking. The authors suggest that these aspects are greatly overlooked by most psychological research, which consider them as unessential to the music. This may be true concerning systematic scientific investigation; however, it is extremely rare that visual aspects of performance are disregarded by a common listener (especially by those with limited musical training). People are extremely attentive to these details; they watch musicians closely, and, at times, it appears that these aspects are, perhaps, of even greater importance for them than the music itself. When Evgeny Kissin toured in Israel, my mother-in-law (who is a professional pianist, performer and teacher with lots of experience) attended one of his performances with a friend of hers (who has no formal training in music). After the concert, my mother-in-law, agitated and extremely excited, conducted a thorough analysis of the performance of this, perhaps one of the greatest pianists of all time, focusing in particular on his interpretation, phrasing, dynamics and other related specific details. The friend listened attentively and then expressed her curiosity regarding the reason behind my mother-in-law decision to buy more expensive tickets in the second row. My mother-in-law replied that she wanted to see the pianist’s hands, after which the friend thoughtfully observed, “But what ugly faces and dreadful grimaces he makes all the time!” I think this observation is not uncommon. Moreover, it is very characteristic. The authors of the article argue that listeners form an integrated audio-visual image of music. If this were correct, what kind of mental representation of the music the aforementioned friend would have? Will the music that she just listened to be always associated for her with the “ugly” and contorted countenance of the performer?
Then, perhaps, this could be further subdivided into two related questions. Do the twisted features or body movements of performer bear any interpretational connotations? Do these movements constitute an intentional message of a performer to a listener? Does it convey or elucidate musical meaning? Or, it’s just an involuntary muscle movement, correlated with the level of difficulty of a certain passage, or associated with the intensity of performers’ emotional involvement? And, another question, how are these aspects perceived by listener. Do listeners, indeed, combine visual and audio aspects of performance to create an integrated image of music, or visual characteristics simply distract them from the music? It appears that the latter could be true.
Obviously, there is a great disparity between professional musician’s approach to a listening process and that of a person with no formal music training. While musicians would typically focus on the specific music qualities of performance, disregarding other attributes, non-musicians would be greatly amused and distracted by other characteristics, including visual aspects of performance. This is especially the case with teenagers and young kids. When I attempt to discuss with them the music performances they attend, it appears that visual aspects of performance make more profound and enduring impression on them, than any specific musical characteristics.
There is also another issue, which I think worth mentioning in this context. It seems that visual aspects of performance, including facial expressions, gestures, and body movements are culturally determined. For instance, I know that in Russia, such movements were not encouraged. It was thought that a musical message should be transmitted using specific musical means of expression, such as dynamics, phrasing, tempos, articulation, and so on, rather than by means of the excessive and disproportionate visual indications.
In short, this is an interesting subject to discuss, and I think it deserves to be further investigated.
Here is a couple of links to video recordings of the music performances. This is amazing how differently these artists express themselves musically and visually. Is it culture? Differences in professional training? Fashion?

David Oistrakh plays Tchaikovsky Violin Concerto
Jascha Heifetz plays the same Concerto
Itzhak Perlman plays Tchaikovsky Violin Concerto
Evgeny Kissin plays Tchaikovsky Piano Concerto
Van Cliburn plays the same Concerto