Kahneman’s Thinking, Fast and Slow: Lessons for Composers

I am a composer and a teacher. Like most composers, I want to communicate strongly with my listeners. And like most composition teachers, I want to give my students the tools to do the same. Throughout my career, I have found that these tools do not have to come from the discipline of music itself; insights from other fields can also enlighten musical craft. When Leonard B. Meyer published his groundbreaking Emotion and Meaning in Music, he introduced the Western musical world to psychologically based arguments about how musical relationships elicit emotional responses [1]. In the decades since, music theorists and cognitive scientists have tested and expanded upon Meyer’s claims. David Huron, for instance, refined Meyer’s ideas by proposing a number of mechanisms of expectation, along with the possible evolutionary origins of each [2]. Huron’s work, however, does not explore the implications of these ideas for musical composition. To my knowledge, no one to date has published a practical account of how cognitive psychology can inform composition teaching [3].


Theorist and composer Fred Lerdahl came closest when he debunked the myth that a listener’s ability to grasp structure in serial music depends on how much that listener has been exposed to it [4]. In Lerdahl’s view, we must distinguish between “compositional grammar” and “listening grammar.” Whereas a compositional grammar entails the rules (e.g., algorithms, tone-row constraints) that a composer employs consciously when writing a piece, a “listening grammar” produces a mental representation of the raw acoustic signal based on learned stylistic norms (if applicable) and cognitive constraints. In general, the idiosyncratic compositional grammar is inaccessible to the listener (it may in fact be quite inaudible). Since the composer shares cognitive constraints with his listeners, however, he can shape his composition intuitively in ways that will “speak to” his audience. Lerdahl’s larger point is that no matter the style in which a composer writes, she can always draw on her intuitive understanding of human cognition to communicate effectively [5]. And I would add that the more realistic the composer’s (conscious or intuited) expectations about the listener’s cognitive processes, the more likely she will be able to communicate significantly.


In a similar spirit, this article will adapt some recent work in psychology by Nobel laureate Daniel Kahneman to musical ends [6]. More specifically, I ask how Kahneman’s research can help us to compose—and ultimately teach composition—more effectively. My focus on Kahneman stems from the fact that his ideas offer clear answers to some common problems in composition—answers that I have tested through my own composition and teaching. Of course, anecdotes do not constitute proof, but many things discussed here are susceptible to controlled experimentation. And though I do not undertake such experimentation here, I hope that this article inspires others to test my ideas more rigorously.


Kahneman’s research, which focuses on fundamental aspects of human thinking, suggests ways to interpret musical experience in terms of evolutionary psychology. Although there is no consensus on the precise evolutionary origins of music [7], it has several times been proposed that music works on us through the normal capacities of the evolved human brain [8]. Our pattern-seeking mind, always trying to predict what is ahead, is at the core of how music affects us. The details of which patterns we notice, and how they affect us, are varied and complex. But without at least some easily recognized patterns and expectations, music is no more than transient noise [9]. In this article, I focus on several principles Kahneman sets forth about how the mind perceives and interprets these patterns, and discuss their relevance to music.




Kahneman posits two kinds of mental processes, which he calls Systems 1 and 2. System 1 is very fast, automatic, and always active [10]. It supplies virtually instantaneous, but not very rigorous heuristics in real time, which enable us to live without painstakingly considering every detail and every alternative possibility in every situation. A fight-or-flight reaction, for instance, is not under conscious control, and it is certainly not a carefully weighted statistical assessment of probabilities. This kind of quick reaction is more immediately useful for survival than a slower, more reflective, alternative: when in mortal danger, speed matters!


It is worth noting here that System 1 includes not only instinctive and reflexive reactions; it also includes much learned behavior. Learning to play a musical instrument, for example, takes a lot of conscious effort at first, but with practice, it becomes automatic. The critical point for us is that System 1 is not only the locus for reflexive (and some learned) behaviors, but it also does a lot of extremely fast real-time “thinking”, using its own distinctive, primitive logic.


System 2, by contrast, is slow, conscious, and requires effort. It is deliberate and carefully reasoned. It is also, as Kahneman points out, lazy [11]. It takes a definite effort of will to use it, and it is never as fast as System 1. System 2 is usually too slow to work in real time; however, it is much more thorough and careful than System 1. In fight-or-flight situations, as described above, System 1 engages automatically. But if the danger is not immediate, it is possible to stop and think, i.e., to deliberately engage System 2.


I propose here that many, perhaps most, of our strongest responses to music come from System 1. Music evolves rapidly in real time, and our first reactions to it are immediate, powerful, and not very carefully thought out. This has all the hallmarks of System 1 perception and behavior. Obviously, one can also study music carefully and analytically (System 2), but this is very difficult in real time, and especially at first hearing [12]. This is of course not to say that music is of no interest to the analytical System 2. Nor do I deny that our responses to music can change and deepen with multiple hearings [13]. I am also not denying the existence of innate and/or learned differences between various listeners. My point is simply that many of our primary responses to music are better understood in terms of System 1—which all normal humans share—and that without these powerful, primary responses, the listener will not easily engage with the work.


Let us look more closely at System 1. As mentioned above, one of the main tasks of the human mind is to seek and interpret patterns in the environment. This tendency is so strong that we sometimes see patterns where there are none (e.g., the man in the moon, constellations). Once we notice a pattern, System 1 quickly draws various conclusions. Kahneman details various heuristics behind these conclusions. Sometimes they involve logical shortcuts and errors, but they usually help us. These shortcuts generally consist of simple rules, which make sense in evolutionary, survival terms: they are right much more often than not.


For example, imagine seeing someone slam a door: it makes a loud noise, which we naturally associate with the door closing. There are other logical possibilities to explain the noise, but they are much less likely than the normal relationship between a door slam and a sudden noise. It is as though we are interpreting simultaneity as causation, or at least as evidence of only one intentional event, which explains both the visual and the aural stimuli [14]. Intentionality is a very important issue in our lives, and art that aims to communicate significantly needs first to convince us that it is intentional, and non-random.


This ‘intentionality” heuristic is only one of several. Let’s look in more detail at some of Kahenman’s heuristics which apply to music.




Kahneman’s first principle is that we automatically seek associations to current ideas. As he writes, “[System 1] offers a tacit interpretation of what happens to you and around you. It contains the model of the world that instantly evaluates events as normal or surprising. It is the source of your rapid and often precise intuitive judgments. And it does most of this without your conscious awareness of its activities.” [15] All sensory data—words, odors, things we see—leads to a fast mental search for links to current or remembered events, and then to a conclusion about familiarity or strangeness. These associations and conclusions are not rigorous; they betray a kind of primitive thinking, which is based on impressions of similarity and/or contiguity, or perceived causality.


An essential point for music is that the associative machine only represents activated ideas [16]; information that is not retrieved from memory might as well not exist [17]. As such, the “frame” created by a musical form—the feeling that, within a short time (assuming the composer is doing his job well, and the listener is paying attention), we are having a coherent and concentrated experience, with a provocative beginning, an intriguing development, and a convincing ending—makes this “activation” much easier. This is why we will only discuss here associations within the work being heard. These perceived associations often engender expectations about subsequent events, in effect setting us up to look for similar links elsewhere [18].


The common use of musical motives provides a clear example of the association principle in music. A motive is a short, well-defined pattern that attracts our attention. Once we focus on a motive, we start listening for it and noticing it as the piece unfolds.


A more elaborate example is the familiar musical “question and answer” structure: the period. The second phrase of a period starts similarly to the first one and is heard as a “reply,” which either conforms to or deviates from the latter.




According to Kahneman, “anything that makes it easier for the associative machine to run smoothly will also bias beliefs…Familiarity is not easily distinguished from truth.” [19] As this quotation suggests, we are much more sympathetic to things that are easy for System 1 to understand. Various experiments, cited by Kahneman, show that humans prefer familiarity to strangeness: familiarity translates into cognitive ease. In other words, familiarity makes us comfortable because it implies predictability—security. Our tendency to privilege familiar associations, via the filter of cognitive ease, is unconscious and automatic. Surely the enormous amount of repetition in most music is, at least in part, motivated by this natural preference: within any given work, after the opening, the amount of totally new material is usually rather limited. By repeating a given motive, phrase, or section, a composer effectively teaches it to us; we can then enjoy the subsequent, positive, feeling of recognition and security that such repetition provides [20]. This also suggests why music without significant repetition is often hard to grasp. This kind of difficulty, which Kahneman calls “cognitive strain,” happens when System 1 is overwhelmed. As Kahneman repeatedly reminds us, System 2 is lazy. In the case of a temporal art like music, which occurs in real time, it is impossible to slow down the piece, analyze what we have just heard, and arrive at a carefully reasoned conclusion. As a result, the listener often perceives music without repetition as difficult, or incomprehensible.


Given that System 1 takes the path of least resistance (cognitive ease), we need a clear idea about what aspects of music are most easily heard. I call this a theory of salience. In psychological terms, this amounts to understanding perceptual priorities [21]. Of course a trained musician will notice things that a layman will not, but this does not change the fact that certain kinds of perceptions and associations are much easier to grasp than others, especially at first hearing. Again, these are the ones that get processed quicker and more reliably. The point here is not that there is some rigid, fixed order of salience, but that there are different degrees of salience, which translate directly into degrees of cognitive ease (or strain). A composer needs to understand how our strong preference for cognitive ease will affect the listening process [22].


A short example from species counterpoint will clarify the question of salience/cognitive ease. Species counterpoint is generally used as an introduction to many idioms of tonal writing. Here it will be useful because it is commonly taught, and its rules are familiar and fairly standardized. In species work, as in most classical counterpoint, parallel fifths are prohibited, since they create a momentary emptiness in the texture. Example 1 shows a clear-cut case of such fifths.



No teacher of species counterpoint would allow these fifths (here marked “x”): they involve chord tones on strong beats, and the return to E at the end of m. 1 creates consecutive parallels with the first beat of m. 2. Everything conspires to make these parallels very salient.


Example 2 is more subtle. Here the parallels (again, marked “x”) are between the last beat of m. 1 and the second beat of m. 2. Hearing these fifths requires effort: they remain audible, but are much less prominent, especially since the differing melodic patterns between the two bars no longer invite us to associate the fifths more than any other interval. The F in m. 2 does not trigger a change in melodic direction, but rather is part of a larger movement down to C in m. 3. The thirds—much richer—on the first beat of these measures are now much more salient.



In Example 3 we see the same parallels as in Example 2, but with an added third part. This added part attenuates their effect even more, since the listener’s attention is distracted by the attack of the low G in m. 1, the suspension it forms into m. 2, and the rich parallel sixths that follow. The added part thus dramatically changes what is salient: here we easily notice the many rich intervals, and no longer those (now) well hidden fifths.



Many counterpoint teachers would ban all of these parallels. But does this make sense? Surely it is more useful to help the student to notice and to understand distinctions of salience; after all, examples like the last one abound in the repertoire. I am reminded here of Brahms’s well-known catalogue of parallel octaves and fifths, assembled precisely so he could see under what circumstances the rules of thumb are no longer adequate guides [23]. The point here is that real musical situations bring many simultaneous elements into play, and the effects on the auditory result are often not as simple as they might seem. It is telling that basic music textbooks almost always emphasize the primacy of pitch, even though pitch is not necessarily the first element to be parsed. In fact, the perception of pitch relationships, as we have seen above, can be drastically affected by rhythm, texture, etc [24]. Often before having enough information to grasp the details of pitch relationships, we notice rhythm, dynamics, register, timbre, and tempo. And not surprisingly, these elements contribute significantly to the perception of musical character [25].


An example: if we imagine the opening of Beethoven’s Fifth Symphony played quietly, at a much slower tempo, and one octave higher, by one single flute, the musical character is completely different. I have actually tried the experiment of playing this opening, loud and vigorously, on the piano for non-professional musicians, but substituting a minor third (G G G E) for Beethoven’s major third (G G G Eb). Even though they easily recognize the piece, nobody notices the detail, since the character is virtually identical.


If we are genuinely interested in what listeners perceive immediately and easily (i.e., cognitive ease), we cannot ignore these other musical dimensions. The composer who pays insufficient attention to these issues cannot communicate with any force [26].


Just as there are degrees of salience, there are degrees of familiarity, ranging from immediately recognizable (reassuring), through increasing degrees of novelty, to strangeness (demanding immediate attention). These levels of familiarity/strangeness also correspond to degrees of cognitive strain. Novelty helps keep our interest alive, but only to a point; when it becomes overwhelming, it can make us uneasy [27]. Take, for example, some of the standard motivic transformations that every student learns (e.g., retrograde, diminution/augmentation). These are often much easier to see than they are to hear. Salient visual associations are not necessarily the same as salient auditory associations. Recognizing a motive in retrograde, for instance, is much easier for the eye than the ear. The reason is obvious: vision is not constrained by real time. We can easily look back and forth to investigate a connection. For the sense of hearing, on the other hand, memory is easily distracted by intervening events, and unless the composer deliberately puts the connection in the spotlight, the first idea is easily forgotten—drowned out, as it were, by subsequent demands on our attention.


When composers really do want distant connections to be noticed, they must present them in an especially striking way. Think of the dramatic fermata at the end of this passage from the opening of Beethoven’s Fifth Symphony (Ex. 4).



The high G, sustained in the violins, is impossible to ignore; it demands our attention: it almost sounds like a mistake. When this passage returns (Ex. 5), the fermata now leads to an oboe solo, which creates a wonderful emotional richness, as it ”explains” why the previous anomaly so attracted our attention. The spotlight created by the fermata focuses our attention on a special moment, so that when it takes a different turn the second time round, we make the connection easily. It is clear that Beethoven wants us to notice this connection, since he has made both instances so prominent.



The distance between keys during a modulation is another example of the various degrees of cognitive ease. In tonal music there is a well established scale of distance between keys: the more accidentals in the key signature change, the further the tonal distance. Different modulations introduce different amounts of new information within a given timeframe. All other things being equal, modulation to a closely related key is less demanding (threatening) than moving to a distant key. Remote modulations are normally more “arousing” than close ones, since they have less notes in common. The difference lies in the amount of mental “work” required from the listener (i.e., the degree of cognitive ease/strain).


Again, the point here is the relationship between salience and cognitive ease. Various degrees of salience are useful to composers in different situations. Sometimes the composer will want to make the listener anxious to hear more; at other times, (relative) ease can create a feeling of resolution. Often composers use gradations of cognitive ease to allow the music to breathe, playing with tension and resolution. Note, however, that this kind of tension stays within fairly narrow limits: listening to a piece of music is not normally dangerous, so any unease produced is not very threatening.




As Kahneman frequently points out, we have a very strong desire to see reality in a coherent way: we want events to be connected—to make sense—so that we can feel reasonably secure in our environment. Explaining our environment is so important to us that we automatically look for such coherence, even when none is intended. Consider the following two sentences:


“John was very tired.”
“Mary was not home.”


Although these sentences were not intended to form parts of a coherent story, we immediately, and without special effort, look for a connection between them to explain what is going on. In the following version, the connection between these two statements is much clearer:


“John was very tired. Mary was not home. His disappointment only deepened.”


The extra phrase links the previous two clearly into an explicit narrative structure. There is something very satisfying and reassuring when we feel we have made sense of the world around us; we look for such “stories” all the time [28]. Note once again that this search for narrative coherence is not a consciously decision; it happens automatically, and very quickly.


Narrative coherence depends on three important types of related observations, which System 1 is always seeking: norms, surprises, and causes. These are all parts of a convincing narrative.




“The main function of System 1 is to maintain and update a model of your personal world, which represents what is normal in it.” [29]


One of System 1’s main jobs is to assess normality. Consistent, predictable patterns easily become norms. A situation with no perceived patterns requires constant alertness: risks can arise unexpectedly at any time [30]. Our search for patterns is not a carefully weighed comparison in all details, but rather a quick judgment by System 1, drawn from easy associations, and often based only on superficial similarity and/or contiguity in time. These observations quickly lead us to conclusions about when a given situation is “normal”, and we find that reassuring. Establishing what is normal also allows us to direct out attention elsewhere.


Kahneman terms some situational norms “passive expectations,” e.g., the fact that in a concert, when the piece ends, the performers will stand up and leave the stage. You do not know exactly when it will happen, but it is a perfectly normal outcome in this situation. The same is true of many other normal events in our lives, where the detailed timing of these events is not very important. But we are interested here in “active expectations,” specifically created within a particular piece [31]. Specific aspects of a given work create norms, and thus lead us to expect a particular event (say, a continuation of a given pattern) at a specific time. These expectations can then be completely or partially confirmed, or contradicted entirely. And therein lies a good deal of music’s effect.


Various norms are established quickly at the start of a piece. Examples include the instrument(s) or voice(s) for which the piece is written, the tempo, familiar harmonic idioms (or lack thereof), and so on. These observations quickly establish the world within which the rest of the piece occurs, and we expect them to be respected. In music, as elsewhere, our minds do not like incoherence. Some norms are stylistic—culturally learned. A cultured Western listener has some learned expectations that predate the individual work, e.g., the word “symphony” causes us to expect a different kind of music from, say, the word “toccata.” And Mozart will not lead you in the same musical direction as Shostakovich. But above and beyond these large, common categories, each piece soon creates its own world, with its own specific patterns, enabling the composer to play with expectations based on these “normal” patterns.


Repetition helps us to notice such patterns, and to develop specific expectations about them: it leads to cognitive ease. For example, in a short prelude, there is usually only one motive/character, since short works require less variety. Once we have been exposed to that first motive, it generally becomes the norm for that piece, with which subsequent events are compared. In other words, beginnings turn easily into norms. Many preludes by Bach, Chopin, and others serve as examples [32]. Once we have norms, and thus expectations, surprises become possible.




“A capacity for surprise is an essential aspect of our mental life, and surprise itself is the most sensitive indication of how we understand our world and what we expect from it.” [33]


Expectations, and the possibility of violating them with surprises, are fundamental to most music. To see these processes in action, let us examine the beginning of Mozart’s Jupiter symphony (Ex. 6).



Mozart proposes a forceful motive, repeated twice, establishing a strong tonic-dominant axis. In mm. 3–4, a soft appoggiatura motive—a surprise, for the moment—is then repeated three times in sequence, bringing in all the other notes of the scale and clarifying the tonality. These two motives are then transposed and repeated. There follows a third motive (m. 9), a sort of fanfare, again somewhat surprising. The contrasts between these three motives, engendering two moderate surprises in a row, lead us to expect a rather substantial piece, not just because it would be unlikely to assemble an orchestra for a piece which lasts only a minute or two, but also because such a group of well defined motives, juxtaposed dramatically and without much transition, seems to pose a question: What is the connection between these characters? What are they doing here, confronting each other in such a short period of time? By quickly putting into question the “norm” set up by the first idea, Mozart has created suspense and curiosity about where the music is headed: there is a sense of needing more time for exploration. And indeed, this kind of narrative questioning creates a challenge to the listener, and it is a wonderful way to begin a piece of music [34].


Another common example of surprise involves contrast of dynamics [35]. Simple dynamic progressions—crescendos and diminuendos—establish clear, if temporary, norms. A steady crescendo in the orchestra suggests coordinated intensification. A decrescendo suggests a loss of energy. Once we add the element of surprise, there are more sophisticated possibilities. A crescendo leading to a subito piano is felt as an interruption, as is its opposite (i.e., a decrescendo leading to a sudden forte). In terms of musical form, the latter two patterns behave a lot like deceptive cadences: they surprise the listener, and thus create suspense. Such interruptions—a kind of surprise—are very commonly used in musical form: they leave something incomplete, inviting the listener to stay attentive.


Dynamic patterns may confirm pitch relationships (e.g., the harmony becoming more and more dissonant during a crescendo) or contradict them (e.g., becoming more and more consonant during a crescendo). Such complex patterns in more than one dimension of the music can provide many useful levels of punctuation, rather like the various forms of open cadence in the classical repertoire. They enlarge the composer’s expressive resources enormously. Surprise and predictable continuity depend on each other. Extreme surprises (e.g., a chimpanzee coming out and pouring yellow paint into the piano during a concert) break up the musical frame completely. Apart from very rare comic effects, this kind of exaggerated surprise has no musical interest. Expectations created within the piece itself, however, which in turn make surprises possible, are almost always intentional. For example, imagine a rising sequence: once the pattern is installed, you are not surprised if it continues, nor are you surprised by at least some degree of change, since a lengthy linear progression quickly becomes boring. It might be objected that surprise cannot, by definition, be foreseen. But music, being a temporal art, often suggests that some specific event will arrive—we just do not know when: the surprise in such cases is in the timing. Or conversely, we may notice a pattern to the timing, but not know exactly what will happen.


Anomalies (i.e., surprises in the musical patterning) form an important category of events within the overall trajectory of a piece. An anomaly attracts attention to itself, and we automatically search for an explanation. In the vast majority of cases, the anomaly will be somehow be “explained” by succeeding events, as in the example from Beethoven’s Fifth Symphony discussed earlier. Such anomalies are an excellent way to generate interest: again, they challenge the listener.




“Finding such causal connections is part of understanding a story and is an automatic operation of System 1.” [36]


Challenges to the listener attract attention and prompt a search for causes: why is this happening at this time? The search for causes is a significant mental function. From an evolutionary point of view, we need to know quickly whether certain combinations of events imply danger. We look for evidence of intention and/or of causality in order to distinguish random, accidental noises from patterned sound, and also to distinguish safe patterns from dangerous ones. We feel good when we find plausible causes. It is far better to err on the side of seeing danger where there is none. This is probably why our search for patterns is so fast, so automatic, and outside of conscious control, at least at first.


Although music is not normally dangerous to our survival, the heuristics Kahneman describes in this chapter still apply to musical experience. When the brain detects two or three aspects of music that coincide, there seems to be a strong presumption of intention, and sometimes even of causality.


For example, when a percussive event of some sort coincides with the start of something else (e.g., a melody, a rhythm) it can feel like the percussive event actually “causes” the other. Here is a typical situation: if, in a gentle phrase for oboe, the first note is accompanied by a chord on the harp, one has the sense that the oboe phrase is “triggered” by the harp, even though harps do not by any means automatically trigger oboes [37]. This leads us to what I call the “principle of coordination.” A good way for the composer to make music sound non-random and intentional is simply to coordinate two or more events, preferably from the same perceived source. In much classical music, this is almost trivially obvious. For example, when a chord is played by the whole orchestra simultaneously, it is almost impossible not to deduce a common intention. For the primitive logic of System 1, simultaneity seems to equal intention [38]. Of course, uncoordinated events can also create interest, but at a certain point, the absence of coordination will cause the listener to conclude that the stimuli are random.


Some recent composers intentionally avoid simultaneity and coordination. Perhaps the lack of such coordination explains why their music has not achieved popular acceptance. Kahneman points out that cognitive ease feels good, and cognitive strain is unpleasant, especially true when the cognitive strain is prolonged. When we are distracted by trying to make sense of whatever is causing the cognitive strain, we tend to need System 2. As we have seen, however, System 2 is not very useful in real time. At the very least, our quick deductions about intentionality/causality serve to focus our attention, since they strongly suggest that a pattern we have noticed is significant. In this sense, searching for causes is a fundamental (and involuntary) process, which can be exploited by composers to make their music sound inevitable and more forceful.




“It is the consistency of the information that matters for a good story, not its completeness.” [39]


“A story is about significant events and memorable moments, not about time passing. Duration neglect is normal in a story and the ending often defines its character. The same core features appear in the rules of narrative and in the memories of colonoscopies, vacations, and films.” [40]


As we have seen, System 1 searches for easy associations, norms, and causes for our experiences. Without any effort, we integrate these observations into a narrative, since it is very reassuring to interpret events in a coherent way. A narrative is also easier to remember than random, isolated events. Kahneman emphasizes that memory does not store events as a literal moment-to-moment transcription, but rather stores the highlights of the experience. Memory is stored as a convincing story, rather than a simple chronological list of details. The most important aspects of what is stored follow what Kahneman calls the “peak-end” principle: in memory there is much more importance accorded to climactic (peak) and ending moments than to the actual, real-time duration of the experience [41].


How does this relate to music? As Kahneman points out above, the ways we interpret and remember events as varied as colonoscopies, vacations, and films, all follow the same general principles. I propose here that they also apply to music. All of these experiences take place in a clearly circumscribed period of time. The overall experience is not random: it has a clearly defined beginning, a coherent evolution, which maintains our interest through various events, and a conclusive ending. Events are thus connected to one another as a narrative. The time frame of the work separates the experience from the ebb and flow of our everyday lives, and focuses our attention on salient relationships within the period in question. A complete experience of this sort is felt as a narrative whole. Of course, the notion of a story in music—apart from the obvious case of program music—needs to be formulated in musical terms. I am not saying that all music is programmatic, and certainly not in the sense of describing a specific sequence of events, identical for everybody. All I propose here is that an extremely common kind of coherence in music acts, overall, as a kind of narrative framework. A beginning engages our interest, and coherent ideas develop in comprehensible ways, moving through various highlights and contrasts while maintaining at least some suspense until the end, where the conclusion proposes a resolution. It is as though the music takes us on an emotional voyage, where the details of what we imagine are up to us. The fact that music, like a real story, allows for so many levels and degrees of punctuation and suspense makes possible extremely rich structures.


It is possible to combine any degree of contrast in one aspect of the music with any degree of familiarity, or unfamiliarity, in another. The resulting narrative structure can be rich and refined, rather like a story with various subplots and interrelated themes. And indeed much music in the Western canon exhibits just such combinations of familiarity and novelty, continually creating expectations and then playing with them. This is exactly how a good story works. Something interesting happens, the consequences are explored over the course of various “adventures,” and then there is a final resolution. The Sarabande from Bach’s French Suite in B minor (Ex. 7) demonstrates how such narratives work.



Let us examine the details of this musical “story.” The beginning establishes pulsation, tonality, register, and timbre (harpsichord) as norms. The opening motive, with its suspension on the third beat creates a mild surprise, and suspense. In the second bar the same motive is inverted and varied through the addition of sixteenth notes, increasing momentum. The third bar adds more sixteenth-note syncopations, as the bass moves down stepwise to the fourth measure. Here the rhythm slows down back to eighth notes, over a dominant chord, but the sustained seventh in the top part helps to maintain suspense. This first cadence is thus clearly not final. Starting in m. 5, the note values return to (now continuous) sixteenths, and the harmonic rhythm is somewhat faster overall until the climax in m. 7, which in turn contains the fastest harmonic rhythm so far. The right hand line gets gradually higher from mm. 5–7. There are also more dissonances attacked together in these bars than in mm. 1–4. Bar 7 has the most jagged line, with several large leaps, covering almost two octaves: these are the quickest and most dramatic register changes so far. All of these things add up to a rather potent climax, as not very much is held back, at least in the context of this piece [42]. The climax in m.7 is followed by a cadence in the mediant: the contour falls markedly, the low register appears for the first time, and almost all non-harmonic tones are eliminated. The harmony lasts the entire bar. This cadence is thus more conclusive that the one in m. 4. That said, it is not in the home key, and thus remains somewhat open-ended.


To summarize: an intriguing pattern is presented, it is developed and intensified through various ups and downs, to arrive eventually at a moment of (relative) stability. This kind of quasi-narrative construction appears in an overwhelming number of pieces in the standard musical repertoire.




We have seen here how some general notions about how the evolved brain does its job lead to stimulating ideas about how a composer can best communicate with his listeners. Only by respecting our normal mental limits and capacities can a composer hope to communicate significantly and effectively. Kahneman’s model of two selves, System 1 and System 2, proves useful here. Given the constraints on System 2, especially in real time, I propose paying more attention to the operations of System 1 in trying to understand our most basic, primary musical experiences. The associative machine, cognitive ease, norms, surprises and causes, and finally our ingrained preference for understanding reality as a narrative, all have direct implications for composers. Since the heuristics mentioned in Kahneman seem to be innate, and not changeable by an act of will, knowing the facts about them can be useful to theorists as well. His insights about how the mind works shed valuable light on how music works its magic on us.




Matthew Lane and Mitch Burke both made many extensive and constructive suggestions, which I much appreciate. Andrew Schartmann very kindly did a lot of the scholarly research lying behind the citations from the literature; his detailed comments were extremely useful during an extensive revision of this article.


Sylvain Caron’s encouraging remarks were much appreciated. As usual, my friend Charles Lafleur had many penetrating comments and suggestions, and this article would have been much less than it is, without our daily wide-ranging discussions.


My thanks to all of you. Of course, whatever faults remain are mine alone.




Aldwell, Edward and Carl Schachter. Harmony and voice Leading. 4th Edition. New York: Schirmer, 2010.

Ball, Philip. The Music Instinct. London: Random House, 2010.

Boyd, Brian. On the Origin of Stories. Cambridge, Massachusetts, the Belknap Press of Harvard University Press, 2009.

Bregman, Albert. Auditory Scene Analysis. Cambridge, Massachusetts: The MIT Press, 1999.

Huron, David. Sweet Anticipation. Cambridge, Massachusetts: The MIT Press, 2007.

Huron, David. “Tone and Voice,” Music Perception, 19, no. 1 (2001): 1-64.

Huron, David. “Crescendo, Diminuendo Asymmetries in Beethoven’s Piano Sonatas”, Music Perception, 7, no. 4 (1990): 395-402.

Juslin, Patrik N. and John Sloboda, J. Music and Emotion. Oxford: Oxford University Press, 2001.

Kahneman, Daniel. Thinking, Fast and Slow. Canada: Random House of Canada, 2011.

Lerdahl, Fred and Ray Jackendoff. A Generative Theory of Tonal Music. Cambridge, Massachusetts: The MIT Press, 1983

Margulis, Elizabeth Hellmuth. On Repeat: How Music Plays the Mind. New York: Oxford University Press, 2013.

Mast, Paul. “Brahms’s Study, Octaven u. Quinten. U. A.”, Music Forum no. 5 (1980): 1-141.

Meyer, Leonard B. Emotion and Meaning in Music (Chicago: University of Chicago Press, 1956).

Snyder, Bob. Music and Memory. Cambridge, Massachusetts: The MIT Press: 2000.

Zbikowski, Lawrence. Conceptualizing Music: Cognitive Structure, Theory, and Analysis. Toronto: Oxford University Press: 2002.


[1] Meyer argues that emotional content in music arises primarily through the composer’s manipulation of a listener’s expectations. Leonard B. Meyer, Emotion and Meaning in Music (Chicago: University of Chicago Press, 1956).

[2] David Huron, Sweet Anticipation: Music and the Psychology of Expectation (Cambridge, MA: MIT Press, 2006).

[3] A number of authors have incorporated cognitive psychology into the teaching of music theory, which has implications for composition, but no one has focused exclusively on composition from this perspective. For example, Paula Telesco’s (2013) advice to heed the limits of working memory when teaching aural skills has compositional applications, many of which are explored here. See also Elizabeth West Marvin, “Research on Tonal Perception and Memory: What Implication for Music Theory and Pedagogy?” Journal of Music Theory Pedagogy 9 (1995), 31–70.

[4] Fred Lerdahl, “Cognitive Constraints on Compositional Systems,” Contemporary Music Review 6, no. 2 (1992), 97–121.

[5] Although this article puts cognitive constraints in the foreground, it does not seek to downplay the importance of experience and acculturation in composition and listening. As Lerdahl and Jackendoff (1983, 3) write, “A listener without sufficient exposure to an idiom will not be able to organize in any rich way the sounds he perceives. However, once he becomes familiar with the idiom, the kind of organization that he attributes to a given piece will not be arbitrary but will be highly constrained in specific ways.”

[6] The ideas on which this article draws are outlined in Daniel Kahneman, Thinking, Fast and Slow (New York: Farra, Straus and Giroux, 2011).

[7] A number of theories are put forth in The Origins of Music, eds. Nils L. Wallin, Björn Merker, and Steven Brown (Cambridge, MA: MIT Press, 2000), 269–480.

[8] For example, Albert Bregman, Auditory Scene Analysis (Cambridge, MA: The MIT Press, 1999).

[9] This raises the question of musical universals—an idea that has been criticized for promoting biological determinism and downplaying the importance of acculturation. My aim here is to suggest how research in human perception might inform a composer’s musical choices, not to advocate for (or deny) such universals. For more on the tentative nature of musical universals, see Bruno Nettl, “An Ethnomusicologist Contemplates Universals in Musical Sound and Musical Culture,” in The Origins of Music, eds. Nils L. Wallin, Björn Merker, and Steven Brown (Cambridge, MA: The MIT Press, 2000), 463–72.

[10] David Huron (2007, 13) also mentions the existence of fast and slow systems. His “fast” system (analogous to Kahneman’s System 1) works very quickly and draws rather primitive conclusions, which are not rigorously logical. These responses may eventually be changed or enhanced by more deliberate, conscious thought, but only after the fact.

[11] Kahneman, 31.

[12] And the powerful system one response will take place in any case.

[13] Lawrence Zbikowski (2002) explores how the meaning a given piece has for us can change over time. As an example, he discusses Proust’s first and later hearings of the Vinteuil sonata.

[14] Bregman (1999) discusses the various cues we use to we make sense of an auditory “scene.” For a summary of his findings, see p. 641ff.

[15] Kahneman, 58.

[16] Ibid., ch. 7.

[17] Ibid., 85.

[18] There is the special case of music that includes direct quotations from other works or from nature, but this in itself seems unlikely to make for a satisfying musical experience, if the elements already mentioned (beginning, development, ending) are not present.

[19] Kahneman, 62.

[20] See Huron, 133, on the effects of repetition—the so-called “exposure effect.”

[21] Robert B. Snyder (2002, 222) notes that most listeners agree about what is most salient, which suggests that it is probably a function of the way our brain works.

[22] Again, this is not to deny that perception can change as a result of multiple hearings: obviously, as the listener becomes familiar with the work, his perspective will change. But the first hearing is crucial: if the listener is not sufficiently engaged at first hearing, he is not likely to listen a second time.

[23] Paul B. Mast, “Brahms Study, Octaven u. Quinten u. A., with Schenker’s Commentary Translated,” Music Forum 5 (1980), 1–141.

[24] Edward Aldwell and Carl Schachter (2003, 76–77) provide a fairly nuanced discussion of hidden fifths, cautioning students to be attentive to the effects of context on salience. Their treatment of parallels, however, does not mention several key musical parameters.

[25] See P.N. Juslin and J. Sloboda, Music and Emotion (New York: Oxford University Press. 2001), 235ff.

[26] Some of this may seem obvious, until one has taught a few composition students. Given that, as mentioned above, so much musical education virtually ignores aspects of the music other than pitch, (and sometimes rhythm, but mainly as it applies to motives) this is not surprising.

[27] Huron (2006, 22) points out that when the “danger” resolves very fast, the feeling is distinctly more pleasurable, due to the contrast between the momentary doubt and the satisfaction of resolution.

[28] A fascinating discussion of this phenomenon in literature can be found in Brian Boyd, The Origin of Stories (Cambridge, MA: Harvard University Press, 2009).

[29] Kahneman, 71.

[30] Of course, new dangers can arrive even when we have already noticed patterns, but the point here is our increased comfort when we feel we have understood the situation.

[31] For a detailed discussion of active expectations, see Kahneman, 72, and Huron, 2007, passim.

[32] Chopin’s Prelude in C-sharp minor might seem an exception, but though it starts with a pair of contrasting motives, these are repeated several times in alternation, with mild harmonic variety, rather than leading into new terrain. Even when there seem to be a few more ideas, as in Prokofiev’s Visions Fugitives, #3, the degree of novelty/contrast remains relatively low, the ideas will be carefully prepared through common elements, smooth voice leading, etc., and the composer will generally return before long to the previous idea, in a very salient way.

[33] Kahneman, 71–72. For more on surprises in music, see Huron, 19–40, 269ff.

[34] Some listeners might not be surprised by these three motives, but if instead we try just repeating the first motive several times in sequence, it is evident that Mozart has given us more than the minimum amount of novelty required to maintain interest. The striking contrasts in Mozart’s version capture our attention.

[35] In “Crescendo, Decrescendo, Diminuendo Asymmetries in Beethoven’s Piano Sonatas,” Music Perception 7, no. 4 (1990), 395–402, David Huron finds that, overall, crescendos are more frequent and longer than diminuendos. This ties into our discussion: diminuendos give a sense of energy loss, which can make it sound as though a piece is coming to an end. Having more crescendos than diminuendos sounds right intuitively, especially when we think of the energetic quality of so much of Beethoven’s music.

[36] Kahneman, 75.

[37] As Kahneman (ibid., 77) notes, “The prominence of causal intuitions is a recurrent theme in this book, because people are prone to apply causal thinking inappropriately.”

[38] Bregman (1999) discusses in depth the way our minds separate auditory stimuli into various planes of perceived sound.

[39] Kahneman, 87.

[40] Ibid., 387.

[41] Ibid., 386ff.

[42] Something extremely unusual, like adding trombones at m. 7, would make it still more striking, but in the context of a work for solo harpsichord, the result would just sound like a chance coincidence or a musical joke.