ISSN 2158-5296
cycle, pulse, projection, grouping, tempo, timing
This essay develops and applies a theoretical language for describing a special family of musical cycles that feature five elements at very slow tempos, with the aim of appreciating the listening experiences they afford. Referring to experimentally determined constraints on the perception and memory of duration and content, it shows that slow 5-cycling occupies liminal cognitive territory in which sequential expectations begin to dominate over timing expectations. To explore the nature and variety of effects that are possible in this process, it presents transcriptions and analysis of five examples of world music of diverse genres and cultures—work song, ritual, and entertainment— and in a range of textures including solo, choral, call-and-response, and polyphony. Through analysis of grouping, pulse salience, and durational projection–that is, of the features that inform listeners’ sequential and timing expectations—it shows how slow 5- cycles afford some temporal experiences that differ from those in the more common fast 5-cycles, and that they can differ from each other substantially.
John Roeder is an emeritus professor of music theory at the University of British Columbia (Canada).
Click for DOI, citation, and PDF version.
[1] Cyclical music, which features persistent immediate repetition, serves many purposes in cultures across the world, accompanying dance, ritual, repetitive work, and entertainment. It has also attracted the attention of music theorists who are interested in understanding how it engages basic musical cognitive processes. Such inquiries may benefit from considering a wide variety of examples such as are offered by early recordings of disparate and unconnected cultures. This essay describes and analyzes five diverse examples of a special family of musical cycles, with the aim of appreciating the listening experiences they afford by assessing how they engage with some basic aspects of rhythm perception.
[2] Recent psychological approaches to framing Western music theory suggest conceiving of cycling in terms of the expectations that it raises (Meyer 1989, Huron 2006, Margulis 2013). The expectations are of two different types: what will next happen, and when something will next happen. The first type, sequential expectation, engages our ability to remember and recognize the repetition of a series of event qualia, or properties, such as pitch or timbre, even if the rhythm of that series varies. It explains how we recognize rhythmically varying spoken phrases or pitch motives and anticipate their continuation. The second type, the expectation of when something will happen, involves different perceptual mechanisms of timing: the assessment of duration, by comparing it to an intuited temporal metric. Theorists have identified two of those mechanisms as the entrainment of a pulse (a regularly repeating duration; London 2012) and durational projection, or the anticipation that a just-completed duration will be immediately reproduced (Hasty 1997). Sequential and timing expectations are so cognitively fundamental and are defined so independently of particular musical styles that it seems reasonable to assume they apply to the listening and making of music generally, not only in Western cultures.
[3] In turn, they inform two complementary ways of conceiving of the elements of a cycle. In one conception, the cycle consists of a series of qualia, such as pitch or timbre, each associated with an event. The series may have a specific rhythm, but the cycle is characterized essentially by the repetition of the same qualia in the same order, even if the rhythm varies. In the other conception, when a pulse is clear and countable, we may experience the cycle as a duration spanned by a certain number of beats that may or not be marked by events. According to the latter conception, the vamp rhythms of the original Mission: Impossible theme and the Brubeck quartet’s “Take Five” are both 5-beat cycles, even though one has four events and the other six. The event qualia that define a sequential cycle need not begin on beats, and the beats by which we characterize a timing cycle need not coincide with event onsets.
[4] Both types of expectation are subject to certain general cognitive constraints summarized in Huron (2006), London (2012), and Margulis (2013). It is difficult to remember and recognize the repetition of series with a large cardinality (number of elements, be they qualia or beats). It is difficult to recognize the immediate reproduction of undivided durations longer than two seconds. And it is difficult to entrain (develop regular peaks of attention corresponding to) pulsation faster than 250 BPM (beats per minute) and slower than 30 BPM; pulsation is most salient between 80 and 100 BPM, corresponding to pulse durations of 600–750 milliseconds.
[5] Accordingly, it would seem that short cycles with a clear regular pulse and few elements would be best tuned to maintain attention by raising and satisfying expectations. But there is a catch: if a cycle repeats exactly, with no change in any aspect, listeners tend to habituate to it quickly. They may feel rewarded for being able to predict what happens next, and even take pleasure in moving cyclically to the music. But their other and richer sorts of responses involving expectation—such as imagination, tension, reaction, and appraisal (Huron 2016, 16)—may be thwarted. So, composers of cycles face what we might call the strict cycle paradox: to minimize habituation, exactly repeating cycles need to be continually interesting.
[6] The theoretical and analytical literature discusses many ways that interest can be composed into cycles. Rhythms can be chosen to afford hearing different conflicting pulses (Locke 2010). Pitch sequences can be designed to segment into concurrent auditory streams (Bregman 1994, Tenzer 2022). The qualia patterns can offer the listener a flux of hierarchical depth (Yust 2018): instead of every beat or event being equally important to the identity of the cycle, series of pitches can be organized into shorter memorable groups that establish and develop motives or other processes, or phenomenal accents can be timed to group the tactus into binary or ternary meter.
[7] Another strategy to attract continuous attention involves arranging tempo, rhythms and grouping structure to avoid implying a steady pulse in a tempo within the most easily entrainable range of 80–100 BPM. Example 1 shows such a passage and explains its interest with reference to Hasty’s theory of durational projection. It offers two possible transcriptions of a passage from “The Grinding Stone,” a game song recorded in 1936 by an Estonian women’s family group.
[8] Theoretically, from the singers’ steady attacks one could infer a beat at a tempo of 195 BPM, represented as eighth notes in the example. But experientially that tempo is more rapid than we would prefer as a tactus, and some beats are not marked by events. We might gravitate toward another pulse by tracking how the same series of durations and pitches repeats every 1.5 seconds. That period is slower than we would prefer as a tactus, but it is well within the range in which durational repetition can be felt. The perception of the repetition of this cycle duration is symbolized on the example by the series of arcs between the two staves: when we hear a 1.5-second duration between corresponding events in consecutive repetitions – symbolized by a solid arrow—we project that it will repeat immediately, and it does, as symbolized by the following dashed arc. These regularities—both the fastest beat and the periodicity of the repeating pitch-and-duration series—are perceptible, but neither is as salient as it would if its tempo were between 80 and 100 BPM.

Example 1. Two transcriptions of a passage from “The Grinding Stone”
[9] The most salient durations of this passage are the interonset intervals that span two or three of the fastest beats. Instances of them are indicated by solid arrows at the top and the bottom of the example. If either one repeated persistently and immediately, we could easily entrain to a pulse of 117 or 78 BPM, respectively. But neither one does. Instead, our projections of immediate repetition, symbolized by the dashed arcs, are constantly denied, as the rhythms prompt us to continually reset our shortest-term expectations. To be sure, we may readily perceive and move to the cyclical alternation of short (2) and long (3) durations. But to the extent that we keep hearing the thwarting of projections, the cycling seems much more dynamic than it would in, say, a straight 4/4 meter in which quarters, halves, and wholes constantly repeat. A similar dynamism can manifest even in pure duple meters though the use of syncopated rhythms— tresillo, clave, and other rhythms that have been called “3-generated” (Cohn 2016), “diatonic” (Rahn 1996), or “Euclidean” (Toussaint 2013)—and in those contexts it is often associated with the sense of “groove” (Danielsen 2006). But it necessarily manifests in the cycling of rhythms that span a prime number of beats greater than three.
[10] This particular pattern of pitches and durations in “The Grinding Stone” is especially engaging because it is multivalent: it affords alternating our attention between two mutually exclusive ways of grouping the events, as expressed by the differences in beaming between the two transcriptions in Example 1. If we focus on the recurrence of the melodic succession <F♯,E♯>, then the events seem to group as a long-short rhythm, as represented on the top staff. If we instead conceive of the pattern as alternating between a held F♯ and an E♯ prolonged by a neighbor tone, then it seems to group as short-long, as represented on the bottom staff. We can sense the difference in grouping whether or not we associate the event onsets with the fast beats. In the first way of hearing, the beginning of the next iteration pattern starts sooner than we have projected locally; in the second, it comes later than we have projected. Concurrently, though, we can track the duration of the whole cycle well enough to feel that the beginning occurs when we expect, in tension with these more short-term senses of too soon and too late.
[11] All the busy interplay of temporal sensations described above results from the interaction between three distinctive features perceptible in the persistently repeated pattern. Firstly, its qualia series, because of the prime number of beats that it spans, resists grouping into equal durations, and yet contains few enough events to remember and follow while still affording some variety and contour. Also, the event series is brief enough that its duration can readily be projected and heard to be reproduced. This accounts for the characteristically cyclical sensation of turnaround: the growing expectation towards the end of each statement that the pattern will repeat again. Thirdly, the pulse underlying the pattern is too fast to use to comfortably measure longer durations, and instead inclines us to experience them as undivided. I will call such repeating patterns “fast 5-cycles.” Although perhaps not as widespread as syncopated rhythms in duple meters, they are still broadly employed, most familiarly in Balkan dance music, jazz, and popular music.
[12] Not all 5-cycles are so fast, however. Example 2 gives a framework to understand some basic differences between 5-cycles at different tempi, through simple diagrams that set aside considerations of grouping and restrict consideration to isochronous series of five events of distinguishable qualia, numbered 1 through 5. The top third of the example refers to fast-tempo cycles like Example 1, for which our expectations of repetition focus on the sensations of the repeated cycle period. In the middle of Example 2, we imagine moderating the tempo until the cycle’s total duration becomes too long to easily grasp and compare. Because of the cardinality of the cycle, this happens just as the pace of the events moves into the region of highest tempo salience, which also facilitates hearing and comparing their qualia. The total duration of the cycle may be grasped indirectly in terms of the entrainable beats (as in the “long-form meters” discussed in Clayton 2019). But the temporal experience is different from that of the fast 5, focusing on the durations from beat to beat as well as the qualia of each event in the sequence.

Example 2. Contrasting sensations of 5-cycles at different tempi
[13] Continuing to stretch the tempo, we arrive at a situation, schematized on the bottom third of Example 2, in which the events come too slowly to track as an isochronous pulse and the total duration of the cycle is far longer than we can project and hear realized. Our cyclical expectations focus on the sequence of event qualia, not their exact timing, that is, on what comes next, not exactly when. The slower the pace, the more important the qualia of the events are to the sensation of repetition. Instead of regular attacks, a cycle becomes a series of specific kinds of events, and the timing can vary somewhat in successive iterations without much disrupting the cyclicity. The sense of turnaround that is so crucial to cyclical experience arises in such slow cycles from our sequential expectation that the last event of each cycle will be followed by the first event of the next iteration, rather than from the projection that the cycle’s duration is about to be completed. And it is even possible for extra events to appear that are hierarchically subsidiary, in the sense that their presence or absence does not affect our sequential expectations: during the time from one element of the cycle to the next, we can accept these extra events, especially if they are not accented, and still recognize when the event we expect appears.
[14] Thus, slow-5 cycles constitute perceptual boundary cases. Their pace precludes the kind and depth of entrainment afforded by repetitions at faster tempos. Their odd prime number of elements precludes the consistent duple or triple beat- and event-groupings common to isochronous metrical music. And the timing of their events is less salient than the events’ sequence.
[15] Although one might therefore imagine slow five-cycles to be rare, they actually may be heard in many different kinds of music. The remainder of this essay analyzes and compares examples from early to mid-20th century recordings of music from societies then relatively unaffected by each other and by the assimilation of Western metrical practices. As in Example 1, the music is represented by transcriptions that combine Western pitch and durational notation with timing data determined with the help of audio software. Most of them include just the essential aspects of multiple iterations of a cycle, rather than the details of a particular instance. Although they are therefore neither precise nor emic, they suffice to support the kind of observations about temporality with which I am concerned, which are not affected by slight differences of pitch and timing. They are ordered from simplest to the most complex interactions of durational and sequential expectations that they afford. Through them, the essay develops and applies a theoretical language to describe the variety of special temporal experiences that are afforded by the constant reiteration of an asymmetric series of event qualia.
[16] For a preliminary but direct encounter with some of those experiences and to further refine our conceptions, let us consider an actual example of a slow 5-cycle. It is a recording of Zinzir, a member of the Gizra people in Papua New Guinea, playing a bamboo flute. The cycle length, over 14 seconds, seems extreme, but it is not unique; one can encounter even slower cycling in the examples of Mongolian long song and Noh theater in the Voix du monde recording collection.

Example 3. Very slow sequential 5-cycling in “Flute”
[17] The transcription in Example 3 represents its first four cycles (Roeder 2019), positioning the events time-proportionally, that is, horizontally in correspondence with their onset times (determined from the waveform display in Audacity). Each cycle consists of many variably timed events, but the succession of pitches repeats almost exactly each time. I characterize the cardinality of the cycle as five because, although there are many pitch-events, the event succession segments hierarchically into five distinct groups, each including a single most accented event. The hierarchy is rhythmic, not metric: each group begins with a flurry of brief pitches that lead to a pitch event marked by its long duration relative to them.
[18] In faster cycling like that of Example 1, a sense of drive from cycle to cycle and event to event arises from timing expectations of pulse and durational projection. But this cycle lasts too long and the timing of the accented events (between 1.8 and 2.4 seconds) is too slow and inconsistent for that. One might sense that some durations are immediately reproduced, but that does not occur regularly, so sensations of cycling depend mainly on recognizing the repeating series of accented pitches <E5, E5, C5, E5, A4>. (See Roeder 2023 for another example from nearby culture where this is also the case.) The cycling is of a sequence of qualia, not of isochronous beats.
[19] Despite the lack of pulsation, some pitch processes afford sensations of continuity and turnaround. Across the groups during each cycle recur three pitch gestures, indicated by differently colored lines on the first system: a rise from A4 to E5, an alternation of E5 and G5, and a descent from G5 to A4. These are arranged along with the pattern of long notes to provide a sense of progression from group to group and to give a sense of a departure from then return to the A4.
[20] The player’s breathing, indicated by rests on the transcription, suggests grouping the accented events as 2+1+2: the opening two associated with the A4-to-E5 gestures; a contrasting middle group focused on the elsewhere deemphasized C5; then the closing two concerned again with E5 and A4. Five events suffice to make this minimal ternary design clear; any more (especially if all are similarly ornamented) might be harder to remember.
[21] This brief analysis shows, then, how both cyclicity and interest can be created even when the pace is too slow to raise or satisfy timing expectations. It suggests that it would be interesting to examine other (possibly less extreme) examples of slow 5-cycles. By comparing them, one may start to gain a sense of the varieties of experiences that can be afforded under those constraints. The analyses will focus especially on identifying the five accented events, how they group, how other events group with them, processes of continuity and form, turnaround, and any timing expectations they raise and satisfy or deny.

Example 4. 5-cycling in “The Grinding Stone”
[22] The fast 5-cycle discussed in connection with Example 1 is actually part of what I hear as a slow 5-cycle that is transcribed in Example 4. It begins with a call from a soloist whom the other singers then join in a response. One way to characterize it is as composed, like the flute piece, of five timespans, each initiated by a marked event. These five events are indicated by numbers over the score: the onset of the soloist’s first highest pitch and then the four durationally accented events in the response. As a series, a group, the five are distinctive and memorable. They also come regularly and often enough to be sensed as a slow beat, although its 40 BPM tempo puts it well out of the range of a comfortable tactus. Moreover, the total duration of the repeating group, about 7.5 seconds, is too long to measure. At this scale, the turnaround anticipation of the beginning of each cycle is due more to the repetition of the qualia sequence than to the anticipation that such a long duration will be exactly reproduced. Specifically, I learn to expect that after every four consecutive choral accents of the {D4, F♯4} dyad, the soloist will reenter with an anacrustic climb to her F♯4s.
[23] Durational projection still plays a role in my experience, however, provoked by the highly asymmetric form of the cycle. Although the soloist initiates each round, the most striking beginning is the first attack of the choral response. Not only is it texturally, dynamically, and durationally accented, it also initiates a span of clearly realized projections of the 5-eighths durations, as indicated by the solid and dashed arrows immediately above the transcription. Indeed, the response is just long enough to hear a 10-eighths duration from event 2 to event 4, immediately reproduced from event 4 to the next event 1—the soloist’s high note.
[24] In contrast, measurement during the solo is less certain. It does not participate in another reproduction of a 10-eighths duration. Also, as the arrows at the beginning of the transcription show, the projection of a quarter note suggested by the soloist’s anacrusis fails to be clearly reproduced. One might hear the 5-eighths of the repeated F♯s as another reproduction of the chorus’s prominent duration, but the weakness of the solo line compared to the subsequent stronger chorus attack gives that span the feeling of a continuation rather than of a dominant beginning. So, instead of understanding the five accented events as grouped into 2+3 or 3+2 beginning with the soloist, I hear a 2+2 grouping that begins with the chorus entrance. And since the choral responses suggest a fast 5-cycle themselves, as detailed in Example 1, I start to experience all the fast-5 effects I described earlier, but then must abandon them at the next interrupting call.
[25] The cycle is thus too slow and lopsided to manifest the typical fast-5 strategy of grouping maximally evenly into 2+3 or 3+2. Rather its longer span and content afford a richer interplay of expectations. The cycling is further enlivened by substitutions the chorus makes in the first two groups of their response, show by the starred ossia below the score. These variants appear unpredictably, sometimes in the second event group, sometimes in the third, and sometimes in neither. If we try to hear consistent beginnings and continuations based on when these substitutions occur, we will be disoriented every time.
[26] The next example, from Bulgaria, is a Slavic women’s choral social song like “The Grinding Stone,” but it is organized to create different temporal experiences with its distinctive slow 5-cycle. As shown by the transcriptions in Example 5, “Todorka Platno Tāčeše” repeats pairs of phrases, each pair sung alternately by two small groups on the same text. A steady 78 BPM tactus, notated here as quarter notes, is evident throughout each group’s turn. The total duration of the cycle likewise is over 7.5 seconds. Again, that is too far past the entrainment limit for a listener to develop much anticipation of the duration itself repeating. Indeed, as one group finishes, they sustain the last note past the entrance of the next group; occasionally that entrance continues the previous beat but often does not.
[27] Without that continuing pulse, it is difficult to feel the rhythmic cycle leading into its own persistent repetition. To borrow Victor Zuckerkandl’s (1956) metaphor, I do not experience its meter as if I were regularly bobbing down and up a series of regularly cresting waves. Its ending lacks an extended anacrusis, the upwelling of the coming wave before it peaks. It’s more like bodysurfing, in which the waves come more irregularly, each unleashing a rush of energy as it breaks that dissipates as I ride it onto the beach. Drawing on my sequential memory, I expect that the next wave will come, but my previous ride does not help me anticipate exactly when. The cycling feels imposed by an outside force rather than flowing naturally from its own metrical design, perhaps a reflection of the “ritual” genre to which the recording’s liner notes ascribe it.

Example 5. Two hearings of duration projection in “Todorka Platno Tāčeše”
[28] I do experience the characteristically cyclical sensation of turnaround, or anticipation of the next cycle-beginning, but it does not involve the metrical sense that an exact time span is about to be completed then reproduced. Rather it arises for me from a process of varying projective strength and toggling between two pitch poles as beat markers. The melody seems organized cleverly to promote these sensations. It also remains interesting as it repeats by offering the possibility of different interpretations of durational expectation, which are shown at the top and bottom of Example 5.
[29] Interpretation 1 focuses on regular accents of duration, leap and contour that afford grouping the tactus into five slow beats about 1.5 seconds apart, as in “The Grinding Stone”. The phrase begins on D4 with an anapestic setting of the word “todorka.” The strong accents of leap and duration set the onset of G4 as a dominant beginning of a half-note projection that is continued anacrustically by the beginning of a repetition of the anapest. When this projection is realized at the durationally accented onset of E, I strongly project a third beat a half-note later.
[30] However, the timings of the subsequent anapest and accents detract from that projection. A third statement of the anapest begins, but because the first short sound is a leap-accented G4, which was previously heard on the first slow beat, it sounds less anacrustic. At the moment (marked by an X under the E4) when I expect the onset of the long duration of the anapest articulating the third slow beat, I hear only a short duration continuing the stepwise descent. This truncates the expected anapest, and also turns out to begin a new anapest that places its durational accent later than the expected beat 3. But the surrounding environment supports neither the G nor yet the long D as the beginnings of realized half-note projections. The lack of accent when we expect beat 3 makes it hard to hear the cycling as durational.
[31] Compounding the uncertainty, both G and the anapests now disappear. The melody peaks instead at an embellished F♯. By taking that moment to mark slow beat 4, I am rewarded when an anapest returns a half-note later to affirm beat 5 with its long note. But G no longer participates in the projective action; D has taken over G’s role as dominant beginning, a switch that was hinted in mid-melody. By holding it so long, the singers seem to emphasize its new status.
[32] All this action happens even the first time through, before cycling becomes evident. But when the other section of the choir immediately sings the same phrase, the D dramatically reverts to its first function as an anacrusis to G in the opening anapest. This establishes a sense of polarity, imbuing D with an expectation that G will follow. Each cycle then plays out the shift of accentual role from one pole to the other, and this predictable oscillation of quale compensates for the ephemeral cycling of beats.
[33] I can hear the same shift when I opt to hear a different cyclical process represented by Interpretation 2 at the bottom of Example 5. This reading develops from attending not to accent or pulse but to a sequence of specific grouping quale, the beginning of the recurring anapest motive, and the projections it affords. The onsets of the first and second anapests initiate a half-note projection that is realized by the beginning of the third anapest, indicated by the numbers and arcs on the analysis, and that gives the G4 the projective role of dominant beginning. As in Interpretation 1, there follows projective disruption, as the early entrance of a fourth anapest disrupts the projection from 2 to 3. By hearing its first pitch E4 as a dominant beginning, I can hear a half-note projection starting on the F♯4 that is realized by the attack of the final held D4. But that projection is disrupted by the onset of the fifth statement of the anapest, on the E4, and metrical hiatus ensues after the last D4. As in Interpretation 1, this event takes on projective significance, a counter-pole to the G4s that will reappear as the cycle restarts.
[34] In both interpretations, the slow tempo and the quintuple organization contribute to this process. The deliberate pace makes more salient the durational accents that play a role in destabilizing the beat, and makes it easier to grasp the anapest as a motivic unit and recognize its truncation. With the total length of the cycle too long to attend to, I focus on the more salient half- and quarter-note durations that are involved in the projective activity. The 5-half-note cycle length is the minimum needed to establish the durational projection (and the identity of the motive) from 1 to 2, vary it around beat 3, and then restore it at beats 4 and 5.
[35] The quintuple cardinality and unmeasured pause between cycles in “Todorka” also feature in the next example, but its projection and grouping structure afford some temporal experiences that differ from those of the songs just studied. It is a “Boat Song” performed in 1970 by a group from the Bavuma people who reside on an island in Lake Victoria. It cycles a series of three brief call-and-response pairs, indicated by the top slurs on Example 6. In each of these three groups, the caller sings a distinctive type of text, with the first having non-semantic vocables, the second variable words, and the third a consistent refrain repeating the word “mwango.”
[36] Through repeatedly realized projections, a slow pulse can also be heard to emerge and fade briefly during each cycle, created by the regular marking of five events about 1.3 seconds apart. The first projection is set up by the duration from the onset of the long high G in the first call to the caller’s next attack, which begins the second group. Its projection is realized within the second group at the durationally accented terminal note of the call. The projection of the duration from the caller’s incipit to terminus within the second group is realized at the contour peak of the next group, when the caller emphasizes “mwango.” Then the projection of that duration is realized when the caller returns to and durationally emphasizes “mwango.” Thus, the projections are realized alternately within and across groups.

Example 6. Varying salience of duration in the “Boat Song” sequential cycle
[37] Along with the tenuous slow pulse, there may be heard an increase in the duration and cardinality of the groups and in the number of accents they span. The first group begins with an anacrusis and contains only one accented event. The second group begins with an accented event and ends after the next accented event. The third group does the same as the second, but also includes an anacrusis as did the first group. This sort of growth process was not evident in the grouping of the Slavic choruses.
[38] Although the cycles consistently last about 6.6 seconds, this periodicity is too slow to be entrained, especially considering that the five accented events do not evenly subdivide that timespan. After the last group, the caller takes a short hiatus: sometimes he begins the anacrustic E♭4 after the chorus’s F3, and sometimes slightly before, but never on the chorus’s B♭4. So, I must hear the previous 1.3 second projection realized late or not at all, and the accented moment at the beginning of the next cycle does not maintain the 46.5 BPM beat. After each such cycle-ending break, the beat needs to be re-established with three regular attacks. But with only five such accented attacks before the hiatus, the beat is clearly entrainable only during the caller’s refrain-like third group (and it is emphasized by the only repeated word, “mwango”). At the same moment, but not at other times during the cycle, I am given the opportunity to perceive a longer duration (2.7 seconds) projected and realized before the caller’s hiatus compels reorientation to a new beginning.
[39] Compensating for the variable presence of a pulse, grouping structure affords a strongly cyclical experience. The small number of accented events and the growth and variegation they manifest make it easy to remember the series and hear its immediate repetition. This creates a lively tension: even though the cyclical timing expectations are regularly thwarted, the sequential expectations are strongly satisfied.
[40] The last slow 5-cycle I will discuss is especially interesting for the way that it plays with durational expectation and gradually reveals quintuple organizations. It is the 1968 recording of a performance by Constance Magogo, a Zulu princess recognized as the greatest living authority on her people’s music (Rycroft 1975). She sings while accompanying herself on a musical bow called an ugubhu, which resembles a capoeira berimbau in structure and sound.
[41] In the introduction, played on the ugubhu solo, distinctive features of timing and content make it difficult to discern sustained pulse, grouping, durational reproduction, and cyclicity. It alternates two very low kinds of events that differ only slightly in pitch and timbre (pitch change is achieved by pinching the open string). As the spectrogram and time-proportional transcription in Example 7 illustrate, these events are not sustained. Rather, each one consists of a percussive bow strike on the bowstring followed by an unpredictably variable number of bow ricochets in an unpredictable rhythm. Judging from recordings of other songs by the same expert performer, this is not the only way to play the ugubhu, so it seems appropriate to consider how the technique interacts with other specific features of the song.

Example 7. Pulse fluctuation in the introduction to a Zulu song,
[42] Sometimes the main strikes of the bow appear regularly at a tempo of about 84 BPM, as is indicated by brackets below the transcription. That pulse should be easy to entrain. Militating against this, however, are a lack of pitch change on the beat as well as the jittery rhythms of the ricochets, some of which are as salient as the main strikes and do not subdivide the pulse equally. When there are five regular main strikes in succession, the pulse gains more salience before the disruption than when there are only three in succession. This distinction affords hearing a two-phase cycle of fluctuation from stronger to weaker manifestations of the pulse. I need not count fives to hear this, yet a quintuple organization does become somewhat apparent if I attend to longer durations rather than to the ricochets. As shown at the bottom of the example, every 1.4 seconds there is a main bow strike, usually at a (weak) change of pitch. So the repeating two-phase process involving the fluctuating 84 BPM pulse lasts five of these slow beats. I experience the five as the cardinality of a series of events that I count, not as a duration.
[43] Similarly vague in the introduction is the grouping structure. The main bow strikes may be conceived as a repeating 8-event series: a long A, then four short G♯s, then a long A, then two short G♯s. Conceived this way as a fixed object, it seems to manifest a characteristically African “imparity” pattern (Arom 1984, 57). Yet it is not easy to grasp as a whole, not only because of its overly long duration but also because it articulates grouping structure almost equivocally. Hypothetically, one might conceive of the cycle beginning to be marked by a change to a long A, interrupting a faster series of G♯s. But there are two such changes in each cycle, differing only in the number of preceding G♯s; moreover, the pitch differences are very small, and the varying ricochet rhythms work against associating the As that one might suppose to occupy parallel positions in successive iterations of the cycle.
[44] The introduction repeats this slow bow cycle three times, barely sufficient for a listener to register these effects, before the princess starts to sing. She ushers in several changes that affect how one senses the cyclicity of timing and qualia sequence. Example 8 transcribes a representative pair of cycles, orienting to the 84 BPM pulse stream, which it represents with quarter notes. The bow rhythm becomes much more regular than it was in the introduction, providing a main strike every quarter note and controlling the ricochets to subdivide each pulse equally. With the repetition of the already established rhythm of slower pitch changes, three clear levels of beats emerge to form a steady meter. The vocal melody contributes regular attacks on the tactus and the slow beat.
[45] More than just reinforcing entrainment, however, the melody creates a definite sensation of cycling by affording expectations of qualia succession. Changes of pitch mark the five slow beats, creating a repeating 5-pitch series <E4, D♯4, B♯3, A3, B♯3>. The change from D♯4 to B♯3 makes the third beat more distinctive than it was in the bow pattern alone, when there is no change of pitch.

Example 8. Emergence of clearer cycling later in the Zulu song
Audio Example 7.
[46] The singing also makes more definite the two phases of grouping structure that could vaguely be sensed in the introduction. In the first phase, the performer hums the phoneme “m” five times—thus manifesting five as the cardinality of a group of events as well as of slow beats. The last of the five hums is marked by the change of harmonic interval between vocal and bow pitches from a fifth to a third. In the second phase, she sings other phonemes—often words that vary from cycle to cycle—set as pitch leaps that stress the slow beats by a repeated anacrustic rhythm. These two vocal groups thus respectively include 3 and 2 slow beats, consistent with the grouping of the bow pitch changes.
[47] These changes make it possible to hear durations longer than 1.4 seconds as projected and realized. Yet, as in the fast-5 Example 1, accent and grouping offer two plausible ways to do so. In the hearing represented by projective symbols above the score, the first and last vocal hums function as dominant beginnings, and so the first whole-note span (Q) of the passage sets the measure for the following music. Its projection (Q’) could be heard to be realized by taking the second vocal “-ne” as the next dominant beginning. However, in order to perceive the beginning of the next cycle as the next dominant beginning, the listener must judge that the whole-note projection Q-Q’ was in fact not realized, but that a new dotted-whole projection P arises. But to hear the next cycle to have the same meter as the previous one, the realization P’ of this dotted-whole projection must be heard to be cut off by a dominant beginning on the fifth hum. The other hearing, represented below the score, also takes the first whole note (R) as projective (R’), but, cued by the stronger durational accent and bow-pitch change on the first vocal “ne,” revises the initial projection to be a dotted-whole (S). As in the first hearing, though, the beginning of the next cycle denies realization S’ of this projection, and affords hearing the realization of the whole-tone projection T-T’. Both hearings involve the constant revision of durational expectations, as determined by the particular pitches and rhythms of the cycle.
[48] The song and revised ugubhu rhythm thus reveal two quintuple groups: the five steady hums in the first sung segment, and the series of five events that are both beats to count and specific members of a predictable pitch sequence. Both of them afford my perception of the alternation and overlap of whole-note and dotted-whole-note projections, which gesture at a 5-whole-note duration for the cycle that is beyond my ability to judge durational reproduction.
[49] I began this essay with a theoretical argument that slow 5-cycling occupies liminal cognitive territory in which sequential expectations begin to dominate over timing expectations. Consideration of five examples has shown how this abstract distinction may be heard to play out in various ways in music of diverse genres and cultures—work song, ritual, and entertainment—and in a range of textures including solo, choral, call-and-response, and polyphony. Through analysis of grouping, pulse salience, and durational projection—that is, of the features that inform listeners’ sequential and timing expectations—I have demonstrated how slow 5-cycles afford some temporal experiences that differ from those in the more common fast 5-cycles, and that they can differ from each other substantially. As the word “hearing” in this paper’s title and my persistent use of the first-person indicate, I have framed my discussion as a reflection upon my own position as a Western-enculturated listener, not as an assertion about the nature of the music I discuss or about the conceptions and intentions of its creators. Nevertheless, I hope that the grounding of my discussion in basic perceptual mechanisms will make it possible to build bridges to indigenous conceptions, and to recognize a type of cycling that merits special consideration in our contemplation of repetitive processes in music across the world.
Arom, Simha. 1984. “The Constituting Features of Central African Systems: A Tentative Typology.” The World of Music 26 (1): 51–64.
Bregman, Albert S. 1994. Auditory Scene Analysis: The Perceptual Organization of Sound. MIT Press.
Clayton, Martin. 2019. “Theory and Practice of Long-form Non-isochronous Meters: The Case of the North Indian rūpak tāl.” Music Theory Online 26 (1).
Cohn, Richard. 2016. “A Platonic Model of Funky Rhythms.” Music Theory Online 22 (2).
Danielsen, Anne. 2006. Presence and Pleasure: The Funk Grooves of James Brown and Parliament. Wesleyan University Press.
Hasty, Christopher F. 1997. Meter as Rhythm. Oxford University Press.
Huron, David. 2006. Sweet Anticipation: Music and the Psychology of Expectation. MIT Press.
Locke, David. 2010. “Yewevu in the Metric Matrix.” Music Theory Online 16 (4).
London, Justin. 2012. Hearing in Time: Psychological Aspects of Musical Meter. 2nd ed. Oxford University Press.
Margulis, Elizabeth Hellmuth. 2013. On Repeat: How Music Plays the Mind. Oxford University Press.
Meyer, Leonard B. 1989. Style and Music: Theory, History, and Ideology. University of Chicago Press.
Rahn, Jay. 1996. “Turning the Analysis Around: Africa-derived Rhythms and Europe-derived Music Theory.” Black Music Research Journal 16 (1): 71–89.
Roeder, John. 2023. “Hearing Durational Process in a Huli Song.” Music Analysis 42 (3): 262–268.
Roeder, John. 2019. “Formative Processes of Durational Projection in ‘Free Rhythm’ World Music.” In Thought and Play in Musical Rhythm: Asian, African, and Euro-American Perspectives, ed. Richard Wolf, Stephen Blum, and Christopher Hasty, 55–74. Oxford University Press.
Rycroft, David K. 1975. “The Zulu Bow Songs of Princess Magogo.” African Music 5 (4): 41–97.
Tenzer, Michael. 2022. “Narrow Dimension: A Balinese Gamelan Angklung Cycle.” Presented at the joint conference of the Society for Ethnomusicology, the American Musicological Society, and the Society for Music Theory, New Orleans, LA, November 11, 2022.
Toussaint, Godfried T. 2013. The Geometry of Musical Rhythm: What Makes a “Good” Rhythm Good? CRC Press.
Yust, Jason. 2018. Organized Time: Rhythm, Tonality, and Form. Oxford University Press.
Zuckerkandl, Victor. 1956. Sound and Symbol: Music and the External World. Trans. Willard R. Trask. Pantheon Books.
Zinzir. 1979. “Flute.” From Australia: Songs of the Aborigines and Music of Papua, New Guinea. Lyrichord LYRCD 7331 compact disc.
Varbana, Anne with Matryona Suuvere, Natalia Varbana and Varvara Varbana. 2003. “Käsikivi [The Grinding Stone].” CD 3, track 2 of the Anthology of Estonian Traditional Music. Tartu, Estonia: Eesti Kirjandusmuuseum compact disc set EKMCD 005.
[Women of Ihtiman, Bulgaria]. 1990. “Todorka Platno Tāčeše.” Track 25 of “Two Girls Started to Sing…”: Bulgarian Village Singing. Cambridge, MA: Rounder Records CD 1055.
[Singers from the Bavuma group]. 2000. “Boat Song.” Disc 2, track 17 of Music! The Berlin Phonogramm Archiv Vol. 2 – Tape Recordings Mono, 1951-1974. Berlin: Wergo CD SM 1701 2.
KaDinuzulu, Constance Magogo. 2000. “Zulu Songs Accompanied by The Musical Bow ‘Ugubhu,’ Sung and Played by Princess KaDinuzulu. Village of Mahlabatini, South Africa, 1968.” Disc 2, track 15 of Music! The Berlin Phonogramm Archiv Vol. 2 – Tape Recordings Mono, 1951-1974. Berlin: Wergo CD SM 1701 2.