Performance of Enjambments Performance of Enjambments

Reuven Tsur

The Performance of Enjambments, Perceived Effects,
and Experimental Manipulations

This paper offers further empirical evidence in favour of my conception of poetic rhythm and performance as presented in my book Poetic Rhythm: Structure and Performance -- An Empirical Study in Cognitive Poetics.1 It claims that in an enjambment the performer may convey both the verse line boundary and the run-on sentence as perceptual units, however strained, by having recourse to conflicting phonetic cues: cues of continuity and discontinuity simultaneously. In my book I provided some empirical evidence for this assumption. Among other instances, I considered two readings of the same enjambment, by two leading British actors. I collected responses to these two performances from a wide range of listeners, who were unanimous in their judgment that in one of the two readings, but not in the other one, both the verse line and the run-on sentence are perceived as perceptual units. I have shown that in the former, but not in the latter reading the performer had recourse to conflicting cues, of continuity and discontinuity, simultaneously. In this paper I am going one step further: I will experiment with performances of enjambments that have been judged to suppress the perceptual boundary of the first line. I will attempt to add conflicting cues by electronic manipulations and observe the outcome.

In my book I stated my position with reference to two issues in a recent "state-of-the-art" summary of performance, the "Performance" entry of The New Princeton Encyclopedia of Poetry and Poetics (1993). The first issue concerns delivery style: "C. S. Lewis once identified two types of performers of metrical verse: 'Minstrels' (who recite in a wooden singsong voice, letting scansion override verse) and 'Actors' (who give a flamboyantly expressive recitation, ignoring meter altogether)" (893). I argued that in between these two delivery styles there is a third one, which I call "rhythmical performance", and that this "type" is at the very core of poetic rhythm. The second issue concerns ambiguity. "Chatman isolates a central difference between the reading and scansion of poems on the one hand and their performance on the other: in the former two activities, ambiguities of interpretation can be preserved and do not have to be settled one way or the other ('disambiguated'). But in performance, all ambiguities have to be resolved before or during delivery. Since the nature of performance is linear and temporal, sentences can only be read aloud once and must be given a specific intonational pattern. Hence in performance, the performer is forced to choose between alternative intonational patterns and their associated meanings" (ibid.; cf. e.g. Chatman, 1965, 1966). I argued that this is not so. I also argued that the two issues are intimately related. In Wellek and Warren's terms, the Minstrel subdues prose rhythm, and foregrounds the metric pattern; the Actor subdues the metric pattern in favour of the prose rhythm. For Chatman this may be a slight exaggeration, but in principle this is how things are and should be: when prose rhythm and metre conflict, "the performer is forced to choose between alternative intonational patterns". My position is that there is a third, "rhythmical performance", in which both metric pattern and linguistic stress pattern can be accommodated, such that both are established in the listener's perception. The same holds true for the conflicting intonation patterns articulating the linguistic unit (the phrase or sentence), and the metric unit (the line). This is precisely what the perceived rhythm of poetry is about, and by no means a side issue.

In my 1977 book, A Perception-Oriented Theory of Metre, I suggested that when the endings of the syntactic unit and the metric unit do not coincide (that is, when syntax is run-on from one line to the other), the reciter may indicate continuity and discontinuity at one and the same time by having recourse to conflicting cues. I came to this conclusion in a speculative manner. Twenty years later, in his master's thesis, an empirical study of enjambment, Tom Barney (1990) found ample empirical support for this assumption. This he did without having heard of my work before. I too started at this point. But there is a difference of emphasis between Barney's and my own methodology. Barney investigated poems by Philip Larkin and John Betjeman read by the authors, submitting them to an instrumental analysis. He assumed that the authors knew what they were doing: he tacitly credited them with the ability to perform a piece of poetry in a way that suggests a line boundary and a run-on sentence at the same time. He didn't check alternative possibilities. I am usually not so generous, neither with poets, nor with professional actors, nor with colleagues from the academy. I prefer to collect judgments from students, colleagues or my research associates whether the performer was successful in conveying, e.g., the conflicting aspects of an enjambment. And if possible, I try to compare alternative possiblities. Then I am looking for cues in the phonetic structures of the recordings, trying to find support for the intuitive judgments. One way to compare alternatives is to compare different solutions to the same problem in different performances of one piece of poetry. In this paper I will attempt a more rigorous method: I will discuss performances that have been judged deficient in a certain rhythmic respect, will manipulate them electronically, and then see whether the manipulation improved the reading in the direction predicted by the cognitive framework.

Barney relied in his research on a paper by Gerry Knowles (1991), in which he investigated the nature of tone-groups. Knowles distinguished internally defined prosodic patterns and external discontinuities at the tone-group boundaries. The former consist in some consistent F0 pattern used in ordinary speech; the latter are temporal discontinuation (pause), pitch discontinuation (a sudden change in F0) and segmental discontinuation (that is, in normal speech the articulation of adjacent words is overlapping; when there is no overlap, it may count as discontinuity, even if there is no pause). Glottal stops in words beginning with a vowel, or word-final stop releases too may indicate segmental discontinuation. This would be the most evasive type of discontinuity. "The important distinction that seems to be emerging is between boundaries with or without pauses". In what follows, I shall explore how these correlates of tone-group boundaries can be exploited as conflicting cues for the perceptual accommodation of the conflicting patterns of speech and versification.

One of the most conspicuous kinds of segmental discontinuity is the prolongation of a phoneme or of a syllable at the end of an utterance, announcing (very much like fermata in music) that the preceding unit has come to an end. Prolongation is, in fact, a double-edged phenomenon, that is, in different contexts it has different, sometimes even opposite, effects. From a perceptual point of view, prolongation indicates lack of forward movement. Therefore, when we have reason to suppose that it occurs at the end of some perceptual unit, it will be perceived as reinforcing the sense of rest; when it occurs in the middle of some forward movement, it is perceived as an arrest, arousing strong desire for change. While this is most useful in the kind of research I am engaged in, there is a big problem with this notion. There is no standard by which we can determine whether a phoneme or sequence of phonemes is longer or shorter than ought to be. Consequently, one must rely in this respect on one's intuitive judgment, or some roundabout reasoning about measurements and comparisons. In this paper a more "objective" criterion of prolongation will be added: the duration of speech sounds in actual performances will be artificially extended.

In another paper, Knowles (1992) explores the alignment of the F0 contour with vowels and consonants. "Although the effect of a tone might be to highlight a whole word or phrase, its focus is on a single syllable. Within the syllable it focuses on the vowel, and if the vowel is a diphthong, on one of the elements of the diphthong. Ultimately within the relevant vowel there is a single point which appears to be the focus of accentuation" (Knowles, 1992: 294). Knowles calls this point the accent point. Such points may be located in various places in the vowel. Accordingly, he speaks of early-peaking and late-peaking, as the case may be. "Peak position would seem to be a continuous variable" (ibid). For our present interest it is important that peak position may affect the grouping of syllables. The results of his research are completely independent of the needs of the present inquiry. Knowles suggests the possibility that behind the phonological contrast of tone there is a functional contrast between an "initial" marker ("late peaking") and a "final" marker ("early peaking").

In my explorations I discovered a phenomenon that is fully consistent with what has been said so far: late peaking is perceived as "pressing forward". In places where continuity and discontinuity is required simultaneously (as in enjambment), sometimes some mysterious forward drive is perceived. At the beginning I thought that this had to do with a great leap of pitch. But when I had access to instrumental investigation, it became conspicuous that in many instances the leap of pitch could not account for that sense of drive. Knowles's paper made it unambiguously clear that the explanation has to do with late peaking; and Knowles too suggests that late peaking need not entail an extreme pitch change. Such a conception is compatible with the findings of gestalt psychologists concerning "perceptual forces", the existence of which has been demonstrated with reference to visual perception by Arnheim (1957: 1-8), and with reference to speech perception by Fodor and his colleagues (Garret et al., 1966). A perceptual unit tends "to preserve its integrity by resisting interruptions". A backward or forward drive toward the boundary is generated when there is some intrusion away from the middle of a perceptual unit, which is the case in early and late peaking, respectively.

Enjambment is an obvious instance in which linguistic units and versification conflict. The sentence is run on from one verse line to the other. Consequently, the line ending requires the reader to stop, the run-on sentence requires him not to stop. To have his cake and eat it. This issue is exemplified by a verse instance from Keats's "Ode on a Grecian Urn" in which the versification unit (the verse line) conflicts with the syntactic unit (the clause), that is, when the phrase or clause runs on from one line to the next one. I am going to compare two readings of the same verse instance by leading British actors, available on commercial records. We have listened to them and got the impression that one reading does solve this problem, the other one does not. I am going to examine the phonetic contrasts between the two readings which may account for this perceptual difference.

1. Sylvan historian, who canst thus express
     A flowery tale more sweetly than our rhyme...

Listen to two actors' readings of quote 1.

Hodge Sheen

It is assumed here that a rhythmical performance of these two lines will suggest continuity and discontinuity at one and the same time; a non-rhythmical performance will suppress either the continuous or the discontinuous aspect of this structure. How can we know whether a delivery instance displays at this point continuity or discontinuity or both? Paraphrasing Sibley (1962) we might say that we know that a delivery instance is continuous or discontinuous or both by listening, just as we see that the book is red by looking, or as we tell that the tea is sweet by tasting it. By listening to two delivery instances of this verse instance, we may prefer one to the other according to whether it does or does not suggest continuity and discontinuity at the same time. We can also establish the phonetic correlates that make these suggestions. The present approach assumes that continuity and discontinuity can be suggested at one and the same time by using conflicting phonetic cues, thus committing "organized violence" against speech processing. This cannot be done by merely looking at the graphic output of the computer, only by listening to the sound output. Then one may determine from the graphic output which features of the speech signal "typically count toward" continuity, and which against it, to use Sibley's terms again. But only by listening we can tell what the perceived quality of the whole is, continuous or discontinuous or both.

An MA seminar group, my PhD students, my research assistant, several colleagues and myself have listened to commercially available recordings of Keats's "Ode" by two leading British actors, Douglas Hodge and Michael Sheen. We all made the judgment that Hodge offers an admirably rhythmical solution of the problem, by suggesting continuation and discontinuation at one and the same time at the end of the word "express", whereas in Sheen's reading "A" at the beginning of the next line is irritatingly continuous with "express". Then we looked for features that typically count toward or against discontinuity. Unfortunately, owing to the recording quality, the machine produced no pitch contour for "A" in either reading; so we were unable to use pitch movement as an indicator of continuity or discontinuity. At any rate, the overall pitch contour of the relevant segments seems to suggest continuation in both readings. There is no measurable pause in either of the readings between the two words; and this takes care of syntactic continuity. At the same time, there are two rather significant differences between the two readings that may account for the perceived difference between them. First, in Sheen's reading the /s/ of "express" is inseparably run into "A", whereas in Hodge's reading we may discern a glottal stop that perceptually separates the two words, indicated by a minute "lump" in the wave plot. (Glottal stop is the speech sound we insert before "aim" when we say: "I said 'an aim', not 'a name'". Second, a glance at Figures 1 and 2 may indicate that the syllable "press" in general, and the closing /s/ in particular, are considerably longer in Hodge's reading than in Sheen's. There is no way to know whether a given phoneme in a stretch of speech is longer or shorter than ought to be; one may only make meticulous comparisons of relative duration. In Hodge's reading /s/ is 150 msec long, in Sheen's 105 msec long, that is, over 1.42 times longer. In Hodge's reading "pre-" is 1.15 times longer than in Sheen's (183: 159). In "ex-", by contrast, the /s/ is 111-msec-long in Hodge's reading, in Sheen's 112-msec-long. One msec difference is insignificant; but here this minute difference in Sheen's favour should be evaluated against the substantial, 1.42 times greater length of the final /s/ in the same word in Hodge's reading. The whole phrase "who canst thus express" is 1.450 sec long in Hodge's, 1.334 sec long in Sheen's reading, that is, only 1.08 times longer. That is, "-press" is longer in Hodge's than in Sheen's reading, relative to their respective contexts. One has a strong intuition that in Hodge's reading the line-ending is clearly articulated in spite of the run-on syntax, whereas in Sheen's reading it is not; and that this difference has to do with the relative duration of the /s/ (and of "press") in the two readings, and with the presence or absence of the articulatory gesture called glottal stop. The graphic output of the computer fully supports these intuitions.

The arguments of cognitive poetics proceed in three stages. In the first two stages the critic assigns a structural description to the text and collects "perceived effects" in controlled experimental situations or in some less formal way. In the third stage he applies some theory derived from some other discipline of cognitive science as a "plausible hypothesis" to relate the perceived effects to the structure of the text in a principled manner. Usually this is the best one can do, and our foregoing handling of the two readings of Keats's enjambment appears to be of the more rigorous instances of this methodology. A sceptic, however, can always cast doubts on such a methodology (even though quite frequently "plausible hypotheses" are the best one can offer in the "harder" sciences as well). The rhythmical performance of poetry is a very complex process, and the response to it is even more complex. Complexer and complexer, would say Alice. There may always be variables of which we are not aware, and which may be the thing that determines our response. When a reader or listener reports his intuitions, you may never know what he is responding to. And in such instances you can't contrive a control condition in which "everything is really equal". While in chemistry, for instance, you may have exact control over how much you add of what, you can't gain such a control of the subtleties of e.g. the performer's use of his voice. In what follows I am reporting an attempt to do just that.

The ensuing exercise was prompted by a discussion in a graduate seminar on Cognitive Poetics at the department of Hebrew Literature, Tel Aviv University. The original experiment reported here was, therefore, conducted with a Hebrew poem, and then replicated with Sheen's performance of quote 1 (figure 2) above. I will present first the Hebrew example, and then the manipulations carried out on Sheen's reading.

The Hebrew example concerned an enjambment from a poem by the great Hebrew poet, Nathan Alterman, as read by the Israeli actor Yossi Banay, on a commercially available CD (HL6020). In this poem the dead husband is speaking to his youthful wife.

And nothing remained, except
My dust that pursues your shoes.

This is an extremely strained enjambment. Syntactically, "bilti", the Hebrew for "except" is a preposition that must be emphatically grouped with the ensuing noun at the beginning of the next line. Prosodically, however, it completes the verse line, and is part of a virtuouso formulaic rhyme pattern. This requires an emphatic break after it. Listening to Banay's reading arouses an uneasy feeling that far from trying to strike a balance between continuity and discontinuity, he rather speeds up the transition across the line boundary more than across any other word boundary in the two lines. On closer inspection we find that he achieves this by four phonetic cues. First, there is no pause between the two lines. Second, not as in English, in Hebrew the glottal stop is a phoneme (that is, a speech sound indicating a difference in meaning, marked by a separate letter, aleph or ayin) frequently omitted in Israelese. According to Knowles, the insertion of a glottal stop may generate "segmental discontinuity". The second line of quote 2 begins with a word whose first phoneme is a glottal stop, ""; but Banay omits it, increasing the sense of continuity. Third, Banay coarticulates the words "bilti/afari" (across the line boundary!), by inserting the glide [j]. Fourth, there is an exceptionally late peak on the last syllable of the line, exerting a forward-driving "perceptual force" (see figure 4).

Listen to Yossi Banay's readings of quote 2.

published version first alternative version second alternative version

I decided to experiment with this enjambment, by electronically manipulating the transition. I copied on the coputer a small section of the [i] and pasted it several times into the last [i] of "bilti" until its duration was more than doubled, generating "segmental discontinuity" (figure 6). Then, I presented the two versions to the group of graduate students as genuine versions of Yossi Banay's reading. They were asked whether any one of the two versions did convey simultaneously both the verse line and the run-on sentence as conflicting perceptual units. There was consensus that one of them did, and that it was the "alternative version". The "second alternative version" was generated from the first alternative version by performing two additional electronic manipulations. First, a glottal stop was copied from another verse line and pasted before "afari", enhancing the sense of discontinuity. Second, to render the manipulated lines more natural, a short section of the [i] of "afari" was excised, at the point where its pitch was the highest.

In harmony with my foregoing experiments, I claim that the performer may convey here too both the verse line and the run-on sentence as perceptual units by having recourse to conflicting cues: cues of continuity and discontinuity simultaneously. In his "published version" of these two lines, Yossi Banay had recourse to an aggregation of cues for continuity, suppressing the verse line boundary. In the two alternative versions these two lines were electronically manipulated, inserting cues of discontinuity. In the "first alternative version" the duration of the second [i] of "bilti" was more than doubled. Then, the two lines were presented to a group of graduate students as genuine versions of Yossi Banay's reading. They were asked whether any one of the two versions did convey both the verse line and the run-on sentence as conflicting perceptual units. There was consensus that the "first alternative version" did. The "second alternative version" results from adding two more electronic manipulations to the first alternative version: a glottal stop was copied from another verse line and pasted before "afari"; and, to render the manipulated lines more natural, a short section of the [i] of "afari" was excised, at the point where its pitch was the highest. The glottal stop inserted between the two lines indicates discontinuity; and this was enhanced by the fact that it also disrupted the co-articulation across the line boundary.

Encouraged by these results with the Hebrew text, I decided to try my luck with an English text. I have discussed above two readings of quote 1, by Douglas Hodge and Michael Sheen. MA and PhD students along with colleagues from Israel and from abroad listened to them. Some of the students couldn't distinguish between the ways the two readings handled the enjambment. But those who could, all found that Hodge's reading did indicate continuity and discontinuity at the same time; Sheen's didn't. A close scrutiny of the computer's visual output revealed that the two readings differed in two features predicted by the present theoretical framework: in Hodge's reading the second syllable of "express" was considerably longer than in Sheen's; and in Hodge's reading there was a glottal stop at the beginning of the second line, whereas in Sheen's there was not. We have hypothesised that these two structural differences were responsible for the perceived difference between the two readings. It remained to assess what happens when you add to Sheen's reading those two features. Here, perhaps the dices had been loaded, because we could start with a consensus concerning the two genuine readings, both regarding the perceived qualities and the structural differences. I copied short sections from the last two speech sounds of "express" ("-ess) and pasted them several times into the same speech sounds; and I copied the glottal stop from Hodge's reading and pasted it at the same place in Sheen's. The results leave no room for doubt: by the addition of these two features the same perceived quality of continuity and discontinuity was generated in Sheen's as in Hodge's reading; perhaps even more convincing.

Listen to Sheen's readings of quote 1.
published version manipulated version

I wish to make a few more comments on this comparison. I feel rather uneasy with the emotional quality of "Sylvan historian" indicated in Hodge' reading; and Sheen's emphatic stress on "our" (in a weak position) in "our rhyme" greatly damages the rhythm of the line. But all this has nothing to do with the subtleties we have been scrutinising. I am referring only to the enjambment, not to the other parts of quote 1. According to my introspection, supported by some theoretical considerations and precedents in genuine readings, Sheen could perform the emphatic stress and still preserve the rhythm of the verse line by assigning a rising and deeply-falling intonation contour to "rhyme". Unfortunately, however, I cannot perform such a manipulation with the electronic means at my disposal.

To conclude. Contrary to the received view, enjambments can be performed in such a way that both the line boundary and the run-on sentence are indicated. This can be done by having recourse to conflicting cues. Tsur (1977; 1998) and Knowles (1991; 1992) provide theoretical frameworks that, in conjunction, predict the vocal devices by which this can be done. Barney (1990) and Tsur (1998) provide ample empirical evidence that flesh-and-blood performers do, indeed, attempt precisely these solutions. Tsur also provides some empirical evidence that those vocal devices do have, indeed, the predicted effect (though this requires further, and more rigourous, experimentation). This article added some further empirical evidence, namely, that the relevant cues could be inserted, partly at least, by electronic manipulations and achieve the predicted effect.


1. Consequently, the theoretical part of this paper as well as the comparison between Sheen's and Hodge's readings of Keats have been reproduced here from chapter 3 of that book. [back]


Arnheim, Rudolf (1957) Art and Visual Perception. London: Faber & Faber.

Barney, Tom (1990) "The Forms of Enjambment". University of Lancaster unpublished MA dissertation.

Chatman, Seymour (1965) A Theory of Meter. The Hague: Mouton.

Chatman, Seymour (1966) "On the 'Intonational Fallacy'", QJS 52: 283-286.

Garrett, M., T. Bever and J. A. Fodor (1966) "The Active Use of Grammar in Speech Perception". Perception and Psychophysics 1: 30-32.

Knowles, Gerry (1991) "Prosodic Labelling: The Problem of Tone Group Boundaries", in Stig Johannson and Anna-Brita Stenström (eds.), English Computer Corpora. Selected Papers and Research Guide. (Topics in English Linguistics 3) Berlin: Mouton de Gruyter. 149-163.

Knowles, Gerry (1992) "Pitch Contours and Tones in the Lancaster/IBM Spoken English Corpus", in Gerhard Leitner (ed.), New Directions in English Language Corpora Methodology, Results, Software Developments. Berlin: Mouton de Gruyter. 289-299.

Sibley, Frank (1962) "Aesthetic Qualities", in Joseph Margolis (ed.), Philosophy Looks at the Arts: Contemporary Readings in Aesthetics. New York: Scribner. 63-88.

Tsur, Reuven (1977) A Perception-Oriented Theory of Metre. Tel Aviv: The Porter Israeli Institute for Poetics and Semiotics.

Tsur, Reuven (1998) Poetic Rhythm: Structure and Performance -- An Empirical Study in Cognitive Poetics. Bern: Peter Lang.

Wellek, René & Austin Warren (1956) Theory of Literature. New York: Harcourt, Brace & Co.

Recorded Readings

Banay, Yossi reading Nathan Alterman. Helicon HL6020 (1999).

Hodge, Douglas reading John Keats. Hodder Headline AudioBooks HH 186. (1995).

Sheen, Michael reading Great Poets of the Romantic Age. Naxos AudioBooks NA 20 2112. (1994).

