Performance of Enjambments
Back to home page
Back to "Occasional Papers"
If you need help with using
the .mp3-files, click here:
The Performance of Enjambments, Perceived Effects,
and Experimental Manipulations
This paper offers further empirical evidence in favour
of my conception of poetic rhythm and performance as
presented in my book Poetic Rhythm: Structure and Performance -- An
Empirical Study in Cognitive Poetics.1
It claims that in an enjambment the performer may convey
both the verse line boundary and the run-on sentence
as perceptual units, however strained, by having recourse
to conflicting phonetic cues: cues of continuity and
discontinuity simultaneously. In my book I provided
some empirical evidence for this assumption. Among
other instances, I considered two readings of the same
enjambment, by two leading British actors. I collected
responses to these two performances from a wide range
of listeners, who were unanimous in their judgment
that in one of the two readings, but not in the other
one, both the verse line and the run-on sentence are
perceived as perceptual units. I have shown that in
the former, but not in the latter reading the performer
had recourse to conflicting cues, of continuity and
discontinuity, simultaneously. In this paper I am going
one step further: I will experiment with performances
of enjambments that have been judged to suppress the
perceptual boundary of the first line. I will attempt
to add conflicting cues by electronic manipulations
and observe the outcome.
In my book I stated my position with reference to two
issues in a recent "state-of-the-art" summary
of performance, the "Performance" entry of
The New Princeton Encyclopedia of Poetry and Poetics
(1993). The first issue concerns delivery style: "C.
S. Lewis once identified two types of performers of
metrical verse: 'Minstrels' (who recite in a wooden
singsong voice, letting scansion override verse) and
'Actors' (who give a flamboyantly expressive recitation,
ignoring meter altogether)" (893). I argued that
in between these two delivery styles there is a third
one, which I call "rhythmical performance",
and that this "type" is at the very core
of poetic rhythm. The second issue concerns ambiguity.
"Chatman isolates a central difference between
the reading and scansion of poems on the one hand and
their performance on the other: in the former two activities,
ambiguities of interpretation can be preserved and
do not have to be settled one way or the other ('disambiguated').
But in performance, all ambiguities have to be resolved
before or during delivery. Since the nature of performance
is linear and temporal, sentences can only be read
aloud once and must be given a specific intonational
pattern. Hence in performance, the performer is forced
to choose between alternative intonational patterns
and their associated meanings" (ibid.; cf. e.g.
Chatman, 1965, 1966). I argued that this is not so.
I also argued that the two issues are intimately related.
In Wellek and Warren's terms, the Minstrel subdues
prose rhythm, and foregrounds the metric pattern; the
Actor subdues the metric pattern in favour of the prose
rhythm. For Chatman this may be a slight exaggeration,
but in principle this is how things are and should
be: when prose rhythm and metre conflict, "the
performer is forced to choose between alternative intonational
patterns". My position is that there is a third,
"rhythmical performance", in which both metric
pattern and linguistic stress pattern can be accommodated,
such that both are established in the listener's perception.
The same holds true for the conflicting intonation
patterns articulating the linguistic unit (the phrase
or sentence), and the metric unit (the line). This
is precisely what the perceived rhythm of poetry is
about, and by no means a side issue.
In my 1977 book, A Perception-Oriented Theory of Metre,
I suggested that when the endings of the syntactic
unit and the metric unit do not coincide (that is,
when syntax is run-on from one line to the other),
the reciter may indicate continuity and discontinuity
at one and the same time by having recourse to conflicting
cues. I came to this conclusion in a speculative manner.
Twenty years later, in his master's thesis, an empirical
study of enjambment, Tom Barney (1990) found ample
empirical support for this assumption. This he did
without having heard of my work before. I too started
at this point. But there is a difference of emphasis
between Barney's and my own methodology. Barney investigated
poems by Philip Larkin and John Betjeman read by the
authors, submitting them to an instrumental analysis.
He assumed that the authors knew what they were doing:
he tacitly credited them with the ability to perform
a piece of poetry in a way that suggests a line boundary
and a run-on sentence at the same time. He didn't check
alternative possibilities. I am usually not so generous,
neither with poets, nor with professional actors, nor
with colleagues from the academy. I prefer to collect
judgments from students, colleagues or my research
associates whether the performer was successful in
conveying, e.g., the conflicting aspects of an enjambment.
And if possible, I try to compare alternative possiblities.
Then I am looking for cues in the phonetic structures
of the recordings, trying to find support for the intuitive
judgments. One way to compare alternatives is to compare
different solutions to the same problem in different
performances of one piece of poetry. In this paper
I will attempt a more rigorous method: I will discuss
performances that have been judged deficient in a certain
rhythmic respect, will manipulate them electronically,
and then see whether the manipulation improved the
reading in the direction predicted by the cognitive
Barney relied in his research on a paper by Gerry Knowles (1991), in which he investigated the nature of tone-groups. Knowles distinguished internally defined prosodic patterns and external discontinuities at the tone-group boundaries. The former consist in some consistent F0 pattern used in ordinary speech; the latter are temporal discontinuation (pause), pitch discontinuation (a sudden change in F0) and segmental discontinuation (that is, in normal speech the articulation of adjacent words is overlapping; when there is no overlap, it may count as discontinuity, even if there is no pause). Glottal stops in words beginning with a vowel, or word-final stop releases too may indicate segmental discontinuation. This would be the most evasive type of discontinuity. "The important distinction that seems to be emerging is between boundaries with or without pauses". In what follows, I shall explore how these correlates of tone-group boundaries can be exploited as conflicting cues for the perceptual accommodation of the conflicting patterns of speech and versification.
One of the most conspicuous kinds of segmental discontinuity is the prolongation of a phoneme or of a syllable at the end of an utterance, announcing (very much like fermata in music) that the preceding unit has come to an end. Prolongation is, in fact, a double-edged phenomenon, that is, in different contexts it has different, sometimes even opposite, effects. From a perceptual point of view, prolongation indicates lack of forward movement. Therefore, when we have reason to suppose that it occurs at the end of some perceptual unit, it will be perceived as reinforcing the sense of rest; when it occurs in the middle of some forward movement, it is perceived as an arrest, arousing strong desire for change. While this is most useful in the kind of research I am engaged in, there is a big problem with this notion. There is no standard by which we can determine whether a phoneme or sequence of phonemes is longer or shorter than ought to be. Consequently, one must rely in this respect on one's intuitive judgment, or some roundabout reasoning about measurements and comparisons. In this paper a more "objective" criterion of prolongation will be added: the duration of speech sounds in actual performances will be artificially extended.
In another paper, Knowles (1992) explores the alignment of the F0 contour with vowels and consonants. "Although the effect of a tone might be to highlight a whole word or phrase, its focus is on a single syllable. Within the syllable it focuses on the vowel, and if the vowel is a diphthong, on one of the elements of the diphthong. Ultimately within the relevant vowel there is a single point which appears to be the focus of accentuation" (Knowles, 1992: 294). Knowles calls this point the accent point. Such points may be located in various places in the vowel. Accordingly, he speaks of early-peaking and late-peaking, as the case may be. "Peak position would seem to be a continuous variable" (ibid). For our present interest it is important that peak position may affect the grouping of syllables. The results of his research are completely independent of the needs of the present inquiry. Knowles suggests the possibility that behind the phonological contrast of tone there is a functional contrast between an "initial" marker ("late peaking") and a "final" marker ("early peaking").
In my explorations I discovered a phenomenon that is fully consistent with what has been said so far: late peaking is perceived as "pressing forward". In places where continuity and discontinuity is required simultaneously (as in enjambment), sometimes some mysterious forward drive is perceived. At the beginning I thought that this had to do with a great leap of pitch. But when I had access to instrumental investigation, it became conspicuous that in many instances the leap of pitch could not account for that sense of drive. Knowles's paper made it unambiguously clear that the explanation has to do with late peaking; and Knowles too suggests that late peaking need not entail an extreme pitch change. Such a conception is compatible with the findings of gestalt psychologists concerning "perceptual forces", the existence of which has been demonstrated with reference to visual perception by Arnheim (1957: 1-8), and with reference to speech perception by Fodor and his colleagues (Garret et al., 1966). A perceptual unit tends "to preserve its integrity by resisting interruptions". A backward or forward drive toward the boundary is generated when there is some intrusion away from the middle of a perceptual unit, which is the case in early and late peaking, respectively.
Enjambment is an obvious instance in which linguistic units and versification conflict. The sentence is run on from one verse line to the other. Consequently, the line ending requires the reader to stop, the run-on sentence requires him not to stop. To have his cake and eat it. This issue is exemplified by a verse instance from Keats's "Ode on a Grecian Urn" in which the versification unit (the verse line) conflicts with the syntactic unit (the clause), that is, when the phrase or clause runs on from one line to the next one. I am going to compare two readings of the same verse instance by leading British actors, available on commercial records. We have listened to them and got the impression that one reading does solve this problem, the other one does not. I am going to examine the phonetic contrasts between the two readings which may account for this perceptual difference.
1. Sylvan historian, who canst thus express
A flowery tale more sweetly than our rhyme...
Listen to two actors' readings of quote 1.
It is assumed here that a rhythmical performance of
these two lines will suggest continuity and discontinuity
at one and the same time; a non-rhythmical performance
will suppress either the continuous or the discontinuous
aspect of this structure. How can we know whether a
delivery instance displays at this point continuity
or discontinuity or both? Paraphrasing Sibley (1962)
we might say that we know that a delivery instance
is continuous or discontinuous or both by listening,
just as we see that the book is red by looking, or
as we tell that the tea is sweet by tasting it. By
listening to two delivery instances of this verse instance,
we may prefer one to the other according to whether
it does or does not suggest continuity and discontinuity
at the same time. We can also establish the phonetic
correlates that make these suggestions. The present
approach assumes that continuity and discontinuity
can be suggested at one and the same time by using
conflicting phonetic cues, thus committing "organized
violence" against speech processing. This cannot
be done by merely looking at the graphic output of
the computer, only by listening to the sound output.
Then one may determine from the graphic output which
features of the speech signal "typically count
toward" continuity, and which against it, to use
Sibley's terms again. But only by listening we can
tell what the perceived quality of the whole is, continuous
or discontinuous or both.
An MA seminar group, my PhD students, my research assistant, several colleagues and myself have listened to commercially available recordings of Keats's "Ode" by two leading British actors, Douglas Hodge and Michael Sheen. We all made the judgment that Hodge offers an admirably rhythmical solution of the problem, by suggesting continuation and discontinuation at one and the same time at the end of the word "express", whereas in Sheen's reading "A" at the beginning of the next line is irritatingly continuous with "express". Then we looked for features that typically count toward or against discontinuity. Unfortunately, owing to the recording quality, the machine produced no pitch contour for "A" in either reading; so we were unable to use pitch movement as an indicator of continuity or discontinuity. At any rate, the overall pitch contour of the relevant segments seems to suggest continuation in both readings. There is no measurable pause in either of the readings between the two words; and this takes care of syntactic continuity. At the same time, there are two rather significant differences between the two readings that may account for the perceived difference between them. First, in Sheen's reading the /s/ of "express" is inseparably run into "A", whereas in Hodge's reading we may discern a glottal stop that perceptually separates the two words, indicated by a minute "lump" in the wave plot. (Glottal stop is the speech sound we insert before "aim" when we say: "I said 'an aim', not 'a name'". Second, a glance at Figures 1 and 2 may indicate that the syllable "press" in general, and the closing /s/ in particular, are considerably longer in Hodge's reading than in Sheen's. There is no way to know whether a given phoneme in a stretch of speech is longer or shorter than ought to be; one may only make meticulous comparisons of relative duration. In Hodge's reading /s/ is 150 msec long, in Sheen's 105 msec long, that is, over 1.42 times longer. In Hodge's reading "pre-" is 1.15 times longer than in Sheen's (183: 159). In "ex-", by contrast, the /s/ is 111-msec-long in Hodge's reading, in Sheen's 112-msec-long. One msec difference is insignificant; but here this minute difference in Sheen's favour should be evaluated against the substantial, 1.42 times greater length of the final /s/ in the same word in Hodge's reading. The whole phrase "who canst thus express" is 1.450 sec long in Hodge's, 1.334 sec long in Sheen's reading, that is, only 1.08 times longer. That is, "-press" is longer in Hodge's than in Sheen's reading, relative to their respective contexts. One has a strong intuition that in Hodge's reading the line-ending is clearly articulated in spite of the run-on syntax, whereas in Sheen's reading it is not; and that this difference has to do with the relative duration of the /s/ (and of "press") in the two readings, and with the presence or absence of the articulatory gesture called glottal stop. The graphic output of the computer fully supports these intuitions.
The arguments of cognitive poetics proceed in three stages. In the first two stages the critic assigns a structural description to the text and collects "perceived effects" in controlled experimental situations or in some less formal way. In the third stage he applies some theory derived from some other discipline of cognitive science as a "plausible hypothesis" to relate the perceived effects to the structure of the text in a principled manner. Usually this is the best one can do, and our foregoing handling of the two readings of Keats's enjambment appears to be of the more rigorous instances of this methodology. A sceptic, however, can always cast doubts on such a methodology (even though quite frequently "plausible hypotheses" are the best one can offer in the "harder" sciences as well). The rhythmical performance of poetry is a very complex process, and the response to it is even more complex. Complexer and complexer, would say Alice. There may always be variables of which we are not aware, and which may be the thing that determines our response. When a reader or listener reports his intuitions, you may never know what he is responding to. And in such instances you can't contrive a control condition in which "everything is really equal". While in chemistry, for instance, you may have exact control over how much you add of what, you can't gain such a control of the subtleties of e.g. the performer's use of his voice. In what follows I am reporting an attempt to do just that.
The ensuing exercise was prompted by a discussion in a graduate seminar on Cognitive Poetics at the department of Hebrew Literature, Tel Aviv University. The original experiment reported here was, therefore, conducted with a Hebrew poem, and then replicated with Sheen's performance of quote 1 (figure 2) above. I will present first the Hebrew example, and then the manipulations carried out on Sheen's reading.
The Hebrew example concerned an enjambment from a poem by the great Hebrew poet, Nathan Alterman, as read by the Israeli actor Yossi Banay, on a commercially available CD (HL6020). In this poem the dead husband is speaking to his youthful wife.
And nothing remained, except
My dust that pursues your shoes.
This is an extremely strained enjambment. Syntactically, "bilti", the Hebrew for "except" is a preposition that must be emphatically grouped with the ensuing noun at the beginning of the next line. Prosodically, however, it completes the verse line, and is part of a virtuouso formulaic rhyme pattern. This requires an emphatic break after it. Listening to Banay's reading arouses an uneasy feeling that far from trying to strike a balance between continuity and discontinuity, he rather speeds up the transition across the line boundary more than across any other word boundary in the two lines. On closer inspection we find that he achieves this by four phonetic cues. First, there is no pause between the two lines. Second, not as in English, in Hebrew the glottal stop is a phoneme (that is, a speech sound indicating a difference in meaning, marked by a separate letter, aleph or ayin) frequently omitted in Israelese. According to Knowles, the insertion of a glottal stop may generate "segmental discontinuity". The second line of quote 2 begins with a word whose first phoneme is a glottal stop, ""; but Banay omits it, increasing the sense of continuity. Third, Banay coarticulates the words "bilti/afari" (across the line boundary!), by inserting the glide [j]. Fourth, there is an exceptionally late peak on the last syllable of the line, exerting a forward-driving "perceptual force" (see figure 4).
Listen to Yossi Banay's readings of quote 2.
published version first alternative version second alternative version
I decided to experiment with this enjambment, by electronically manipulating the transition. I copied on the coputer a small section of the [i] and pasted it several times into the last [i] of "bilti" until its duration was more than doubled, generating "segmental discontinuity" (figure 6). Then, I presented the two versions to the group of graduate students as genuine versions of Yossi Banay's reading. They were asked whether any one of the two versions did convey simultaneously both the verse line and the run-on sentence as conflicting perceptual units. There was consensus that one of them did, and that it was the "alternative version". The "second alternative version" was generated from the first alternative version by performing two additional electronic manipulations. First, a glottal stop was copied from another verse line and pasted before "afari", enhancing the sense of discontinuity. Second, to render the manipulated lines more natural, a short section of the [i] of "afari" was excised, at the point where its pitch was the highest.
In harmony with my foregoing experiments, I claim that the performer may convey here too both the verse line and the run-on sentence as perceptual units by having recourse to conflicting cues: cues of continuity and discontinuity simultaneously. In his "published version" of these two lines, Yossi Banay had recourse to an aggregation of cues for continuity, suppressing the verse line boundary. In the two alternative versions these two lines were electronically manipulated, inserting cues of discontinuity. In the "first alternative version" the duration of the second [i] of "bilti" was more than doubled. Then, the two lines were presented to a group of graduate students as genuine versions of Yossi Banay's reading. They were asked whether any one of the two versions did convey both the verse line and the run-on sentence as conflicting perceptual units. There was consensus that the "first alternative version" did. The "second alternative version" results from adding two more electronic manipulations to the first alternative version: a glottal stop was copied from another verse line and pasted before "afari"; and, to render the manipulated lines more natural, a short section of the [i] of "afari" was excised, at the point where its pitch was the highest. The glottal stop inserted between the two lines indicates discontinuity; and this was enhanced by the fact that it also disrupted the co-articulation across the line boundary.
Encouraged by these results with the Hebrew text, I
decided to try my luck with an English text. I have
discussed above two readings of quote 1, by Douglas
Hodge and Michael Sheen. MA and PhD students along
with colleagues from Israel and from abroad listened
to them. Some of the students couldn't distinguish
between the ways the two readings handled the enjambment.
But those who could, all found that Hodge's reading
did indicate continuity and discontinuity at the same
time; Sheen's didn't. A close scrutiny of the computer's
visual output revealed that the two readings differed
in two features predicted by the present theoretical
framework: in Hodge's reading the second syllable of
"express" was considerably longer than in
Sheen's; and in Hodge's reading there was a glottal
stop at the beginning of the second line, whereas in
Sheen's there was not. We have hypothesised that these
two structural differences were responsible for the
perceived difference between the two readings. It remained
to assess what happens when you add to Sheen's reading
those two features. Here, perhaps the dices had been
loaded, because we could start with a consensus concerning
the two genuine readings, both regarding the perceived
qualities and the structural differences. I copied
short sections from the last two speech sounds of "express"
("-ess) and pasted them several times into the
same speech sounds; and I copied the glottal stop from
Hodge's reading and pasted it at the same place in
Sheen's. The results leave no room for doubt: by the
addition of these two features the same perceived quality
of continuity and discontinuity was generated in Sheen's
as in Hodge's reading; perhaps even more convincing.
Listen to Sheen's readings of quote 1.
published version manipulated version
the theoretical part of this paper as well as the comparison
between Sheen's and Hodge's readings of Keats have
been reproduced here from chapter 3 of that book. [back]
Arnheim, Rudolf (1957) Art and Visual Perception. London:
Faber & Faber.
Barney, Tom (1990) "The Forms of Enjambment". University of Lancaster unpublished MA dissertation.
Chatman, Seymour (1965) A Theory of Meter. The Hague: Mouton.
Chatman, Seymour (1966) "On the 'Intonational Fallacy'", QJS 52: 283-286.
Garrett, M., T. Bever and J. A. Fodor (1966) "The Active Use of Grammar in Speech Perception". Perception and Psychophysics 1: 30-32.
Knowles, Gerry (1991) "Prosodic Labelling: The Problem of Tone Group Boundaries", in Stig Johannson and Anna-Brita Stenström (eds.), English Computer Corpora. Selected Papers and Research Guide. (Topics in English Linguistics 3) Berlin: Mouton de Gruyter. 149-163.
Knowles, Gerry (1992) "Pitch Contours and Tones in the Lancaster/IBM Spoken English Corpus", in Gerhard Leitner (ed.), New Directions in English Language Corpora Methodology, Results, Software Developments. Berlin: Mouton de Gruyter. 289-299.
Sibley, Frank (1962) "Aesthetic Qualities", in Joseph Margolis (ed.), Philosophy Looks at the Arts: Contemporary Readings in Aesthetics. New York: Scribner. 63-88.
Tsur, Reuven (1977) A Perception-Oriented Theory of Metre. Tel Aviv: The Porter Israeli Institute for Poetics and Semiotics.
Tsur, Reuven (1998) Poetic Rhythm: Structure and Performance -- An Empirical Study in Cognitive Poetics. Bern: Peter Lang.
Wellek, René & Austin Warren (1956) Theory of Literature. New York: Harcourt, Brace & Co.
Banay, Yossi reading Nathan Alterman. Helicon HL6020
Hodge, Douglas reading John Keats. Hodder Headline AudioBooks HH 186. (1995).
Sheen, Michael reading Great Poets of the Romantic Age. Naxos AudioBooks NA 20 2112. (1994).
Back to home page
Back to "Occasional Papers"
This page was created using TextToHTML. TextToHTML is a free software for Macintosh and is (c) 1995,1996 by Kris Coppieters