|Serveur © IRCAM - CENTRE POMPIDOU 1996-2005.
Tous droits réservés pour tous pays. All rights reserved.
Contemporary Music Review, 1989
Copyright © CMR 1989
Keywords : auditory grouping, event structure, form-bearing dimension, knowledge structure, mental representation, musical form, perceptual invariance, psychological constraints.
One approach to these questions would be to investigate the form-bearing capacity of the perceptual dimensions that are used in music. A dimension can bear form if configurations of values along it can be encoded, organized, recognized and compared with other such configurations. We can arrange a sequence of pitches, for example, that is easily recognized when it is heard again. We are also quite adept at noticing variations on such a sequence - the appreciation of melodic variation requires this psychological ability.
The utility of a dimension as a form-bearer, however, depends on some additional factors. A dimension that affords a greater number of perceivable configurations is more valuable to a composer than a dimension along which only a small number are possible. This restriction may be due, for example, to limits in the discrimination among the values available. Many pitches are easily perceived and discriminated, as are a vast quantity of their combinations. It is unlikely, though, that a large number of separate vibrato rates would be easily discriminated and, as such, only a very small number of configurations would be possible.
Limits on the encoding of complex patterns along the dimensions would in turn limit their potential for transformation and development of configurations. Our encoding of relative durations is fairly acute and highly structured, which allows for a rich variation and elaboration of rhythmic patterns. To the contrary, encoding of spatial location in audition is relatively poor, and one would imagine that the development of sequences of spatial positions of notes would not be easily apprehended by a listener.
Another factor is the capacity to encode patterns along a given dimension in the presence of changes along other dimensions. Duration may well be the strongest dimension from this perspective since many composers repeat rhythmic motifs across rather large variations in pitch and timbre pattern, and listeners have no trouble in recognizing the similarity. According to the above mentioned conditions we might predict that pitch and duration would be strong dimensions, that several of the timbre dimensions would be of medium power, but that vibrato rate and spatial location, for example, would be very weak.
We cannot, therefore, arbitrarily structure an available physical dimension such as spatial location and still expect it to be comprehended. The limits for a given individual may be tied to internalizations of relations in the physical world that have proven useful to the human species through its evolution (Shepard, 1984). They might also be related to the extent to which the various dimensions are used frequently in the music of a given culture. It may be that form-bearing capacities represent biological and psychological constraints on structure processing. We need to study the way in which such structures can be apprehended by a listener.
A form extended in time directly poses the problem of how to do research on the mental representation of its temporal structure. The experience of form is highly dependent on several cognitive processes involved in the mental representation of musical constituents and musical knowledge, and in the organization and comprehension of musical structures. These processes include perceptual grouping, abstract knowledge structures and event structure processing.
Auditory grouping processes serve to organize the acoustic surface into musical events (simultaneous grouping), to connect events into musical streams (sequential grouping), and to "chunk" event streams into musical units (segmentational grouping). These three basic grouping processes are important precursors to the organization of musical form in that they pre-organize the continuous "acoustic surface" into discrete entities and groups of entities. This organization is, in effect, the forming of mental descriptions of what is happening in the world, a mapping of acoustic sources and events into auditory descriptions that can be used in calculating expectations and in developing comprehension, in effect a kind of "auditory scene analysis" (Bregman, 1977, 1981, in press). To some extent these grouping processes precede the extraction (or "computation") of the perceptual qualities of events and the relations among these event qualities (pitch, timbre, density, rhythm, interval, consonance, etc.; cf. McAdams, 1987; Wright & Bregman, 1987). Musical form is built upon relations among perceptual qualities. To the extent that grouping processes affect the emergent qualities, they can affect the perception of form. Some perceptual dimensions are strongly correlated with the sensory dimensions along which grouping decisions are made, such as pitch and timbre being strongly correlated with the spectral changes that affect sequential and segmentational grouping (Bregman, 1978; McAdams & Bregman, 1979; McAdams, 1984; I. Deliège, 1987). Such dimensions will be important contributors to musical form. Thus, an important criterion for a form-bearing dimension is that changes along it should be able to induce distinctive transitions or contrasts at the musical surface.
The perceived qualities of musical events are anchored to a learned system of relations (scale, meter, harmonic field, etc.) that is more or less strongly evoked by the relations among events in the musical context. This system of relations may be considered as abstract knowledge about the structure of the music of a given culture that one has acquired through extensive experience. This knowledge is abstract in the sense that it embodies relations that apply across a large repertoire of pieces and serves to establish the relative stability or salience relations among the values along a given dimension. This domain is perhaps the most important for the consideration of form-bearing capacity because it is clear that if a system of habitual relations among values along a dimension cannot be learned, the power of that dimension as a structuring force would be severely compromised.
The incoming acoustic information is parsed and interpreted according to acquired musical knowledge structures which affect the subsequent encoding and organizing of the musical material in an accumulated event structure upon which the experience of form is based. This event structure is specific to a given listening to a piece of music. This area involves many psychological processes to which a dimension must be susceptible if it is to contribute to musical form, such as the encoding of values and relations on musical dimensions, the perception of the similarity, invariance, and difference of musical material occurring at different times, the combination of hierarchical and associative structures and the appreciation of the trajectory of development, and the generation of expectations based on structural implications.
The remainder of this article will examine more closely the aspects of abstract knowledge structures and event structure processing that are important for evaluating the contribution of various perceptual dimensions to musical form.
One of the aspects of this categorization is the discreteness of many of the musical dimensions. It is crucial that categories be easily discriminated from one another. If a listener cannot tell them apart then the use of one category rather than its neighbor cannot create a perceptible difference in structure. The semitone in the Western 12-tone pitch system, for example, is at least six times larger than the smallest interval most listeners are able to discriminate (Green, 1976, chap. 10). Even quarter-tones are still a reasonable interval size with respect to discriminability, at least in the middle and upper registers. Jordan (1987) has shown that listeners discriminate tonal function at the quarter-tone level but not at the eighth-tone level. A microtonal pitch system such as Partch's (1974) 43-tone per octave Chromelodeon approaches very closely the limits of discriminability which could present difficulties in perceiving the different pitch structures produced in the system.
The importance of discretization is that we remember discrete entities easier than continuous or unclearly demarcated ones, at least for the memory of structures. This does not mean that continuous variation is not important in the appreciation of musical form. It is certainly vital for expressive variation of musical gesture. I am inclined, though, to remain close to Clarke's (1987) distinction between structure and expression, where discrete elements carry structure and continuous variation carries expression. But if a piece of music were to glide continuously through its several dimensions, I fear that a listener would not acquire much of a sense of form. Expressive, interpretive gesture is certainly continuous in nature, but most often around a stable point of reference or between a couple of points of reference, which serve the role of category prototypes (Rosch, 1978). A specific category is still often evoked in such cases.
It is also important that there be a relatively small number of categories. How small this number is depends primarily on short-term memory limitations (our ability to encode and compare values over a relatively short time period). There is certainly an important interaction between these short-term processing limitations and the storage of an abstract system of relations among values. It is easier to encode values that can be anchored to an over-learned system of relations than those that are completely new for a listener, as would often be the case in one's initial contacts with new scale or rhythmic systems, or with new sound categories such as are found in electroacoustic music. Most pitch systems of the world are limited to 5 - 12 tones per octave, a limit largely superceded by Partch's Chromelodeon. For musical qualities like timbre that have several perceptual "dimensions" such as brightness, tone color, roughness, attack character etc., it would probably be necessary in using them to build musical form to have only a few values along each dimension. These values should be widely separated so that the distinctive relations among them could be easily perceived and learned. Such is the case with distinctive features in speech sounds, where only two or three values are employed along any given speech dimension (Fant, 1973). This allows them to be meaningfully contrasted with one another.
It may be desirable, also for reasons related to memory limits, to have well-defined, perhaps fixed, categories or relations ("intervals") among categories. This fixity would certainly enhance the contribution of long-term memory to the encoding and organization of incoming information, primarily since the categories and relations could be generalized across a large number of musical situations. Dimensions like pitch and musical instrument timbre are, in many cases, more or less fixed in the fabrication of the instruments or are defined by cultural convention (though, again, there are important exceptions throughout the world). Duration and dynamics tend, however, to be much more flexibly chosen and thus greater perceptual importance is attached to maintaining and contrasting relations among values in performance (Gabrielsson, 1979).
Relations are thus quite important components of musical material. Much psychological research on music recognition memory has shown the importance of relations as building blocks of patterns. These relations would include pitch intervals, timbre vectors and duration proportions. Our ability to recognize transposed or accelerated musical patterns testifies to the psychological reality of relational encoding. But in some cases it appears that these relations become strongly fixed in long-term memory. Shepard & Jordan (1984) presented listeners with a stretched C major scale such that the octave ended on C#' instead of C'. They then presented probe tones belonging to C major, C# major or the stretched scale and asked listeners to judge how well they fit with the previously heard scale. Listeners tended to judge tones from the C# major scale as better fitting, indicating that as the tones of the stretched scale deviated from the learned pattern, this pattern was "shifted" to accommodate the deviations. At the end it was thus "positioned" on C# and influenced the subsequent judgments accordingly.
The origins of the discretization of dimensions into categories are varied, depending on the dimension in question, and may be considered as either natural or artificial (cf. Rosch, 1973, 1978, for visual and semantic categories). Instrumental timbres would appear to be separated on the basis of their mode of production (attack quality and spectral evolution are related strongly to instrument family - such as brass, bowed string, single or double reed, and so on), as well as on the basis of certain abstracted qualities such as brightness (Grey, 1977). This also appears to be the case with speech phonemes and implies the existence of possibly innate perceptual mechanisms that are tuned to the physical behavior of sound-producing objects (Neisser, 1976, chap. 8). The categorization of other dimensions (such as pitch, duration, dynamics and many synthesized timbres) results from a learned, artificial division of psychophysical continua of perceptual qualities (Dowling & Harwood, 1986, chap. 4). It is important to note that in the case of natural categories, it is difficult to achieve a continuous perception of some dimensions. The tendency to categorization and identification of the source is very strong. In the case of artificial categories, the continuity of the dimension is already there and a clear, reproducible system of discretization is necessary. The strong timbral identities of instrumental sound sources may to some extent be overcome by compositional artifice in orchestration where their blending into composite qualities gives one more flexibility in continuous variation between "composed" timbres (cf. Boulez, 1987).
To begin this investigation, let us consider several properties that appear to contribute to the organization of pitch systems (e.g., scale structures - Cross, Howell & West, 1985; and tonal hierarchies - Krumhansl, 1983) that we may postulate to be more or less easily processed by a listener. A recognition of these properties in a sequence of musical events helps to evoke and establish the system framework in the listener's mind.
Some researchers feel that the category values and intervals should be selected with respect to sensory considerations, such as sensory consonance (Helmholtz, 1877/1885; Krumhansl, 1987; Lerdahl, 1988; though this is contested by Brown, 1988). The organization of relations according to such sensory properties provides a solid psychoacoustic foundation for their function in musical patterns. Mathews, Pierce & Roberts (1987) propose this kind of constraint as a crucial consideration for the development of new musical dimensions. They call it the acoustic nucleus hypothesis: "With new materials, it is necessary to have an acoustic nucleus on which to grow powerful musical connotations via long-term learning. The acoustic nucleus consists of sound qualities that are perceivable at a low peripheral level, such as . . . relative dissonance . . . " (p. 83). However, we should realize that this constraint has been loosened somewhat in the establishment of the equal-tempered pitch system in order to gain other organizational possibilities such as being able to transpose a pitch pattern to any other pitch without seriously distorting the interval relations. Lerdahl (1987) has made some preliminary attempts at applying a generalized notion of sensory dissonance to the development of a system of timbral relations.
A property that seems to be special to pitch is the existence of a strong perceptual equivalence at the octave which allows the dimension to be organized cyclically with a given pattern of values being repeated regularly in each octave. This property is found in most of the musical scales of the world, though there are some notable exceptions (cf. discussion in Burns & Ward, 1982). As such one might hypothesize, as do Burns & Ward, that octave generalization in pitch is a learned concept which has its roots in sensory consonance, but the innate vs. acquired nature of octave equivalence is far from resolved in the debate between universalists and cultural relativists.
The psychological value of a cyclic dimension is that it allows a certain economy of mental representation and learning of the stimulus dimension: a large number and range of values can be used without overloading memory since a scale pattern with a small number of elements is repeated regularly. Once well-learned, these patterns tend to become very strong as components of structural interpretation, that is, as components of mental schemata that are used to organize and understand the incoming musical structure (Krumhansl, 1983; Shepard & Jordan, 1984).
One wonders whether the mere existence of a cyclic organization is in itself sufficient to enrich the dimension's form-bearing capacities independently of the actual repetition interval used. The very special status of the octave is due to its high degree of consonance. Aside from experiments by Mathews & Pierce (1980), little experimental evidence exists for scale systems organized on intervals other than the octave. They developed a "stretched" diatonic scale which had an "octave" ratio of 2.4:1 instead of 2:1. The tones that were played also had stretched frequency ratios, making them inharmonic. Subjects were asked to judge whether a short harmonic progression was in the same key as a longer passage. Both ended in a stretched equivalent of a cadence. Subjects were also asked to judge the finality of the cadence compared with unstretched tones and scales. The results suggest that subjects can match the "keys" of chord sequences in the stretched system, though the cadences lack a sense of finality.
In a great deal of contemporary Western music since the 1950s, there has been a kind of obsessive avoidance of pitch pattern repetition in other octaves. Part of the reason for this was to avoid invoking the schemata of classical tonal music. Another was based on an æsthetic principal of continual renewal of material with as little repetition as possible (cf. Schoenberg, 1941/1975, 1948/1975). The resulting irregularities (also often applied to rhythm) force listeners to adopt completely different modes of listening and remembering.
There is no evidence of true circularity in form-bearing dimensions other than pitch. One attempt at imposing circularity on timbre has been proposed by Slawson (1985). He takes a bounded two-dimensional representation of vowel-space (the dimensions corresponding to the center frequencies of two formants or filters) as a starting point for a theory of "sound color". From this he tries to develop a series of rules of organization of the space and of operations on the elements in the space based on serial procedures. He suggests that if an operation, such as transposition, forces one to leave the bounded space to the right, one should treat the right hand border as coextensive with the left hand border and simply wrap the pattern around. This has the effect of completely changing the interval and contour relations of the pattern. I would claim that this is the perceptual equivalent of using a two and a half octave instrument (C2 to F4) and treating the C2 as equivalent to F4. Transposing the pattern F3-D4-Bb3-A3-C#4 up a fourth and wrapping the notes above F4 around to C2 would give Bb3-D2-Eb4-D4-C#2 which is clearly different from the original in both interval pattern and contour. The author recognizes that this is an unfounded premise that violates what one perceives. He proceeds nonetheless to base a large portion of his theory of sound color and many of his compositional efforts on this falsely imposed property of the space. It becomes clear through this intellectual exercise that one cannot "invent" a perceptual equivalence that has no psychoacoustic foundation. It also becomes clear that the lack of this property places some rather severe constraints on the possible range of operations along a dimension.
Balzano (1980), Dowling & Harwood (1986), and Krumhansl (1987) have delineated a number of other properties of the pitch dimension that are helpful in establishing a tonal pitch hierarchy. These criteria are examined in detail with respect to several existing and proposed pitch scale systems in Krumhansl (1987). Focal values are those that occur frequently, that have long durations, and that tend to occupy strong positions in musical phrases. Frequency of occurrence and duration may, in particular, help to establish the system framework when a listener is faced with an unfamiliar musical style. Western listeners appear able to do this with Indian rags though certain subtleties of the Indian system escape them (Castellano, Bharucha & Krumhansl, 1984; see also Kessler, Hansen & Shepard, 1984, for Western and Balinese listeners). The effect of phrase position would depend a great deal on the listener already having acquired some understanding of phrase structure, perhaps from cues such as slowing down and pauses at the end of phrases (cf. Carlson, Friberg, Frydén, Granström & Sundberg, this volume).
It is desirable for listeners to be able to rapidly discern their "position" within the system of pitch relations (Browne, 1981). This position finding may be due to both the asymmetric structure of intervals among the categories in a scale (e.g. in the major diatonic scale there are series of two and three major seconds separated by minor seconds), and to the existence of rare or distinctive intervals in the set (e.g. in the major diatonic scale there exists only one augmented fourth, two minor seconds, and a greater number of other intervals; cf. Butler & Brown, 1984). This property also maximizes the variety of interval sizes in a given scale. This property would distinguish the Western diatonic scales and many Indian thats from equal-tempered pentatonic scales found on Indonesian gamelan and Ugandan harp or from certain equal-tempered heptatonic scales found on Ugandan and Thai xylophones (cf. discussion in Burns & Ward, 1982, pp. 257-258).
The last criterion of ordering is the predisposition to certain sequential relations among dimension values (Butler & Brown, 1984; Brown, 1988). This criterion is enhanced by the existence of distinctive intervals. It tries to capture aspects of functional relations that are not merely related to the frequency of occurrence of individual values, but to the frequency of occurrence of pairs or sets of values in a given sequential order, that is, to statistical sequential asymmetries found in a body of music. For example, within the Western tonal hierarchy, the unstable leading-tone tends to resolve to a succeeding tonic. The statistical occurrence of this ordering of the pitches is much greater than the reverse. The learning of these tendencies through extended exposure to a given style of music is partially responsible for the sense of directed motion: a given value implies by anticipation its succession by another, giving rise to the patterns of tension and release, or implication and realization (cf. Narmour, this volume). This functional, sequential aspect of pitch relations has been stressed by Butler & Brown (1984) as a crucial cue in evoking a tonal center and a sense of key. What I would like to point out here is that many of these relations of stability and instability which occur frequently, can also become part of the abstract knowledge of the structure of a musical dimension such as pitch. This is the problem of the interaction between abstract knowledge structures and real-time event structure processing that is discussed by Deutsch (1984) and Bharucha (1984b) in terms of tonal and event hierarchies.
Much psychology research has demonstrated that information is more easily encoded, organized, perceived and remembered when it is hierarchically ordered (Restle, 1970; Deutsch & Feroe, 1981). One might postulate that this is, then, a desirable property of a stimulus system. There are two types of hierarchy that are often referred to here. One is a hierarchy of dominance or stability relations (what Simon, 1962/1982, refers to as a "formal hierarchy") in which some elements dominate other elements and are thus given greater structural prominence. The other type is one where combinations of elements at lower levels give rise to emergent properties at higher levels that are not easily derived from the properties of the individual constituents. Pitch relations in a scale are primarily of the first type, while the relations between pitch, chord, and key are of the second type. There is a third type which is often used in connection with event hierarchies (see next section) which has a notion of parts within parts. This latter is the main type referred to in Lerdahl & Jackendoff (1983) in their well-formedness rules and structural trees. Hierarchization depends to a large extent on some of the criteria listed previously, such as the existence of reference points or focal values.
The derivation of harmonic function from the tonal organization of pitch is an example of the richness of hierarchization possible in a musical dimension (cf. Krumhansl, 1987; Lerdahl, 1988), though one wonders to what extent this richness might be limited to pitch and duration. The derivation of harmony from scale structure illustrates the fact that certain properties emerge at certain levels in a hierarchical system. A number of criteria for harmonic relations that mirror to some extent those of pitch relations listed above have been proposed by Krumhansl (1987). I won't examine these here, but will summarize her reflections by saying that the relations among the "emergent values" (chord types) at this higher level of the hierarchy are organized in strikingly similar ways to the simpler values (pitches) at the lower level. This coherence between levels of the hierarchy distinguishes the Western tonal pitch system and gives it such a high structuring potential.
The abandonment or weakening by many contemporary composers of the tonal pitch hierarchy, with its incumbent structural economy, and the resistance of some to using any other kind of hierarchy to replace it, might place a greater cognitive burden on the listener. To date little experimentation or psychologically oriented theory has been directed at trying to understand what listeners actually hear and understand in this kind of music, or to understand the extent to which one can learn to process these new combinatorial structures (though see Lerdahl, this volume). The work that has been done suggests that most listeners are more sensitive to contour than to precise interval structure in atonal and serial music (Francès, 1972/1988, chaps. 3,4; Dowling & Harwood, 1986, chap. 5). Krumhansl, Sandell & Sergeant (1987) produced evidence that both inexperienced listeners and those highly trained in contemporary musical idioms, when confronted with fragments of 12-tone serial music (Schoenberg's Wind Quintet, op. 26, 1924, and String Quartet, no. 4, op. 37, 1936), tend to interpret the pitches according to tonal implications in the fragments. The judgments of trained listeners tended to be negatively correlated with these implications. The authors interpreted these results as indicating that trained musicians hear the tonal implication and then give low ratings to tones that fit with it since such relations are not supposed to be present in this music: a kind of post hoc decision rather than an immediate perceptual experience. At any rate, it should be emphasized that relatively little work has been done on music outside the Western tonal idiom. Nor do we yet have a large population of people who have been as exposed to these new musical organizations as they are to tonal/metric music. Some of the strong pronouncements about the ecological invalidity of contemporary musical idioms are certainly premature, though perhaps composers should also take a stronger interest in the speculations of music psychologists (McAdams, 1988).
I have confined myself primarily to pitch in this section, but it is worthwhile briefly considering other dimensions. Another crucial form-bearing dimension is duration, upon which very elaborate metric and rhythmic systems have been developed. A relatively small number of well-defined relative durations are used. A system of strong and weak beats is often organized hierarchically (and represented mentally as abstract knowledge) to which duration patterns are anchored (cf. Gabrielsson, 1979; Lerdahl & Jackendoff, 1983, chap. 4; Longuet-Higgins & Lee, 1984; Povel & Essens, 1985; Dowling & Harwood, 1986, chap. 7). Some of the reflections on the relations between tonal and event hierarchies suggest that the temporal dimension is crucial for the establishment of relations that are encoded as abstract knowledge on other dimensions.
To my knowledge no systematic experimental or musical research has yet been done on the possibilities of "scale" systems of timbre, or how these might interact with pitch and duration systems. It has been demonstrated that musical instrument timbres can be easily discriminated. Timbre relations have been shown to have similar mental representations across several listeners and can be more or less predicted on the basis of acoustic properties (Grey, 1977; Risset & Wessel, 1982). Listeners can also make consistent judgments of analogous timbral vectors (intervals through more than one dimension, possessing both distance and direction components). This demonstrates that the notion of vector is relevant and that transposed vectors are perceived as being equivalent when distance and direction relations are held constant between the two timbres (Wessel, 1979). The existence of the vector is already an important step toward developing patterns and scales in timbre space. Some preliminary attempts at making hierarchically organized sequences of timbres are encouraging (Lerdahl, 1987). What is not yet known is the extent to which variation along the dimensions of timbre can maintain perceptual invariance in the face of changes along the pitch and duration dimensions, or the extent to which listeners can acquire stable abstract representations of an ordered system of timbre relations.
It will also be necessary to try to generalize the use of emergent properties of a hierarchical system (as the relation between pitch and harmony) to the timbre dimension. A combination of values must possess an emergent property that derives from the group configuration rather than being a new value along the dimension. In the case of pitch, a chord can have the quality of being major or minor, for example, and one can still hear out the individual pitches. The pitches do not fuse into a new pitch which replaces them (in spite of the fact that we are often limited by masking processes in hearing out inner voices). It is frequently the case, as a lot of 20th Century music testifies, that multiple timbres fuse into a composite timbre, the individual identities being replaced by the newly emerged one. The non-fusion criterion may, in many cases, limit the possibilities of a superordinate system of timbral combinations. But then it may be pushing the rational urge too far to expect all form-bearing dimensions to behave in the same way or to have similar structural properties at all levels of combination.
Given that these patterns are sequential objects, it seems that the notion of "event schema" as the representation of a category may be appropriate. These abstract event schemata of stereotypic patterns and forms would correspond to some extent to the notion of "scripts" as developed by Schank & Abelson (1977). These researchers propose that we have abstract scripts for various kinds of macro-events, such as going to a concert or taking a bath. The script involves the main kinds of actions that are necessary such as leaving home, going to the concert hall, buying the ticket, sitting down, listening attentively, and going home. This is evidently very abstract and allows for all kinds of variation in its real-life manifestation. The same would hold for the notion of a melodic process. A "changing-note process", for example, starts at an important pitch (such as a tonic), descends a step below this pitch on the scale, skips up to the pitch above and then comes down to the main pitch again. In its specific manifestation, each of these elements may be elaborated into several bars of music. What Meyer's theory proposes however is that we recognize these processes as basic categories of melodic structure of which there are a very small number. This idea has been given some credence in analyses and experimental studies by Rosner & Meyer (1982, 1986).
This has several implications for form-bearing dimensions. Such patterns, if used extensively in the music of a culture, must be abstracted and generalized through experience by listeners. This means that for a lexicon of stereotypic event schemata to be established in long-term memory, the process of encoding and abstraction must be possible for that dimension. This is an area that certainly deserves more serious consideration and experimentation on several musical dimensions.
An important property of the activation of a structural schema is the "assignment" of relative stability, dominance, or salience relations among the perceived events. The fixing of events within an interpretive framework gives them musical significance and to some extent forms the dynamic musical flow by the anchoring or assimilation of less stable or salient elements to more stable and salient elements. But the temporal order of tones can also strongly influence which ones are interpreted as more stable (and, by implication, what the activated framework is). For example, in the tonally ambiguous sequence B3-C4-D#4-E4-F#4-G4, Bharucha (1984a) found that listeners preferred C major as an accompanying chord over B major, though the pitches of both chords are present in the sequence. To the contrary, when the sequence is played in reverse, listeners preferred B major. Bharucha concludes that two constraints are in operation in the anchoring of an unstable tone to a more stable one: the stable tone normally follows the unstable one and the tones must be either diatonic or chromatic neighbors. Thus, the skips between C-D# and E-F# cause the sequence to be interpreted as three sequential pairs and the stability relations to be established within each pair. This is a point where grouping processes and knowledge structures strongly interact (Deutsch, 1978; Krumhansl, 1979; Bharucha, 1984a,b). One wonders what responses listeners would give if the range of possible accompaniments presented to them was greater or if the listener had had extensive experience listening to Arabic music where such a scale pattern is quite common.
Additional constraints on sequence order have been reported by Butler & Brown (1984) where the sequential position of the tritone in a melodic pattern was important in the degree of activation of a tonal center. This indicates that distinctive intervals in a scale structure are not sufficient in and of themselves to activate the representation of a whole system of relations. The import of the process of anchoring or assimilation is that the possibility of establishing such sequential tendencies is a strong factor contributing to the form-bearing capacity of a dimension.
Memory limits also need to be considered in the encoding of musical material. In order to be easily encoded and then to contribute to the perception of connectedness between groups of musical events separated in time, a musical pattern must satisfy a number of constraints. It should be small enough to fit within the perceptual present (about 2-5 seconds; Fraisse, 1978). It should be unified enough to be grouped into a chunk in short-term memory. (A limit on short-term storage is classically set at about 5-9 chunks in Miller, 1956, though this depends on the nature of the pattern.) Patterns that are organized according to easily discernible rules of construction, such as hierarchical patterns, are generally remembered more easily and can have more elements in short-term memory than patterns organized in other ways (Restle, 1970; Deutsch & Feroe, 1981; Deutsch, 1982). Memorability may also be greater for patterns that have some kind of inherent stability or well-formedness as well as for patterns that are more easily assimilated to an existing knowledge structure. This latter is suggested by work on recognition memory of tonal and atonal sequences (see Francès, 1972/1988; Dowling & Harwood, 1986). The question of "good formation" has not been considered much of late. What "well-formedness" means may well be quite different for different dimensions and it is not clear to what extent it would be independent of cultural convention, and thus of acquired knowledge structures.
Perceptual invariance means that certain relations between categories along a stimulus dimension must remain constant after transformation. A perfect fifth (or a whole melody) maintains a relatively constant quality regardless of the register in which it is played (within the range of musical pitch). This means that transposition is an operation that is easily afforded by the pitch dimension. If patterns composed along a dimension are not predisposed to being varied and still being perceived as similar, then the dimension cannot make a strong contribution to musical form (White, 1960). In cases of inversion and expansive transformations, we would intuit that the degree of similarity would be less than in the case of transposition where both contour and interval content were maintained. A simple retrograde transformation, on the other hand, reverses absolute pitch sequence, interval sequence and contour; though the pitch set remains constant, its order is completely changed and people tend not to perceive it as being very similar to the original sequence (Francès, 1972/1988, Exp. 6). From these cases, we might hypothesize that limits on the perceived relatedness of a pattern to its transformed version indicate the limits of viable musical transformations. Krumhansl, Sandell & Sergeant (1987) have shown that listeners are capable of classifying mirror forms related to distinctive original pitch patterns when there are only two sets of them. However, no estimate was made of how similar these forms were perceived as being with respect to their originals.
There would appear to be two basic classes of transformation: linear operations along a given dimension or set of dimensions (such as translation, expansion, rotation, etc.), and structural modifications of the pattern (such as changing a single element, splitting an internal beat in two, hierarchically elaborating a melodic process by developing musical figures around its main notes, or adapting a pattern to a different meter). In the case of linear dimensional operations, the comparison between original and transformed versions would remain at the same structural level. For operations like hierarchical elaboration, the similarity judgment would necessarily be made at appropriate levels of the patterns depending on the degree of elaboration (Deutsch & Feroe, 1981; Rosner & Meyer, 1986).
It is not only desirable that listeners be able to recognize transformed material as being similar or related, but that they appreciate the nature of the transformation as well. This can contribute to a sense of direction in the musical development. The recognition and comparison of a more or less similar pattern after some kind of transformation at a later point in a piece, implies the existence of a mental representation of the original pattern that maintains certain structural properties during transformation. These properties may be perceptible as such: a contour, for example. The existence of such representations suggests limits on the structuring of transformed musical materials. The majority of research in this area has been done in experiments on recognition memory for pitch patterns. Not much work has been done on other stimulus dimensions or on the contribution of perceived similarity to associative structuring. A more extended exploration may eventually open a rich domain of functional replacements for the classical variation process and propose new kinds of musical development (see, for example, Reynolds, 1987). It may well turn out, however, that certain kinds of transformation are limited to specific dimensions.
A potential form-bearing dimension should be closely correlated with the sensory dimensions that effect perceptual grouping, whether it be of a simultaneous, sequential or segmentational nature. Current research indicates that the dimensions of timbral brightness, pitch, duration, dynamics and spatial location have this capacity. Future work should examine interactions and competitions among the dimensions with respect to perceptual grouping potential. Another important avenue of research would be the interaction between knowledge structures and grouping processes in order to understand the extent to which these latter, primarily bottom-up processes can be affected by previous knowledge and ongoing expectations.
A form-bearing dimension should be susceptible to being organized into perceptual categories and relations among these categories should be easily encoded. A system of salience or stability relations should be learnable through mere exposure and should affect the perception of patterns along this dimension. Certain recurrent sequential patterns of values should be easily learned as a kind of lexicon of forms. In other words, relations along the dimension must be susceptible to being acquired as abstract knowledge. Experimental research has focussed almost exclusively on pitch in this area though duration is beginning to receive more attention. In pitch, the vast majority of work is confined to Western tonal music, with some notable side trips to India and Indonesia. Future work should look at existing pitch systems of other non-Western cultures and at some of the new approaches to the compositional organization of pitch found in the work of our living composer colleagues. The same could be said of rhythm and meter. To my mind the next most important candidate for exploration and experimentation is timbre. Along which of its dimensions can we perceive, organize and remember musical relations? To what extent can they compete with structures of pitch and duration? It may be that the different dimensions have different general characteristics and that their relative contributions to grouping processes and knowledge structures will be varied.
Finally, there is much work of both theoretical and experimental natures to be done on the contribution of form-bearing dimensions to the building of hierarchical and associative event structures. A serious effort is needed to clarify theoretically the notion of associative structure and to develop experimental methodologies to verify its psychological reality. Other problems would include some of the following. To what extent can different form-bearing dimensions contribute to associative and hierarchical structures? What is the relative contribution of the different dimensions in these structures to the direction of attentional processes and to the development of expectation? How do the different dimensions interact in the accumulation of a musical form when the implications of their individual structures converge on structural coherence or diverge toward structural ambiguity? Perhaps with some clearer ideas of the mental representation and processing of musical structure and their result in an experience of musical form, we can approach some of the more fundamental questions of individual musical experience that have to this point eluded experimental and theoretical efforts.
Server © IRCAM-CGP, 1996-2008 - file updated on .
Serveur © IRCAM-CGP, 1996-2008 - document mis à jour le .