Serveur © IRCAM - CENTRE POMPIDOU 1996-2005.
Tous droits réservés pour tous pays. All rights reserved.

Psychological constraints on form-bearing dimensions in music

Stephen McAdams

Contemporary Music Review, 1989
Copyright © CMR 1989

Abstract

In raising the question of form-bearing dimension in music, we are trying to understand the possibilities and limits of the apprehension musical form in terms of the psychological mechanisms that operate on a received acoustic structure. To approach this understanding theoretically and experimentally, we need to define the notion of form-bearing dimension and to develop some ideas on the interactions that take place between perceptual processes and memory structures as form is accumulated in the mind of a listener. Three areas of psychological concern are discussed: perceptual grouping processes, abstract musical knowledge structures and event structure processing. For each area, the constraints on different musical dimensions such as pitch, duration, dynamics and timbre are examined in light of their potential to carry musical form.

Keywords : auditory grouping, event structure, form-bearing dimension, knowledge structure, mental representation, musical form, perceptual invariance, psychological constraints.

Introduction

From the perceiver's perspective, the existence of artistic form is often intangible, fleeting, fugitive - evolving with new perceptions, new understandings. Form as experienced depends partly on the mind of the listener and partly on the structure presented to that mind. How, then, do we proceed to investigate the psychological reality of a musical form? What of a musical structure is experienced as musical form?

One approach to these questions would be to investigate the form-bearing capacity of the perceptual dimensions that are used in music. A dimension can bear form if configurations of values along it can be encoded, organized, recognized and compared with other such configurations. We can arrange a sequence of pitches, for example, that is easily recognized when it is heard again. We are also quite adept at noticing variations on such a sequence - the appreciation of melodic variation requires this psychological ability.

The utility of a dimension as a form-bearer, however, depends on some additional factors. A dimension that affords a greater number of perceivable configurations is more valuable to a composer than a dimension along which only a small number are possible. This restriction may be due, for example, to limits in the discrimination among the values available. Many pitches are easily perceived and discriminated, as are a vast quantity of their combinations. It is unlikely, though, that a large number of separate vibrato rates would be easily discriminated and, as such, only a very small number of configurations would be possible.

Limits on the encoding of complex patterns along the dimensions would in turn limit their potential for transformation and development of configurations. Our encoding of relative durations is fairly acute and highly structured, which allows for a rich variation and elaboration of rhythmic patterns. To the contrary, encoding of spatial location in audition is relatively poor, and one would imagine that the development of sequences of spatial positions of notes would not be easily apprehended by a listener.

Another factor is the capacity to encode patterns along a given dimension in the presence of changes along other dimensions. Duration may well be the strongest dimension from this perspective since many composers repeat rhythmic motifs across rather large variations in pitch and timbre pattern, and listeners have no trouble in recognizing the similarity. According to the above mentioned conditions we might predict that pitch and duration would be strong dimensions, that several of the timbre dimensions would be of medium power, but that vibrato rate and spatial location, for example, would be very weak.

We cannot, therefore, arbitrarily structure an available physical dimension such as spatial location and still expect it to be comprehended. The limits for a given individual may be tied to internalizations of relations in the physical world that have proven useful to the human species through its evolution (Shepard, 1984). They might also be related to the extent to which the various dimensions are used frequently in the music of a given culture. It may be that form-bearing capacities represent biological and psychological constraints on structure processing. We need to study the way in which such structures can be apprehended by a listener.

A form extended in time directly poses the problem of how to do research on the mental representation of its temporal structure. The experience of form is highly dependent on several cognitive processes involved in the mental representation of musical constituents and musical knowledge, and in the organization and comprehension of musical structures. These processes include perceptual grouping, abstract knowledge structures and event structure processing.

Auditory grouping processes serve to organize the acoustic surface into musical events (simultaneous grouping), to connect events into musical streams (sequential grouping), and to "chunk" event streams into musical units (segmentational grouping). These three basic grouping processes are important precursors to the organization of musical form in that they pre-organize the continuous "acoustic surface" into discrete entities and groups of entities. This organization is, in effect, the forming of mental descriptions of what is happening in the world, a mapping of acoustic sources and events into auditory descriptions that can be used in calculating expectations and in developing comprehension, in effect a kind of "auditory scene analysis" (Bregman, 1977, 1981, in press). To some extent these grouping processes precede the extraction (or "computation") of the perceptual qualities of events and the relations among these event qualities (pitch, timbre, density, rhythm, interval, consonance, etc.; cf. McAdams, 1987; Wright & Bregman, 1987). Musical form is built upon relations among perceptual qualities. To the extent that grouping processes affect the emergent qualities, they can affect the perception of form. Some perceptual dimensions are strongly correlated with the sensory dimensions along which grouping decisions are made, such as pitch and timbre being strongly correlated with the spectral changes that affect sequential and segmentational grouping (Bregman, 1978; McAdams & Bregman, 1979; McAdams, 1984; I. Deliège, 1987). Such dimensions will be important contributors to musical form. Thus, an important criterion for a form-bearing dimension is that changes along it should be able to induce distinctive transitions or contrasts at the musical surface.

The perceived qualities of musical events are anchored to a learned system of relations (scale, meter, harmonic field, etc.) that is more or less strongly evoked by the relations among events in the musical context. This system of relations may be considered as abstract knowledge about the structure of the music of a given culture that one has acquired through extensive experience. This knowledge is abstract in the sense that it embodies relations that apply across a large repertoire of pieces and serves to establish the relative stability or salience relations among the values along a given dimension. This domain is perhaps the most important for the consideration of form-bearing capacity because it is clear that if a system of habitual relations among values along a dimension cannot be learned, the power of that dimension as a structuring force would be severely compromised.

The incoming acoustic information is parsed and interpreted according to acquired musical knowledge structures which affect the subsequent encoding and organizing of the musical material in an accumulated event structure upon which the experience of form is based. This event structure is specific to a given listening to a piece of music. This area involves many psychological processes to which a dimension must be susceptible if it is to contribute to musical form, such as the encoding of values and relations on musical dimensions, the perception of the similarity, invariance, and difference of musical material occurring at different times, the combination of hierarchical and associative structures and the appreciation of the trajectory of development, and the generation of expectations based on structural implications.

The remainder of this article will examine more closely the aspects of abstract knowledge structures and event structure processing that are important for evaluating the contribution of various perceptual dimensions to musical form.

Abstract Knowledge Structures

A lot of our perceptions, decisions, and understandings are based on generalizations that we have learned from specific experience. In music cognition these abstractions seem to be of two types that are general to at least a whole repertoire of music or to a musical culture (though some may also apply across cultures): systems of relations among musical categories (such as pitch categories, scale structure, and tonal and metric hierarchies), and a lexicon of abstract patterns that are frequently encountered (such as gallop rhythm, gap-fill melody, sonata form or rag form). The former tend to be knowledge structures that are atemporal while the latter have a sequential aspect. Below I will discuss the role of categories, ordered relations among categories, and a lexicon of patterns in the cognition of musical form. The discussion will focus primarily on pitch with brief mention of limits and possibilities for other dimensions.

Ordered relations

It is necessary that the classification of stimulus categories and the ordering of their relations reflect psychological possibilities, in other words that there be a strong degree of correspondence between the stimulus structure and its mental representation. Otherwise the resultant structures will not be decodable by the listener, and they won't be able to contribute to the appreciation of musical form. The way of ordering the perceptual categories places more or less rigorous constraints on the apprehension of musical forms.

To begin this investigation, let us consider several properties that appear to contribute to the organization of pitch systems (e.g., scale structures - Cross, Howell & West, 1985; and tonal hierarchies - Krumhansl, 1983) that we may postulate to be more or less easily processed by a listener. A recognition of these properties in a sequence of musical events helps to evoke and establish the system framework in the listener's mind.

Some researchers feel that the category values and intervals should be selected with respect to sensory considerations, such as sensory consonance (Helmholtz, 1877/1885; Krumhansl, 1987; Lerdahl, 1988; though this is contested by Brown, 1988). The organization of relations according to such sensory properties provides a solid psychoacoustic foundation for their function in musical patterns. Mathews, Pierce & Roberts (1987) propose this kind of constraint as a crucial consideration for the development of new musical dimensions. They call it the acoustic nucleus hypothesis: "With new materials, it is necessary to have an acoustic nucleus on which to grow powerful musical connotations via long-term learning. The acoustic nucleus consists of sound qualities that are perceivable at a low peripheral level, such as . . . relative dissonance . . . " (p. 83). However, we should realize that this constraint has been loosened somewhat in the establishment of the equal-tempered pitch system in order to gain other organizational possibilities such as being able to transpose a pitch pattern to any other pitch without seriously distorting the interval relations. Lerdahl (1987) has made some preliminary attempts at applying a generalized notion of sensory dissonance to the development of a system of timbral relations.

A property that seems to be special to pitch is the existence of a strong perceptual equivalence at the octave which allows the dimension to be organized cyclically with a given pattern of values being repeated regularly in each octave. This property is found in most of the musical scales of the world, though there are some notable exceptions (cf. discussion in Burns & Ward, 1982). As such one might hypothesize, as do Burns & Ward, that octave generalization in pitch is a learned concept which has its roots in sensory consonance, but the innate vs. acquired nature of octave equivalence is far from resolved in the debate between universalists and cultural relativists.

The psychological value of a cyclic dimension is that it allows a certain economy of mental representation and learning of the stimulus dimension: a large number and range of values can be used without overloading memory since a scale pattern with a small number of elements is repeated regularly. Once well-learned, these patterns tend to become very strong as components of structural interpretation, that is, as components of mental schemata that are used to organize and understand the incoming musical structure (Krumhansl, 1983; Shepard & Jordan, 1984).

One wonders whether the mere existence of a cyclic organization is in itself sufficient to enrich the dimension's form-bearing capacities independently of the actual repetition interval used. The very special status of the octave is due to its high degree of consonance. Aside from experiments by Mathews & Pierce (1980), little experimental evidence exists for scale systems organized on intervals other than the octave. They developed a "stretched" diatonic scale which had an "octave" ratio of 2.4:1 instead of 2:1. The tones that were played also had stretched frequency ratios, making them inharmonic. Subjects were asked to judge whether a short harmonic progression was in the same key as a longer passage. Both ended in a stretched equivalent of a cadence. Subjects were also asked to judge the finality of the cadence compared with unstretched tones and scales. The results suggest that subjects can match the "keys" of chord sequences in the stretched system, though the cadences lack a sense of finality.

In a great deal of contemporary Western music since the 1950s, there has been a kind of obsessive avoidance of pitch pattern repetition in other octaves. Part of the reason for this was to avoid invoking the schemata of classical tonal music. Another was based on an æsthetic principal of continual renewal of material with as little repetition as possible (cf. Schoenberg, 1941/1975, 1948/1975). The resulting irregularities (also often applied to rhythm) force listeners to adopt completely different modes of listening and remembering.

There is no evidence of true circularity in form-bearing dimensions other than pitch. One attempt at imposing circularity on timbre has been proposed by Slawson (1985). He takes a bounded two-dimensional representation of vowel-space (the dimensions corresponding to the center frequencies of two formants or filters) as a starting point for a theory of "sound color". From this he tries to develop a series of rules of organization of the space and of operations on the elements in the space based on serial procedures. He suggests that if an operation, such as transposition, forces one to leave the bounded space to the right, one should treat the right hand border as coextensive with the left hand border and simply wrap the pattern around. This has the effect of completely changing the interval and contour relations of the pattern. I would claim that this is the perceptual equivalent of using a two and a half octave instrument (C2 to F4) and treating the C2 as equivalent to F4. Transposing the pattern F3-D4-Bb3-A3-C#4 up a fourth and wrapping the notes above F4 around to C2 would give Bb3-D2-Eb4-D4-C#2 which is clearly different from the original in both interval pattern and contour. The author recognizes that this is an unfounded premise that violates what one perceives. He proceeds nonetheless to base a large portion of his theory of sound color and many of his compositional efforts on this falsely imposed property of the space. It becomes clear through this intellectual exercise that one cannot "invent" a perceptual equivalence that has no psychoacoustic foundation. It also becomes clear that the lack of this property places some rather severe constraints on the possible range of operations along a dimension.

Balzano (1980), Dowling & Harwood (1986), and Krumhansl (1987) have delineated a number of other properties of the pitch dimension that are helpful in establishing a tonal pitch hierarchy. These criteria are examined in detail with respect to several existing and proposed pitch scale systems in Krumhansl (1987). Focal values are those that occur frequently, that have long durations, and that tend to occupy strong positions in musical phrases. Frequency of occurrence and duration may, in particular, help to establish the system framework when a listener is faced with an unfamiliar musical style. Western listeners appear able to do this with Indian rags though certain subtleties of the Indian system escape them (Castellano, Bharucha & Krumhansl, 1984; see also Kessler, Hansen & Shepard, 1984, for Western and Balinese listeners). The effect of phrase position would depend a great deal on the listener already having acquired some understanding of phrase structure, perhaps from cues such as slowing down and pauses at the end of phrases (cf. Carlson, Friberg, Frydén, Granström & Sundberg, this volume).

It is desirable for listeners to be able to rapidly discern their "position" within the system of pitch relations (Browne, 1981). This position finding may be due to both the asymmetric structure of intervals among the categories in a scale (e.g. in the major diatonic scale there are series of two and three major seconds separated by minor seconds), and to the existence of rare or distinctive intervals in the set (e.g. in the major diatonic scale there exists only one augmented fourth, two minor seconds, and a greater number of other intervals; cf. Butler & Brown, 1984). This property also maximizes the variety of interval sizes in a given scale. This property would distinguish the Western diatonic scales and many Indian thats from equal-tempered pentatonic scales found on Indonesian gamelan and Ugandan harp or from certain equal-tempered heptatonic scales found on Ugandan and Thai xylophones (cf. discussion in Burns & Ward, 1982, pp. 257-258).

The last criterion of ordering is the predisposition to certain sequential relations among dimension values (Butler & Brown, 1984; Brown, 1988). This criterion is enhanced by the existence of distinctive intervals. It tries to capture aspects of functional relations that are not merely related to the frequency of occurrence of individual values, but to the frequency of occurrence of pairs or sets of values in a given sequential order, that is, to statistical sequential asymmetries found in a body of music. For example, within the Western tonal hierarchy, the unstable leading-tone tends to resolve to a succeeding tonic. The statistical occurrence of this ordering of the pitches is much greater than the reverse. The learning of these tendencies through extended exposure to a given style of music is partially responsible for the sense of directed motion: a given value implies by anticipation its succession by another, giving rise to the patterns of tension and release, or implication and realization (cf. Narmour, this volume). This functional, sequential aspect of pitch relations has been stressed by Butler & Brown (1984) as a crucial cue in evoking a tonal center and a sense of key. What I would like to point out here is that many of these relations of stability and instability which occur frequently, can also become part of the abstract knowledge of the structure of a musical dimension such as pitch. This is the problem of the interaction between abstract knowledge structures and real-time event structure processing that is discussed by Deutsch (1984) and Bharucha (1984b) in terms of tonal and event hierarchies.

Much psychology research has demonstrated that information is more easily encoded, organized, perceived and remembered when it is hierarchically ordered (Restle, 1970; Deutsch & Feroe, 1981). One might postulate that this is, then, a desirable property of a stimulus system. There are two types of hierarchy that are often referred to here. One is a hierarchy of dominance or stability relations (what Simon, 1962/1982, refers to as a "formal hierarchy") in which some elements dominate other elements and are thus given greater structural prominence. The other type is one where combinations of elements at lower levels give rise to emergent properties at higher levels that are not easily derived from the properties of the individual constituents. Pitch relations in a scale are primarily of the first type, while the relations between pitch, chord, and key are of the second type. There is a third type which is often used in connection with event hierarchies (see next section) which has a notion of parts within parts. This latter is the main type referred to in Lerdahl & Jackendoff (1983) in their well-formedness rules and structural trees. Hierarchization depends to a large extent on some of the criteria listed previously, such as the existence of reference points or focal values.

The derivation of harmonic function from the tonal organization of pitch is an example of the richness of hierarchization possible in a musical dimension (cf. Krumhansl, 1987; Lerdahl, 1988), though one wonders to what extent this richness might be limited to pitch and duration. The derivation of harmony from scale structure illustrates the fact that certain properties emerge at certain levels in a hierarchical system. A number of criteria for harmonic relations that mirror to some extent those of pitch relations listed above have been proposed by Krumhansl (1987). I won't examine these here, but will summarize her reflections by saying that the relations among the "emergent values" (chord types) at this higher level of the hierarchy are organized in strikingly similar ways to the simpler values (pitches) at the lower level. This coherence between levels of the hierarchy distinguishes the Western tonal pitch system and gives it such a high structuring potential.

The abandonment or weakening by many contemporary composers of the tonal pitch hierarchy, with its incumbent structural economy, and the resistance of some to using any other kind of hierarchy to replace it, might place a greater cognitive burden on the listener. To date little experimentation or psychologically oriented theory has been directed at trying to understand what listeners actually hear and understand in this kind of music, or to understand the extent to which one can learn to process these new combinatorial structures (though see Lerdahl, this volume). The work that has been done suggests that most listeners are more sensitive to contour than to precise interval structure in atonal and serial music (Francès, 1972/1988, chaps. 3,4; Dowling & Harwood, 1986, chap. 5). Krumhansl, Sandell & Sergeant (1987) produced evidence that both inexperienced listeners and those highly trained in contemporary musical idioms, when confronted with fragments of 12-tone serial music (Schoenberg's Wind Quintet, op. 26, 1924, and String Quartet, no. 4, op. 37, 1936), tend to interpret the pitches according to tonal implications in the fragments. The judgments of trained listeners tended to be negatively correlated with these implications. The authors interpreted these results as indicating that trained musicians hear the tonal implication and then give low ratings to tones that fit with it since such relations are not supposed to be present in this music: a kind of post hoc decision rather than an immediate perceptual experience. At any rate, it should be emphasized that relatively little work has been done on music outside the Western tonal idiom. Nor do we yet have a large population of people who have been as exposed to these new musical organizations as they are to tonal/metric music. Some of the strong pronouncements about the ecological invalidity of contemporary musical idioms are certainly premature, though perhaps composers should also take a stronger interest in the speculations of music psychologists (McAdams, 1988).

I have confined myself primarily to pitch in this section, but it is worthwhile briefly considering other dimensions. Another crucial form-bearing dimension is duration, upon which very elaborate metric and rhythmic systems have been developed. A relatively small number of well-defined relative durations are used. A system of strong and weak beats is often organized hierarchically (and represented mentally as abstract knowledge) to which duration patterns are anchored (cf. Gabrielsson, 1979; Lerdahl & Jackendoff, 1983, chap. 4; Longuet-Higgins & Lee, 1984; Povel & Essens, 1985; Dowling & Harwood, 1986, chap. 7). Some of the reflections on the relations between tonal and event hierarchies suggest that the temporal dimension is crucial for the establishment of relations that are encoded as abstract knowledge on other dimensions.

To my knowledge no systematic experimental or musical research has yet been done on the possibilities of "scale" systems of timbre, or how these might interact with pitch and duration systems. It has been demonstrated that musical instrument timbres can be easily discriminated. Timbre relations have been shown to have similar mental representations across several listeners and can be more or less predicted on the basis of acoustic properties (Grey, 1977; Risset & Wessel, 1982). Listeners can also make consistent judgments of analogous timbral vectors (intervals through more than one dimension, possessing both distance and direction components). This demonstrates that the notion of vector is relevant and that transposed vectors are perceived as being equivalent when distance and direction relations are held constant between the two timbres (Wessel, 1979). The existence of the vector is already an important step toward developing patterns and scales in timbre space. Some preliminary attempts at making hierarchically organized sequences of timbres are encouraging (Lerdahl, 1987). What is not yet known is the extent to which variation along the dimensions of timbre can maintain perceptual invariance in the face of changes along the pitch and duration dimensions, or the extent to which listeners can acquire stable abstract representations of an ordered system of timbre relations.

It will also be necessary to try to generalize the use of emergent properties of a hierarchical system (as the relation between pitch and harmony) to the timbre dimension. A combination of values must possess an emergent property that derives from the group configuration rather than being a new value along the dimension. In the case of pitch, a chord can have the quality of being major or minor, for example, and one can still hear out the individual pitches. The pitches do not fuse into a new pitch which replaces them (in spite of the fact that we are often limited by masking processes in hearing out inner voices). It is frequently the case, as a lot of 20th Century music testifies, that multiple timbres fuse into a composite timbre, the individual identities being replaced by the newly emerged one. The non-fusion criterion may, in many cases, limit the possibilities of a superordinate system of timbral combinations. But then it may be pushing the rational urge too far to expect all form-bearing dimensions to behave in the same way or to have similar structural properties at all levels of combination.

A lexicon of patterns

Another type of knowledge we might expect experienced listeners to possess, and which would influence the ability of a perceptual dimension to bear form, is that of classes of patterns and forms. There are, in any given culture, certain patterns of pitch, duration, and perhaps timbre relations that occur frequently and in many different specific circumstances. Some music theorists propose a relatively restricted lexicon of pitch patterns that are the "genetic code" from which melodies are built (Narmour, this volume; in press), or that are a kind of archetypal substructure from which melodies are elaborated (Meyer's "melodic process", 1973).

Given that these patterns are sequential objects, it seems that the notion of "event schema" as the representation of a category may be appropriate. These abstract event schemata of stereotypic patterns and forms would correspond to some extent to the notion of "scripts" as developed by Schank & Abelson (1977). These researchers propose that we have abstract scripts for various kinds of macro-events, such as going to a concert or taking a bath. The script involves the main kinds of actions that are necessary such as leaving home, going to the concert hall, buying the ticket, sitting down, listening attentively, and going home. This is evidently very abstract and allows for all kinds of variation in its real-life manifestation. The same would hold for the notion of a melodic process. A "changing-note process", for example, starts at an important pitch (such as a tonic), descends a step below this pitch on the scale, skips up to the pitch above and then comes down to the main pitch again. In its specific manifestation, each of these elements may be elaborated into several bars of music. What Meyer's theory proposes however is that we recognize these processes as basic categories of melodic structure of which there are a very small number. This idea has been given some credence in analyses and experimental studies by Rosner & Meyer (1982, 1986).

This has several implications for form-bearing dimensions. Such patterns, if used extensively in the music of a culture, must be abstracted and generalized through experience by listeners. This means that for a lexicon of stereotypic event schemata to be established in long-term memory, the process of encoding and abstraction must be possible for that dimension. This is an area that certainly deserves more serious consideration and experimentation on several musical dimensions.

Event Structure Processing

Musical events, after passing through elementary grouping processes, are then processed in such a way as to recover aspects of their larger-scale structure. This more time-bound part of structure processing, specific to the information being received, may be contrasted with the abstract knowledge discussed in the previous section. This contrast mirrors that proposed by Bharucha (1984b) and Deutsch (1984) between tonal and event hierarchies. I prefer to use the term "event structure" here, since there are aspects of musical form, represented through event structures, that are not purely hierarchical, such as the relatively little understood associative relations established by similar patterns in different parts of a piece. The area of event structures concerns the processes that underlie the perceptual encoding of musical events and patterns within the context of evoked systems of relations, the perception of invariance and transformation of musical patterns, and the establishment of associative and hierarchical relations across time in the building of a mental representation of a musical form. I will discuss here the problems of pattern encoding and the perception of invariance and transformation.

Encoding values and relations

The kinds of relations and patterns that are encoded include pitch intervals, pitch contours, chord qualities, rhythmic intervals between event onsets, rhythmic contours of sequences of long and short events, and vectors between points in timbre space. The process of encoding the patterns of values and relations among them as they occur in time is not neutral. Perceived relations are constrained by grouping processes, and by expectations about events that are likely to occur. These expectations result from anticipatory schemata representing abstract knowledge acquired through previous experience and which are activated by the incoming events. Such schemata have been shown to facilitate the perception of certain tones over others in the cases both of tonal pitch relations (Bharucha & Stoeckig, 1986) and of rhythmic sequences (Bharucha & Pryor, 1986). This evoking, or activation, of mental representations of tonality and meter may have the effect of orienting perception toward a set of context-constrained alternatives (Bartlett & Dowling, 1988). "Events are thus expected, implied, erroneously judged to have occurred, and rendered more consonant, to the extent that their mental representations have been activated in anticipation of their occurrence" (Bharucha, 1987, p. 3). Following this, one might conjecture that perceptual dimensions for which listeners are capable of acquiring abstract structural knowledge, and which can subsequently be used to elicit more or less strong expectations in listening, would be good candidates for form-bearing dimensions.

An important property of the activation of a structural schema is the "assignment" of relative stability, dominance, or salience relations among the perceived events. The fixing of events within an interpretive framework gives them musical significance and to some extent forms the dynamic musical flow by the anchoring or assimilation of less stable or salient elements to more stable and salient elements. But the temporal order of tones can also strongly influence which ones are interpreted as more stable (and, by implication, what the activated framework is). For example, in the tonally ambiguous sequence B3-C4-D#4-E4-F#4-G4, Bharucha (1984a) found that listeners preferred C major as an accompanying chord over B major, though the pitches of both chords are present in the sequence. To the contrary, when the sequence is played in reverse, listeners preferred B major. Bharucha concludes that two constraints are in operation in the anchoring of an unstable tone to a more stable one: the stable tone normally follows the unstable one and the tones must be either diatonic or chromatic neighbors. Thus, the skips between C-D# and E-F# cause the sequence to be interpreted as three sequential pairs and the stability relations to be established within each pair. This is a point where grouping processes and knowledge structures strongly interact (Deutsch, 1978; Krumhansl, 1979; Bharucha, 1984a,b). One wonders what responses listeners would give if the range of possible accompaniments presented to them was greater or if the listener had had extensive experience listening to Arabic music where such a scale pattern is quite common.

Additional constraints on sequence order have been reported by Butler & Brown (1984) where the sequential position of the tritone in a melodic pattern was important in the degree of activation of a tonal center. This indicates that distinctive intervals in a scale structure are not sufficient in and of themselves to activate the representation of a whole system of relations. The import of the process of anchoring or assimilation is that the possibility of establishing such sequential tendencies is a strong factor contributing to the form-bearing capacity of a dimension.

Memory limits also need to be considered in the encoding of musical material. In order to be easily encoded and then to contribute to the perception of connectedness between groups of musical events separated in time, a musical pattern must satisfy a number of constraints. It should be small enough to fit within the perceptual present (about 2-5 seconds; Fraisse, 1978). It should be unified enough to be grouped into a chunk in short-term memory. (A limit on short-term storage is classically set at about 5-9 chunks in Miller, 1956, though this depends on the nature of the pattern.) Patterns that are organized according to easily discernible rules of construction, such as hierarchical patterns, are generally remembered more easily and can have more elements in short-term memory than patterns organized in other ways (Restle, 1970; Deutsch & Feroe, 1981; Deutsch, 1982). Memorability may also be greater for patterns that have some kind of inherent stability or well-formedness as well as for patterns that are more easily assimilated to an existing knowledge structure. This latter is suggested by work on recognition memory of tonal and atonal sequences (see Francès, 1972/1988; Dowling & Harwood, 1986). The question of "good formation" has not been considered much of late. What "well-formedness" means may well be quite different for different dimensions and it is not clear to what extent it would be independent of cultural convention, and thus of acquired knowledge structures.

Invariance and transformation

Pattern similarity perception may be considered an important basis for musical development, which involves the abstraction of invariances across transformations of musical patterns. The transformation of musical patterns figures among the quasi-universal characteristics of the world's music systems. A strong form-bearing dimension should allow a richness of pattern transformations that are perceived as related to the original material to a greater or lesser degree (Slawson, 1985). This raises the question of what remains the same and what varies when a musical pattern is transformed. Within the dimension of pitch, for example, transposition maintains exact interval pattern and contour, but can change key. Harmonic modulation maintains contour and interval class pattern (allowing for equivalence of major and minor intervals, etc. across keys). Inversion maintains interval size while inverting direction (and thus contour). In Music for Strings, Percussion, and Celeste (1936), Bartók has used expansive and compressive transforms that enlarge or reduced all intervals (constrained by some desired pitch set) which maintains pitch contour. Theme and variations treatment often uses hierarchical elaboration wherein notes of the original melody are ornamented or developed with melodic figures. It is hierarchical in the sense that a "reduction" of the elaborated melody would yield the original.

Perceptual invariance means that certain relations between categories along a stimulus dimension must remain constant after transformation. A perfect fifth (or a whole melody) maintains a relatively constant quality regardless of the register in which it is played (within the range of musical pitch). This means that transposition is an operation that is easily afforded by the pitch dimension. If patterns composed along a dimension are not predisposed to being varied and still being perceived as similar, then the dimension cannot make a strong contribution to musical form (White, 1960). In cases of inversion and expansive transformations, we would intuit that the degree of similarity would be less than in the case of transposition where both contour and interval content were maintained. A simple retrograde transformation, on the other hand, reverses absolute pitch sequence, interval sequence and contour; though the pitch set remains constant, its order is completely changed and people tend not to perceive it as being very similar to the original sequence (Francès, 1972/1988, Exp. 6). From these cases, we might hypothesize that limits on the perceived relatedness of a pattern to its transformed version indicate the limits of viable musical transformations. Krumhansl, Sandell & Sergeant (1987) have shown that listeners are capable of classifying mirror forms related to distinctive original pitch patterns when there are only two sets of them. However, no estimate was made of how similar these forms were perceived as being with respect to their originals.

There would appear to be two basic classes of transformation: linear operations along a given dimension or set of dimensions (such as translation, expansion, rotation, etc.), and structural modifications of the pattern (such as changing a single element, splitting an internal beat in two, hierarchically elaborating a melodic process by developing musical figures around its main notes, or adapting a pattern to a different meter). In the case of linear dimensional operations, the comparison between original and transformed versions would remain at the same structural level. For operations like hierarchical elaboration, the similarity judgment would necessarily be made at appropriate levels of the patterns depending on the degree of elaboration (Deutsch & Feroe, 1981; Rosner & Meyer, 1986).

It is not only desirable that listeners be able to recognize transformed material as being similar or related, but that they appreciate the nature of the transformation as well. This can contribute to a sense of direction in the musical development. The recognition and comparison of a more or less similar pattern after some kind of transformation at a later point in a piece, implies the existence of a mental representation of the original pattern that maintains certain structural properties during transformation. These properties may be perceptible as such: a contour, for example. The existence of such representations suggests limits on the structuring of transformed musical materials. The majority of research in this area has been done in experiments on recognition memory for pitch patterns. Not much work has been done on other stimulus dimensions or on the contribution of perceived similarity to associative structuring. A more extended exploration may eventually open a rich domain of functional replacements for the classical variation process and propose new kinds of musical development (see, for example, Reynolds, 1987). It may well turn out, however, that certain kinds of transformation are limited to specific dimensions.

Conclusion

I have tried to set out in this article a number of ideas about constraints of a psychological nature on perceptual dimensions that are either well-known bearers of form, such as pitch and duration, or those that are serious candidates, such as timbre. These constraints fall into three areas as are summarized below.

A potential form-bearing dimension should be closely correlated with the sensory dimensions that effect perceptual grouping, whether it be of a simultaneous, sequential or segmentational nature. Current research indicates that the dimensions of timbral brightness, pitch, duration, dynamics and spatial location have this capacity. Future work should examine interactions and competitions among the dimensions with respect to perceptual grouping potential. Another important avenue of research would be the interaction between knowledge structures and grouping processes in order to understand the extent to which these latter, primarily bottom-up processes can be affected by previous knowledge and ongoing expectations.

A form-bearing dimension should be susceptible to being organized into perceptual categories and relations among these categories should be easily encoded. A system of salience or stability relations should be learnable through mere exposure and should affect the perception of patterns along this dimension. Certain recurrent sequential patterns of values should be easily learned as a kind of lexicon of forms. In other words, relations along the dimension must be susceptible to being acquired as abstract knowledge. Experimental research has focussed almost exclusively on pitch in this area though duration is beginning to receive more attention. In pitch, the vast majority of work is confined to Western tonal music, with some notable side trips to India and Indonesia. Future work should look at existing pitch systems of other non-Western cultures and at some of the new approaches to the compositional organization of pitch found in the work of our living composer colleagues. The same could be said of rhythm and meter. To my mind the next most important candidate for exploration and experimentation is timbre. Along which of its dimensions can we perceive, organize and remember musical relations? To what extent can they compete with structures of pitch and duration? It may be that the different dimensions have different general characteristics and that their relative contributions to grouping processes and knowledge structures will be varied.

Finally, there is much work of both theoretical and experimental natures to be done on the contribution of form-bearing dimensions to the building of hierarchical and associative event structures. A serious effort is needed to clarify theoretically the notion of associative structure and to develop experimental methodologies to verify its psychological reality. Other problems would include some of the following. To what extent can different form-bearing dimensions contribute to associative and hierarchical structures? What is the relative contribution of the different dimensions in these structures to the direction of attentional processes and to the development of expectation? How do the different dimensions interact in the accumulation of a musical form when the implications of their individual structures converge on structural coherence or diverge toward structural ambiguity? Perhaps with some clearer ideas of the mental representation and processing of musical structure and their result in an experience of musical form, we can approach some of the more fundamental questions of individual musical experience that have to this point eluded experimental and theoretical efforts.

Acknowledgments

This article has benefitted enormously from insightful discussions with Carol Krumhansl. I would also like to thank Eric Clarke, Fred Lerdahl, Eugene Narmour and an anonymous reviewer for helpful critiques of an earlier version of the manuscript.

References

[Balzano, G.J. (1980)]: The group-theoretic description of twelvefold and microtonal pitch systems, Computer Music Journal, 4(4), 66-84.
[Bartlett, J.C. & Dowling, W.J. (1988)]: Scale structure and similarity of melodies, Music Perception, 5, 285-314.
[Bartók, B. (1936)]: Music for Strings, Percussion, and Celeste, Vienna : Universal Editions/Philharmonia.
[Bharucha, J.J. (1984a)]: Anchoring effects in music: The resolution of dissonance, Cognitive Psychology, 16, 485-518.
[Bharucha, J.J. (1984b)]: Event hierarchies, tonal hierarchies and assimilation: A reply to Deutsch and Dowling, Journal of Experimental Psychology: General, 113, 421-425.
[Bharucha, J.J. (1987)]: Music cognition and perceptual facilitation: A connectionist framework, Music Perception, 5, 1-30.
[Bharucha, J.J. & Pryor, (1986)]: Disrupting the isochrony underlying rhythm: An asymmetry in discrimination, Perception & Psychophysics, 40, 137-141.
[Bharucha, J.J. & Stoeckig, K. (1986)]: Reaction time and musical expectancy: Priming of chords, Journal of Experimental Psychology: Human Perception & Performance, 12, 1-8.
[Boulez, P. (1987)]: Timbre and composition - timbre and language. In"Music and Psychology: A Mutual Regard," S. McAdams (ed.), Contemporary Music Review, 2(1), 161-172.
[Bregman, A.S. (1977)]: Perception and behavior as compositions of ideals, Cognitive Psychology, 9, 250-292.
[Bregman, A.S. (1978)]: The formation of auditory streams. In Attention and Performance VII, J. Requin (ed.), Hillsdale, N.J. : Lawrence Erlbaum Associates.
[Bregman, A.S. (1981)]: Asking the "what for" question in auditory perception. In Perceptual Organization, M. Kubovy & J.R. Pomerantz (eds.), pp. 99-118, Hillsdale, N.J. : Lawrence Erlbaum Associates.
[Bregman, A.S.]: (in press) Auditory Scene Analysis, Cambridge, Mass. : Bradford Books, MIT Press.
[Brown, H. (1988)]: The interplay of set content and temporal context in a functional theory of tonality perception, Music Perception, 5, 219-249.
[Browne, R. (1981)]: Tonal implications of the diatonic set, In Theory Only, 5(6-7), 3-21.
[Burns, E.D. & Ward, W.D. (1982)]: Intervals, scales, and tuning. In The Psychology of Music, D. Deutsch (ed.), pp. 241-269, New York : Academic Press.
[Butler, D. & Brown, H. (1984)]: Tonal structure versus function: Studies of the recognition of harmonic motion, Music Perception, 2, 6-24.
[Castellano, Bharucha, J.J. & Krumhansl, C.L. (1984)]: Tonal hierarchies in the music of North India, Journal of Experimental Psychology: General, 113, 394-412.
[Clarke, E.F. (1987)]: Levels of structure in musical time. In"Music and Psychology: A Mutual Regard," S. McAdams (ed.), Contemporary Music Review, 2(1), 211-238.
[Cross, I. Howell, P. & West, R. (1985)]: Structural relationships in the perception of musical pitch. In Musical Structure and Cognition, P. Howell, I. Cross & R. West (eds.), pp. 121-142.
[Deliège, I. (1987)]: Grouping conditions in listening to music: An approach to Lerdahl and Jackendoff's grouping preference rules, Music Perception, 4, 325-360.
[Deutsch, D. (1978)]: Delayed pitch comparisons and the principle of proximity, Perception & Psychophysics, 23, 227-230.
[Deutsch, D. (1982)]: The processing of pitch combinations. In The Psychology of Music, D. Deutsch (ed.), pp. 271-316, New York : Academic Press.
[Deutsch, D. (1984)]: Two issues concerning tonal hierarchies: Comment on Castellano, Bharucha and Krumhansl, Journal of Experimental Psychology: General, 113, 413-416.
[Deutsch, D. & Feroe, J. (1981)]: The internal representation of pitch sequences in tonal music, Psychological Review, 88, 503-522.
Dowling, W.J. & Harwood, D.L. (1986) Music Cognition , New York : Academic Press.
[Fant, G. (1973)]: Speech Sounds and Features, Cambridge, Mass. : MIT Press.
[Fraisse, P. (1963)]: Psychology of Time, New York : Harper, trans. from Psychologie du temps, Paris : Presses Universitaires de France, 1957.
[Francès, R. (1988)]: The Perception of Music, Hillsdale, N.J. : Lawrence Erlbaum Associates; trans. by W.J. Dowling from La perception de la musique, 2nd ed., Paris : Vrin, 1972.
[Gabrielsson, A. (1979)]: Experimental research on rhythm, The Humanities Association Review, 30(1/2), 69-92.
[ Green, D.M. (1976)]: An Introduction to Hearing, Hillsdale, N.J. : Lawrence Erlbaum Associates.
[ Grey, J.M. (1977)]: Multidimensional perceptual scaling of musical timbres, Journal of the Acoustical Society of America, 61, 1270-1277.
[ Helmholtz, H. von (1885)]: On the Sensations of Tone, republ. 1954, New York : Dover; trans. by A.J. Ellis from Die Lehre von den Tonempfindungen, 4th ed., 1877.
[ Jordan, D.S. (1987)]: Influence of the diatonic tonal hierarchy at microtonal intervals, Perception & Psychophysics, 41, 482-488.
[ Kessler, E.J., Hansen, C. & Shepard, R.N. (1984)]: Tonal schemata in the perception of music in Bali and in the West, Music Perception, 2, 131-165.
[ Krumhansl, C.L. (1979)]: The psychological representation of musical pitch in a tonal context, Cognitive Psychology, 11, 346-374.
[ Krumhansl, C.L. (1983)]: Perceptual structures for tonal music, Music Perception, 1, 28-62.
[ Krumhansl, C.L. (1987)]: General properties of musical pitch systems: Some psychological considerations. In Harmony and Tonality, J. Sundberg (ed.), pp. 33-52, Stockholm : Royal Swedish Academy of Music, publ. no. 54.
[ Krumhansl, C.L. , Sandell, G.J. & Sergeant, D.C. (1987)]: The perception of tone hierarchies and mirror forms in twelve-tone serial music, Music Perception, 5, 31-78.
[ Lerdahl, F. (1987)]: Timbral hierarchies. In"Music and Psychology: A Mutual Regard," S. McAdams (ed.), Contemporary Music Review, 2(1), 135-160.
[ Lerdahl, F. (1988)]: Cognitive constraints on compositional systems. In Generative Processes in Music, J. Sloboda (ed.), Oxford : Oxford University Press.
[ Lerdahl, F. & Jackendoff, R. (1983)]: A Generative Theory of Tonal Music, Cambridge, Mass. : MIT Press.
[ Longuet-Higgins, H.C. & Lee, C.S. (1984)]: The rhythmic interpretation of monophonic music, Music Perception, 1, 424-441.
[ Mathews, M.V. & Pierce, J.R. (1980)]: Harmony and non-harmonic partials, Journal of the Acoustical Society of America, 68, 1252-1257.
[ Mathews, M.V., Pierce, J.R. & Roberts, L.A. (1987)]: Harmony and new scales. In Harmony and Tonality, J. Sundberg (ed.), pp. 59-84, Stockholm : Royal Swedish Academy of Music, publ. no. 54.
[ McAdams, S. (1984)]: The auditory image: A metaphor for musical and psychological research on auditory organization. In Cognitive Processes in the Perception of Art, W.R. Crozier & A.J. Chapman (eds.), pp. 289-323, Amsterdam : North-Holland.
[ McAdams, S. (1987)]: Music: A science of the mind? In "Music and Psychology: A Mutual Regard," S. McAdams (ed.), Contemporary Music Review, 2(1), 1-61.
[ McAdams, S. (1988)]: Perception et intuition: Calculs tacites [Perception and intuition: Tacit computations], InHarmoniques, 3, 86-103.
[ McAdams, S. & Bregman, A.S. (1979)]: Hearing musical streams, Computer Music Journal, 3(4), 26-43.
[ Meyer, L.B. (1973)]: Explaining Music, Berkeley : University of California Press.
[ Miller, G.A. (1956)]: The magical number seven, plus or minus two: Some limits on our capacity for processing information, Psychological Review, 63, 81-97.
[ Narmour, E. (in press)]: vol. 1 -- The Analysis and Perception of Basic Melodic Structures: The Implication-Realization Model; vol. 2 -- The Analysis and Perception of Melodic Complexity: The Implication-Realization Model,
[ Neisser, U. (1976)]: Cognition and Reality, San Francisco : W.H. Freeman.
[ Partch, H. (1974)]: Genesis of a Music, 2nd ed., New York : Da Capo Press.
[ Povel, D.J. & Essens, P. (1985)]: The perception of temporal patterns, Music Perception, 2, 411-440.
[ Restle, F. (1970)]: Theories of serial pattern learning: Structural trees, Psychological Review, 77, 481-495.
[ Reynolds, R. (1987)]: A perspective on form and experience. In "Music and Psychology: A Mutual Regard," S. McAdams (ed.), Contemporary Music Review, 2(1), 277-308.
[ Risset, J.C. & Wessel, D.L. (1982)]: Exploration of timbre by analysis and synthesis. In The Psychology of Music, D. Deutsch (ed.), pp. 26-58, New York : Academic Press.
[ Rosch, E. (1973)]: Natural categories, Cognitive Psychology, 4, 328-350.
[ Rosch, E. (1978)]: Principles of categorization. In Cognition and Categorization, E. Rosch & B.B. Lloyd (eds.), pp. 28-71, Hillsdale, N.J. : Lawrence Erlbaum Associates.
[ Rosner, B.S. & Meyer, L.B. (1982)]: Melodic processes and the perception of music. In The Psychology of Music, D. Deutsch (ed.), pp. 317-341, New York : Academic Press.
[ Rosner, B.S. & Meyer, L.B. (1986)]: The perceptual roles of melodic process, contour, and form, Music Perception, 4, 1-40.
[ Schank, R. & Abelson, R.P. (1977)]: Scripts, plans, goals, and understandings: An inquiry into human knowledge structures, Hillsdale, N.J. : Lawrence Erlbaum Associates.
[ Schoenberg, A. (1924)]: Wind Quintet, op.26, Vienna : Universal Editions/ Philharmonia.
[ Schoenberg, A. (1936)]: String Quartet, no. 4, op. 37, New York : Schirmer.
[ Schoenberg, A. (1975)]: Composition with twelve tones: (1) & (2). In Style and Idea, L. Stein (ed.), pp. 214-249, New York : Saint Martin's Press; trans. by L. Black, (1) c. 1941, (2) c. 1948.
[ Shepard, R.N. (1984)]: Ecological constraints on internal representation: Resonant kinematics of perceiving, imagining, thinking and dreaming, Psychological Review, 91, 417-447.
[ Shepard, R.N. & Jordan, D.S. (1984)]: Auditory illusions demonstrating that tones are assimilated to an internalized musical scale, Science, 226, 1333-1334.
[ Simon, H.A. (1962)]: The architecture of complexity, Proceedings of the American Philosophical Society, 106, 467-482, republ. in The Sciences of the Artificial, 2nd. ed., Cambridge, Mass. : MIT Press, 1982.
[ Slawson, W. (1985)]: Sound Color, Berkeley : University of California Press.
[ Wessel, D.L. (1979)]: Timbre space as a musical control structure, Computer Music Journal, 3(2):
[ White, B.W. (1960)]: Recognition of distorted melodies, American Journal of Psychology, 73, 100-107.
[ Wright, J.K. & Bregman, A.S. (1987)]: Auditory stream segregation and the control of dissonance in polyphonic music. In "Music and Psychology: A Mutual Regard," S. McAdams (ed.), Contemporary Music Review, 2(1), 1-61.