Serveur © IRCAM - CENTRE POMPIDOU 1996-2005. Tous droits réservés pour tous pays. All rights reserved. |
This article is to be published as a chapter of the 4th Volume, Hearing, of the Handbook of Perception (Academic Press). It gives an overview and a substantial bibliography of the following subjects :
Musical Acoustics faces the challenge of dealing scientifically with music, an aesthetic activity which raises philosophical questions and involves a Weltanschauung (cf. Lesche, in Music and Technology, 1971, p. 39-55). Science is not a normative activity like aesthetics : however it can clarify some musical issues and show that they are burdened with many myths. There is an urge for better understanding musical behavior : it would be valuable to gain insight on such a human experience, deeply involving perceptual and motor abilities, and capable of rousing strong motivations ; moreover such understanding is needed to make good musical use of the technological progress and of the new and powerful tools it can provide.
An acoustic signal becomes music only when it is perceived as such, through ear and brain. So hearing and auditory perception are central to Musical Acoustics. However hearing has been studied mostly in terms of reactions to simple stimuli (cf. Handbook of Perception, volume 4, section III) : this is understandable, but musical sounds are not -- and should not be -- simple stimuli, so psychoacoustic results have to be qualified. More important, the stimulus and response situation of many sensory experiments is remote from the experience of music. To obtain significant results on music, one must take in account very complex situations. Must one agree with Eddington that a science with more than seven variables is an art, and conclude that musical acoustics is either irrelevant or unscientific ?
There are some specific research subjects which are relevant to music and which can legitimately be isolated from the intricate context involved in music. Moreover powerful new tools and new methods are available to study more and more complex situations and to isolate what is musically relevant in this complexity. Thanks to recording, evanescent sounds become objects which can be scrutinized at leisure. New equipment, specially the general purpose digital computer, permits to analyse complex data -- psychological judgments as well as sounds -- or to embody elaborate models -- of perception, of musical sounds, of musical composition.
Recent progress in Musical Acoustics is encouraging, although it is slowened by the lack of support ; important results often come as byproducts of research in other fields. So a number of questions stated thirty years ago (R.W. Young, 1944) are still unanswered (cf. also Small and Martin, 1957). This chapter reviews some important topics in Musical Acoustics it can merely scratch the surface : the reader looking for a more complete treatment is directed to a number of references.
Perception implies generalization and discrimination. The discrimination power of hearing determines the smallest differences that can be used significantly in music : for instance the steps of a musical scale should not be smaller than the just noticeable pitch difference. Hence the psychoacoustic study of the listener's response to separate parameters of sound is of interest : however it provides only a bounding of the area of musical perception, rather than determining what happens in that area (Poland, 1963). There is evidence that limens, measured in laboratory conditions, are smaller than the differences which can be detected in complex listening tasks (Plomp, 1966, p.19) like attending to music (Jacobs and Wittmann, 1964). Interaction between separate parameters is important in musical perception, which takes place in a complex context (Francès, 1958) (cf. below, II, F).
In Musical Acoustics treatises, one generally consider four parameters -- or attributes -- of auditory events : pitch, duration (related to rhythm), loudness and timbre. Pitch and duration are carefully marked in conventional occidental music notation, which resembles a system of cartesian coordinates (pitch versus time) ; loudness is marked more crudely, and timbre is determined by the musical instruments for which the music is written (e.g. violin) and the way they are played (e.g., pizzicato). In some manuscripts by Bach, the instruments are not even specified and no dynamic marking is provided. Duration and especially pitch are regarded as the most important musical parameters : a review entitled Musical Perception (Ward, 1970) deals almost only with pitch perception. Yet other parameters are also important, specially in certain types of music.
One might object to the choice of the above parameters of sound. As Seashore (1938, p. 16) clearly states, this choice stems from the assumption that the hearing of tones is an imperfect copy of the physical characteristics of the sound : periodic sound waves can be described in terms of frequency, amplitude, duration and form, hence the auditory correlates should be the four characteristics of musical tones. This assumption is disputable : it is now clear that periodic sound is not synonymous of musical tone (cf. below, II, D) ; also, pitch and timbre are correlated (cf. below, II, B, 6). However the description of musical sounds in terms of pitch, duration, loudness and timbre remains widespread and most useful, but one must be reminded that these parameters are seldom treated independently in music.
Music calls for appreciation of pitch intervals. Musicians consistently judge that a constant musical interval between two pitches corresponds in a wide range to a constant ratio between the two corresponding frequencies. This seems to be at variance with psychophysical experiments of pitch scaling (Stevens and Volkmann, 1940) : however pitch scaling involves a very special task for Ss. As Ward (1970) strongly states, "it is nonsense to give musical pitch a cavalier dismissal simply because an octave interval has a different size in mels depending on the frequency range".
Precise appreciation of pitch intervals is not possible throughout the entire frequency range of audible sounds. According to Watt (1917, p. 63), the musical range of pitch may be defined as the range within which the octave is perceived. Using this definition, Guttman and Pruzansky (1962) find that the lower limit of musical pitch is around 60 Hz (the frequency of the lowest note of the piano is about 27 Hz) ; similarly there is a break-down in the precision and the consistency of intervallic judgments above 4.000 Hz, which corresponds roughly to the highest note of the piano. This does not mean that frequencies below 60 Hz or above 4.000 Hz are never used in music ! (Cf. Attneave and Olson, 1971). In the range of musical pitch, the octave relationship is very strong -- this will be discussed below (cf. II, B, 6).
The pitch of a sine wave depends upon the intensity (Stevens and Davis, 1952) ; this effect varies with listeners, it is usually quite small (A. Cohen, 1961), and it is weaker for complex tones or for modulated sine tones than for unmodulated sine tones (cf. Seashore, 1938, p. 64) thus the effect of intensity on pitch is -- fortunately -- of little significance in music.
Presbyacousis -- that is, the normal hearing loss at high frequencies occuring with age -- does not interfere too much with musical pitch discrimination, although it influences judgments on the tone quality of the sounds.
Most music uses scales (Sachs, 1943). A musical scale is a set of tone steps (or notes) selected from the continuum of frequencies and bearing a definite interval relation to each other. There exists a considerable amount of litterature on the frequency ratios corresponding to the scales used in music (Ellis, 1885 ; Partch, 1949 ; Barbour, 1953 ; Lloyd and Boyle, 1963 ; Billeter, 1970).
The brief discussion which follows is restricted to occidental music. Pythagoras is credited for the design of a scale in which the frequencies of the tone steps are deduced in turn from the frequency of one of them by multiplications by 3/2 -- corresponding to an interval of a "perfect fifth" (and divisions by 2, to bring back the notes in the proper octave). Aristoxenus is sometimes credited for a slightly different scale (also called just intonation or Zarlino scale) with simple frequency ratios : the notes of the scales are the tones obtained on a string instrument by dividing the string into 2, 3, 4, etc. equal sections, and bringing them back in the proper octave. Early occidental music used so-called diatonic scales comprising seven (unequal) intervals per octave, which were slightly different in the Pythagoras and Aristoxenus designs. However these scales were unpractical for "modulation", that is, change of "tonality", tonality meaning here the note chosen as the origin of the successive intervals of the scale : a keyboard instrument would have required many dozens of keys per octave to give the proper intervals in several tonalities. With the increasing use of modulation between the XVIth and XVIIIth century, compromises had to be adopted for the tuning of fixed pitch instruments, (specially keyboard instruments) ; "equal temperament" is the best known : the octave is divided into 12 equal intervals -- named semitones -- forming a so-called chromatic scale (a semitone thus corresponds to a frequency ratio of 122). From the tones of the chromatic scales, one can select 12 major and 12 minor tempered scales corresponding to different tonalities : expressed in number of semitones, the succession of intervals of the major scales is (2,2,1,2,2,2,1) (Cf. Barbour, 1953 ; Backus, 1969, p. 127). Bach's "well-tempered clavier" illustrates the use of all tonalities : however the question whether this "well-tempered clavier" really used the "equally-tempered" tuning is not settled (Billeter, 1970). Equal temperament was widely used in classical, romantic and contemporary occidental music : Schoenberg's 12 tone technique uses the notes of the chromatic scale. Yet several XXth century composers, specially Varèse, were not satisfied with what they considered a somewhat arbitrary and aurally unsatisfactory pitch system (Ouelette, 1968). Special instruments were designed for other scales (Van Esbroeck and Montfort, 1946 ; Partch, 1949). The development of electronic sources of sound (cf. below, V) makes it easier for the musician to escape equal temperament and to freely choose his own scales.
Despite considerable speculation about the justification of the frequency ratios used in various scales (cf. below, II, B, 3), there are few data about tuning, intonation and intervals used in actual musical practice or preferred by listeners. The tuning of pianos is usually "stretched", that is, the high notes are higher and the lower notes are lower than what would correspond to the tempered scale. This can be ascribed partly to the inharmonicity of piano strings (Schuck and Young, 1943), yet stretched tuning seems to take place also in the gongs of the Indonesian Gamelans (Hood, 1966), so it may relate to universal perceptual characteristics such as the stretching of the subjective octave (Sundberg and Lindqvist, 1973). Stretched tuning is preferred over unstretched by listeners (Ward and Martin, 1961). Violinists have been said to follow a pythagorean scale (Greene, 1937 ; Van Esbroeck, 1949), which may result from practice on an instrument tuned by fifths : however the departure from equal temperament is not very significant, except that notes other than the tonic (in tonal music) tend to be played sharp ; also contextual effects in tonal music have a significant influence on intonation -- e.g. the interval C to F sharp (an augmented fourth) is played larger than the interval C to G flat (a diminished fifth), although these intervals are identical in equal temperament (Small, 1936 ; Shackford, 1961, 1962). Extensive listening tests, performed with the help of a specially designed organ, failed to give clear trends about the preferences of listeners for a given type of tuning (Van Esbroeck, 1946) ; well-tempered intervals tend to be preferred in melodies, which may be due to familiarity with the piano, whereas results are unclear in chords (cf. Ward and Martin, 1961). Listeners tend to prefer intonations in which the notes are sharp as compared with a tempered scale (Boomsliter and Creel, 1963). In the performance of solo music, wind instruments are played with a stretched frequency scale, which agrees with the fact that the subjective musical octave systematically exceeds a 2 : 1 frequency ratio (Sundberg and Lindqvist, 1973). (Cf. also Fransson et al., 1974 ; Music, Room and Acoustics, 1977).
Non-occidental music uses other scales : Indian Music, for instance, has a larger variety of scales than western music (Ellis, 1885). Many of these scales have five or seven steps per octave : this has been linked to "the magical number seven plus or minus two" which, according to G.A. Miller (1969), measures in "items" (digits, letters, notes) the span of immediate memory. Also the intervals of octave, fifth and fourth seem to be present in many exotic scales (cf. Sachs, 1943).
A consonance is a combination of two or more simultaneous tones which is judged pleasing to listen.
Pythagoras noticed that consonant intervals like the octave, the fifth, the fourth, correspond to simple frequency ratios like 2/1, 3/2, 4/3 (actually Pythagoras measured ratios of string length on a string instrument, the monochord). Aristoxenus held a more relativist view on consonance. Later, numerological explanations were advocated by Leibniz (who asserted that unconscious counting by the brain was the basis for the feeling of consonance or dissonance : of Revesz, 1954, p. 50) ; Euler Schopenhauer ; M. Meyer (1898) ("the brain dislikes large numbers"). Some authors still claim that the ear appreciates the numerical complexity of an interval, either melodic -- notes played in succession -- or harmonic-simultaneous notes (Boomsliter and Creel, 1963) (cf. Cazden, 1959). It has been attempted to attribute to a given frequency ratio a multiple-valued complexity, in order to explain that the same frequency ratio can be heard as two different intervals (Tanner, in Canac 1959, p. 83-102 ; also Tanner, 1972) (for example, on a piano Keyboard, F. Sharp and G. Flat correspond to the same note, yet one can mentally switch and hear the interval this note forms with the C below either as diminished -C -G flat or as augmented -C -F sharp, implying different harmonic resolutions).
Numerological theories of consonance suffer difficulties. Because of the hear's tolerance, intervals corresponding to 3:2 (a simple ratio) and 300 001 / 200 000 (a complex ratio) are not discriminated. Also psychophysiological evidence indicates that numerical ratios should not be taken for granted. The subjective octave corresponds to a frequency ratio a little larger than 2, and reliably different for different individuals (Ward, 1954 ; Sundberg and Lindqvist, 1973) ; this effect is increased by sleep-deprivation (Elfner, 1964).
There are also physical theories of consonance. Helmholtz (1877) links the degree of dissonance to the audibility of beats between the partials of the tones. (Intervals judged consonants like the octave, fifth, fourth, usually evoke no beat -- although the fifth corresponding to frequencies 32 Hz and 48 Hz does evoke an audible beat). This theory is hardly tenable, because the pattern of beats, for a given interval, depends very much of the placement of the interval with the audible frequency range. Recent observations (Plomp, 1966) suggest an improved physical explanation of consonance : listeners find that the dissonance of a pair of pure tones is maximum when the tones are about a quarter of a critical bandwidth apart ; the tones are judged consonant when they are more than one critical bandwidth apart. Based on this premise, Pierce (1966) (of von Foerster and Beauchamp, 1969, pp. 129-132) has used tones made up of non-harmonic partials, so that the ratios of fundamentals leading to consonance are not the conventional ones ; Kameoka et al. (1969) have developed an involved method to calculate the magnitude of dissonance by taking in account the contribution of separate frequency components.
Whereas the explanation put forth by Plomp can be useful to evaluate the "consonance", "smoothness", or "roughness", of a combination of tones, it is certainly insufficient to account for musical consonance. In a laboratory study, Van de Geer et al. (1962) found that intervals judged the most consonant by laymen do not correspond to the ones usually termed consonant. This result is elaborated by recent work by Fyda and Wessel (1977). The term consonance seems ambiguous, since it refers at the same time to an elemental level, where "smoothness" and "roughness" are evaluated, and to a higher aesthetic level, where consonance can be functional in a given style. The two levels are related in a culture-bound fashion. in music, one does not judge only the consonance of isolated tones : as Cazden (1945) states, "context is the determining factor.(...) The resolution of intervals does not have a natural basis ; it is a common response acquired by all individuals within a culture area" (cf. also Lundin, 1947). Musical consonance is relative to a musical style (Guernesey, 1928) ninth chords, dissonant in Mozart's music, are treated as consonant by Debussy (Chailley, 1951 ; Cazden, 1962, 1968, 1972). The cultural and contextual aspects of musical consonance are so important that, despite nativists' claims to the contrary, purely mathematical and/or physical explanations can only be part of the story. (cf. Costère, 1962).
Relative pitch, akin to intervallic sense (Revesz, 1954), is a basic ingredient of musical aptitude (cf. Lundin, 1967). Given a pitch reference (e.g. the A of a tuning fork or a pitchpipe), an individual with good relative pitch will be able to sing or identify any note of the scale. As Ward (1970) puts it, the Western musician "apparently has an internal scale of relative pitch, a movable conceptual grid or template so to speak, that specifies the pitch relations among the notes of our Western scale".
Relative pitch is an acquired ability. Apparently the child first appreciates the general contour of melodies -- only in a latter stage of development can he appreciate the intervals accurately, granted sufficient exposure to music (Brehmer, 1925 ; Francès, 1958 ; Zenatti, 1969). Intervallic sense is not immediate but abstracted from familiarity with melodies (Teplov, 1966). Experiments (Dowling and Fijitani, 1971) confirm that subjects have good longterm memory for exact interval sizes in the context of familiar tunes, and establish (Dowling, 1971) that interval size and melodic contour are handled differently by the listener. A recent experiment (Bever and Chiarello, 1974) indicates that the perceptual processing of melodies can switch from the right to the left hemisphere of the brain while children are learning music, which confirms the acquisition of a move analytic ability. A melody is not perceived as a sequence of isolated tones, but as a pattern of grouped sounds ; perception involves grouping of tones in streams and segregation of different streams : for example, a simple monodic instrument, rapidly alternating high and low tones, can be heard as playing two melodies (Bregman and Campbell, 1971) (cf. Pikler, 1966, a), Van Norden, 1975 ; Warren, 1972). Similar processes, as well as interaction between pitch and localization, probably intervene in Deutsch's striking musical illusions, obtained by presenting certain sequences of tones to both ears (Deutsch, 1975). For example, when a sequence consisting of a high tone (800 Hz) alternating with a low tone (400 Hz) is presented so that when one ear receives the high tone, the other ear receives the low tone, and vice versa, most right-handed subjects hear a high tone in the right ear alternating with a low tone in the left ear -- this is not changed by reversing the earphones.
Active processes play a part in melodic perception. As Creel et al. (1970) say, "the singer sings a note flat-flatter than what ? than our internal pattern of expectations" (cf. also Boomsliter and Creel, 1963). The constructive aspect of perception has been stressed in other fields (Neisser, 1967) ; according to Licklider (1967), "one is proposing and testing hypotheses, actually hearing, not the sensory data, but the winning hypothesis". Genetic and cultural factors favor the supply of certain types of hypotheses (cf. Francès, 1958 ; Risset, 1968, p. 102) this can explain the reference grid or template, constituted by familiarity with a cultural pitch system. Such a grid makes pitch perception more consistent, systematic and easier to remember, but also particularized and biased. Large frequency deviations in artistic singing are not perceived as such : the pitches heard correspond to the conventional scale (cf. Seashore, pp. 256-272). Different musical civilizations use different scales, but intervals are "naturalized" (Francès, 1958, pp. 47-49) : a western musician will interpret intervals of an Eastern scale in terms of the Western scale template ; for instance, listening to two different steps, he may assimilate, incorporate the steps to his template, and say : this is a low F, this is a sharp F. This categorization into scale steps is analogous to phoneme perception. Apparently only for children is it possible to efficiently accomodate (cf. Piaget, 1969) the template to different scales. There is some evidence that the development of pitch sense can be inhibited at an early age by culture-bound conditions (W.P. Tanner and Rivette, 1964). Clearly relative pitch perception is not naive : it involves unconscious abstraction, influenced by the cultural history of the listener (Francès, 1958 ; Allen, 1967).
A scale, defined as a succession of intervals, must be anchored on a note of fixed frequency, called standard of pitch, or diapason. It is usual to take the frequency of A above middle C as the standard of reference (Backus, 1969).
As can be inferred from the tuning of organs, three centuries ago, the frequency of A above middle C varied between some 375 and 500 Hz (Ellis, 1885). Later the latitude was reduced, and an official standard pitch was adopted first at A = 435 Hz, then at A = 440 Hz. However there are many complaints that the standard of pitch actually used is rising, which imposes difficulties to musicians -- specially singers -- and to instrument-makers. The responsibility of this rise is often ascribed to instrumentalists and instrument-makers who want a more brilliant sound, but other reasons intervene. The pitch given by a pipe depends upon the temperature, and counter-adjustments in pipe length affect the pitches given by the holes of the instrument ; a wind instrument player can compensate this only to a small extent, at the expense of ease of playing and quality of intonation. Hence the standard of pitch should be prescribed at a specified temperature (Young and Webster, 1958). Yet the intonation of different instruments does not depend the same way of temperature. Recent measurements performed at the Opera of Paris (Leipp, 1971, p. 132 ; pp. 328-329) indicate that many factors affect the orchestral pitch, including the tempo and the excitement of the performance.
Good musicians possess good relative pitch. But an aura of mystery surrounds "perfect pitch", a rare ability even among musicians. The choir leader gifted with perfect pitch does not need a pitchpipe or a tuning fork to give the reference tone A to his chorus : somehow he has an internal reference of standard pitch. The name perfect pitch is favored by musicians ; many scientists prefer the term Absolute Pitch (A.P.), since it refers to identification of tones on an absolute basis.
There has been much controversy on A.P. : is it a specific ability ? is it an inborn, hereditary faculty, or can it be learned ?
Neu (1947) claims that A.P. is nothing more than a fine degree of accuracy of pitch discrimination. This view seems untenable Oakes (1955) has shown that pitch naming ability and differential pitch sensitivity are independent. Also the absolute note recognition seems to involve a different process : Ss with A.P. have much faster response (Baird, 1917 ; Gough, 1922). According to Revesz (1954) or Bachem (1950), there is a clearcut difference between Ss with "regional pitch", who simply narrow the interval of errors. Other experimenters (Oakes, 1955) contend there is a continuous distribution of Ss in A.P. test performance (cf. Ward, 1963).
While Seashore (1938), Revesz (1954), Bachem (1950) contend that A.P. is hereditary, Neu (1947) denies that A.P. is inborn, and holds that it is a product of environment. Pitch discrimination can indeed improve with practice (Lundin, 1967). Yet attempts by adults to acquire A.P. by training have been poor : although their performance is improved, it is far from that of a recognized possessor of A.P. (M. Meyer, 1899, 1956 ; Lundin and Allen, 1962 , Cuddy, 1968 ; Brady, 1970). According to Copp (1916), pitch-naming is easy to develop in children-early years are critical. This suggests that A.P. can be acquired at some critical stage of development -- a process similar to imprinting (Jeffress, 1962). The nature-nurture argument is hard to settle : possessors of A.P. tend to cluster in families, but this does not prove that A.P. is hereditary, since it is in these families that A.P. is more likely to be recognized, valued and fostered by musical exposure. Clearly, a child can learn to name notes only if he is told their names, which will not occur in any environment. In a most commendable review on A.P., Ward (1963) advocates a view point put forth by Watt (1917) : A.P. is perhaps an inborn ability in all of us, which may be "trained out of us in early life" (Ward, 1970) -- it is normally reinforced but rather inhibited by the generalization involved in tasks such as recognition of transposed melodies. There is some indication that A.P. is related to a special ability for memory retrieval rather than storage.
Although A.P. is exceptional, absolute identification of vowels, using timbre cues, is commonplace. It has been suggested that A.P. was mediated by a proficient use of timbre clues : the performance of Ss in A.P. tests often depend highly on the musical instrument on which the pitches are produced (Baird, 1917). On the other hand, Brentano (1907), Revesz (1954), Bachem (1950) claim that A.P. involves recognition of a certain quality, called tone chroma or tonality, which endows all Cs with a certain Cness, all Ds with a certain Dness, regardless of the octave placement and timbre (hence octave errors are not surprising). This implies that pitch is a compound attribute. Other authors have supported a two-component theory of pitch (Stumpf, 1890 ; M. Meyer, 1904 ; W. Köhler, 1915 ; cf. Teplov, 1966) : all theories are not identical. According to Revesz (1954) the pitch continuum can be represented as a helix drawn on a cylinder (cf. Pikler, 1966, b) : the vertical coordinate, along the axis, corresponds to an attribute named tone height, which varies monotonically with the frequency : when the frequency is increased, the other attribute, the chroma, represented by the position around a circular section, varies cyclically : a rise of an octave corresponds to one turn around the circle. Such a view was often not taken seriously -- until Shepard (1964) succeeded in demonstrating circularity of relative pitch, using specially contrived computer -- synthesized sounds comprising only octave components : Shepard synthesized twelve tones, forming a chromatic scale which seems to go up endlessly when the twelve tones are repeated. One can go further in contriving the stimuli to divorce chroma and tone height : sounds have been generated which go down the scale while becoming shriller ; different listeners can make different pitch judgments about such stimuli, depending whether they weigh more one aspect or another e.g. stimuli made with stretched octaves are heard by most listeners to go down in pitch when their frequencies are doubled, because chroma cues dominate the conflicting tone height cue (Risset, 1971). Moreover tone chroma (like speech) appears to be processed by the left brain hemisphere, whereas tone height is processed by the right hemisphere (Charbonneau and Risset, 1975). The basis for chroma is the strength of the octave relation : octave similarity is perceived by all subjects, although more strongly by trained musicians (Allen, 1967) and there are striking indications that tonal pitch memory is organized in a way that allows disruptions to generalize across octaves (Deutsch, 1973). The above mentioned demonstrations involved artificial stimuli with only octave components, which creates special situations ; the octave is not always "the Alpha and the Omega of... the musical ear" (Revesz, 1954, p. 61) tune recognition is very much disturbed by playing the different notes at different octaves (Deutsch, 1972 ; cf. also Bregman and Campbell, 1971). Yet pitch perception certainly seems to involve both a focalized aspect (associated with the periodicity of the waveform) and a distributed aspect (associated with the spectral distribution and closely related to timbre) (Watt, 1917 ; Licklider, 1956). The focalized aspect only lends itself to precise pitch-matching (Köhler, 1915). Although the two aspects are not normally divorced, in some cases the clues for the two aspects are contradictory : it is then clear that Ss rely mostly on one or the other (Risset, 1971) for instance many non-trained subjects do not immediately perceive the low pitch of the residue (cf. Plomp, 1966), but rather a high "formant pitch" (Meyer-Eppler, in Die Reihe, 1958, p. 60). To some extent, one can learn to improve pitch-naming from clues of the distributed aspect (Köhler, 1915 ; Gough, 1922), but as well as intervallic sense, characteristic cases of A.P. seem to involve a special ability to deal with the focalized aspect, regardless of timbre cues (Bachem, 1950).
A.P. can help the intonation of singers, specially in atonal music ; it is also useful to the conductor. It can have drawbacks : possessors of A.P. can be annoyed by changes in the pitch standard. Apparently a number of great composers did not have A.P., among them Schumann and Wagner, although such data are hard to check (Teplov, 1966, p. 194). Many points on A. P. are not settled ; cross-cultural data might be enlightening. Passion pervades the issues of A.P. (Ward, 1963) this shows even in the expression "perfect pitch".
The music listener does not, in most cases, attend to durations of isolated events. The rhythmical organization of music often involves a metric grid somewhat similar to the pitch scale, which helps judging duration relationships in a precise and consistent way Ss which have much difficulty to halve a duration can solve complex rhythmic problems in the context of music (cf. Meumann, 1894, and Teplov, 1966, pp. 358-376) ; listeners have an amazing inability to detect temporal order -- except in the highly structured patterns of speech and music (Warren et al. 1969 ; cf. also Bregman and Campbell, 1971). Like in pitch, there are intended deviations from a mathematically accurate division of time. Rhythming organization and "chunking" of the musical flux seems critical to its memorization (Dowling, 1973). For more information on rhythm, meter and duration, cf. Cooper and Meyer, 1960 also, Stetson, 1905 ; Ruckmick, 1918 ; Jacques-Dalcroze, 1921 ; Farnsworth, 1934 ; Sachs, 1953 ; Fraisse, 1956 ; Teplov, 1966 ; Fraisse et al, 1953 ; Simon, 1972 ; Restle, 1970 ; Jones, 1976. To do justice to musical rhythm, one should be careful to keep in mind its relation to the other aspects of the music (for instance the harmonic aspect for tonal western music).
Timbre, this attribute of the tone that can distinguish it from other tones of the same pitch and loudness, is also called tone color or tone quality. For periodic tones, timbre has been ascribed to the form of the wave, and subsequently to the harmonic spectrum, assuming that the ear is insensitive to phase relations between harmonics (cf. Helmholtz, 1877 ; Olson, 1967, p. 206, and Backus, 1969, pp. 94-100). Actually phase changes can alter the timbre in certain conditions, yet their aural effect on periodic tones is very weak, especially in a normally reverberant room where phase relations are smeared (Wightman and Green, 1974, Schroeder, 1975 ; Cabot et al., 1976).
Even a sine wave changes quality from the low to the high end of the musical range (Stumpf, 1890 ; Köhler, 1915). In order to keep the timbre of a periodic tone approximately invariant when the frequency is changed, should the spectrum be transposed so as to keep the same amplitude ratios between the harmonics, or should the spectral envelope be kept invariant ? This question raised a debate between Helmholtz and Hermann (cf. Winckel, 1967, p. 13). In speech, a vowel corresponds approximately to a specific formant structure (a formant is a peak of the spectral envelope, often associated with a resonance in the sound-producing system). One may say that a fixed formant structure gives a timbre varying less with frequency than a fixed spectrum (Slawson, 1968). To some extent, one can correlate a particular spectrum and the resulting timbre (Stumpf, 1890 ; Schaeffer, 1952). The concept of critical bandwidth, linked to the spectral resolution of the ear (Plomp, 1966), may permit a better understanding of this correlation. In particular, if many high order harmonics lie close together, the sound becomes very harsh-hence, for instance, antiresonances in the frequency response of the string instruments play an important part to ensure acceptable timbres.
So, for periodic tones, the timbre depends upon the spectrum. It has long been thought that musical tones were periodic (cf. below, III, B) : however manipulations of sound, including simple tape reversal, (Stumpf, 1926 ; Schaeffer, 1952), electronic and computer sound synthesis (Risset and Mathews, 1969 ; Chowning, 1972) evidence that strictly periodic tones are dull, and that timbre is very sensitive to temporal factors. It is not surprising that hearing does not rely only on the exact structure of the spectrum to evaluate timbre and identify the origin of the sound, since this structure is severely distorted during the propagation of the sound (cf. Wente, 1935 ; R.W. Young, 1957) : timbre perception involves a pattern recognition process, which resorts to factors resistant to distorsion. Specific variations of parameters throughout the sound are often not perceived as such, but rather as characteristic tone qualities (cf. below, III, B) ; this is specially true for rapid variations during the attack (Stumpf, 1926), but also for the slow frequency modulation known as vibrato, whose rate is around 6 or 7 Hz (cf. Seashore, 1938, p. 256). Departures from regular periodicity can make the sound livelier (cf. Boomsliter and Creel, 1970) many musical sounds are inharmonic, e.g., the partials' frequencies are not exact multiples of the fundamental frequencies : this contributes to subjective warmth (Fletcher et al, 1962). The presence of many simultaneous players or singers affects the tone quality as well as the intensity : this so-called chorus effect leads to the widening of spectral lines ; in fact, there is no clear-cut barrier between musical tones and noise (cf. Winckel, 1967). The acoustics of the listening room can exercise an important influence on the tone quality (cf. Handbook of Perception, IV, Ch. 16 ; Schroeder, 1966). As for pitch, timbre perception depends upon the internal timbre references of the listener, which are related to the sounds (e.g. to the musical instruments) he is used to hear and identify. Abilities in musical instrument recognition vary widely among individuals.
A single spectrum is thus inadequate to characterize the timbre of an arbitrary sound. Useful representations of sounds are provided by the sound spectrograph (Potter et al., 1947), which displays the temporal evolution of the spectrum in a crude but often meaningful way (cf. Leipp, 1971). Other modalities for analysis -- specially for a running analysis -- are helpful for the investigation of timbre (Moorer, 1977). The inspection of analysis results reveals physical features that may affect timbre, however this enumeration of features remains speculative until their aural significance has been ascertained by synthesis (Risset and Mathews, 1969).
Multidimensional scaling promises to clarify the concept of timbre, by providing geometrical models of subjective timbral space which reveal the principal dimensions of timbre differentiation (Wessel, 1973 ; Miller et al., 1975 ; Grey, 1977). The initial results tend to support the hypothesis (Huggins, 1952) that in order to identify the source of a sound, hearing attempts to separate in the sound the temporal properties, attributable to the excitation, and the structural properties, unchanged with time and related to a response in the sound source.
Further details will be given below (cf. III, B and V, C) understanding of the physical correlates of timbre is of great relevance to the production of better instruments and better synthetic sounds ; timbral maps may help explore and organize new sound material (Ehresman and Wessel, 1977 ; Wessel and Smith, 1977).
Tests have been developed to evaluate musical aptitude. Such tests embody the conceptions of their authors on the measurement of musical aptitude and the factors of musical talent : these conceptions are hard to evaluate objectively. But the tests can be appraised in terms of reliability-reproductibility of the results -- and validity -- does the test measure what it claims to ? It is straightforward -- although tedious -- to determine the reliability ; the validity can be evaluated by comparing the results with other appraisals of Ss' musical aptitude, or by matching the results with those of an assumed valid test, or by careful study of the test's procedures. Other criteria can be of importance : ease of administration ; objectivity (the person scoring the test should not influence the results) ; economy (in terms of time and money) ; standardization.
One can find an appraisal of some tests in Lundin (1967) and Lehman (1968). In general, musical tests leave much to be desired in reliability and relevance. Their indications may help to make a better than random evaluation, but the appraisals they warrant should not be taken for granted. Tests reflect the conceptions of their authors, and they should not be regarded absolute, specially tests of appreciation. The factors measured may not be the important ones in all conditions. There is still much research to be done on the relevance of the testing procedures.
One can already infer from the previous sections that the perception of music involves a wealth of natural and cultural factors, and that it depends in an intricate way of the context and of the listener's attitude (cf. Frances, 1958 ; Poland, 1963). Several volumes could not exhaust this topic : this section only gives some leads which may help exploring the tangle of subjective musical preference.
Strong hopes have been placed in Information Theory (cf. Shannon, 1948 ; Cherry, 1961 ; Pierce, 1962) as a framework suitable to study musical perception. Information theory provides a quantitative definition of "information", related to the complexity and the unpredictability of the message ; the theory has been successful in determining the maximum information that could be conveyed on communication channels, and also in predicting that proper "redundancy" (related to internal structure lowering the information rate) made it possible to code messages so as to protect them against the detrimental effects of noisy channels. The language of information theory helps to clarify some concepts relevant to musical communication (Meyer-Eppler, 1952, 1965 ; Moles, 1966 ; L. B. Meyer, 1967 ; J. E. Cohen, 1962 ; Le Caine, 1956 ; Hiller and Isaacson, 1959, pp. 22-35 ; Berlyne, 1971) specially message intelligibility. A message with a very low information rate arouses no interest in the listener : it is too predictible and banal (like the endless repetition of a note or a pattern). But if the information rate is too high, the message is not intelligible : it is too original and unpredictible (like white noise). Compositional redundancy reduces the information rate -- provided the listener possesses the "code", that is, has a knowledge of the rules of the game, of the constraints of the compositional style. As listeners learn a style through familiarity with the music, the redundancy becomes clear to them, and the music gets banal -- specially for musicians, who overlearn the style : hence, historically, music tends to get more complex (L.B. Meyer, 1967, p. 117). New music has often be termed noise, from the time of Ars Nova (XIVth century) to present times : it was too original for the listeners until they had become familiarized with the new style. But contemporary music is in an unprecedented situation in that respect ; most of the music listeners are massively exposed to classical music, background music, pop music and jazz heard on radio, television, in shopping centers, in factories (Soibelman, 1948) -- has a tonal syntax ; listeners overlearn this syntax, which prepares them for appreciating XVII to XIX century music, but neither early occidental music and music of other civilizations, nor contemporary music. (This does not mean that tonal and non-tonal music can be perceived in the same way : cf. Francès, 1958).
Yet the previous theses are only qualitative, and they could have been formulated without the help of information theory. Few data are available on the correlation between information rate and perceived complexity. Pitts and de Charms (1963) indicate that this correlation can be strong, but that unexpected factors intervene : e.g. an intricate piece of music can be judged simple because it is written in waltz tempo. Pitts and de Charms also gathered some evidence supporting a model proposed by Mc. Clelland (1953) and whose application to music resembles the above information theory considerations : according to this model, positive affective arousal results from small deviations in perceived complexity from the level to which the listener is adapted. Similarly, Berlyne (1971) claims that complexity and novelty interact to determine listeners' preference for stimuli. This is supported by experiments on the relation between auditory complexity and preference (Vitz, 1966 ; Duke and Gullickson, 1970). However the measurement of information, designed to deal with independent or statistically related events, is inadequate for organized signals (Green and Courtis, 1966) like music, and it cannot take thoroughly in account the effect of previous musical experiences. Yet information theory has been an incentive for computer statistical analysis of musical styles (Fucks, 1962 ; Hiller and Bean, 1966 ; Bowles, 1966 ; Lefkoff, 1967 ; Lincoln, 1970 ; cf. Cohen, 1962) and for proposing models of musical composition that can be implemented with computers (cf. below, V, C).
Different aspects come into play in the perception of music. Lee (1918 ; 1932) made an enquiry on about 200 musicians from England, France and Germany : about half of them said that the meaning of music resided in music itself, and half said music implied for them an extramusical message. There has been heated debates between the "absolutists", who insist that musical meaning lies in the perception of musical form (e.g. Hanslick, 1891 ; Schoen, 1940 ; Stravinsky, 1947), and the "referentialists", who contend that music conveys extra-musical feelings, concepts, meaning (e.g. Teplov, 1966) (cf. L.B. Meyer, 1956). According to absolutists, extramusical associations are not natural and universal. But referentialists can retort that musical symbolism is present in all musical civilizations, and quite precise in the Orient (cf. Sachs, 1943). These adverse viewpoints coexist rather than being mutually exclusive (L.B. Meyer, 1956 ; Francès, 1958) ; one can distinguish two aspects in the perception of music : a formal, syntactical, cognitive aspect, and a referential, emotional, affective aspect (Tischler, 1956). These aspects are linked to "definite" and "indefinite" listening (Vernon, 1934) (cf. Poland, 1963 ; Moles, 1966) definite listening implies active attention to perceive relations indefinite listening is more passive, and vague. Different aspects of pitch (cf. above II, B, 6) or consonance (cf. above, II, B, 3) relate to either definite or indefinite listening.
The viewpoint of L. B. Meyer (1956, 1967) bridges the gap between the formalist and the referentialist aspects of music. According to this viewpoint, music perception involves anticipation : continuous fulfillment of expectations would cause boredom ; music arouses tendencies of affect when an expectation is inhibited or blocked (expectations are, of course, relevant to the particular style of a composer). This view is akin to Mc. Clelland's model cited above ; Meyer believes that these cerebral operations evoke all responses to music, and that it is the listener with his cultural history who brings to musical experience a disposition that leads to intellectual, emotional or referential response.
The above considerations do not consider much the materialization of music into sound. Varèse complained that "the corporealization" of musical intelligence (also significant to Stravinsky), was neglected in occidental music (cf. Schwartz and Childs, 1967, p. 198) : this has somewhat changed, and contemporary composers are often interested in controlling sound structure in music (Erickson, 1975). In primitive civilizations, some practices (e.g. pounding of drums) probably contributed to the magic and therapeutic effects of music (cf. Schullian and Schoen, 1948). Much of the organ's impact comes from low frequencies, some of them felt by the body although inaudible. A number of rock groups now achieve an almost intoxicating effect on listeners by using very high sound intensities -- which can even be detrimental to hearing (Lebo et al., 1967). Despite the existence of this sensual aspect, experiments on the "pleasantness" of isolated sounds, chords or sequences (Butler and Daston, 1968) are probably of little relevance to musical preference (Langer, 1951, p. 171).
According to Osmond-Smith (1971), the response to music can be studied from the viewpoint of two disciplines : experimental psychology (cf. Francès, 1958 ; Lundin, 1967), and semiotics. Although linguistic concepts and methods should not be applied to music without caution, some musical phenomena are clearly of a semiological nature (Springer, 1956 ; Harweg, 1968 ; Nattiez, 1972 ; Ruwet, 1972). Musical semiotics is still in infancy. The bridge between the viewpoint of experimental psychology and semiotics is perhaps to be found in an approach focusing on the cognitive processes involved in the perception of music (Harwood, 1972).
At the time of gregorian chant the question of musical preference did not often arise : sacred music was meant for rituals, not for comparisons. Now a great variety of music is available through recordings. Record companies take their view of musical preference in account to make marketing evaluations influencing availability and advertisement of records, which in turn prejudice the buyers' preferences. The relevance of "musical taste" depends upon the function of music : it is probably highest in our occidental civilization where music has lost most of its magic, therapeutic or ritual significance (cf. Schullian and Schoen, 1948).
It is not easy to evaluate musical preferences. Record sales as a measure of preference is clearly biased ; also it does not permit to analyse the effect of factors like listeners' origin or personality analyses of concerts or radio music programs (cf. Moles, 1966) do not either. Out of an urge for objectivity, it has been advised to measure physiological responses to music (Phares, 1934) : but these responses are not clearly and reproductively related to music appreciation (cf. Schoen, 1927 ; Podolsky, 1954 ; Eaton, 1971). Most investigators recourse to recordings and comparisons of the listeners' verbalized reactions. (It might be interesting, although perhaps misleading, to study the effect of music on faces and behavior : cf. Lundin, 1967).
If one compares crudely the music of various civilizations, it appears that most classical western music favors stable rhythmic pulses, very elaborate harmony, involving attractions and resolutions correlated with rhythmical accents, and developed at the expense of tuning subtlety and of melodic richness. Africans use complicated rhythmic patterns, whose organization is often not perceived by Occidentals, and a large variety of instruments, often with indefinite pitch. Oriental music, in addition to elaborate rhythmic systems, often calls for varied tuning systems and significant pitch deviations around the notes (Sachs, 1943 ; Leipp, 1971). Efforts are now exerted to preserve musical traditions and to protect them against a rapid contamination (cf. Farnsworth, 1950, 1958) by western tonal music. Contemporary occidental music often tries to incorporate features of other musical civilizations.
More specific data about the effect of various factors on music preference in the western culture are reported by Lundin (1967). Preference is clearly a complex function of the musical "stimulus", but also of the listener, of his history of contacts with this kind of music, of his conception of the function of music (background music ? dance music ? "pure" music ?). Sometimes a particular type of music becomes a token of a subculture (e.g. pop music, free jazz, folk song) : this is a factor of its acceptance or refusal ; in this case, superficial characteristics (in instrumentation, rhythm or melody, for example) may suffice to many listeners. There are also definite cases of specific associations -- for instance, some listeners claim to be oppressed by siren-like glissandi sometimes used in contemporary music (cf. Xenakis, 1971) because it refers them to war recollections.
Despite such specific effects, a wide survey performed half a century ago and analyzing 20.000 verbal reports indicated that affective reactions to musical compositions are strikingly similar for a large majority of listeners regardless of training, age or experience (Schoen, 1927). The conformism of concert programs (cf. Mueller and Hevner, 1942) and the uniformity of radio (and now television) exposure can be partly responsible for that ; also the effect of reverence for highly regarded cultural values should not be underestimated : Farnsworth (1950) finds a strong agreement between musicologists, students and high school students, on composers judged eminent -- even though their many responses are based on little personal listening experiences.
This is not of course the only effect of education. Early musical training strongly develops musical perception abilities (Copp, 1916 ; Teplov, 1966). In teen-agers, musical sophistication is related to socio-economic status (Bauman, 1960). The effects of variations in the structure of the music on the affective response and the significations elicited have been studied (Mueller and Hevner, 1942). The association of the minor mode with sadness is frequent (although not constant) but for western listeners only, and it seems to be learned (Heinlein, 1928). With repetitions, classical selections gain more in enjoyment than popular selections -- the latter often loose (Schoen, 1927 ; cf. also Getz, 1966) ; popular tunes get quickly "worn out" : satiety is hastened with the new media. This can be understood in terms of the conceptions of Mc Clelland or L.B. Meyer (cf. above, II, F, 1).
In an experiment on the preference of harmonic sequences (cf. Lundin, 1967), it was found that musically sophisticated subjects reject the sequences which are too predictable, while unsophisticated subjects rejected the too unpredictable ones. The aesthetic response depends upon attention and attitude : some listeners prefer a familiar, reassuring music ; others have a strong curiosity for novelty and can impose a form on a random pattern. It was reported (Mikol, 1960) that in an appreciation study, the more receptive and less dogmatic students showed improved taste for modern music over repeated hearings (cf. also Mull, 1957).
The acoustical study of musical instruments has progressed along two main lines : understanding their physical behavior ; determining the cues used to identify and evaluate them. Both approaches can be useful to improve instrument making : the former, to correlate variations in instrument design and building with variations in the physical parameters of the sounds produced ; the latter, to indicate which configurations of physical parameters should be achieved in the sound.
Musical instruments are very complex mechanical systems, far from being thoroughly understood (cf. Backus, 1969 ; Benade, 1976 ; for data on instrument tone spectra and instrument directivity, cf. D.C. Miller, 1916 ; E. Meyer and Buchmann, 1931 ; C.A. Culver, 1947 ; J. Meyer, 1966 ; Olson, 1967). Instrument design has evolved mostly empirically, yet in many cases remarkable skill is evidenced in the way instrument making takes in account properties of sound, characteristics of hearing and necessities of music (cf. Leipp, 1971 ; Benade, 1976).
This is well exemplified by string instruments like the violin, considered to have been brought to a high level of perfection by the Italian masters, at a time where its physical behavior was not well understood. The bow catches the string and pulls it to one side, until the string separates from the bow and flies back : it is then caught again, and so on. This behavior takes place because friction between bow and string is higher at slow speeds ; it is relatively well known after the works of Helmholtz (1877), Raman and a number of others (cf. Hutchins, 1962 ; Leipp, 1971). Bowed string vibrations have been filmed, picked up magnetically (Kohut and Mathews, 1971) and simulated by computer from the differential equations of motion (Hiller and Ruiz, 1971) : the waveform is approximately triangular ; the flyback time depends upon the position along the string. The plucked string's behavior is simpler. In both cases the period of the vibration is determined by the time it takes to a perturbation to travel two string lengths (this time depends on string length, tension and material). The violin body is necessary to efficiently convey vibratory motion from the string to the air -- to match impedances ; it is not a "resonator" (if it were, the instrument would emphasize too much the resonant frequency) : in fact the response curve of a violin body has many peaks and valleys (cf. Meinel, 1957 ; Hutchins, 1962), and it can be approximated by a number of resonances at different frequencies (cf. Mathews and Kohut, 1971). Fine variations in the structure of the body and the elastic properties of the wood affect the tone and its evenness ; it is of utmost importance to control adequately the lowest air and wood resonances, which good violin-makers do. One is close to being able to make better (or less expensive) instruments thanks to acoustical progress (cf. Hutchins, 1967 ; Agren and Stetson, 1972, Jansson, 1973).
The piano could be considered a percussion instrument, since its strings are struck by hammers. The hammers are set into motion by the keys. The action (i.e., the mechanical transmission between keyboard and hammers) is quite elaborate : the hammer must escape the string after striking ; the inertia must be small ; a damper must stop the vibration when the key is released. Each hammer strikes one key (in the low range) to three keys (in the medium and high range), The string vibrations are transmitted to the soundboard, which radiates into the air. The strings exert a considerable tension on the frame (up to 30 tons) : piano frames are made of iron (cf. Blackham, 1965). The area stricken by the hammer affects the spectral pattern of the string vibration, which also depends upon the action, the hammer's speed and its surface (soft or hard). Due to the stiffness of the piano string, the vibration is inharmonic, that is, partials are not quite harmonic (cf. above, II, B, 2)(Schuck and Young, 1943) : thus piano sounds are not quite periodic. The string vibrations decay because of friction and radiation : the higher the pitch, the shorter the decay. The initial rate of decay is faster than the decay rate of the latter part of the sound (Martin, 1947). When a pianist plays an isolated note, he can only control the velocity of the hammer and subsequently the damping (Hart et al, 1934) -- this leaves, however, considerable room for differences between different pianists' "touch" (levels of individual notes, overall level, staccato-legato and pedals skill) (cf. Ortmann, 1962).
In wind instruments, a resonant air column is coupled to a valve-mechanism (a "reed") which modulates a steady air stream at audiofrequencies. Complex interactions between the valve mechanism and the air column determine the frequency and waveform of the sound (Benade, 1960).
In the case of the so-called woodwind instruments, the air column "dominates" the reed to determine the frequency ; holes in the tube allow to vary the vibration wevelength, hence the frequency, Woodwinds use tubes which have an almost cylindrical (flute, clarinet) or conical (oboe, bassoon) bore : this permits to use the same holes in two registers (Benade, 1959). Switching to a high register, that is, to a higher resonance mode of the air column, is done by overblowing ; it can be helped by a special "register" hole. The material of woodwind instruments has very little influence on the tone (Backus, 1969, p. 208) : flutes, formely made of wood, are now made of metal. In the flute or the recorder, the air column is excited by a stream of air striking against a sharp edge at the embouchure : the valve operates by air deflection in and out of the embouchure hole, under the influence of the vibrating air column. The flute was greatly improved more than one century ago by Boehm (Boehm, 1964 ; Benade, 1960) (cf. Coltman, 1966). In the clarinet or the saxophone, a single vibrating reed modulates the flow of air from the lungs : its behavior partly explains the fact that the clarinet tone does not contain only odd numbered partials, as a simplistic theory would state ; it also explains that the playing frequencies of the clarinet are lower than the resonance frequencies of the tube (Backus, 1969, pp. 193 and 200). The oboe and the bassoon use a double reed. The intonation of woodwinds still raise many problems (Young and Webster, 1958) : the physical situation is very complex (Bouasse, 1929), and despite ingenious compromises the player has to adjust the pitch by a delicate control of the lips ; according to Backus (1969, p. 202), the bassoon "badly needs an acoustical working-over".
In the brass instruments, the reed is formed of the player's lips, which are heavy ; they are influenced, although not dominated, by the air column (Benade, 1969) : the player produces different frequencies by adjusting the tension of his lips. The instruments take advantage of the resonance peaks of the air column, which helps intonation ; the length of the tube can be modified by using valves (e.g. in the trumpet) or a sliding piece of tubing (e.g. in the trombone). To position the peaks close to musically useful frequencies, the tube departs from a cylindrical shape ; the adjustment is done empirically, but it is now possible to calculate the impedance corresponding to various horn shapes with the help of a computer (F.J. Young, 1960 ). The sound radiation comes from a large area at the flaring end (the bell) : this increases the output level and accounts for the marked directivity of the instrument at high frequencies (Martin, 1942). The oscillations within the horn comprise amounts of nth harmonic growing as the nth power of the 1st component pressure (Worman, 1971). Mutes can be inserted into the bell, to soften and / or modify the tone.
The organ is a special kind of wind instrument. Organ pipes are excited by vibrating reeds or by edge tones. Sets of pipes are grouped in various ways to form so-called stops. The organ console comprises stop knobs and a variable number of keyboards. The complex organ action has evolved -- there are organs with mechanical, pneumatic, electric actions ; the action can affect the tone transients (cf. Richardson, 1954) as well as the ease of playing. The revival of interest in ancient organs has resulted in a better knowledge of the various styles of organ-making (Lottermoser, 1957), an art involving considerable know-how. The reverberant environment of the organ is of utmost importance.
The human voice is perhaps the oldest musical instrument. It can be considered a wind instrument : the glottis (the vocal chords) acts somewhat like lip reeds. However, unlike brass instruments, the coupling is very weak between the vocal chords and the vocal tract : hence the fundamental frequency is determined by control of the vocal chords, while the vocal tract resonances determine formants, responsible for the vocalic quality (Flanagan, 1972). A fixed formant near 3 000 Hz, regardless of the vowel being sung, has been evidenced in operatic singing (Vennard, 1967) : this had been ascribed to a special glottis behavior, but recent investigations suggest that it is due to special vocal tract configurations adopted in singing, and that it helps the singer to resist being masked by the accompanying orchestra (Sundberg, 1974, 1977 ; Music, room, acoustics, 1977). Vibrato (cf. above, II, D) has a considerable importance in operatic singing (Seashore, 1938, p. 42 and p. 256) ; it seems to imply a control loop (Deutsch and Clarkson, 1959).
There is a huge variety of percussion instruments, so-called because the sound is produced by hitting them (with sticks, soft or hard mallets, or with various other tools, including the hands). Most of them involve vibrations of membranes, bars or plates : such vibrations are complex the partials are not harmonic (Bouasse, 1927) (Olson, 1967, pp. 75-83) so it may be difficult to ascribe a pitch to the sound. When one partial or a group of closely or equally spaced partials are dominant, a definite pitch can be identified. Among percussion instruments with definite pitch, the timpani (or kettledrums) use a membrane stretched over an hemispherical bowl ; the xylophone comprise tuned wooden bars coupled to air resonant column ; the vibraphone is similar, but uses metal bars, and the columns can be closed periodically to yield a characteristic amplitude modulation ; the bells are empirically shaped so that the frequencies of the first partials are tuned according to euphonious intervals (Van Heuven, 1949). Among instruments with indefinite pitch, the snare drum consists of a short cylinder onto which a membrane is stretched at both ends : metallic snares can be set along the bottom membrane, to add a rattle quality to the sound ; gongs and cymbals are circular plates which have many closely spaced resonances.
There are many more instruments (cf. Baines, 1969) ; interest is growing in non-occidental instruments. Despite an understandable inertia in the composition of the western orchestra, acoustical progress leads to modification of existing instruments and to design of new acoustical instruments : this is evidenced by inspection of numerous patents (reviewed in the Journal of the Acoustical Society of America) however electronic or hybrid musical instruments (cf. below, V) seem to develop at a faster rate, and they now promise to provide interesting musical possibilities (cf. Mathews and Kohut, 1973).
It has long been believed that musical tones were periodic, and that the tone quality was associated solely with the waveform or with the relative amounts of the various harmonics present in the frequency spectrum (cf. above, II, D). Many analyses of musical instrument tones have been performed (D.C. Miller, 1926 ; C.A. Culver, 1947 ; H.F. Olson, 1966). However most of these analyses did not adequately characterize the instrument timbre. A successful analysis should yield a physical description of the sound from which one could synthesize a sound that, to a listener, is nearly indistinguishable from the original. In many cases the previously mentioned descriptions of musical instrument tones fail the fool-proof synthesis test (Risset and Mathews, 1969).
Only recently has it been possible to analyse rapidly-evolving phenomena. Older analysis instruments gave steady-state analyses, yielding either the frequency spectrum averaged over some duration or the spectrum of a particular pitch period (assumed to be repeated throughout the note). Helmholtz (1885) was aware that "certain characteristics particularities of the tones of several instruments depend on the mode in which they begin and end" ; yet he studied only "the peculiarities of the musical tones which continue uniformly", considering that they determined the "musical quality of the tone". The temporal characteristics of the instruments were averaged out by the analyses -- yet different instruments had different average spectra, so it was thought that this difference in average spectrum was solely responsible for timbre differences. Helmholtz followers do not appear to have felt the importance of temporal changes for tone quality ; there were a few exceptions, like Seashore (1938, p. 102) and Stumpf (1926) the latter had found that removing the initial segment of notes played by various instruments impaired the recognition of these instruments. This motivated analyses of the attack transients of instrument tones (E. Meyer and Buchmann, 1931 ; Backhaus, 1932 ; Richardson, 1954). However transients are complex and they are not quite reproducible from one tone to another, even for tones that sound very similar (Schaeffer, 1966). Most analyses have been restricted to a limited set of tones, and their authors have tended to generalize conclusions that may well be valid only for that set of tones. These shortcomings have produced many discrepancies in the literature and cast an aura of doubt on the entire body of acoustic data.
It is thus necessary to isolate, from complex physical structures, those significant features that are both regular and relevant to timbre. There are now various ways to control the psychoacoustical and musical relevance of the features extracted from the analysis. The most elegant one is the synthesis approach : these features are used to synthesize tones ; listeners judge how similar the synthetic and real tones sound, with results that indicate whether the physical description is sufficient. If it is not, additional analysis work has to be performed to find the proper parameters. Then systematic variations in the parameters (one at a time) enable listeners to evaluate the aural relevance of each of these parameters -- the irrelevant parameters can then be discarded to simplify the description (Fletcher et al, 1962 ; Beauchamp, 1967 ; Freedman, 1967 Strong and Clark, 1967 ; Risset and Mathews, 1969 ; Risset, 1969 ; Grey and Moorer, 1977). Methods starting from actual instrument tones and studying the effect of various alterations of these tones on listeners' recognition have also provided insight : it has thus been shown that the attack of the tone is an important recognition clue for a number of instruments (Berger, 1964 ; Saldanha and Corso, 1964 ; Schaeffer, 1966). The analysis of confusions between speeded-up instrument tones suggests there is a perceptual basis for grouping the instruments into families like the string or the brass family (Clark et al., 1964).
Temporal changes can be essential : a fast attack followed by a slow decay gives a plucked quality to any waveform. Schaeffer (1966) distinguishes between the sound "material" ("matière"), corresponding to a spectral cross-section, and the sound "shape" ("forme") corresponding to the evolution in time (similar concepts had been introduced under the names timbre and sonance by Seashore, 1938, p. 103). By appropriate modifications of material and / or shape (that is, of spectrum and / or temporal envelope), it is possible to transmute, e.g., a piano tone into a guitar-like tone, or a oboe-like tone into a harpsichord-like tone (Schaeffer, 1966 ; Risset, 1969).
In most cases, one cannot isolate a single physical invariant characteristic of the timbre of a musical instrument. Throughout the pitch and loudness range, the physical parameters of the sound of a given instrument vary considerably, to the extent that the perceptual invariance, the unity of timbre of an instrument like the clarinet seems to be a learned concept. However a property, a law of variation, rather than an invariant, often appears to play an important part in the characterization of an instrument (or a class of similar instruments). In the piano, from treble to bass, the attack gets less abrupt while the spectrum gets richer (Schaeffer, 1966). Throughout part of the range, the sound of the clarinet can be grossly imitated by a symetrical non-linear scheme giving a sine wave at low amplitude and odd partials at higher intensity, provided the temporal envelope is smooth enough (Risset, 1969 ; Beauchamp, 1975 ). The violin's quality is ascribed to a triangular source waveform modified by a steady and complex spectral response with many peaks and valleys (Mathews and Kohut, 1973) : this scheme explains the richness of the vibrato, which modulates the spectrum in a complex way ; the presence of vibrato makes violin recognition easier (Saldanha and Corso, 1964 ; Fletcher, 1967). The brassy quality seems to be primarily associated to spectral variation with loudness : increasing loudness causes a strong increase in the intensity of higher partials (Risset and Mathews, 1969) ; within the brass family, different instruments present different spectral patterns and different temporal envelopes (Luce and Clark, 1967). That these features indeed characterize to a large extent stringed and brassy timbres can be demonstrated with Mathews' electronic violin (Mathews and Kohut, 1973) : this instrument, played with a bow, can sound like an ordinary violin, but also like a trumpet if it is given the spectral characteristic of the brass.
Very often idiosyncrasies of sound production result into tone particularities which are strong cues for instrument recognition : frequency glides between notes in the trombone, because of the slide ; intonation troubles in the horn, because of the difficulty to hit the right mode ; initial erratic vibration in string instruments, when the string is first set in motion by the bow ; burst of tonguing noise at the beginning of recorder sounds. Such particularities help make imitative synthesis very realistic (Morrill, 1977).
Some instruments can be grossly characterized by the combination of a few factors : e.g. a smooth temporal envelope, a spectrum with a predominant fundamental, a proper amplitude modulation with a regular and an irregular component yield a flute-like sound ; a rapid attack followed by a short decay, imposed on a low frequency inharmonic spectrum plus a high frequency noise band, reminds of a snare drum. On the other hand, the naturalness and tone of a bell-like sound is critically dependent on the harmonic composition and on the decay characteristics (Risset, 1969). It has been attempted to characterize economically the tones of various wind instruments -- the respective importance of the temporal and the spectral envelopes for one instrument has been assessed by exchanging these envelopes between instruments (Strong and Clark, 1967).
The importance of a given cue can depend on context. For instance details of the attack of trumpet-like tones (specially the rate at which various partials rise with time) are more significant for timbre in long sustained tones than in brief or evolving tones (Risset and Mathews, 1969). In the case of a very short rise time (as in the piano), the subjective feeling for the attack is actually correlated with the shape of the beginning of the amplitude decay (Schaeffer, 1966). The influence of the room acoustics is quite important and complex (Schroeder, 1966 ; Benade, 1976). Details of a fast decay can be masked in a reverberant environment like that of the organ. For percussion instruments with low damping (like bells, gongs, low piano tones) the instrument's decay usually dominates the room's reverberation. In such cases, the details of the decay have a strong bearing on the timbre : if the partials would decay synchronously, the sound would be unnatural, "electronic" (Risset, 1969) ; generally, the lower the component's frequency, the longer its decay time. Liveliness and warmth of the sound is increased by a complex pattern of beats between the components of the sound -- warmth has been ascribed to inharmonicity in the piano tones (Fletcher et al, 1962).
Although much remains to be done, the recent work on analysis and synthesis of musical instrument tones has brought substantial insight on the correlates of musical instrument timbre (cf. Grey and Moorer, 1977) to the extent that it is now possible to simulate a number of instruments using simplified descriptions, e.g. descriptions in terms of Chowning's powerful frequency modulation technique for sound synthesis (Chowning, 1973).
Most music heard nowadays comes from loudspeakers. Electroacoustic transducers -- the microphone and the loudspeaker -- have made it possible to use electronic amplification and modification of signals, and to achieve high standards of quality in sound recording and reproduction.
Present reproduction systems can be conceptually divided into four parts (some of which may be assembled in one piece of equipment) : the signal sources (phonograph, radio tuner, tape deck) supply electrical signals corresponding to audio-information ; the pre-amplifier (or control amplifier), which modifies the electrical signals, performing amplification and equalization (that is, restoring of a proper spectral balance) ; the power amplifier, which boosts the power of the signals to a level adequate for driving the loudspeakers : the loudspeakers, which convert the electrical signals into sound.
The main defects of a reproduction system are linear distorsion -- non uniform response to various frequencies in the reproduced band (it can be evaluated from the frequency response of the separate components, which should be as smooth and as flat as possible) ; non-linear distorsion -- which creates additional harmonics and very objectionable intermodulation distorsion ; addition of noise (hiss, hum). Irregular rotation of a turn-table results in flutter and vow, also very objectionable. Various other imperfections can result of the recording (depending on microphone placement and balance, on recording room acoustics) ; of an imperfect rendering of the spatial distribution of sound ; of the acoustics of the listening room ; of an inappropriate listening level (because of the hearing characteristics as shown by Fletcher and Munson curves (cf. Stevens and Davis, 1952), high and low frequencies are quite weakened by low-level listening).
In the present state of the art (cf. Villchur, 1965 ; Olson, 1967 ; Crowhurst, 1971), power amplifiers and even preamplifiers can be made near perfect (although cheap ones can be very poor). The weak points are the storage medium (disk records, tape) and the transducers (phonograph pick-up, loudspeaker). Disk records cannot accomodate the full range of dynamics of orchestral music (about 80 dB between the softest and the loudest passages), although this has been improved by intensity compression before recording and compensating expansion at playback. The curvature of the groove introduces distorsion, specially near the centre of the disk. Records wear very much. Tape storage is much better in that respect, yet it is prone to deterioration with time ; also it suffers from noise -- there is a tradeoff between noise (at low recording levels) and distorsion (at high levels), which can be improved by compression and expansion (Dolby, 1967 ; Blackmer, 1972). Although still in its infancy and expensive, digital recording promises to offer higher quality and durability : digital encoding affords substantial protection against the deficiencies and fragility of the recording medium. (David et al., 1959 ; Stockham, 1971 ; Kriz, 1975).
The components of a musical reproduction system should be properly matched in quality. It is often the speaker which limits the quality of the set. The electrodynamic speaker is still the most widely accepted ; multiple speakers are generally used, small ones (tweeters) for high frequencies and large ones (woofers) for low frequencies (the signal is divided into low and high frequency parts by a cross-over network before feeding the speakers). Speaker enclosures are necessary, and their design and realization has a strong effect upon the sound. The art of making loudspeakers is still evolving. The principle of motional feedback can be applied advantageously. Also the desirable spatial distribution of the loudspeaker sounds still needs careful study.
Some of the spatial quality of music is rendered through the use of stereophony, which needs two recording channels and two speakers to create the illusion of auditory perspective. The next step is quadriphony (use of four channels). It has been shown that powerful control of the position and movement of the virtual sound source in the horizontal plane could be achieved with four channels (Chowning, 1971). However this may not be necessary for the rendering of classical music ; some quadriphony systems only use two recording channels, which are properly delayed in time and reverberated before being fed to the additional speakers, in an attempt to approximate the auditory environment of a concert hall in a listening room.
Hi-fi is an art of illusion. Faithful objective reproduction is impossible ; music reproduction must take advantage of the limitations of hearing and maintain the defects and distorsion of the system at a low enough level to satisfy the listener (Jacobs and Wittman, 1964). Relevant to this problem are tests performed by Olson (1967, p. 388), which indicate that listeners are much more tolerant to distorsions if the high frequency cutoff is lower (they also show that speech is less impaired by distorsion than music). Chinn and Eisenberg (1945) tested the frequency-range preference for reproduced speech and music : surprisingly, they found that a majority of Ss preferred a narrow band system to a medium and a wide band system. Olson (1967, p. 393) objected that the amount of distorsion may have been significant, so that Chinn and Eisenberg may have measured in fact the effect of bandwidth on distorsion tolerance ; he tested the preferred frequency range of live music, using acoustical filters : he found that Ss preferred the full-frequency range. Kirk (1956) proved that the judgment of Ss was biased by their expectations and that unfamiliarity with wide range reproduction systems could explain Chinn and Eisenberg results : repeated listenings of the wide (narrow)-band system shifted Ss' preference toward the wide (narrow) band system. Indeed AM radio and television teach many listeners to expect a narrow frequency range in music reproduction. There are also instances of hi-fi fans disappointed by listening to live music.
Music reproduction is not always used in a neutral way ; the balance can be intentionally changed, some instruments can be emphasized, special effects can be used : many pop music recordings have necessitated much sound processing. So far, music reproduction equipment has been designed for signals having the frequency distribution of instrumental music (cf. Bauer, 1970) : inferior quality equipment has specifications chosen to minimize aural impairment for this type of music ; it can be quite detrimental to synthetic music with much energy in the low or high frequency range.
Musical automata already existed at the end of the Middle Ages. In the XVIIth century, Kircher (1650) built music machines using pneumatic action : the score was recorded on punched tape (Kircher also built devices which implemented automatically certain rules of musical composition). There appeared a wealth of pianos, organs, wind and percussion instruments, activated by a clock or a pneumatic mechanism ; there were even automated "orchestras", like the Panharmonicon of Maelzel (the inventor of the metronome), for which Beethoven specially composed in 1813 "La Bataille de Vittoria".
Recording and electrical transmission of sound appeared at the end of the XIXth century : the techniques involved have also been used for the creation of new instruments and new music. In 1897, Cahill built an enormous "electrical factory of sounds", the telharmonium (cf. Russcol, 1972). A few composers were calling for new instruments, specially Varèse, who wrote in 1917 : "Music (...) needs new means of expression, and science alone can infuse it with youthful sap" (cf. Ouelette, 1968). Varèse wanted to escape the scale limitations and timbre restrictions imposed by the conventional instruments : between the two world wars, he tried in vain to originate research towards "the liberation of sound". A number of electronic instruments were built at that time, but most of them were designed or used with a traditional turn of mind, e.g. the electronic organ (cf. Douglas, 1957, 1962 ; Dorf, 1963). Among the most novel electronic instruments were Mager's Sphaerophon, Termen's Theremin, Trautwein's Trautonium and the Ondes Martenot (cf. Dorf, 1957 ; Crowhurst, 1971) : such instruments were used by composers like Strauss, Hindemith, Honegger, Messiaen, Jolivet ; the performer used a keyboard or another device (e.g. the displacement of a ribbon, or the capacitance effect of the hand) to modify the adjustments of electronic circuits producing electrical waves (sent to a loudspeaker). During that period, several composers, including Milhaud, Hindemith, Varèse, experimented with phonograph manipulations ; Varèse proposed the generalization of music to organized sound ; Stokowski foresaw the direct creation into tone, not paper ; Cage predicted "the synthetic production of music... through the aid of electrical instruments which will make available for musical purposes any and all sounds that can be heard" (cf. Russcol, 1972, p. 72), and realized in 1939 Imaginary Landscape n°1, a piece using recorded sound produced by instruments and oscillators : it can be considered the first piece of electronic music -- at any rate, the first musical work to exist only as a recording.
After the second world war, the progress of electronics and recording techniques fostered new and important developments (cf. Beranek, 1970 ; Music and Technology, 1971).
Thanks to recording on magnetic tape, sounds could be dealt with as material objects. In 1948, Schaeffer (1952, 1966) started at the French Radio systematic experiments on modifications of sounds (recording, performed so as to achieve special effects rather than fidelity ; use of loops, tape reversal, tape splicing, mixing, filtering, ... : cf. Ussachevsky, 1958 ; Olson, 1967). These experiments led to production of "musique concrète" using processed recorded sounds of natural origin, including those of musical instruments ; the name musique concrète refers to the process of building the music directly by sound manipulation, instead of using an abstract medium -- the score -- between the conception and the realization of the music. The Schaeffer group attempted to empirically uncover some criteria helpful in musically assembling diversified and complex sounds it was found essential to dissimulate the nature of the sound source by proper transformations : a priori any sound producing object is adequate for musique concrète, be it a piano, played in any conceivable way, or a garbage lid -- but the identification of the piano or the garbage lid would hinder listening to the sounds for themselves.
"Electronic music" appeared a couple of years later, first in Germany, then in Italy, Holland and Japan. The promotors, Meyer-Eppler and Eimert, were joined by composers like Stockhausen, Pousseur, Koenig (cf. Die Reihe, 1958 ; Russcol, 1972) who adopted a formalist approach quite different from the empirical approach of musique concrète : instead of relying on aural control to assemble complex sounds, the emphasis was on the precise building and organization of compositions calling for sounds of well-known physical parameters ; often a precise graphic score was produced before the composition was realized into sound. Initially the sounds were mostly combinations of sinusoidal tones and band-limited noises, produced by a battery of electronic equipment.
Tape-Music, started at Columbia University in 1952 by Ussachevsky and Luening, used freely electronic or natural sounds ; in 1953 a work for orchestra and magnetic tape was presented. The theoretical and technical gap between the original electronic music and musique concrète has gradually been bridged : the term "electronic music" now refers to essentially any process using electronic equipment for the production of sound. Varèse used both electronic and natural sounds in Déserts (1954) and in Poème électronique (1957) ; for this latter work, a number of loudspeakers were used, so that the sound could travel in space. Cage and Stockhausen later used real-time electronic processing of musical instrument sounds.
Many studios devoted to the production of electronic music were built for universities, radio stations (cf. Le Caine, 1956) and even private composers (cf. Mumma, 1964) ; five thousand electronic music compositions are listed in a catalog compiled several years ago (Davies, 1968) (cf. also Cross, 1967). Originally electronic music studios used recording equipment plus electronic equipment, like wave generators or filters, not specifically designed for music. Later music synthesizers appeared. The R.C.A. synthesizer designed in 1955 by Olson (1967, p. 415-426) is an enormous but powerful machine comprising many electronic circuits for the production of sound, controlled by instructions recorded on punched paper tape. Voltage control of oscillators and amplifiers was first applied to the design of electronic music equipment around 1964 : instead of requiring manual adjustment of a knob, parameters like frequency can thus be set by an input voltage (Moog, 1967) ; the equipment can be controlled by electrical signals, produced by a keyboard, a knob, or possibly other sources like biological sensors (Eaton, 1971) or computers (cf. below, V, C ; Mathews and Moore, 1970). Voltage controlled oscillators, amplifiers, modulators and filters are now often assembled in compact and convenient music synthesizers, which can be used as performing instruments as well as part of electronic music studios (cf. Electronic Music Review ; Strange, 1972 ; Appleton and Perera, 1973) : for this reason they had an immediate appeal, specially in the field of pop music, and they helped live electronic music come of age, after pionnier realizations like those of the Barrons (L. and B. Barron, 1959). However one may object to the rather mechanical sound that synthesisers often produce.
Electronic music has produced striking effects and powerful pieces of music. There is however some dissatisfaction with electronic music among many composers, who feel that they cannot exert subtle enough control upon the elaboration of the sound structure. The lack of variety and richness is a strong limitation of purely electronic sounds, which are not complex enough (cf. above, II, D) unless they are manipulated in complex ways : but the user then looses control of their parameters. Natural recorded sounds are varied and often rich, but one cannot easily exercise fine compositional control over them, since there is a disparity between the rich sounds and the rudimentary means of transforming them.
The digital computer can be used to relieve the composer from mechanical tasks, like transposing or inversing a melody (Smith, 1972) it can even be given a demanding duty : that of composing the music. If musical composition is regarded as the assembling of elements of a symbolic repertoire in some structured way (Moles, 1966), it can be performed automatically, provided the rules for selecting and assembling the elements are embodied in a computer program. Rules of counterpoint, as well as other rules, have thus been programmed (Hiller and Isaacson, 1959 ; Barbaud, 1966 ; Koenig, 1970). Statistical constraints, based on statistical analyses of existing music, permit to grossly imitate a style or a composer (Olson, 1967, p. 430) ; also, the composition of stochastic music, where compositional control is only statistical, can be carried out by computer (Tenney, 1969 ; Xenakis, 1971). It is easy to produce automatically random compositions with structure imposed as a sequential dependency : but this does not yield readily perceivable long range structures (Denes and Mathews, 1968). Indeed Markov processes, used to generate music in which the probabilities of successive notes depend on the preceeding ones, are inadequate to generate certain musical structures, like self-imbedded structures ; processes similar to Chomsky's generative grammars have been proposed by Schenker half a century ago as models of tonal composition (cf. Kassler, 1964). Automatic musical composition faces as difficult problems as artificial intelligence in general ; it is hard to explicit all the criteria of compositional choice into a computer program : but the very deficiencies of automatic composition bring insight on the creative process. Moreover the computer, instead of taking composition completely in charge, can efficiently help the composer in specific compositional tasks, or in testing compositional rules.
A computer is much more versatile than ordinary electronic equipment. Computer synthesis of sound can offer varied and precisely controllable sounds. The computer also permits to realize complex mathematical or logical operations including automatic musical composition.
Direct digital synthesis was introduced in 1958 and developed by Mathews (1963 ; 1969). The computer directly calculates the sound waveform -- it computes samples of the waveform, e.g. values of the acoustical pressure at equally spaced time intervals. The numbers are then put out and converted into electrical pulses by a digital to analog converter. The pulses are smoothed by an appropriate low-pass filter to produce a continuously varying voltage which drives a loudspeaker. The sampling theorem (cf. Shannon, 1948) states that, provided the sampling rate is high enough (e.g. 40 000 Hz), one can thus produce any bandlimited waveform (e.g. up to 20 000 Hz). In essence, the computer directly controls the motion of the loudspeaker : direct digital synthesis is the most general sound synthesis process available.
To use it efficiently, however, two problems must be solved. First, one needs a convenient program to control the computer. Programs like Music V enable the user to produce a wide variety of sounds, even very complex ones, provided their physical structure is thoroughly specified (Mathews, 1969). Second, the user -- the composer -- must be able to provide such thorough descriptions of sound : it was soon realized after the first experiments (cf. Mathews, 1963 ; H. V. Foerster and Beauchamp, 1969) that a body of psychoacoustical knowledge, relating the physical parameters of musical sounds and their subjective effect, was needed and lacking. Fortunately, computer sound synthesis is invaluable to make progress in this field (it is also useful for psychoacoustics and speech synthesis research) : every result of interest can be retained and pooled between the users, through examples of synthetic sounds comprising a listing of the synthesis data -- which provide a complete and precise description of the sounds -- and a recording of the sounds -- which users can listen to in order to subjectively evaluate the timbres (Risset, 1969 ; Chowning, 1973 Morrill, 1977). So psychoacoustic knowhow can build up cooperatively to increase the gamut of musically useful sounds available by computer synthesis -- and even by other processes of electronic synthesis. Digital synthesis has helped to understand the physical correlates of the timbre of traditional instruments : consequently it can generate sounds reminding of those instruments (cf. above, III, B). It has also already permitted to achieve unprecedented control over various aspects of the sound, for example controlling independently and / or precisely various cues for pitch (Risset, 1971) or space (Chowning, 1971), or interpolating between instrumental timbres (Grey, 1977), thus yielding novel musical possibilities. Among the most promising areas which can be explored through computer synthesis, the use of tones built up from arbitrarily chosen inharmonic frequency components may favour new melodic and harmonic relationships between the tones, as indicated by Pierce (1966). The computer thus affords sonic resources of unprecedented diversity and ductility : the musician can now envision working in a refined way with the continuum of sound.
However direct digital synthesis is difficult. A lot of computing is involved, so the computer cannot work in real time : a complete sound description must be specified in advance. This does not permit the user to react and modify the sound while listening. However this possibility, whereby the composer can introduce performance nuance in real time, is provided by hybrid systems which interpose a computer between the composer-performer and an electronic sound synthesizer. This way, the computer is freed from the computation of all the temporal details of the waveform ; it only has to provide control signals for the synthesizer : thus real-time operation is possible, and the user can control the sound in elaborate ways with various devices attached to the computer, like "programmed" knobs or keyboards. One can thus program the performance of electronic music without tape recorders ; the computer storage and editing facilities considerably extend the possibilities that would be available with the synthesizer alone new situations can be set up where the user of the system can conduct, perform and improvise. The performance can be perfected with the help of the computer (e.g., the intonation can be corrected, the voice played may be harmonized) : this may contribute to revive musical practice in non-expert amateurs (cf. Mathews and Moore, 1970 ; Music and Technology, 1971 ; Mathews et al, 1973). Of course, the sounds produced by hybrid systems are inherently limited by the possibilities of the synthesizer attached to the computer. However it is now possible to build digital synthesizers which are stable, accurate and powerful (Alonso et al., 1975 ; Di Giugno, in press ; Moorer, 1977), so that one will soon be able to take advantage of real-time operation while benefiting from the rich sonic possibilities of direct digital synthesis.
Electronic music has rapidly grown as a new medium of expression. From an economic standpoint, computer prices now compare with the prices of the analog equipment in a traditional electronic music studio. It seems that we are at the beginning of a new era for the development of electronic music, due to the considerable progress of digital microelectronics and digital sound processing techniques : electronic music will probably use more and more computers (or microprocessors) in conjunction with a battery of digital circuits acting as powerful special-purpose sound processing computers (cf. Music and Technology, 1971, p.129. Moorer, 1977). However progress in psychoacoustical knowhow and music theory as well as ingenuity in musical system design are required, if such systems are to be musically efficient.
The author is indebted to E.C. Carterette, J. Grey, D. Harwood, M.V. Mathews, and D. Wessel for helpful comments and criticism.
____________________________
Server © IRCAM-CGP, 1996-2008 - file updated on .
____________________________
Serveur © IRCAM-CGP, 1996-2008 - document mis à jour le .