IRCAM - Centre PompidouServeur © IRCAM - CENTRE POMPIDOU 1996-2005.
Tous droits réservés pour tous pays. All rights reserved.

Real-Time Synthesizer Control

Max V. Mathews, Gerald Bennett

Rapport Ircam 5/78, 1978
Copyright © Ircam - Centre Georges-Pompidou 1978

Control of real-time sound synthesizers

It seems clear that apart from its computer IRCAM should have the possibility of real-time sound synthesis. This task might be done by a digital synthesizer since digital techniques appear to be cheaper and better than analogue methods. It seems somewhat less clear that IRCAM should actually build its own synthesizer, for at the moment we have much more musical than engineering expertise, and it would seem only logical to exploit this expertise to the fullest. Because at least one extremely convincing design for a very general and powerful digital synthesizer exists already, it is not clear how necessary it is that we invest great amounts of time in this project. On the other hand, no very suitable or compelling control devices exist to render synthesizers musically useful. We feel that the most urgent priority should be given to developing powerful and supple control devices which could be used with any synthesizer IRCAM decides to buy or build.

The development of such devices will be a two-fold project: the more obvious aspect will be middle-term in length and will concern various sorts of external control imposed on or giving rise to musical material. Some ideas for these sorts of control will be discussed at greater length below.

The other aspect of the project will be of longer duration and cannot be precisely defined yet. The point, however, is this: Just as the digital synthesizer is faster and more economical than the general purpose computer for the specific task of making sound, so it is reasonable to imagine that increased knowledge in psychoacoustics may make it possible to design synthesizers more specifically tailored to certain musical functions. For the moment it is more realistic to consider general techniques for controlling sound synthesis. However, the possibility of developing a synthesizer whose very structure would, for example, simplify complex additive synthesis should be kept in mind.

We shall now consider some specific kinds of devices which seem appropriate for controlling sound synthesis.

Controlling many sound sources with few musicians

How can we control a group of sound sources which is large and complex enough to be musically interesting with the control signals that can be produced by one or by a few performers? We believe this is an essential unsolved problem which is appropriate for IRCAM.

As an example to make the problem clear, the thousand oscillators proposed by Berio appear to be a very interesting sound source. It now seems that they can be built for a reasonable cost and should operate reliably. But how could one man with only ten fingers, one vocal tract, two legs and feet,two arms, one body, and one head control more than a small fraction of these oscillators?

In general, we see several approaches to this problem, all of which are worth studying, none of which are guaranteed to usefull.

  1. Controlling many oscillators with each human action. If many oscillators are controlled in identical ways by one action, the result will almost certainly be uninteresting. A more promising possibility is to transform the action in individually different ways before applying them to the individual oscillators.
  2. Controlling some oscillators with "recorded" actions. A complex process can be controlled by making a number of passes and in each pass controlling only part of the process. The control signals can be recorded and combined with control signals from the next pass so that at the last pass everything is controlled.This is a technique used by "Groove".
  3. Using many aspects of a complex human activity for control. It is possible to derive many signals from a complex process such as human speech or singing or performance on an instrument, and to use each signal for control purposes. Pitch, amplitude, and formant frequencies are obvious possibilities if the activity is speech. Pitch, amplitude and spectrum are possibilities for instrumental performance.
  4. Using many performers. This possibility is obvious but we do not wish to overlook it.

Matching human physiological characteristics

The control devices must fit the physiological characteristics of people. We believe that at least two kinds of gestures may need separate consideration. The first, we call facile gestures which can be done rapidly, for example, to control individual notes. For these gestures the fingers, lips, tongue, breath, vocal chords, and possibly other things associated with the vocal tract seem most promising. The second kind of gestures we call grand or smooth gestures such as the sweeping arm movements used in conducting. These may involve the inertia of large parts of the body to achieve a smooth motion appropriate to control of phrases and large musical units.

We believe that there is often a conflict between precision and freedom in gestures - finger motions being an example of precise motions and arm-hand motions being an example of free motions. Because of this conflict, it is important to choose gestures appropriate to the demands of musical parameters for precision and freedom. It is also important to seek gestures which have both good precision and good freedom to as great an extent as is possible.

Using natural complexity

The everyday world of sound is remarkably complex: even a simple metal bar when struck vibrates in ways which would take great amounts of time to reproduce by additive synthesis. In thinking about control devices for sound synthesis, one goal should be to harnass the complexity of the world to make and control complex structures in music.

We already have extremely delicate control over some of this complexity: here the voice is the first example which comes to mind. The muscles of the vocal tract can reproduce exceedingly small changes in complex speech sounds with startling fidelity. It should be possible to put the refinement of speech production to work to control complex musical structures in an equally refined and precise way.

Certainly pitch and amplitude, and perhaps the formant structure of speech can be used to control any aspect of sound synthesis one chooses. One can imagine a composer constructing sounds modeled on, or at least determined by, the formant structure of a vowel sound. Timbral changes might be determined by a diphthong, while consonants might give information about attack, decay, and inharmonicity of components.


One device which should be given special consideration as an input mechanism is the piano or organ keyboard. It seems likely that present keyboards are the most rapid devices with which people can control things. It may or may not be possible to significantly improve keyboards by changing their shape. But even if their shape remains the same, their function should be examined. In particular, should the keyboard be an off-on device as in an organ or a touch sensitive device? If it is touch sensitive, how many dimensions should be sensed for each key? The piano probably senses one velocity for each keystroke plus how long the key is depressed. Can a person effectively control the entire time course of the keystroke? If not, how much can he control? Can a person effectively control lateral pressure on the keys and, if so, how many independent lateral pressures can he control?

Real-time graphical input device

A device such as a lightpen and cathode ray tube on which one can draw or trace a graph or picture and use the picture for real-time control of sound parameters seems interesting. Specifically, the x and y coordinates of the tip of the pencil could provide two time functions to control two musical parameters.

Generality and memory

To both facilitate research where great flexibility is needed and to provide for novel performance modes, we believe that it should be possible to attach almost any control device to any musical parameter. Obviously, some control devices will be less suited to some musical parameters or, at least, will produce unusual performances, but we feel that it is important to confront the musician with these novel possibilities.

It is also important to directly record the control signals so that they can be analyzed and so they can be later reproduced. Fortunately, people move slowly (compared to sound waves) and low sampling rates (about 100 samples/second) are enough to record the signals produced by human gestures.

What to build and what to buy

We hope the points we have presented make a convincing argument that the real-time control problem is much better suited for IRCAM's attention than the real-time sound synthesis problem. However, one can control nothing if one has nothing to control. IRCAM must either build or buy an adequate synthesizer. On this choice we see the following points:
  1. We can probably purchase exactly the same synthesizer that we would build. There are plenty of people who would build something according to our specifications.
  2. If we build a machine, we will learn a lot more about making synthesizers than if we buy a machine. Our people will become expert in this kind of circuit construction instead of the contractor's people.
  3. It will probably take us longer to get a synthesizer if we build it ourselves.
  4. The costs of building may turn out to be almost the same as the costs of buying. However, the kinds of money involved may be different - more operating budget being used if we choose to build.

Server © IRCAM-CGP, 1996-2008 - file updated on .

Serveur © IRCAM-CGP, 1996-2008 - document mis à jour le .