This thesis proposes new analytical and modeling solutions in order to better understand and represent how we perform music. Musical performance can be considered as the conversion of a sequence of discrete and abstract cognitive concepts into a sequence of physical and continuous actions. While the former are time-invariant and context-independent, the sequencing of actions imposes spatio-temporal variations to accommodate the continuous flow of movement. These variations, which are referred to by the term coarticulation, are the consequence of cognitive and biomechanical constraints. A better characterization of these phenomena is motivated by the wish to understand the embodied relation to music, bringing as well important cues for the manipulation of digital media, both in compositional processes, and in the real-time control of musical interfaces. We measured the levels of pressure inside the mouth, the muscular facial activity, the force applied against the mouthpiece and the radiated sound during the performance of trumpet players. The experimental protocol formulated for these measurements aimed at identifying the effects of dynamics and sequencing in the performance of musical tones. We first characterized the gesture control employed for the production of single, isolated notes, and later compared these findings with those issued form the performance of series of concatenated notes. Then, we detailed a novel framework for the synthesis of gesture control aimed at offline compositional tools with a virtual trumpet and a design methodology for the real-time control of a virtual trumpet through physical interfaces. The latter is based on measurements issued from the mimicking of sound-producing gestures performed through different control modalities. From the analysis, it emerged that the levels of effort and rate in performing musical tones affect the timing of tonguing and breathing activity, as well as the control strategy employed for the breathing patterns. In particular, the spatial amplitude of the breathing gestures influenced whether these were organized in a ballistic or in a feed-back type of control. Different speeds in performing consecutive tones revealed discordant scalings in the durations of tongue and breathing gestures. These results suggest that the synergetic constellation of the gestural activities involved in the production of tones are not regulated by relational timing invariance but present different contextual dependences. The sequencing of consecutive tones resulted in the superposed activity of the control variables, and the coproduction imposed coarticulation effects. Specifically, these unified the motor control strategy adopted for the control of the pressure level, and affected -- yet not in a systematic way -- the duration of tongue release. The planning of tone sequences affected the distribution of the muscular effort. These results suggest that different control strategies are employed locally in transition between gesture units and generally for suprasegmental planning of sequences. Some of these spatio-temporal variations have been later simulated and discussed from a modeling perspective. The framework that has been developed allows to represent parametrically and with sufficient detail some of these spatio-temporal contextual variations that emerged from the experimental curves.