Ircam-Centre Pompidou


  • Recherche simple
  • Recherche avancée

    Panier électronique

    Votre panier ne contient aucune notice

    Connexion à la base

  • Identification
    (Identifiez-vous pour accéder aux fonctions de mise à jour. Utilisez votre login-password de courrier électronique)

    Entrepôt OAI-PMH

  • Soumettre une requête

    Consulter la notice détailléeConsulter la notice détaillée
    Version complète en ligneVersion complète en ligne
    Version complète en ligne accessible uniquement depuis l'IrcamVersion complète en ligne accessible uniquement depuis l'Ircam
    Ajouter la notice au panierAjouter la notice au panier
    Retirer la notice du panierRetirer la notice du panier

  • English version
    (full translation not yet available)
  • Liste complète des articles

  • Consultation des notices

    Vue détaillée Vue Refer Vue Labintel Vue BibTeX  

    Catégorie de document Thèse
    Titre Data-Driven Concatenative Sound Synthesis
    Auteur principal Diemo Schwarz
    Discipline Informatique
    Université ou établissement Université Paris 6 - Pierre et Marie Curie
    Directeur Xavier Rodet
    Copyright Diemo Schwarz
    Année 2004
    Statut éditorial Non publié

    Concatenative data-driven sound synthesis methods use a large database of source sounds, segmented into heterogeneous units, and a unit selection algorithm that finds the units that match best the sound or musical phrase to be synthesised, called the target. The selection is performed according to the features of the units. These are characteristics extracted from the source sounds, e.g. pitch, or attributed to them, e.g. instrument class. The selected units are then transformed to fully match the target specification, and concatenated. However, if the database is sufficiently large, the probability is high that a matching unit will be found, so the need to apply transformations is reduced. Usual synthesis methods are based on a model of the sound signal. It is very difficult to build a model that would preserve all the fine details of sound. Concatenative synthesis achieves this by using actual recordings. This data-driven approach (as opposed to a rule-based approach) takes advantage of the information contained in the many sound recordings. For example, very naturally sounding transitions can be synthesized, since unit selection is aware of the context of the database units. In speech synthesis, concatenative synthesis methods are the most widely used. They resulted in a considerable gain of naturalness and intelligibility. Results in other fields, for instance speech recognition, confirm the general superiority of data-driven approaches. Concatenative data-driven approaches have made their way into some musical synthesis applications which are briefly presented. The Caterpillar software system developed in this thesis allows data-driven musical sound synthesis from a large database. However, musical creation is an artistic activity and thus not based on clearly definable criteria, like in speech synthesis. That's why a flexible, interactive use of the system allows composers to obtain new sounds. To constitute a unit database, alignment of music to a score is used to segment musical instrument recordings. It is based on spectral peak structure matching and the two approaches using Dynamic Time Warping and Hidden Markov Models are compared. Descriptor extraction analyses the sounds for their signal, spectral, harmonic, and perceptive characteristics, and temporal modeling techniques characterise the temporal evolution of the units uniformly. However, it is possible to attribute score information like playing style, or arbitrary information to the units, which can later be used for selection. The database is implemented using a relational SQL database management system for optimal flexibility and reliability. A database interface cleanly separates the synthesis system from the database. The best matching sequence of units is found by a Viterbi unit selection algorithm. To incorporate a more flexible specification of the resulting sequence of units, the constraint solving algorithm of adaptive local search has been alternatively applied to unit selection. Both algorithms are based on two distance functions: the target distance expresses the similarity of a target unit to the database units, and the concatenation distance the quality of the join of two database units. Data-driven concatenative synthesis is then applied to instrument synthesis with high level control, explorative free synthesis from arbitrary sound databases, resynthesis of a recording with sounds from the database, and artistic speech synthesis. For these applications, unit corpora of violin sounds, environmental noises, and speech have been built.

    Mots-clés alignment / music to score alignment / segmentation / dynamic time warping / polyphonic alignment / spectral distance / evaluation of alignment / sound synthesis / concatenative synthesis / data-driven synthesis / unit selection / database / c
    Equipes Analyse et synthèse sonores, Interactions musicales temps-réel
    Cote Schwarz04a
    Adresse de la version en ligne

    © Ircam - Centre Pompidou 2005.