Ircam-Centre Pompidou

Recherche

Recherche simple

Recherche avancée

Panier électronique

Afficher le contenu du panier

Consulter les notices sélectionnées

Vider le panier

Votre panier contient 2 notices

Connexion à la base

(Identifiez-vous pour accéder aux fonctions de mise à jour. Utilisez votre login-password de courrier électronique)

Entrepôt OAI-PMH

Soumettre une requête

	Consulter la notice détaillée
	Version complète en ligne
	Version complète en ligne accessible uniquement depuis l'Ircam
	Ajouter la notice au panier
	Retirer la notice du panier

English version

(full translation not yet available)

Liste complète des articles

Consultation des notices

Vue détaillée

Catégorie de document	Thèse
Titre	Data-Driven Concatenative Sound Synthesis
Auteur principal	Diemo Schwarz
Discipline	Informatique
Université ou établissement	Université Paris 6 - Pierre et Marie Curie
Directeur	Xavier Rodet
Copyright	Diemo Schwarz
Année	2004
Statut éditorial	Non publié
Résumé	Concatenative data-driven sound synthesis methods use a large database of source sounds, segmented into heterogeneous units, and a unit selection algorithm that finds the units that match best the sound or musical phrase to be synthesised, called the target. The selection is performed according to the features of the units. These are characteristics extracted from the source sounds, e.g. pitch, or attributed to them, e.g. instrument class. The selected units are then transformed to fully match the target specification, and concatenated. However, if the database is sufficiently large, the probability is high that a matching unit will be found, so the need to apply transformations is reduced. Usual synthesis methods are based on a model of the sound signal. It is very difficult to build a model that would preserve all the fine details of sound. Concatenative synthesis achieves this by using actual recordings. This data-driven approach (as opposed to a rule-based approach) takes advantage of the information contained in the many sound recordings. For example, very naturally sounding transitions can be synthesized, since unit selection is aware of the context of the database units. In speech synthesis, concatenative synthesis methods are the most widely used. They resulted in a considerable gain of naturalness and intelligibility. Results in other fields, for instance speech recognition, confirm the general superiority of data-driven approaches. Concatenative data-driven approaches have made their way into some musical synthesis applications which are briefly presented. The Caterpillar software system developed in this thesis allows data-driven musical sound synthesis from a large database. However, musical creation is an artistic activity and thus not based on clearly definable criteria, like in speech synthesis. That's why a flexible, interactive use of the system allows composers to obtain new sounds. To constitute a unit database, alignment of music to a score is used to segment musical instrument recordings. It is based on spectral peak structure matching and the two approaches using Dynamic Time Warping and Hidden Markov Models are compared. Descriptor extraction analyses the sounds for their signal, spectral, harmonic, and perceptive characteristics, and temporal modeling techniques characterise the temporal evolution of the units uniformly. However, it is possible to attribute score information like playing style, or arbitrary information to the units, which can later be used for selection. The database is implemented using a relational SQL database management system for optimal flexibility and reliability. A database interface cleanly separates the synthesis system from the database. The best matching sequence of units is found by a Viterbi unit selection algorithm. To incorporate a more flexible specification of the resulting sequence of units, the constraint solving algorithm of adaptive local search has been alternatively applied to unit selection. Both algorithms are based on two distance functions: the target distance expresses the similarity of a target unit to the database units, and the concatenation distance the quality of the join of two database units. Data-driven concatenative synthesis is then applied to instrument synthesis with high level control, explorative free synthesis from arbitrary sound databases, resynthesis of a recording with sounds from the database, and artistic speech synthesis. For these applications, unit corpora of violin sounds, environmental noises, and speech have been built.
Mots-clés	alignment / music to score alignment / segmentation / dynamic time warping / polyphonic alignment / spectral distance / evaluation of alignment / sound synthesis / concatenative synthesis / data-driven synthesis / unit selection / database / c
Equipes	Analyse et synthèse sonores, Interactions musicales temps-réel
Cote	Schwarz04a
Adresse de la version en ligne	http://articles.ircam.fr/textes/Schwarz04a/index.pdf

© Ircam - Centre Pompidou 2005.