Ircam-Centre Pompidou

Recherche

Recherche simple

Recherche avancée

Panier électronique

Afficher le contenu du panier

Consulter les notices sélectionnées

Vider le panier

Votre panier contient 3 notices

Connexion à la base

(Identifiez-vous pour accéder aux fonctions de mise à jour. Utilisez votre login-password de courrier électronique)

Entrepôt OAI-PMH

Soumettre une requête

	Consulter la notice détaillée
	Version complète en ligne
	Version complète en ligne accessible uniquement depuis l'Ircam
	Ajouter la notice au panier
	Retirer la notice du panier

English version

(full translation not yet available)

Liste complète des articles

Consultation des notices

Vue détaillée

Catégorie de document	Contribution à un colloque ou à un congrès
Titre	Towards Glottal Source Controllability in Expressive Speech Synthesis
Auteur principal	Jaime Lorenzo-Trueba
Co-auteurs	Roberto Barra-Chicote, Tuomo Raitio, Nicolas Obin, Paavo Alku, Junichi Yamagishi, Juan M. Montero
Colloque / congrès	Interspeech. Portland : 2012
Comité de lecture	Oui
Année	2012
Statut éditorial	Accepté - publication en cours
Résumé	In order to obtain more human like sounding human- machine interfaces we must first be able to give them expressive capabilities in the way of emotional and stylistic features so as to closely adequate them to the intended task. If we want to replicate those features it is not enough to merely replicate the prosodic information of fundamental frequency and speaking rhythm. The proposed additional layer is the modification of the glottal model, for which we make use of the GlottHMM parameters. This paper analyzes the viability of such an approach by verifying that the expressive nuances are captured by the aforementioned features, obtaining 95% recognition rates on styled speaking and 82% on emotional speech. Then we evaluate the effect of speaker bias and recording environment on the source modeling in order to quantify possible problems when analyzing multi-speaker databases. Finally we propose a speaking styles separation for Spanish based on prosodic features and check its perceptual significance.
Mots-clés	expressive speech synthesis / speaking style / glottal source modeling
Equipe	Analyse et synthèse sonores
Cote	LorenzoTrueba12a
Adresse de la version en ligne	http://articles.ircam.fr/textes/LorenzoTrueba12a/index.pdf

© Ircam - Centre Pompidou 2005.