Ircam-Centre Pompidou

Recherche

Recherche simple

Recherche avancée

Panier électronique

Votre panier ne contient aucune notice

Connexion à la base

(Identifiez-vous pour accéder aux fonctions de mise à jour. Utilisez votre login-password de courrier électronique)

Entrepôt OAI-PMH

Soumettre une requête

	Consulter la notice détaillée
	Version complète en ligne
	Version complète en ligne accessible uniquement depuis l'Ircam
	Ajouter la notice au panier
	Retirer la notice du panier

English version

(full translation not yet available)

Liste complète des articles

Consultation des notices

Vue détaillée

Catégorie de document	Contribution à un colloque ou à un congrès
Titre	On Automatic Voice Casting for Expressive Speech: Speaker Recognition vs. Speech Classification
Auteur principal	Nicolas Obin
Co-auteurs	Xavier Roebel, Grégoire Bachman
Colloque / congrès	IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP),. Florence : 2014
Comité de lecture	Oui
Année	2014
Statut éditorial	Accepté - publication en cours
Résumé	This paper presents the first large-scale automatic voice casting system, and explores the adaptation of speaker recognition techniques to measure voice similarities. The proposed system is based on the representation of a voice by classes (e.g., age/gender, voice quality, emotion). First, a multi-label system is used to classify speech into classes. Then, the output probabilities for each class are concatenated to form a vector that represents the vocal signature of a speech recording. Finally, a similarity search is performed on the vocal signatures to determine the set of target actors that are the most similar to a speech recording of a source actor. In a subjective experiment conducted in the real-context of voice casting for video games, the multi-label system clearly outperforms standard speaker recognition systems. This indicates evidence that speech classes successfully capture the principal directions that are used in the perception of voice similarity.
Mots-clés	voice casting / voice similarity / speaker recognition / speech classification
Equipe	Analyse et synthèse sonores
Cote	Obin14c
Adresse de la version en ligne	http://architexte.ircam.fr/textes/Obin14c/index.pdf

© Ircam - Centre Pompidou 2005.