Ircam-Centre Pompidou

Recherche

  • Recherche simple
  • Recherche avancée

    Panier électronique

    Votre panier ne contient aucune notice

    Connexion à la base

  • Identification
    (Identifiez-vous pour accéder aux fonctions de mise à jour. Utilisez votre login-password de courrier électronique)

    Entrepôt OAI-PMH

  • Soumettre une requête

    Consulter la notice détailléeConsulter la notice détaillée
    Version complète en ligneVersion complète en ligne
    Version complète en ligne accessible uniquement depuis l'IrcamVersion complète en ligne accessible uniquement depuis l'Ircam
    Ajouter la notice au panierAjouter la notice au panier
    Retirer la notice du panierRetirer la notice du panier

  • English version
    (full translation not yet available)
  • Liste complète des articles

  • Consultation des notices


    Vue détaillée Vue Refer Vue Labintel Vue BibTeX  

    %0 Conference Proceedings
    %A Obin, Nicolas
    %T Cries and Whispers - Classification of Vocal Effort in Expressive Speech
    %D 2012
    %B Interspeech
    %C Portland
    %F Obin12d
    %K speech recognition
    %K vocal effort
    %K voice quality
    %K glottal source
    %K GMM-UBM/SVM
    %X The expansion of the video games industry raises innovative and challenging issues for speech technologies, e.g. the development of automatic content-based speech processing and speech recognition systems in the context of video games post-production and voice casting. This paper presents a large-scale study on the classification of vocal effort in expressive speech for video games. Changes in vocal effort conduct to substantial modifications in the configuration of voice production mechanisms. In particular, registers of vocal effort affect especially voice quality which reflects qualitative modifications of the source excitation characteristics. This study introduces robust source characteristics to measure various types of voice quality (e.g., breathy, creaky, tense) for the classification of vocal effort into whispered, normal, and shouted speech. The system is evaluated in the real scenario of video games production with the complete speech recordings of a massive role-playing video game. The proposed features significantly improve the classification from 81.1% to 87% over conventional MFCCs. These advancements confirm the role of the source and voice quality for the description of changes in vocal effort.
    %1 6
    %2 1
    %U http://architexte.ircam.fr/textes/Obin12d/

    © Ircam - Centre Pompidou 2005.