Ircam-Centre Pompidou

Recherche

  • Recherche simple
  • Recherche avancée

    Panier électronique

    Votre panier ne contient aucune notice

    Connexion à la base

  • Identification
    (Identifiez-vous pour accéder aux fonctions de mise à jour. Utilisez votre login-password de courrier électronique)

    Entrepôt OAI-PMH

  • Soumettre une requête

    Consulter la notice détailléeConsulter la notice détaillée
    Version complète en ligneVersion complète en ligne
    Version complète en ligne accessible uniquement depuis l'IrcamVersion complète en ligne accessible uniquement depuis l'Ircam
    Ajouter la notice au panierAjouter la notice au panier
    Retirer la notice du panierRetirer la notice du panier

  • English version
    (full translation not yet available)
  • Liste complète des articles

  • Consultation des notices


    Vue détaillée Vue Refer Vue Labintel Vue BibTeX  

    %0 Thesis
    %A Obin, Nicolas
    %T MeLos: Analysis and Modelling of Speech Prosody and Speaking Style
    %D 2011
    %C Paris
    %I Ircam-UPMC
    %F Obin11e
    %K speech prosody
    %K speaking style
    %K speech synthesis
    %K discrete-continuous HMMs
    %K information fusion
    %K stylization
    %K trajectory modelling
    %K linguistics
    %X This thesis addresses the issue of modelling speech prosody for speech synthesis and presents MeLos: a complete system for the analysis and modelling of speech prosody, “the music of speech”. The objective of this thesis is to model the strategy, alternatives, and speaking style of a speaker for natural, expressive, and varied speech synthesis. The present study presents original contributions with special attention paid to the combination of theoretical linguistic and statistical modelling to provide a complete speech prosody system. A unified discrete/continuous context-dependent HMM is presented to model the symbolic and the acoustic characteristics of speech prosody: 1) A rich description of the text characteristics based on a linguistic processing chain that includes surface and deep syntactic parsing is proposed to refine the modelling of the speech prosody in context. 2) Segmental HMMs and Dempster-Shafer fusion are used to balance linguistic and metric constrains in the production of a pause. 3) A trajectory model is proposed based on the stylization and the simultaneous modelling of short and long-term F0 variations over various temporal domains. The proposed system is used to model the strategies, alternatives and speaking style of a speaker, and is extended to model the speaking style of any arbitrary number of speakers using shared-context-dependent modelling and speaker normalization techniques.
    %1 8
    %2 1
    %U http://architexte.ircam.fr/textes/Obin11e/

    © Ircam - Centre Pompidou 2005.