| Catégorie de document |
Contribution à un colloque ou à un congrès |
| Titre |
Real-Time Audio-to-Score Alignment of Singing Voice Based on Melody and Lyric Information |
| Auteur principal |
Rong Gong |
| Co-auteurs |
Philippe Cuvillier, Nicolas Obin, Arshia Cont |
| Colloque / congrès |
Interspeech. Dresden : 2015 |
| Comité de lecture |
Oui |
| Année |
2015 |
| Statut éditorial |
Accepté - publication en cours |
| Résumé |
Singing voice is specific in music: a vocal performance conveys both music (melody/pitch) and lyrics (text/phoneme) content. This paper aims at exploiting the advantages of melody and lyric information for real-time audio-to-score alignment of singing voice. First, lyrics are added as a separate observation stream into a template-based hidden semi-Markov model (HSMM), whose observation model is based on the construction of vowel templates. Second, early and late fusion of melody and lyric information are processed during real-time audio-to-score alignment. An experiment conducted with two professional singers (male/female) shows that the performance of a lyrics-based system is comparable to that of melody-based score following systems. Furthermore, late fusion of melody and lyric information substantially improves the alignment performance. Finally, maximum a posteriori adaptation (MAP) of the vowel templates from one singer to the other suggests that lyric information can be efficiently used for any singer. |
| Mots-clés |
singing voice / real-time audio-to-score alignment / lyrics / spectral envelope / information fusion / singer adaptation |
| Equipes |
Analyse et synthèse sonores, Systèmes temps-réel |
| Cote |
Gong15a |
| Adresse de la version en ligne |
http://architexte.ircam.fr/textes/Gong15a/index.pdf |
|