Résumé |
This paper describes a semi-parametric speaker-like laughter synthesis method. A large corpus of spontaneous laughter is presented. An attempt to use traditional automatic segmentation on the data is discussed. Significant results from the statistical analysis of the corpus are then presented, with concern to the static and dynamic acoustic characterizations of bouts and syllables. Interestingly, laughter prosody seems to be guided by the same physiological constraints as verbal speech. After this analysis part, a method for synthesizing laughter from any neutral utterance using information from the previous results is described. A TTS algorithm selects some phones that are duplicated to create a homotype series. Finally, speech processings modify the prosody of this series, providing a realistic high quality speaker-like bout of laughter. |