Résumé |
In the area of prosody/discourse interface analysis, there is a demonstrated need for sharable prosodic notation systems for the segmentation of continuous speech into discourse units (Mettouchi & al. 2007, Simon & Degand 2011); that can be used independently to a particular discourse genre (Cresti & Moneglia2005). The fundamental issue for the transcription of prosody in discourse is to define stable prosodic units (phrasing and boundaries, accents, others?) which can be first: identified in common by experts on perceptual bases, regardless there theoretical background, second: manipulated by novices (linguists of other domains, students), and third: used for easy data exchange, comparison, and corpus-based learning methods for automatic prosodic labelling (Avanzi & al. 2010). Regarding to transcription of French prosody, usually considered as a boundary language (Lacheret-Dujour & Beaugendre 1999), the identification of different degrees of boundaries is generally adopted for the transcription. However, in a bottom-up approach, the identification of different degrees of prosodic breaks is not so consensual, even among experts. The present study addresses the definition of the prosodic units that can be used for the transcription of French prosody in discourse. The aim of this communication is to present the different steps of the prosodic transcription that has been conducted during 3 years within the Rhapsodie project in order to propose a reference transcription system for the segmentation of French discourses into prosodic units. The methodology retained including linguistic prerequisites, speech database, and experiments is presented as follows: 1. The general context and the different modules involved in the analysis of Rhapsodie speech database, and the linguistic justification of the methodology chosen for the prosodic processing (bottom-up approach driven by the data, regardless to a particular theory and functional cues). 2. The data, i.e. the different genres of discourse to be annotated (about 3 hours of speech, monologues and dialogues). 3. The different experiments conducted to provide a prosodic transcription framework (transcription methodology and transcription reliability measures) and to establish a reference speech database with prosodic transcription in French. This third section presents first: two pilot experiments conducted with a consortium of 15 French experts in order to define the optimal, i.e. the more sharable, transcription unit (boundary vs. syllabic prominences), second: prosodic transcription by 5 novices, and third: final transcription, referred as the reference transcription, by 3 experts. For the two first steps, we will present the transcription procedure and the resulting inter-transcribers agreement. For steps 2 and 3, the guidelines used for the transcription (transcription by novices and correction by experts) will be presented. |