Recherche
Recherche simple
Recherche avancée
Panier électronique
Votre panier ne contient aucune notice
Connexion à la base
Identification
(Identifiez-vous pour accéder aux fonctions de mise à jour. Utilisez votre login-password de courrier électronique)
Entrepôt OAI-PMH
Soumettre une requête
| Consulter la notice détaillée |
| Version complète en ligne |
| Version complète en ligne accessible uniquement depuis l'Ircam |
| Ajouter la notice au panier |
| Retirer la notice du panier |
English version
(full translation not yet available)
Liste complète des articles
|
Consultation des notices
%0 Conference Proceedings
%A Rossignol, Stéphane
%A Rodet, Xavier
%A Soumagne, Joel
%A Colette, Jean-Louis
%A Depalle, Philippe
%T Feature extraction and temporal segmentation of acoustic signals
%D 1998
%B ICMC: International Computer Music Conference
%C Ann Arbor
%F Rossignol98a
%K segmentation
%K signal representation
%K feature extraction
%K coding
%K multimedia
%X This paper deals with temporal segmentation of acoustic
signals and feature extraction. Segmentation and feature
extraction are aimed at being a first step for sound
signal representation, coding, transformation and multimedia.
Three interdependent levels of segmentation are defined. They
correspond to different levels of signal attributes. The
Source level distinguishes speech, singing voice, instrumental
parts and other sounds, such as street sounds, machine
noise... The Feature level deals with characteristics such
as silence/sound, transitory/steady, voiced/unvoiced, harmonic,
vibrato and so forth. The last level is the segmentation into
Notes and Phones.
A large set of features is first computed: derivative and relative
derivative of f0 and energy, voicing coefficient, mesure of
the inharmonicity of the partials, spectral centroid, spectral ``flux'',
high order statistics, energy modulation, etc. A decision function
on the set of features has been built and provides the segmentation
marks. It also depends on the current application and the required
result. As an example, in the case of the singing voice, segmentation
according to pitch is different from segmentation into phones. A
graphical interface allows visualization of these features, the results
of the decisions, and the final result.
For the Source level, some features are predominant: spectral
centroid, spectral flux, energy modulation and their variance
computed on a sound segment of one second or more.
Segmentation starts with the Source level, but the three levels
are not independent. Therefore, information obtained at a given level
is propagated towards the other levels. For example, in case of
instrumental music and the singing voice, if vibrato is detected at
the Feature level, amplitude and frequency of the vibrato are
estimated and are taken into account for the Notes and Phones
level. The vibrato is removed from the f0 trajectory, and the
high frequencies of the signal are not used in spectral flux
computation.
A complete segmentation and feature extraction system is demonstrated.
Applications and results on various examples such as a movie sound
track are presented.
%1 7
%2 3
%U http://articles.ircam.fr/textes/Rossignol98a
|
|