Recherche

Recherche simple

Recherche avancée

Panier électronique

Afficher le contenu du panier

Consulter les notices sélectionnées

Vider le panier

Votre panier contient 1 notice

Connexion à la base

Identification

(Identifiez-vous pour accéder aux fonctions de mise à jour. Utilisez votre login-password de courrier électronique)

Entrepôt OAI-PMH

Soumettre une requête

	Consulter la notice détaillée
	Version complète en ligne
	Version complète en ligne accessible uniquement depuis l'Ircam
	Ajouter la notice au panier
	Retirer la notice du panier

English version

(full translation not yet available)

Liste complète des articles

Consultation des notices

Vue détaillée

Vue Refer

Vue Labintel

Vue BibTeX

%0 Conference Proceedings
%A Régnier, Lise
%A Peeters, Geoffroy
%T Singing Voice Detection in Music Tracks using Direct Voice Vibrato Detection
%D 2009
%B ICASSP
%F Regnier09b
%K Singing voice detection
%K vibrato detection
%K voice segmentation
%K vibrato and tremolo parameters extraction
%K feature extraction.
%X In this paper we investigate the problem of locating singing voice in music tracks. As opposed to most existing methods for this task, we rely on the extraction of the characteristics specific to singing voice. In our approach we suppose that the singing voice is characterized by harmonicity,formants, vibrato and tremolo. In the present study we deal only with the vibrato and tremolo characteristics. For this, we first extract sinusoidal partials from the musical audio signal . The frequency modulation (vibrato) and amplitude modulation (tremolo) of each partial are then studied to determine if the partial corresponds to singing voice and hence the corresponding segment is supposed to contain singing voice. For this we estimate for each partial the rate (frequency of the modulations) and the extent (amplitude of modulation) of both vibrato and tremolo. A partial selection is then operated based on these values. A second criteria based on harmonicity is also introduced. Based on this, each segment can be labelled as singing or non-singing. Post-processing of the segmentation is then applied in order to remove short-duration segments. The proposed method is then evaluated on a large manually annotated test-set. The results of this evaluation are compared to the one obtained with a usual machine learning approach (MFCC and SFM modeling with GMM). The proposed method achieves very close results to the machine learning approach : 76.8% compared to 77.4% F-measure (frame classification). This result is very promising, since both approaches are orthogonal and can then be combined.
%1 6
%2 3
%U http://articles.ircam.fr/textes/Regnier09b/