Ircam-Centre Pompidou


  • Recherche simple
  • Recherche avancée

    Panier électronique

    Votre panier ne contient aucune notice

    Connexion à la base

  • Identification
    (Identifiez-vous pour accéder aux fonctions de mise à jour. Utilisez votre login-password de courrier électronique)

    Entrepôt OAI-PMH

  • Soumettre une requête

    Consulter la notice détailléeConsulter la notice détaillée
    Version complète en ligneVersion complète en ligne
    Version complète en ligne accessible uniquement depuis l'IrcamVersion complète en ligne accessible uniquement depuis l'Ircam
    Ajouter la notice au panierAjouter la notice au panier
    Retirer la notice du panierRetirer la notice du panier

  • English version
    (full translation not yet available)
  • Liste complète des articles

  • Consultation des notices

    Vue détaillée Vue Refer Vue Labintel Vue BibTeX  

    Catégorie de document Article paru dans une revue
    Titre On the use of a spatial cue as prior information for stereo sound source separation based on spatially weighted non-negative tensor factorization
    Auteur principal Yuki Mitsufuji
    Co-auteur Axel Roebel
    Paru dans EURASIP Journal on Advances in Signal Processing, Mars 2014, Vol. 2014, n° 1
    Comité de lecture Oui
    Collation p.1-9
    Année 2014
    Statut éditorial Publié

    This paper proposes a new method to enhance the performance of non-negative tensor factorization (NTF), one of the most prevalent source separation techniques nowadays. The enhancement is mainly achieved by introducing weights on bin-wise NTF cost functions, which differentiates NTF target components from other components so that the target should be approximated more precisely than others. Assuming sources are distributed sparsely in a 2-D sound field, the target components approximating a target source are exclusively selected by a user, or from accompanying images by means of providing a spatial cue to an NTF framework. The spatial cue is given in a similar format to the well-known binaural feature, inter-channel level difference (IID). This helps incorporate the spatial cue into the system, since the similar features of this format can be easily calculated from every spectrogram bin. The weighting functions are designed taking into account the distance between the spatial cue and the calculated features. Namely, the largest values are assigned to the spectrogram bins where the features present the highest similarity to the spatial cue, and the value decreases in proportion to the distance between them. The method is evaluated in terms of separation quality, comparing the proposed algorithm to the conventional NTF technique, PARAFAC-NTF, as well as other source separation techniques. The evaluation results measured by the metric signal-to-distortion ratio (SDR), signal-to-interference ratio (SIR), and signal-to-artifact ratio (SAR) demonstrate the effectiveness of the new method, improved primarily by the weighting function and the initialization based on IID, while demonstrating a decrease in computational costs, a significant problem with NTF.

    Mots-clés non-negative tensor factorization / sound source separation / spatial cue
    Equipe Analyse et synthèse sonores
    Cote Mitsufuji14a

    © Ircam - Centre Pompidou 2005.