Résumé |
Our daily life happens in a world of dense acoustic environments. This is especially the case in train stations, where soundscapes are usually very complex. In this paper, we will investigate how Non Negative Matrix Factorization methods can be used to obtain a low rank spectrogram approximation, composed of spectral templates that can be related to some salients events like footsteps, whistles, etc. . . We thus assume here that the scene can be characterized by a few salient events that occur several times within the scene. We also assume that even if the acoustic realizations of those events cannot be considered in isolation, those realizations have similar spectro temporal properties. We here consider 66 recordings made in French train stations, where individual salient events have been manually annotated. We then assess the ability of the methods to extract meaningful components by comparing the activations of those components within the scene to the manually annotated ones. Experiments demonstrate that enforcing sparsity on the activations, i.e. constraining that only a few components is active at a time, has a positive effect. |