Here is presented a set of signal processing methods, which aim at providing a blind estimation of a low-level description of the room effect related to a recorded audio scene, in order to derive a perceptual characterization of its spatial features. In this case, "blind" means that only the recording itself is provided, and that no other information (i.e. a geometrical description of the room, or a set of room impulse responses) is available. Two single channel tools are proposed, in order to take into account the different natures of the early and late parts of the room effect, that estimate the relevant information in each of them. Then the estimations of the early parts of the responses are simultaneously processed by a coincidence detector, in order to estimate the binaural cues of the sound scene.