next up previous
Next: Auto-Regressive (AR) Models with Up: Perceptual Representations of Sound Previous: Short-Time Spectral Frames

Perceptual Atoms: Wefts, Harmonics and Noise Bursts

The field of Computational Auditory Scene Analysis is emerging with new, non FFT-based, representations of audio with a view to solving difficult auditory scene analysis problems. Frequency components are grouped using gestalt principles such as synchrony of onset and temporal proximity as well as psycho-acoustic principles such as harmonicity and critical band masking effects.

Such representations are being employed to partition the time-frequency space, as represented in a Constant-Q spectrogram, into perceptually grouped regions. The potential for these grouped representations is enormous; all the salient components of auditory signals are encoded in three co-existing representations. However, such representations currently require a prohibitive amount of computation and specialist software tools for forming groups and tracks robustly.



Michael Casey
Fri Mar 22 15:49:22 EST 1996