Short-Time Spectral Frames

Next: Perceptual Atoms: WeftsHarmonics Up: Perceptual Representations of Sound Previous: Minimum-Phase Waveform Reconstruction

Short-Time Spectral Frames

In the late 1970s and early 1980s the Short-Time Fast Fourier Transform (SFFT) became an important representation for sound. Digital spectrograms based on SFFT data where used for the interpretation of speech data and for the analysis of acoustic signals. The main flaw with the FFT as a spectral estimator is that the analysis bins are spaced linearly across the perceptual frequency space. Therefore as much information is devoted to high-frequency information as is devoted to the low-frequency components of a sound.\

The Cochlea nerve and nucleus operates on a roughly logarithmic scale, thus more information is devoted to the lower frequencies than to the higher frequencies. The frequency space is divided up into bands, called critical bands, that maintain a roughly constant ratio between their bandwidth (Q) and center frequency, (CF).

There s another problem with FFT-based spectral representations of sound; they contain little information as to the distribution of noise in the signal. The Fast Fourier Transform (FFT) is based on the Discrete Fourier Transform (DFT) which is a generalization of the Discrete Fourier Series (DFS) which represents periodic signals. The DFT and the FFT both are based on the assumption that the analysis signal is infinitely periodic with the period being the length of the analysis window. Since noise is, by definition, non periodic other methods must be sought for estimating the noise component of the signal. Smith and Serra proposed a bi-partite system in which the noise spectrum was estimated in conjunction with an FFT-based analysis technique, thus there representation is both stochastic and periodic.\

Michael Casey
Fri Mar 22 15:49:22 EST 1996