Measurements were made using a Macintosh Quadra computer equipped with an Audiomedia II DSP card, which has 16-bit stereo A/D and D/A converters that operate at a 44.1 kHz sampling rate. One of the audio output channels was sent to an amplifier which drove a Realistic Optimus Pro 7 loudspeaker. This is a small two way loudspeaker with a 4 inch woofer and 1 inch tweeter. The KEMAR was equipped with Etymotic ER-11 microphones, and Etymotic ER-11 preamplifiers. The outputs of the microphone preamplifiers were connected to the stereo inputs of the Audiomedia card.
From the standpoint of the Audiomedia card, a signal sent to the audio outputs results in a corresponding signal appearing at the audio inputs. Measuring the impulse response of this system yields the impulse response of the combined system consisting of the Audiomedia D/A and A/D converters and anti-alias filters, the amplifier, the speaker, the room in which the measurements are made, and most importantly, the response of the KEMAR with its associated microphones and preamps. We can avoid interference due to room reflections by ensuring that any reflections occur well after the head response time, which is several milliseconds. We can compensate for a non-uniform speaker response by measuring the speaker response separately and creating an inverse filter. The inverse filter, when applied to an HRTF measurement, equalizes the speaker response to be flat.
The impulse responses were obtained using ML sequences. The sequence length was N = 16383 samples, corresponding to a 14-bit generating register. Two copies of the sequence were concatenated to form a 2*N sample sound which was played from the Audiomedia card. Simultaneously, 2*N samples were recorded on both the left and right input channels (we wrote software for the Audiomedia to simultaneously play and record stereo sounds). For each input channel, the following technique was used to recover the impulse response. The first N samples of the result were discarded, and the remaining N samples were duplicated to form a 2*N sample sequence. This was cross-correlated with the original N sample ML sequence using FFT based block convolution, forming a 3*N - 1 sample result. The N sample impulse response was extracted starting at N - 1 samples into this result.
Noise in the ML sequence impulse responses can be attributed to measurement noise, non-linearities in the system, and time aliasing. Measurement noise can be averaged out by using longer ML sequences. This is completely analagous to averaging smaller length measurements. For instance, averaging two independent N point impulse response measurements should achieve a 3 dB signal to noise ratio (SNR) improvement over either of the measurements considered alone. Similarly, using a 2*N(+1) point ML sequence should achieve a 3 dB SNR improvement over using an N point ML sequence. However, noise caused by non-linearities in the system will not be reduced by repeated averaging over ML sequence measurements, because the noise is correlated between measurements. It is necessary either to use longer ML sequences or to average the reponses resulting from different ML sequences (i.e. from different masks) to reduce noise caused by non-linearities (see ). Time aliasing can be eliminated by using ML sequences which are longer than the reverberation time of the measurement space. Since the measurements were done in an anechoic chamber and the ML sequences were sufficiently long, time aliasing was not a problem. We chose 16383 point measurements to give good signal to noise ratios without excessive storage requirements or computation time. The measured SNR was 65 dB, as discussed later.