As described earlier, each HRTF measurement yielded a 16383 point impulse response at a 44.1 kHz sampling rate. Most of this data is irrelevant. The 1.4 meter air travel corresponds to approximately 180 samples, and there is an additional delay of 50 samples inherent in the playback/recording system. Consequently, in each impulse response, there is a delay of approximately 230 samples before the head response occurs. The head response persists for several hundred samples (subject to interpretation) and is followed by various reflections off objects in the anechoic chamber (such as the KEMAR turntable). In order to reduce the size of the data set without eliminating anything of potential interest, we decided to discard the first 200 samples of each impulse response and save the next 512 samples. Each HRTF response is thus 512 samples long. Most researchers will no doubt truncate this data further.
The impulse responses are stored as 16-bit signed integers, with the most significant byte stored in the low address (i.e. Motorola 68000 format). The dynamic range of the 16-bit integers (96 dB) exceeds the signal to noise ratio of the measurements, which we conservatively measured to be 65 dB. Using the 0 degree elevation, 0 degree azimuth, left ear, 16383 point measurement, we compared the energy in 100 samples centered on the head response to the first 100 samples of the response (these should ideally be zero) which yielded the 65 dB SNR.
The HRTF data is stored in directories by elevation. Each directory name has the format ``elevEE'', where EE is the elevation angle. Within each directory each filename has the format ``XEEeAAAa.dat'' where X is either ``L'' or ``R'' for left and right ear response, respectively, EE is the elevation angle of the source in degrees, from -40 to 90, and AAA is the azimuth of the source in degrees, from 0 to 355. Elevation and azimuth angles indicate the location of the source relative to the KEMAR, such that elevation 0 azimuth 0 is directly in front of the KEMAR, elevation 90 is directly above the KEMAR, elevation 0 azimuth 90 is directly to the right of the KEMAR, etc. For example, the file ``R-20e270a.dat'' is the right ear response, with the source 20 degrees below the horizontal plane and 90 degrees to the left of the head. Note that three digits are always given for azimuth so that the files appear in sorted order in each directory.
To select a pair of HRTF responses, we recommend using symmetrical responses obtained from one of the KEMAR ears. For instance, for the HRTF responses for a source 45 degrees to the right of the head at 0 degrees elevation, use ``L0e045a.dat'' for the left ear and ``L0e315a.dat'' for the right ear, or use ``R0e315a.dat'' for the left ear and ``R0e045a.dat'' for the right ear. Note that this approach eliminates binaural localization cues in the median plane.
The maximum sample value in the left ear HRTF data is -26793 in file ``L40e289a.dat''. In the right ear HRTF data the maximum value is 29877 in the file ``R40e039a.dat''.
The speaker impulse response and headphone impulse responses are stored in the directory ``headphones+spkr''. An inverse filter for the Optimus Pro 7 speaker is included. The inverse filter was designed by zero-padding the measured impulse response and taking the DFT of the zero-padded sequence. The resulting complex spectrum was inverted by negating the phase and inverting the magnitude. This was done over the range from DC to 18 kHz; beyond 18 kHz the inverse spectrum was made flat by repeating the 18 kHz magnitude value. The inverse filter was obtained by computing the inverse DFT of this spectrum. A minimum phase version of this inverse filter was also computed using the real cepstrum (see ). The files in the ``headphones+spkr'' directory are listed in Table 3.
Table 3: Contents of ``headphones+spkr'' directory
The 512 point impulse responses and speaker and headphone data may be found in the tar archive ``full.tar.Z''.