next up previous
Next: Implementation of transaural filter Up: Visually Steered 3-D Audio Previous: Performance of binaural spatializer

Principles of transaural audio

Transaural audio is a method used to deliver binaural signals to the ears of a listener using stereo loudspeakers. The basic idea is to filter the binaural signal such that the subsequent stereo presentation produces the binaural signal at the ears of the listener. The technique was first put into practice by Schroeder and Atal [22, 21] and later refined by Cooper and Bauck [10], who referred to it as ``transaural audio''. The stereo listening situation is shown in figure 12, where tex2html_wrap_inline2130 and tex2html_wrap_inline2132 are the signals sent to the speakers, and tex2html_wrap_inline2134 and tex2html_wrap_inline2136 are the signals at the listener's ears. The system can be fully described by the vector equation:

where:

equation1324

and tex2html_wrap_inline2138 is the transfer function from speaker X to ear Y. The frequency variable has been omitted.

If tex2html_wrap_inline2140 is the binaural signal we wish to deliver to the ears, then we must invert the system transfer matrix tex2html_wrap_inline2142 such that tex2html_wrap_inline2144 . The inverse matrix is:

  equation1351

This leads to the general transaural filter shown in figure 13. This is often called a crosstalk cancellation filter, because it eliminates the crosstalk between channels. When the listening situation is symmetric, the inverse filter can be specified in terms of the ipsilateral ( tex2html_wrap_inline2146 ) and contralateral () responses:

equation1375

Cooper and Bauck proposed using a ``shuffler'' implementation of the transaural filter [10], which involves forming the sum and difference of tex2html_wrap_inline2150 and tex2html_wrap_inline2152 , filtering these signals, and then undoing the sum and difference operation. The sum and difference operation is accomplished by the unitary matrix tex2html_wrap_inline2154 below, called a shuffler matrix or MS matrix:

equation1391

It is easy to show that the shuffler matrix tex2html_wrap_inline2154 diagonalizes the matrix tex2html_wrap_inline2158 via a similarity transformation:

  equation1402

Thus, in shuffler form, the transaural filters are the inverses of the sum and the difference of tex2html_wrap_inline2160 and tex2html_wrap_inline2162 . Note that tex2html_wrap_inline2154 is its own inverse. This leads to the transaural filter shown in figure 14. The normalizing gains can be commuted to a single gain of 1/2 for each channel, or can be ignored.

In practice, the transaural filters are often based on a simplified head model. Here we list a few possible models in order of increasing complexity:

At high frequencies, where pinna response becomes important (> 8 kHz), the head effectively blocks the crosstalk between channels. Furthermore, the variation in head response for different people is greatest at high frequencies [19]. Consequently, there is little point in modeling pinna response when constructing a transaural filter.


next up previous
Next: Implementation of transaural filter Up: Visually Steered 3-D Audio Previous: Performance of binaural spatializer

Michael Casey
Mon Mar 4 18:47:28 EST 1996