Based on the psychophysical findings of auditory literature we provide a set of conditions that a representation of sound must meet before being considered perceptually valid. We consider a perceptually valid representation to be more than a modeling philosophy. There are very good reasons why the ear/brain system represents sound the way it does, and any attempt to model results from perceptual literature benefits from extracting these component features. Desirable attributes of a sound representation include the following:
a) The representation should be based on log-frequency in order to reflect the relative salience of low frequency components with respect to the high frequencies.
b) There must be temporal information that gives precedence to the transient portion of the signal; that is, the first 100 milliseconds. But the representation should also encode the steady-state portion of the signal.