Internal states and external representation for input-output synthesis
Models in time to handle the sequence of states in time.
Clusters to handle that complexity of the state.
Output representations to handle the sampled audio (observed) a perceptional model and a synthesis model.