Prosodic Font
The Space Between the Spoken
and the Written
Sunset.wav
Wow.wav
These are composited images created with different frames from two
separate Prosodic Font animated utterances. Click on these images to see
the Prosodic Font animation. You can also hear the original sound files
used to make these Prosodic Font animations.
A Prosodic Font gains form, shape and substance
through a particular someone, somewhere, talking about something. The alphabetic
forms, glyphs, are given shape and motion through a coded interpretation
of the speech signal. Mappings between voice and glyph design might go
something like this:Vocal timbre controls the glyph rendering choices.
The speech dynamic controls the dynamic glyph size. The relative pitch
range controls the anti-aliasing, curvature, weighting and squash-n-stretch
of the glyph figures. The rate of speech causes the words to appear for
longer or shorter durations. The explosiveness of a phonetic "t" causes
the letter to shake violently...
Prosody is emotionally specific. It is evident across telephone lines (to about 60% accuracy) when someone is angry and when they are sad. Our ability to read finer gradations of emotions - and even mixtures of emotions - from voice stimuli alone are well-developed skills by adulthood; likewise is our unconscious skillful manipulation of endless variations of spoken tunes and rhythms.
Prosody is also informationally specific. The syntactic "point" of a sentence is often disambiguated from the rest of the sentence through prosodic means. However, intonational contours and rhythms do not lend themselves to causal interpretations. There is no "angry" intonational contour nor rhythm. Rather, vocal characteristics act as a gestalt - within a contextualized communication situation - to communicate fine degrees of speaker state, meaning and intention.
Modelling typography after speech requires introducing
a temporal design element. I have been priviliged to inherit the design
legacy of the Media Lab, where many people have designed temporal typographic
forms. Yin Yin Wong's and David Small's work in Temporal Typography, Suguru
Ishizaki's work in Kinetic Typography, Professor John Maeda's course in
Digital Typography, and Peter Cho's typographical work have all strongly
influenced my design.
There are many examples of continuously parameterized fonts. Don Knuth's
METAFONT project (left image) used in excess of seventy parameters to create
the differences in typographical style shown. Adobe's Multiple Master font
(right image) likewise can change font parameters continuously to create
glyphs that look as if they belong to different font families entirely.
Prosodic Font uses an Object Oriented approach to creating fonts that can move easily. The primitives (above) can be placed within a typographer's grid given two lines of constraint. They are composited with other primitives to form every letter of the alphabet. In this way, each primitive can maintain its own shape integrity as the font transforms over time.
These two Prosodic Font 'g' examples are different only in weight. They
are composited using a circle, a straight line with a curved tail 'facing
left' connected to it.
| Prosody | Font |
| Amplitude | Scalar size (per syllable) |
| Average Amplitude | Background graphic rectangle to show "normal" voice size |
| Abstraction of fundamental frequency curve (TILT system) | Width, Height (inversely related), Weight (inversely related),
vertical translation (not yet implemented) |
| Syllable duration | Duration of syllable activity |
Neither I (the font algorithm designer) nor the speaker (the prosodic font designer) need to label speech as exhibiting particular emotions. Rather, the speech signals are systematically "mapped" onto visual characteristics. It is the recipient's (the hearer/reader's) job to interpret the speaker's expression - just as it is in conversational settings. If prosody is a system, then we might understand nuances of expression and emotion in a systematically animated font. This method has more potential to work as a tool of communication than if we needed to label the entire range of human vocal expressiveness into specific discrete categories. Additionally, the job of creating algorithmic spatio-temporal mappings remains a creative act of design.
To read more about this idea and project, you can download my eighty page thesis in Adobe Acrobat format or a two page Computer Human Interaction paper also in Acrobat format: