Analysis and Synthesis of PalestrinaStyle Counterpoint Using Markov Chains
Excerpts from this page were published as:
Summary:A novel approach to computergenerated Palestrinastyle counterpoint using probabilistic Markov chains is presented. It is shown how Markov chains adequately capture the rules of species counterpoint and how they can be used to synthesize species counterpoint given a cantus firmus. It is also shown how such rules can be inferred from given counterpoint examples. 

Editio Medicaea. This revised edition of Gregorian chant was published in Rome in 1614. Contributors to this important revision of the liturgical chant included Palestrina. 

IntroductionSixteenthcentury counterpoint, with which the name of Palestrina has almost become synonymous, has long been held by musicians and theoreticians as an exceptionally elegant form of composition which has both musical and pedagogical value. Modeled after this style, species counterpoint was first introduced by J. J. Fux in his 1725 treatise, Gradus ad Parnassum. Fux codified the study of Palestrinastyle counterpoint by presenting five categories of instruction, called species. Gradus was a standard counterpoint text studied by countless musicians during the eighteenth and nineteenth centuries. However, from a stylistic viewpoint, it did not present an adequate approximation of Palestrina's music [SS89]. This goal was eventually realized by Knud Jeppesen, who believed that counterpoint must be studied with as much correspondence between written exercises and composition as possible [Jep31]. In other words, counterpoint had to be learned within the context of a specific musical style. This project, called Palestrina, presents a framework for generating music in the style of Palestrina using Markov models. A set of species counterpoint rules as outlined by Jeppesen is implemented in the form of probability tables which are then used as statetransition matrices for the Markov models. In the next step, the probability tables are estimated from given counterpoint examples. Palestrina uses Markov chains, a subclass of graphical models to provide the the necessary structure for representing the compositional process of generating counterpoint. Markov chains provide the flexibility of integrating deterministic and probabilistic rules in the same model and can be trained on experimental data. Prior WorkIn 1955, Hiller and Isaacson [HI58] began investigating techniques to generate music using the ILLIAC, a computer located at the University of Illinois. Their first experiment used some basic counterpoint rules to generate fourpart first species counterpoint given a cantus firmus. Their objective, however, was not to compose great counterpoint  it was to "demonstrate that technical musical concepts could be translated into computer language to produce musical output." Since their initial experiments in the 1950's, there have been a number of other people who have implemented systems that generate species counterpoint.Kemal Ebcioglu (1980) created a rulebased system to generate fifthspecies counterpoint given a cantus firmus that encapsulates each rule in a simple LISP function [Ebc80]. An enumeration algorithm is used to generate as many solutions as possible given a time limit. The enumeration algorithm uses additional rules to output the more "musical" solutions first (added due to the fact that solutions blindly following the rules were deemed "grossly insufficient"). These include rules for producing better musical contours such as the following: If the melody skips, and if the notes within the scope of this skip have not already been sounded, then they must eventually be sounded before the end of the melody.The system, in the author's own words, seems to output solutions which "compare quite well to that of an average conservatory student." David Lewin (1983) has written a program that generates firstspecies counterpoint using his own "Global Rule" which resembles Ebcioglu's musical countour rule [Lew83]: For every note X of the counterpoint line lying above (below) the cadence tone, some note lying one step lower (higher) than X must appear in the line at some point subsequent to X.His program generates the counterpoint backwards from the cadence note in order to more easily implement the Global Rule. Although this method is more computationally efficient, it is, as argued by Robert Gjerdingen, an unmusical approach since it treats the counterpoint as a solution to a problem rather than an "aesthetic utterance." Gjerdingen (1988) approaches the problem from a musician's point of view, as opposed to a programmer's [Gje88]. His system, PRAENESTE, is based on the idea that, just like a sixteenthcentury singer improvising on a cantus, the computer should work forward in time without the benefit of being able to back up and start over. In response to each new contrapuntal situation, PRAENESTE uses a small number of concrete musical schemata to provide for itself a selection of correct melodic patterns. Given the preceding material, it selects a single melodic path that best satisfies a number of the higherlevel aesthetic constraints. For PRAENESTE, being correct is a matter of course; its main concern is with style (as described by Jeppesen). PRAENESTE is quite successful, and the results resemble Jeppesen's examples (especially those of the first three species) remarkably. In the case of fifth species, however, the musical differences are still considerable. William Schottstaedt (1989) implemented all five species with up to six voices [Sch89]. His program follows Fux's guidelines and rules as closely as possible. These rules were extended and modified until the results acceptably resembled Fux's examples. Since most of the rules are not absolute, a penalty system is used to assign them relative degrees of importance. If at any given time the total penalty exceeds a certain amount, that particular path is abandoned. For example, an absolute rule such as the restriction of parallel fifths has a infinite penalty. A "very bad infraction" like direct motion to an octave has a penalty of 200, while a stylistic infraction such as a repeated pattern of three notes only has 4. The penalty values arise from both Fux's commentary as well as experience running the program. The search time is optimized by taking the branch of the search tree which has the least penalty. If the path fails, then the program backs up and tries the next best branch. Unfortunately, the first solution found may not be very good, in which case a new search is made with a different starting interval in order to maximize the chances of finding a truly new solution. Even with this approach, the computetime for fifth species solutions still remains high. Schottstaedt also admits that the program has no provision for starting a melody with a rest. Neither does it favor invertible counterpoint and imitation. It tends to "let voices get entangled in each other" and make "inadequate judgments about overall melodic shapes."
The approach used in Synthesizing the counterpointDynamic ProgrammingPalestrina uses a dynamic programming approach to find the most likely solution out of a multitude of possible solutions given a cantus firmus [Ber95].
The first step is the construction of a state trellis diagram where the states describe the possible chromatic notes in the counterpoint (Fig. 1). The counterpoint is constrained to be above or below the cantus and the permissible note range spans a major tenth above (or below) the highest (or lowest) cantus note. Hence the number of states in the trellis equals the chromatic range of the cantus plus 16 (a major tenth is equivalent to 16 minor seconds). Musical constraints on the counterpoint are then formalized by a mix of transition probabilities and state probabilities conditioned on the cantus. For example, the unconditional likelihood of a harmonic interval and the probability of a melodic interval given the previous melodic interval are specified. While most rules can be realized based on a firstorder Markov chain, some require a secondorder system. A basic firstorder structure has been implemented, but the model also allows for secondorder considerations by looking one step ahead and choosing optimal solutions with respect to the next time step. In the synthesis application, the most likely path through the trellis structure given the underlying cantus and transition rules is calculated. Evaluating every possible path would be computationally prohibitive. However, the problem is made tractable by using a dynamic programming approach commonly known as the Viterbi algorithm. The Viterbi algorithm is based on the insight that the most likely past state sequence leading into any one state is independent from future states. There are a number of variants that are used for different problems, one of the most common applications is finding the most likely state sequence in a Hidden Markov model  that is, a Markov Chain with undefined states at the time of building the model. Another common application is for demodulating convolved communication channels [Vit67].
Going forward in the chain, previous path probabilities are multiplied with the transition probability of the new update, and the path that generated the highest joint probability is stored. At any time step τ, the only considerations are the path leading into state τ1, the probability associated with that sequence, and the transition probabilities at time τ. After the last transition occurs, the sequence with the highest probability is selected as the solution [Ber95]. Probabilistic counterpoint rulesEach rule is implemented as a probability table where illegal transitions are described by probability zero. The transition probabilities for generating a counterpoint line are obtained by multiplying the individual values from each table, assuming the rules are independent:
Analysis of counterpointPalestrina can infer a probabilistic description of counterpoint rules from existing compositions. For the purpose of experimentation, a set of firstspecies counterpoint strictly conforming to the rules were composed. Twelve examples were composed by humans and 44 examples were generated by a machine using the algorithm described above. This database of speciescounterpoint examples was used as training data for a Markov model. The framework was set up for a state sequence identical to the last section. The transition tables were then inferred by counting the number of occurrences of certain transitions and renormalizing. Table 2 compares two tables which were used in the generating algorithm with those that were inferred from humangenerated examples and machinegenerated examples.Experimental resultsA software application for Windows was written that (a) infers probability tables from given counterpoint examples, (b) composes firstspecies counterpoint given a cantus firmus, (c) plays and displays the composed music (Figure 3).
Much of the experimentation centered around deciding appropriate transition probabilities for the rule tables. While it was easy to make transitions legal or illegal, based on the outlined rules, it was more interesting and timeconsuming to weight the probabilities properly in order to get musical results. Subtle changes in the table values resulted in very different solutions. Also probabilities had to be weighted not just with respect to other values in the same table but with respect to competing rules. For example, one possible situation where there are two acceptable solutions might be the following: (1) the penultimate note is approached in stepwise motion (ideal) but four consecutive parallel sixths result (acceptable, but not ideal) vs. (2) the penultimate note is not approached in stepwise motion (acceptable, but not ideal) but there are fewer consecutive parallel sixths. This is just one example of an aesthetic decision which can be reflected in the probability values. During the course of implementation, it became apparent that some of the earlier tables were unnecessary. For example, there was initially a table which restricted parallel fifths and octaves. This was later rendered superfluous by the approaching motion table listed above (parallel motion is a special case of direct motion). Another influential aesthetic factor was climax placement. As recommended by Jeppesen, the climax note should be unique as well as higher in pitch than all other notes. The program either tries each possible climax time until a satisfactory (if any) solution is reached, or the user specifies where the climax point should be. This rule helps produce surprisingly musical results. Sound examplesFive recordings of output from the Palestrina Windows application. The original output was in MIDI. It was recorded as audio, then converted to MP3 format.
Conclusions and Future WorkPalestrina shows effectively that Markov chains can be used to compose a firstspecies counterpoint line given a cantus firmus. Not only does the composed line comply with the strict rules of sixteenthcentury counterpoint, but the results are also musical and comparable to those created by a knowledgeable musician. The program was capable of finding solutions identical to those presented by Jeppesen as well as some student compositions (...graded A). It furthermore shows how some of the more complex transition tables can be estimated from given counterpoint examples and that these probability tables are close to the data available for estimation. The ultimate goal of this work is to infer rules from authentic Palestrina works and to compose such counterpoint based on the inference model. Palestrina also includes a module that parses data in MIDI files and converts them into a convenient format for analysis in Matlab. I've created a MIDI database consisting of 30 threepart movements from Palestrina masses, yet the evaluation of the data must be left for future work.References[SS89] Felix Salzer and Carl Schachter. Counterpoint in Composition. Columbia University Press, New York, 1989. [Jep31] Knud Jeppesen. Counterpoint, the Polyphonic Vocal Style of the Sixteenth Century, 1931. [Jor98] Michael Jordan. Learning in Graphical Models. MIT Press, Cambridge, Massachusetts, 1998. [HI58] Lejaren Hiller and Leonard Isaacson. Musical Composition with a HighSpeed Digital Computer. Journal of the Audi o Engineering Society. 1958. [Ebc80] Kemal Ebcioglu. Computer Counterpoint. ICMC, 1980. [Lew83] David Lewin. An Interesting Global Rule for Species Counterpoint. Theory Only, 6.8. March 1983. pp. 1944. [Gjer88] Robert Gjerdingen. Concrete Musical Knowledge and a Computer Program for Species Counterpoint. Explorations in M usic, the Arts, and Ideas: Essays in Honor of Leonard B. Meyer. Pendragon Press, Stuyvesant, 1988. [Sch89] William Schottstaedt. Automatic Counterpoint. In: Current Directions in Computer Music Research. Matthews & Pierce eds. MIT Press, 1989. [Ber95] Dimitri P. Bertsekas. Dynamic Programming and Optimal Control. Athena Scientific, Belmont Massachusetts, 1995. [RJ86] L.R. Rabiner and B.H. Juang. An Introduction to Hidden Markov Models. IEEE ASSP Magazine. 1986. pp 416. [Vit67] Viterbi, A. J. Error bounds for convolutional codes and an asymptotically optimal decoding algorithm. IEEE Transactions on Information Processing, 1967. 13:260269. 

Last modified: 2 Sept 2001