The original head alignment program from the FLIRT system used one translation, one rotation, and a vertical stretch as its degrees of freedom when aligning the four points (eyes, nose, and mouth) of the input heads to the canonical head. There were two most noticeable problems with the output of this program. First, some output heads' radii were significantly larger than others, causing some heads to look too large. Second, the resampling algorithm in this program (which creates a grid-aligned range map from the rotated data) caused the rotated heads to have quantization artifacts; these appeared as visible creases in the output heads.
To alleviate these problems, a new alignment program was written, which used solely rigid transformations (one translation and one rotation) to align these four points on the heads. The algorithm for this alignment is derived in , and can be summarized as follows:
to minimize the least-squared error between and ,
The fact that the rays are all being cast towards the center of the model imposes some constraints on the problem which can be used to improve the performance of the algorithm. The 3D space can be divided up into ``bins'', with the restriction that no ray may intersect more than one bin. In this case the geometric constraint allows subdivision of space into wedges, that is, by and y: see Figure 3.1. By putting the triangles comprising the model into bins, we can restrict the number of triangles with which we need to perform intersection tests when actually executing the ray cast.
Figure 3.1: The spatial subdivision scheme. The triangle falls into the four shaded bins, which divide space radially. The ray being cast is only intersected with triangles in the bin it intersects.
In this case, it is easy to determine which bin or bins a particular triangle falls in. The vertical span of the triangle determines the range of bins in the y dimension; its angular span determines the range of bins in the dimension. Note that this causes a triangle to be placed in any bin with which it might have an intersection; this inaccuracy is acceptable, since it will never cause an intersection to be missed, and speeds up the binning process.
Once all of the triangles of the model have been binned, rays are cast into the model by first determining which bin the ray will intersect, and then intersecting the ray with all triangles in that bin. The intersection point with the largest radius (distance from the vertical axis) is returned as the result of the ray cast. There is a tradeoff between the size of the bins and the speed of the ray casting process; if the bins are made smaller (for example, the number of bins is equal to the number of rays being cast) then the number of triangle intersections required per ray will decrease, but the startup time for the binning process will increase. Informal testing has shown that allowing four rays to intersect with each bin (two in the direction and two in the y direction) offers a good compromise.
This binning scheme decreased the running time of the resampling algorithm from 40 minutes to under 30 seconds, a factor of 100 speed increase. The resampled heads are free from the original quantization artifacts.
Tests done with the output heads from the new alignment program showed that it did not line up the eyes and mouth as well as FLIRT's version, because of the lack of a vertical stretch. For this reason, the new resampling algorithm was coupled with the original alignment program (which, in the meantime, had had the ``oversized head'' problem fixed) to generate the aligned heads for later work.