Two basic forms of experiments were performed. The first involved the reconstruction of heads which were used as data points for computing the eigenspaces. In this situation it was expected that when using all of the eigenvectors for reconstruction, the single eigenspaces would have less reconstruction error than the modular eigenspaces (see Chapter 5), because this is the criterion which an eigendecomposition optimizes. We wanted to see whether the modular eigenspaces would improve the truncation error (incurred when dropping the higher-frequency eigenvectors).
Our error metric was the Euclidean distance between an input scanned head and its eigenspace reconstruction. This metric has the disadvantage that it has no perceptual foundation; a reconstruction with a seemingly low error can be perceptually very different from the input. Heuristically, in the data presented here, a distance of a few hundred is relatively insignificant, while an error of a few thousand is very significant (reconstruction bears only passing resemblance to the input).
(As expected, these are certainly due to roundoff error.)
The differences between the latter reconstructions may also be attributable to roundoff error. The modular eigenspaces' reconstructions seemed to have less error, but the perceptual differences between the reconstructions were not significant.
The other experiment involved cross-validation using the ``leave-one-out'' technique. An eigenspace (or set of eigenspaces, in the case of modular eigenspaces) was created from all but one of the available CyberWare scans; the remaining head was projected onto both the single and modular eigenspaces, and the reconstruction errors were compared. In this case we expected the modular eigenspaces to perform better than the single eigenspace, because there were separate reconstruction parameters for the eyes, nose, and mouth regions. However, this was not the case.
The reconstructed heads from the cross-validation experiments were compared visually. In cases where the modular eigenspace reconstruction error differed noticeably from the single eigenspace's, the visual discrepancies between the models were negligible. For the most part, all of the reconstructed heads looked like the mean or ``average'' face, and the differences between the modular and single eigenspaces' reconstructions were limited to small variations in the eye and mouth regions.
We hypothesized that the presence of the hair and back of the head in the eigenspace was causing the eigenspace to concentrate less on the face, which was the region of interest. For this reason a new set of masks (eyes, nose, and mouth) was created which cropped out everything except the facial region. Because of a lack of time, the above cross-validation and truncation error experiments were not run with these eigenspaces. Instead, data from FLIRT was used to visually compare the reconstructions. In the following experiments 8 input heads were used to compute the eigenspaces and all 8 eigenvectors were kept.
The first set of experiments used smoothed versions of the modular eigenspace masks (Figure 6.1). One of the CyberWare scanned heads not in the eigenspaces was decimated to the dimensions FLIRT ordinarily provides, and projected into both a single and the modular eigenspaces. The results of this are shown in Figure 6.2. Both heads look approximately the same, with only minor variations in the nose and mouth regions. An input head from FLIRT was also projected into these eigenspaces; the results are shown in Figure 6.3. The obvious differences may be attributable to the non-orthogonality of the modular eigenspaces and the fact that we were combining them improperly (Chapter 5).
To try to remedy the reconstruction error, a new set of orthogonal masks was generated (Figure 6.4). In this situation it was expected that the modular eigenspaces would code the eigenfeatures better than the single eigenspace for the entire head, although edges might be visible along the boundaries where one eigenspace stopped coding for a region and another one started. Again, one of the scanned heads not in the eigenspaces was reconstructed (Figure 6.5). This time the modular eigenspace did qualitatively better in the reconstruction; the eyes looked straight ahead instead of off to the side. Furthermore, no hard edges were visible in the modular reconstructed model. When input data from FLIRT was projected into this eigenspace, however, the result was not just visible edges, but fairly drastic protrusions of the eigenfeatures beyond the rest of the face (Figure 6.6). This is probably attributable to variations in the lighting of the user being tracked by FLIRT. An overhead light source was used, likely causing the nose to be illuminated more than the rest of the face, and causing shadows in the cheeks. This could cause the average energy of the vector being projected into the nose eigenspace to be disproportionately larger than that of the ``rest of the face'' vector. Since we estimated the structure of the head solely from the texture, these regions protruded further in the reconstruction. However, there were also color variations between the various eigenfeatures visible in the actual model; these did not disappear even when the structure information was removed from the eigenspace.
Figure 6.1: The cropped and diffused modular eigenspace masks.
Figure 6.2: Cropped, diffused mask reconstruction: from left to right, original head, single eigenspace reconstruction, modular reconstruction.
Figure: Cropped, diffused mask reconstruction from FLIRT: video image is as in Figure 1.2. Left is single eigenspace reconstruction, right is modular reconstruction.
Figure 6.4: The orthogonal modular eigenspace masks.
Figure 6.5: Orthogonal mask reconstruction. Note that the eyes look straight ahead, unlike the diffused mask versions.
Figure: Orthogonal mask reconstruction from FLIRT: video image is as in Figure 1.2. Lighting variations on the face may be the cause of the obvious borders around the eigenfeatures.