Kroos, Christian (2004): A system for video-based analysis of face motion during speech. Dissertation, LMU München: Fakultät für Sprach- und Literaturwissenschaften |
Vorschau |
PDF
Kroos_Christian.pdf 5MB |
Abstract
During face-to-face interaction, facial motion conveys information at various levels. These include a person's emotional condition, position in a discourse, and, while speaking, phonetic details about the speech sounds being produced. Trivially, the measurement of face motion is a prerequisite for any further analysis of its functional characteristics or information content. It is possible to make precise measures of locations on the face using systems that track the motion by means of active or passive markers placed directly on the face. Such systems, however, have the disadvantages of requiring specialised equipment, thus restricting the use outside the lab, and being invasive in the sense that the markers have to be attached to the subject's face. To overcome these limitations we developed a video-based system to measure face motion from standard video recordings by deforming the surface of an ellipsoidal mesh fit to the face. The mesh is initialised manually for a reference frame and then projected onto subsequent video frames. Location changes (between successive frames) for each mesh node are determined adaptively within a well-defined area around each mesh node, using a two-dimensional cross-correlation analysis on a two-dimensional wavelet transform of the frames. Position parameters are propagated in three steps from a coarser mesh and a correspondingly higher scale of the wavelet transform to the final fine mesh and lower scale of the wavelet transform. The sequential changes in position of the mesh nodes represent the facial motion. The method takes advantage of inherent constraints of the facial surfaces which distinguishes it from more general image motion estimation methods and it returns measurement points globally distributed over the facial surface contrary to feature-based methods.
Dokumententyp: | Dissertationen (Dissertation, LMU München) |
---|---|
Keywords: | face, motion tracking, auditory-visual speech, wavelets, video |
Themengebiete: | 400 Sprache
400 Sprache > 410 Linguistik |
Fakultäten: | Fakultät für Sprach- und Literaturwissenschaften |
Sprache der Hochschulschrift: | Englisch |
Datum der mündlichen Prüfung: | 16. Februar 2004 |
MD5 Prüfsumme der PDF-Datei: | fed51a13feba05b4c1d2c498555026c2 |
Signatur der gedruckten Ausgabe: | 0001/UMC 13698 |
ID Code: | 2145 |
Eingestellt am: | 03. Jun. 2004 |
Letzte Änderungen: | 24. Oct. 2020 11:38 |