Abstract
A method that extracts the 3-D shape and movement of lip and tongue and displays them simultaneously is presented. Lip movement is easily observable and thus extractable using a camera. However, it is difficult to extract the real movement of tongue exactly because the tongue may be occluded by the lip and teeth. In this paper, we use a magnetic resonance imaging (MRI) device to extract the sagittal view of the movement of tongue during articulation. Since the frame rate of the available MRI device is very low (5 fps), we obtain a smooth video sequence (20 fps) by a new contour-based interpolation method. The overall procedure of extracting the movement of lip and tongue is as follows. First, fiducial color markers attached on the lip are detected, and then the data of 3D movement of the lip are computed using a 3D reconstruction technique. Next, to extract the movement of tongue image, we applied a series of simple image processing algorithms to MRI images of tongue and then extracted the contour of tongue interactively. Finally, the data of lip and tongue are synchronized and temporally interpolated. An OpenGL based program is implemented to visualize the data interactively. We performed the experiment using the Korean basic syllables and some of the data are presented. It is confirmed that a lot of experiments using the results support theoretical and empirical observation of linguistics. The acquired data can be used not only as a fundamental database for scientific purpose but also as an educative material for language rehabilitation of the hearing-impaired. Also it can be used for making a high-quality lip-synchronized animation including tongue movement.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Provine, J.A., Bruton, L.T.: Lip Synchronization in 3-D Model Based Coding for Video-Conferencing. In: Proc. of ISCAS, 1st edn., pp. 453–456 (1995)
Chen, T., Rao, R.R.: Audio-visual Integration in Multimodal Communication. In: Proc. of the IEEE, vol. 86, pp. 837–852 (1998)
Guenter, B., Grimm, C., Wood, D., Malvar, H., Pighin, F.: Making Faces. In: Proc. of SIGGRAPH., pp. 55–66 (1998)
Hager, G.D., Belhumeur, P.N.: Real-time Tracking of Image Region with Changes in Geometry and Illumination. In: Proc. of CVPR., pp. 403–410 (1996)
Matsino, K., Lee, C.W., Tsuji, S.: Automatic Recognition of Human Facial Expressions. In: Proc. of the IEEE., pp. 352–359 (1995)
Pighin, F., Szeliski, R., Salesin, D.: Resynthesizing Facial Animation through 3D Model-Based Tracking. In: Proc. of ICCV., pp. 130–150 (1999)
Laprie, Y., Berger, M.-O.: Extraction of Tongue Contours in X-ray Images with Minimal User Interaction. In: Proc. of International Conference on Spoken Language Processing, vol. 1, pp. 268–271 (1996)
Akgul, Y.S., Kambhamettu, C., Stone, M.: Automatic Extraction and Tracking of the Tongue Contours. IEEE Trans. on Medical Imaging, 1035-1045 (1999)
Unay, D.: Analysis of Tongue Motion Using Tagged Cine-MRI. Master Thesis, Bogazici University (2001)
Stone, M., Davis, E., Nessaiver, M., Gullipalli, R., Levine, W., Lundberg, A.: Modeling Motion of the Internal Tongue from Tagged Cine-MRI Images. Journal of the Acoustical Society of America, 109(6), 2974–2982 (2001)
Engwall, O., Beskow, J.: Resynthesis of 3D Tongue Movements from Facial Data. In: Proc. of Eurospeech., pp. 2261–2264 (2003)
Engwall, O.: A 3D Tongue Model Based on MRI Data. In: Proc. of ICSLP, pp. 901–904 (2000)
Hartley, R., Zisserman, A.: Mutiple View Geometry in Computer Vision, 2nd edn. Cambridge University Press, Cambridge (2003)
Dragonfly Technical Reference Manual, http://www.ptgrey.com
Zhang, Z.: A Flexible New Technique for Camera Calibration. IEEE Trans. on Pattern Analysis and Machine Intelligence 22(11), 1330–1334 (2000)
MAGNETOM Sonata*, http://www.healthcare.siemens.com
Korean Standard Pronunciation, http://natogi.new21.org/han/pyojun/p202.htm
The National Institute of the Korean Language, http://www.korean.go.kr
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Park, H., Hong, SW., Park, JI., Moon, SK., Ko, H. (2005). Extracting the Movement of Lip and Tongue During Articulation. In: Ho, YS., Kim, H.J. (eds) Advances in Multimedia Information Processing - PCM 2005. PCM 2005. Lecture Notes in Computer Science, vol 3767. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11581772_75
Download citation
DOI: https://doi.org/10.1007/11581772_75
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-30027-4
Online ISBN: 978-3-540-32130-9
eBook Packages: Computer ScienceComputer Science (R0)