Hu et al., 2002 - Google Patents

A self-calibrated speaker tracking system using both audio and video data

Hu et al., 2002

Document ID: 1980908820179411728
Author: Hu J; Su T; Cheng C; Liu W; Wu T
Publication year: 2002
Publication venue: Proceedings of the International Conference on Control Applications

External Links

Cited by

Snippet

In this paper, a self-calibrated speaker tracking system applied to both image tracking and sound source detection is proposed. The sound source estimated by microphone system is used to supply video tracking system. On the other hand, the direction of speaker detected …

Continue reading at congres.cran.univ-lorraine.fr (PDF) (other versions)

238000001514 detection method 0 abstract description 13

Classifications

- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00221—Acquiring or recognising human faces, facial parts, facial sketches, facial expressions
- G06K9/00268—Feature extraction; Face representation
- G06K9/00281—Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00221—Acquiring or recognising human faces, facial parts, facial sketches, facial expressions
- G06K9/00288—Classification, e.g. identification
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/36—Image preprocessing, i.e. processing the image information without deciding about the identity of the image
- G06K9/46—Extraction of features or characteristics of the image
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00597—Acquiring or recognising eyes, e.g. iris verification
- G06K9/00604—Acquisition
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00624—Recognising scenes, i.e. recognition of a whole field of perception; recognising scene-specific objects
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6201—Matching; Proximity measures
- G06K9/6202—Comparing pixel values or logical combinations thereof, or feature values having positional relevance, e.g. template matching
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10024—Color image
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10016—Video; Image sequence
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N5/00—Details of television systems
- H04N5/222—Studio circuitry; Studio devices; Studio equipment; Cameras comprising an electronic image sensor, e.g. digital cameras, video cameras, TV cameras, video cameras, camcorders, webcams, camera modules for embedding in other devices, e.g. mobile phones, computers or vehicles
- H04N5/225—Television cameras; Cameras comprising an electronic image sensor, e.g. digital cameras, video cameras, video cameras, camcorders, webcams, camera modules for embedding in other devices, e.g. mobile phones, computers or vehicles

Similar Documents

Publication	Publication Date	Title
US6028960A (en)	2000-02-22	Face feature analysis for automatic lipreading and character animation
US8396282B1 (en)	2013-03-12	Method and system for computing fused saliency maps from multi-modal sensory inputs
KR100754385B1 (en)	2007-08-31	Positioning, tracking, and separating device using audio / video sensor and its method
Yang et al.	1998	Visual tracking for multimodal human computer interaction
US6404900B1 (en)	2002-06-11	Method for robust human face tracking in presence of multiple persons
AU6308799A (en)	2001-03-05	Locating an audio source
US20130272548A1 (en)	2013-10-17	Object recognition using multi-modal matching scheme
CN107820037B (en)	2021-03-26	Method, apparatus and system for audio signal and image processing
EP3275213B1 (en)	2019-12-04	Method and apparatus for driving an array of loudspeakers with drive signals
Matthews et al.	1998	A comparison of active shape model and scale decomposition based features for visual speech recognition
Brutti et al.	2016	Online cross-modal adaptation for audio–visual person identification with wearable cameras
KR20190009006A (en)	2019-01-28	Real time multi-object tracking device and method by using global motion
Hu et al.	2002	A self-calibrated speaker tracking system using both audio and video data
Kirchmaier et al.	2011	Dynamical information fusion of heterogeneous sensors for 3D tracking using particle swarm optimization
Li et al.	2012	Multiple active speaker localization based on audio-visual fusion in two stages
Wang et al.	2001	Real-time automated video and audio capture with multiple cameras and microphones
CN112508998A (en)	2021-03-16	Visual target alignment method based on global motion
Spors et al.	2001	Joint audio-video object tracking
Terissi et al.	2010	3D Head Pose and Facial Expression Tracking using a Single Camera.
JP2016054409A (en)	2016-04-14	Image recognition device, image recognition method, and program
KR102894785B1 (en)	2025-12-02	System for sound source separation by using object analysis of video signals
Keyrouz et al.	2008	Three dimensional object tracking based on audiovisual fusion using particle swarm optimization
Cho et al.	2015	A New Multimodal Database for Performance Evaluation in System Level
Goyal	2021	Using Spasmodic Closure Patterns to Simplify Visual Voice Activity Detection
Decroix et al.	2016	Online audiovisual signature training for person re-identification