Hu et al., 2002 - Google Patents
A self-calibrated speaker tracking system using both audio and video dataHu et al., 2002
View PDF- Document ID
- 1980908820179411728
- Author
- Hu J
- Su T
- Cheng C
- Liu W
- Wu T
- Publication year
- Publication venue
- Proceedings of the International Conference on Control Applications
External Links
Snippet
In this paper, a self-calibrated speaker tracking system applied to both image tracking and sound source detection is proposed. The sound source estimated by microphone system is used to supply video tracking system. On the other hand, the direction of speaker detected …
- 238000001514 detection method 0 abstract description 13
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00221—Acquiring or recognising human faces, facial parts, facial sketches, facial expressions
- G06K9/00268—Feature extraction; Face representation
- G06K9/00281—Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00221—Acquiring or recognising human faces, facial parts, facial sketches, facial expressions
- G06K9/00288—Classification, e.g. identification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/36—Image preprocessing, i.e. processing the image information without deciding about the identity of the image
- G06K9/46—Extraction of features or characteristics of the image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00597—Acquiring or recognising eyes, e.g. iris verification
- G06K9/00604—Acquisition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00624—Recognising scenes, i.e. recognition of a whole field of perception; recognising scene-specific objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6201—Matching; Proximity measures
- G06K9/6202—Comparing pixel values or logical combinations thereof, or feature values having positional relevance, e.g. template matching
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10024—Color image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10016—Video; Image sequence
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N5/00—Details of television systems
- H04N5/222—Studio circuitry; Studio devices; Studio equipment; Cameras comprising an electronic image sensor, e.g. digital cameras, video cameras, TV cameras, video cameras, camcorders, webcams, camera modules for embedding in other devices, e.g. mobile phones, computers or vehicles
- H04N5/225—Television cameras; Cameras comprising an electronic image sensor, e.g. digital cameras, video cameras, video cameras, camcorders, webcams, camera modules for embedding in other devices, e.g. mobile phones, computers or vehicles
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US6028960A (en) | Face feature analysis for automatic lipreading and character animation | |
| US8396282B1 (en) | Method and system for computing fused saliency maps from multi-modal sensory inputs | |
| KR100754385B1 (en) | Positioning, tracking, and separating device using audio / video sensor and its method | |
| Yang et al. | Visual tracking for multimodal human computer interaction | |
| US6404900B1 (en) | Method for robust human face tracking in presence of multiple persons | |
| AU6308799A (en) | Locating an audio source | |
| US20130272548A1 (en) | Object recognition using multi-modal matching scheme | |
| CN107820037B (en) | Method, apparatus and system for audio signal and image processing | |
| EP3275213B1 (en) | Method and apparatus for driving an array of loudspeakers with drive signals | |
| Matthews et al. | A comparison of active shape model and scale decomposition based features for visual speech recognition | |
| Brutti et al. | Online cross-modal adaptation for audio–visual person identification with wearable cameras | |
| KR20190009006A (en) | Real time multi-object tracking device and method by using global motion | |
| Hu et al. | A self-calibrated speaker tracking system using both audio and video data | |
| Kirchmaier et al. | Dynamical information fusion of heterogeneous sensors for 3D tracking using particle swarm optimization | |
| Li et al. | Multiple active speaker localization based on audio-visual fusion in two stages | |
| Wang et al. | Real-time automated video and audio capture with multiple cameras and microphones | |
| CN112508998A (en) | Visual target alignment method based on global motion | |
| Spors et al. | Joint audio-video object tracking | |
| Terissi et al. | 3D Head Pose and Facial Expression Tracking using a Single Camera. | |
| JP2016054409A (en) | Image recognition device, image recognition method, and program | |
| KR102894785B1 (en) | System for sound source separation by using object analysis of video signals | |
| Keyrouz et al. | Three dimensional object tracking based on audiovisual fusion using particle swarm optimization | |
| Cho et al. | A New Multimodal Database for Performance Evaluation in System Level | |
| Goyal | Using Spasmodic Closure Patterns to Simplify Visual Voice Activity Detection | |
| Decroix et al. | Online audiovisual signature training for person re-identification |