[go: up one dir, main page]

Hu et al., 2002 - Google Patents

A self-calibrated speaker tracking system using both audio and video data

Hu et al., 2002

View PDF
Document ID
1980908820179411728
Author
Hu J
Su T
Cheng C
Liu W
Wu T
Publication year
Publication venue
Proceedings of the International Conference on Control Applications

External Links

Snippet

In this paper, a self-calibrated speaker tracking system applied to both image tracking and sound source detection is proposed. The sound source estimated by microphone system is used to supply video tracking system. On the other hand, the direction of speaker detected …
Continue reading at congres.cran.univ-lorraine.fr (PDF) (other versions)

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/00221Acquiring or recognising human faces, facial parts, facial sketches, facial expressions
    • G06K9/00268Feature extraction; Face representation
    • G06K9/00281Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/00221Acquiring or recognising human faces, facial parts, facial sketches, facial expressions
    • G06K9/00288Classification, e.g. identification
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/36Image preprocessing, i.e. processing the image information without deciding about the identity of the image
    • G06K9/46Extraction of features or characteristics of the image
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/00597Acquiring or recognising eyes, e.g. iris verification
    • G06K9/00604Acquisition
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/00624Recognising scenes, i.e. recognition of a whole field of perception; recognising scene-specific objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/62Methods or arrangements for recognition using electronic means
    • G06K9/6201Matching; Proximity measures
    • G06K9/6202Comparing pixel values or logical combinations thereof, or feature values having positional relevance, e.g. template matching
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10024Color image
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/222Studio circuitry; Studio devices; Studio equipment; Cameras comprising an electronic image sensor, e.g. digital cameras, video cameras, TV cameras, video cameras, camcorders, webcams, camera modules for embedding in other devices, e.g. mobile phones, computers or vehicles
    • H04N5/225Television cameras; Cameras comprising an electronic image sensor, e.g. digital cameras, video cameras, video cameras, camcorders, webcams, camera modules for embedding in other devices, e.g. mobile phones, computers or vehicles

Similar Documents

Publication Publication Date Title
US6028960A (en) Face feature analysis for automatic lipreading and character animation
US8396282B1 (en) Method and system for computing fused saliency maps from multi-modal sensory inputs
KR100754385B1 (en) Positioning, tracking, and separating device using audio / video sensor and its method
Yang et al. Visual tracking for multimodal human computer interaction
US6404900B1 (en) Method for robust human face tracking in presence of multiple persons
AU6308799A (en) Locating an audio source
US20130272548A1 (en) Object recognition using multi-modal matching scheme
CN107820037B (en) Method, apparatus and system for audio signal and image processing
EP3275213B1 (en) Method and apparatus for driving an array of loudspeakers with drive signals
Matthews et al. A comparison of active shape model and scale decomposition based features for visual speech recognition
Brutti et al. Online cross-modal adaptation for audio–visual person identification with wearable cameras
KR20190009006A (en) Real time multi-object tracking device and method by using global motion
Hu et al. A self-calibrated speaker tracking system using both audio and video data
Kirchmaier et al. Dynamical information fusion of heterogeneous sensors for 3D tracking using particle swarm optimization
Li et al. Multiple active speaker localization based on audio-visual fusion in two stages
Wang et al. Real-time automated video and audio capture with multiple cameras and microphones
CN112508998A (en) Visual target alignment method based on global motion
Spors et al. Joint audio-video object tracking
Terissi et al. 3D Head Pose and Facial Expression Tracking using a Single Camera.
JP2016054409A (en) Image recognition device, image recognition method, and program
KR102894785B1 (en) System for sound source separation by using object analysis of video signals
Keyrouz et al. Three dimensional object tracking based on audiovisual fusion using particle swarm optimization
Cho et al. A New Multimodal Database for Performance Evaluation in System Level
Goyal Using Spasmodic Closure Patterns to Simplify Visual Voice Activity Detection
Decroix et al. Online audiovisual signature training for person re-identification