[go: up one dir, main page]

Wu et al., 2021 - Google Patents

Modeling the Conditional Distribution of Co-Speech Upper Body Gesture Jointly Using Conditional-GAN and Unrolled-GAN. Electronics 2021, 10, 228

Wu et al., 2021

View PDF
Document ID
9282625478903747496
Author
Wu B
Liu C
Ishi C
Ishiguro H
Publication year

External Links

Snippet

Co-speech gestures are a crucial, non-verbal modality for humans to communicate. Social agents also need this capability to be more human-like and comprehensive. This study aims to model the distribution of gestures conditioned on human speech features. Unlike previous …
Continue reading at pdfs.semanticscholar.org (PDF) (other versions)

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/50Computer-aided design
    • G06F17/5009Computer-aided design using simulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T13/00Animation
    • G06T13/203D [Three Dimensional] animation
    • G06T13/403D [Three Dimensional] animation of characters, e.g. humans, animals or virtual beings
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/3061Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F17/30634Querying
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N99/00Subject matter not provided for in other groups of this subclass
    • G06N99/005Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computer systems based on biological models
    • G06N3/004Artificial life, i.e. computers simulating life
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/065Adaptation
    • G10L15/07Adaptation to the speaker
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computer systems utilising knowledge based models
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/06Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids
    • G10L21/10Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids transforming into visible information
    • G10L2021/105Synthesis of the lips movements from speech, e.g. for talking heads
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/62Methods or arrangements for recognition using electronic means
    • G06K9/6217Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification

Similar Documents

Publication Publication Date Title
Alexanderson et al. Listen, denoise, action! audio-driven motion synthesis with diffusion models
Nyatsanga et al. A comprehensive review of data‐driven co‐speech gesture generation
Bhattacharya et al. Text2gestures: A transformer-based network for generating emotive body gestures for virtual agents
Ghorbani et al. ZeroEGGS: Zero‐shot Example‐based Gesture Generation from Speech
Bhattacharya et al. Speech2affectivegestures: Synthesizing co-speech gestures with generative adversarial affective expression learning
Chiu et al. How to train your avatar: A data driven approach to gesture generation
Gibet et al. The signcom system for data-driven animation of interactive virtual signers: Methodology and evaluation
Qi et al. Emotiongesture: Audio-driven diverse emotional co-speech 3d gesture generation
Zhi et al. Livelyspeaker: Towards semantic-aware co-speech gesture generation
US20250131631A1 (en) Three-dimensional face animation from speech
Ondras et al. Audio-driven robot upper-body motion synthesis
Yang et al. Statistics‐based motion synthesis for social conversations
US12456244B2 (en) Autonomous animation in embodied agents
Rebol et al. Passing a non-verbal turing test: Evaluating gesture animations generated from speech
US20220215267A1 (en) Processes and methods for enabling artificial general intelligence capable of flexible calculation, prediction, planning and problem solving with arbitrary and unstructured data inputs and outputs
WO2025025822A1 (en) Method and apparatus for generating virtual object action, and computer device
Sadoughi et al. Creating prosodic synchrony for a robot co-player in a speech-controlled game for children
Wang et al. Integrated speech and gesture synthesis
CN113609301A (en) Dialogue method, medium and system based on knowledge graph
Gao et al. Gesgpt: Speech gesture synthesis with text parsing from chatgpt
Sun et al. Beyond talking–generating holistic 3d human dyadic motion for communication
Voß et al. Aq-gt: a temporally aligned and quantized gru-transformer for co-speech gesture synthesis
Zhang et al. Speech-driven personalized gesture synthetics: Harnessing automatic fuzzy feature inference
Oralbayeva et al. Data-driven communicative behaviour generation: A survey
Lee et al. Crossmodal clustered contrastive learning: Grounding of spoken language to gesture