Horvitz et al., 2000 - Google Patents
Deeplistener: harnessing expected utility to guide clarification dialog in spoken language systems.Horvitz et al., 2000
View PDF- Document ID
- 4377204221556194569
- Author
- Horvitz E
- Paek T
- Publication year
- Publication venue
- INTERSPEECH
External Links
Snippet
We describe research on endowing spoken language systems with the ability to consider the cost of misrecognition, and using that knowledge to guide clarification dialog about a user's intentions. Our approach relies on coupling utility-directed policies for dialog with the …
- 238000005352 clarification 0 title abstract description 14
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/1822—Parsing for meaning understanding
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/183—Speech classification or search using natural language modelling using context dependencies, e.g. language models
- G10L15/19—Grammatical context, e.g. disambiguation of the recognition hypotheses based on word sequence rules
- G10L15/197—Probabilistic grammars, e.g. word n-grams
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L2015/088—Word spotting
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
- G10L15/30—Distributed recognition, e.g. in client-server systems, for mobile phones or network applications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/063—Training
- G10L2015/0635—Training updating or merging of old and new templates; Mean values; Weighting
- G10L2015/0636—Threshold criteria for the updating
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination
- G10L25/66—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination for extracting parameters related to health condition
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11250844B2 (en) | Managing agent engagement in a man-machine dialog | |
US7580908B1 (en) | System and method providing utility-based decision making about clarification dialog given communicative uncertainty | |
US10319381B2 (en) | Iteratively updating parameters for dialog states | |
Smith et al. | Interaction strategies for an affective conversational agent | |
Horvitz et al. | Models of attention in computing and communication: from principles to applications | |
KR101622111B1 (en) | Dialog system and conversational method thereof | |
US20030061029A1 (en) | Device for conducting expectation based mixed initiative natural language dialogs | |
CN114127710A (en) | Ambiguity Resolution Using Conversational Search History | |
Horvitz et al. | Harnessing models of users’ goals to mediate clarification dialog in spoken language systems | |
US20220115001A1 (en) | Method, System and Apparatus for Understanding and Generating Human Conversational Cues | |
EP4091161B1 (en) | Synthesized speech audio data generated on behalf of human participant in conversation | |
Horvitz et al. | Deeplistener: harnessing expected utility to guide clarification dialog in spoken language systems. | |
US11437039B2 (en) | Intelligent software agent | |
US11669697B2 (en) | Hybrid policy dialogue manager for intelligent personal assistants | |
CN116417003A (en) | Voice interaction system, method, electronic device and storage medium | |
CN114127694A (en) | Error recovery for the session system | |
Feng et al. | ASR-GLUE: A new multi-task benchmark for asr-robust natural language understanding | |
US11430446B1 (en) | Dialogue system and a dialogue method | |
CN115552517A (en) | Non-hotword preemption of automated assistant response presentations | |
US12148417B1 (en) | Label confidence scoring | |
CN116368562A (en) | Enabling natural conversations for automated assistants | |
CN111292749B (en) | Session control method and device of intelligent voice platform | |
EP4089569A1 (en) | A dialogue system and a dialogue method | |
Lahiri et al. | Hybrid multi purpose voice assistant | |
Paul et al. | Intent based multimodal speech and gesture fusion for human-robot communication in assembly situation |