Németh et al., 2009 - Google Patents

Human voice or prompt generation? can they co-exist in an application?

Németh et al., 2009

Document ID: 7018437754850735082
Author: Németh G; Zainkó C; Bartalis M; Olaszy G; Kiss G
Publication year: 2009
Publication venue: INTERSPEECH

External Links

Cited by

Snippet

This paper describes an R&D project regarding procedures for the automatic maintenance of the interactive voice response (IVR) system of a mobile telecom operator. The original plan was to create a generic voice prompt generation system for the customer service …

Continue reading at www.researchgate.net (PDF) (other versions)

238000000034 method 0 abstract description 16

Classifications

- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/183—Speech classification or search using natural language modelling using context dependencies, e.g. language models
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/06—Elementary speech units used in speech synthesisers; Concatenation rules
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/02—Methods for producing synthetic speech; Speech synthesisers
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/003—Changing voice quality, e.g. pitch or formants
- G10L21/007—Changing voice quality, e.g. pitch or formants characterised by the process used
- G10L21/013—Adapting to target pitch
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M3/00—Automatic or semi-automatic exchanges
- H04M3/42—Systems providing special services or facilities to subscribers
- H04M3/487—Arrangements for providing information services, e.g. recorded voice services, time announcement
- H04M3/493—Interactive information services, e.g. directory enquiries ; Arrangements therefor, e.g. interactive voice response [IVR] systems or voice portals
- H04M3/4936—Speech interaction details
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/06—Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids
- G10L21/10—Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids transforming into visible information
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M2201/00—Electronic components, circuits, software, systems or apparatus used in telephone systems
- H04M2201/40—Electronic components, circuits, software, systems or apparatus used in telephone systems using speech recognition

Similar Documents

Publication	Publication Date	Title
US11594221B2 (en)	2023-02-28	Transcription generation from multiple speech recognition systems
US7292980B1 (en)	2007-11-06	Graphical user interface and method for modifying pronunciations in text-to-speech and speech recognition systems
US9418652B2 (en)	2016-08-16	Automated learning for speech-based applications
US7596499B2 (en)	2009-09-29	Multilingual text-to-speech system with limited resources
US8812314B2 (en)	2014-08-19	Method of and system for improving accuracy in a speech recognition system
Rabiner	1994	Applications of voice processing to telecommunications
Gardner-Bonneau et al.	2007	Human factors and voice interactive systems
US9401145B1 (en)	2016-07-26	Speech analytics system and system and method for determining structured speech
JP6517419B1 (en)	2019-05-22	Dialogue summary generation apparatus, dialogue summary generation method and program
Gibbon et al.	2013	Spoken language system and corpus design
Campbell	2005	Developments in corpus-based speech synthesis: Approaching natural conversational speech
Kopparapu	2015	Non-linguistic analysis of call center conversations
Kuhn et al.	2024	Measuring the accuracy of automatic speech recognition solutions
JP2020071676A (en)	2020-05-07	Speech summary generation apparatus, speech summary generation method, and program
US7428491B2 (en)	2008-09-23	Method and system for obtaining personal aliases through voice recognition
US20140278404A1 (en)	2014-09-18	Audio merge tags
Campbell	2007	Evaluation of speech synthesis: from reading machines to talking machines
JP3936351B2 (en)	2007-06-27	Voice response service equipment
Németh et al.	2009	Human voice or prompt generation? can they co-exist in an application?
De Klerk	2002	Towards a corpus of black South African English
JP2021039293A (en)	2021-03-11	Information processing device, information processing method, and program
Gibbon et al.	2000	Consumer off-the-shelf (COTS) speech technology product and service evaluation
Hagen et al.	2003	HMM/MLP hybrid speech recognizer for the Portuguese telephone SpeechDat corpus
CN118051582A (en)	2024-05-17	Method, device, equipment and medium for identifying potential customers based on telephone voice analysis
Mac Lochlainn	2010	Sintéiseoir 1.0: a multidialectical TTS application for Irish