GB2304957A - Voice-dialog system for automated output of information - Google Patents
Voice-dialog system for automated output of information Download PDFInfo
- Publication number
- GB2304957A GB2304957A GB9618308A GB9618308A GB2304957A GB 2304957 A GB2304957 A GB 2304957A GB 9618308 A GB9618308 A GB 9618308A GB 9618308 A GB9618308 A GB 9618308A GB 2304957 A GB2304957 A GB 2304957A
- Authority
- GB
- United Kingdom
- Prior art keywords
- utterance
- voice
- utterances
- user
- identifier
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 235000004240 Triticum spelta Nutrition 0.000 claims abstract description 10
- 238000009434 installation Methods 0.000 claims description 39
- 238000000034 method Methods 0.000 claims description 17
- 230000004044 response Effects 0.000 claims description 3
- 230000015572 biosynthetic process Effects 0.000 description 2
- 238000003786 synthesis reaction Methods 0.000 description 2
- 238000013518 transcription Methods 0.000 description 2
- 230000035897 transcription Effects 0.000 description 2
- 238000010586 diagram Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 239000013589 supplement Substances 0.000 description 1
- 230000001755 vocal effect Effects 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M3/00—Automatic or semi-automatic exchanges
- H04M3/42—Systems providing special services or facilities to subscribers
- H04M3/487—Arrangements for providing information services, e.g. recorded voice services or time announcements
- H04M3/493—Interactive information services, e.g. directory enquiries ; Arrangements therefor, e.g. interactive voice response [IVR] systems or voice portals
- H04M3/4931—Directory assistance systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/183—Speech classification or search using natural language modelling using context dependencies, e.g. language models
- G10L15/187—Phonemic context, e.g. pronunciation rules, phonotactical constraints or phoneme n-grams
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M2201/00—Electronic components, circuits, software, systems or apparatus used in telephone systems
- H04M2201/40—Electronic components, circuits, software, systems or apparatus used in telephone systems using speech recognition
Landscapes
- Engineering & Computer Science (AREA)
- Signal Processing (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Machine Translation (AREA)
Abstract
A voice-dialog system outputs information, in particular a telephone number. An alphabet-identifier identifies an utterance which is spelt out by the user and selects utterances that can be spelt in a similar manner from a plurality of predetermined utterances; an utterance-identifier compares the utterance input by the user with the utterances selected by the alphabet-identifier and supplies at least one utterance for output to the user. A lexicon operates on-line and stores orthographic-phonetic information for the plurality of predetermined utterances which the alphabet-identifier, the utterance-identifier and a synthesizer can access in real time.
Description
VOICE-DIALOG SYSTEM FOR AUTOMATED OUTPUT OF INFORMATION
The invention relates to a voice-dialog method for automated output of information, such as a telephone number of a user, and to a voice-dialog installation for carrying out the method, and to an apparatus for speaker-independent voice-identification, in particular for use in such an installation.
Voice-dialog systems for automated voice output of telephone numbers are known, in which the dialog between a caller, who requires certain information, and the system is conducted over the telephone. The voicedialog systems currently in operation can, however, only identify a fixed, small to medium vocabulary of approximately 1000 words. Any texts, including the output of place names, surnames and the telephone number, are output by way of a voice synthesizer. It has, however, been shown that errors in the pronunciation of names occur, particularly if the names do not obey the usual pronunciation rules.
The underlying object of the invention is therefore to make a voice-dialog method available for automated output of information and to provide a voicedialog installation which is suitably developed for this purpose and which can process a very large identifiable vocabulary, that is, approximately 10,000 to 100,000 words, and can still attain an acceptable identification rate and which also reduces or even totally avoids errors in the case of the voice output of foreign-language terms.
According to a first aspect of the present invention, there is provided a voice-dialog method for automated output of information, having the following steps:
(a) intermittently loading orthographic-phonetic
information for a plurality of predetermined
utterances from a lexicon which is capable of
operating on-line, with the information being
available in real time;
(b) verbally requesting the user to input an
utterance;
(c) temporarily storing the utterance which has
been input;
(d) verbally requesting the user to spell the
utterance which has been input;
(e) in response to the spelt-out utterance,
identifying and selecting a plurality of the
predetermined, spelt-out reference utterances
with the aid of the stored orthographic
information on the basis of ascertaining
similarity;
(f) feeding the selected utterances and the
temporarily stored utterance to an utterance
identifier;
(g) identifying and selecting at least one
utterance from the selected utterances on the
basis of a similarity-comparison; and
(h) sequentially outputting the utterances found
in step (g) and the associated information in
synthesized voice form.
According to a second aspect of the present invention, there is provided a voice-dialog installation, comprising:
a device for the input of an utterance by a user,
at least one synthesizer for generating voice signals for the user,
a voice-inputting device,
an alphabet-identifier which can identify an utterance which is spelt out by the user and can select orthbgraphically similar utterances from a plurality of predetermined spelt-out reference utterances,
an utterance-identifier which compares the utterance input by the user with the utterances selected by the alphabet-identifier and on the basis of ascertaining similarity supplies at least one utterance for output to the user, and
at least one lexicon which is capable of operating on-line and stores orthographic-phonetic information for the plurality of predetermined utterances which the alphabet-identifier, the utterance-identifier and the synthesizer can access in real time.
According to a third aspect of the present invention, there is provided an apparatus for speakerindependent voice-identification, having an alphabetidentifier which can identify an utterance spelt out by a user and can select several spelt-out reference utterances from a plurality of predetermined spelt-out reference utterances on the basis of ascertaining similarity, and having an utterance-identifier which, on the basis of ascertaining similarity, compares an utterance, which is input by the user and which corresponds to the spelt-out utterance, with the utterances which are pre-selected by the alphabetidentifier and supplies at least one output utterance as a result.
The invention is able to process a very large vocabulary at an acceptable identification rate, as an utterance input by a user undergoes combined voice identification. This utterance can be a surname, a first name, a street name, a place name or even words which are joined together. The combined voice identification takes in an alphabet-identifier, which can identify an utterance spelt out by the user and thereupon can select orthographically similar utterances from a plurality of predetermined reference utterances which have been spelt out. The term "orthographically similar utterance" is used each time in the following to express the fact that two or more sequences of pronounced letters forming words sound alike (e.g. "es e es es e el" and "ef e es es e el").
As a second main component, the combined voice identification includes an utterance-identifier which compares the utterance input directly by the user with the reference utterances which correspond to the speltout reference utterances which are selected by the alphabet identifier. On the basis of ascertaining similarity, the utterance identifier supplies as an identification result at least one word for output to the user, which word corresponds to a reference utterance similar to the user's utterance. A lexicon capable of operating on-line is used to store orthographic-phonetic information for the plurality of predetermined utterances which the alphabet-identifier, the utterance-identifier and a synthesizer can access in real time.
Advantageously, a memory for temporary storage is provided, which memory temporarily stores the utterance directly input by the user before it is forwarded to the utterance-identifier. In addition, the installation contains a further memory in which the spelt-out reference utterances, which have been preselected by the alphabet-identifier, are loaded in the form of a list of candidates of orthographically similar names.
The utterance-identifier operates in keywordspotting mode so that the user can, within certain limits, make additional utterances before and after the actual utterance and the utterance-identifier is still able to extract the relevant utterance.
The orthographic-phonetic information stored in the lexicon pertains, in the first place, to the spelling of the predetermined utterances which the alphabet-identifier uses in order to identify an utterance which has been spelt out and to make therefrom a pre-selection of orthographically similar names for the utterance-identifier. In addition, phonetic transcriptions, for example for place names and surnames, are stored in the lexicon. Orthographic and phonetic transcriptions of proper names are transmitted from an electronic dictionary of pronunciation to the lexicon in an off-line process.
In this connection, only proper names which occur in the electronic telephone directory are transferred.
The electronic telephone directory is a data bank which is capable of operating in real time and which contains the addresses and telephone numbers required to output information to the user. In order to obtain a high level of quality even in the case of voice-output of names which do not obey the usual pronunciation rules, intonation-related information of the terms is also stored in addition to the phonetic information. These voice features reproduce the intonation of syllables and endings of foreign-language words as well.
In order to avoid a situation where the results of identification of the combined voice identification are affected at random on account of acoustic similarities between words and/or spoken letters, additional information for homonyms is stored in the lexicon.
This additional information allows one candidate obtained by voice identification to be supplemented by alternatives which can be pronounced in the same way and thus allows the identification rate of the installation to be increased.
Advantageously, the lexicon includes a store for general vocabulary, for names of towns and for the surnames which occur there.
The control of the voice-dialog installation is effected by means of a program-controlled microcomputer. The control software implemented therein ensures inter alia that the required orthographic and phonetic information from the lexicon is made available to the identifiers and the synthesizer in good time and that the installation requests a user in a voice-controlled manner to input the respective utterances. In addition, it monitors the time-outs occurring in the voice-identifiers, processes terminating and help commands and takes over the identification and control of errors.
Internal program loops, which can reject an utterance input by the user or at the end of a given time span can ask the user to input his utterance anew, run in the utterance-identifier and in the alphabetidentifier.
The invention is explained in greater detail below with reference to an exemplary embodiment in conjunction with the enclosed drawings in which:
Figure 1 is a schematic block diagram of a voice
dialog installation having the combined
voice-identification according to the
invention and an on-line lexicon;
Figure 2 is a flow chart showing the progress of
an automated voice dialog for name
identification and output of a pertinent
call number effected by the voice-dialog
installation according to Figure 1.
Figure 1 shows the basic structure of a voicedialog installation which can effect lexicon-controlled identification of any utterances, for example of place names or surnames, by means of a combination of voiceidentifiers and can output information associated with the utterance (for example a call number) on the basis of an utterance which has been ascertained (identification result). In detail, a telephone set or apparatus 10 is represented in Figure 1, at which apparatus a caller can input the place name and the surname of a subscriber, whose telephone number he wishes to find out, or certain other utterances.
Arranged on the operational side of the voice-dialog installation there is at least one analog-to-digital converter 80 which converts the analog voice signals from the subscriber into digital signals. The output of the analog-to-digital converter can be connected to the respective input of a voice memory 20 and an alphabet-identifier or letter-identifier 30. The voice memory 20 is used for temporary storage, for later use, of the utterance directly input into the telephone apparatus 10 by the caller, that is, for example, the name "Meier". The alphabet-identifier 30 receives, by way of the analog-to-digital converter 80 as a function of the status of the voice-dialog run, a spelt-out version of the directly input utterance which was previously stored in the voice memory 20. A programcontrolled microcomputer 120 ensures that the directly input utterance is loaded into the voice memory 20 and that the spelt-out utterance is fed to the alphabetidentifier 30. The output of the alphabet-identifier 30 is connected to a memory 40, stored in which there is a list of candidates of orthographically similar utterances which have been ascertained by the alphabetidentifier 30 during a pre-selection. An utteranceidentifier 50 is provided with three inputs which are connected to respective outputs of the candidate memory 40, the voice memory 20 and an on-line lexicon 70. The utterance-identifier 50 operates in the so-called keyword-spotting mode which makes it possible for the actual utterance, for example "Meier", to be correctly extracted, even if additional utterances such as "er", "please" or the like precede or follow it. The output of the keyword-spotter 50 is connected to an idetification-result memory 55 in which the resultant utterances, that is, similarly sounding names, are stored by the keyword-spotter 50. The utterances which are stored in the identification- result memory 55 are fed to a synthesizer 60 which on the basis of the corresponding information from the lexicon in turn transmits the names in synthesized speech by way of a digital-to-analog converter 85 to the telephone apparatus 10 of the subscriber. The synthesizer 60 can also produce the verbal requests to be made of the caller in conjunction with a database - not shown - in which all of the texts to be announced by the installation are contained in an orthographic or phonetic form.
The on-line lexicon 70 mentioned above is distinguished above all by the fact that it can be used simultaneously and in real time by the alphabetidentifier 30 for letter-identification, by the keyword-spotter 50 and by the synthesizer 60. That is why all the information relating to the utterances to be identified by the installation and to be made is stored in this lexicon 70. This information is orthographic and pronunciation- or intonation-related information which is loaded from a dictionary of pronunciation 100 into the on-line lexicon 70 in an off-line process. In addition, information on homonyms is stored in the lexicon 70 in order to extend the identification result of the utterance-identifier with names which sound alike or in order to supplement the spelt-out reference utterances of the alphabet identifier with orthographically similar names and thus to increase the probability of detecting the correct utterance at the same time. This also ensures that there is an increased success rate during use or an improved total throughput through the installation, as utterances which are to be identified are more rarely rejected by the voice-identifiers 30, 50. The information on homonyms makes it possible for the utterance-identifier, for example for an utterance "Meier", to find all the spellings present in the electronic telephone directory, such as, for example, "Meier", "Mayer", "Maier" and "Meyer", and to include them in the list of identification results. On the other hand, it is thereby possible for the alphabetidentifier to map, for example, frequently occurring and possibly incorrectly used spelling variants, such as, for example "MULLER" or "MUELLER", to the correct, spelt-out reference utterance even if, for example, only the spelling with " " appears in the telephone directory. The on-line lexicon 70 which has been described therefore first assists both the voiceidentification and the voice synthesis.
The mode of operation of the voice-dialog installation is explained in greater detail in the following with reference to a name-identification. It may be assumed that the voice-dialog installation already knows the name of the place in which the person, whose telephone number a caller would like to find out, lives. For this purpose, the installation first asked the user of the telephone apparatus 10 to input the place name (for example Darmstadt) direct, that is, in a form not spelt out. Advantageously, the microcomputer 120 controls the installation in such a way that the place name is only fed to the keywordspotter 50 in order to identify the utterance. As already mentioned, the keyword-spotter is able to tolerate additional utterances, such as "er" or "please", and extract merely the town name as information. The voice dialog-installation can also be developed in such a way that pre-selection of orthographically similar place names is effected by the alphabet-identifier 30 for the keyword-spotter 50 when an incorrect identification result or no identification result at all has been supplied by the keyword-spotter 50. After the place name has been identified, the voice-dialog installation makes available from the on line lexicon 70 all the surnames, which are stored in an electronic telephone directory 90 for this town name. It may further be assumed that the spelling of all the proper names which are required for the spelling identification in the alphabet-identifier 30, a respective sequence of phonetic symbols for all the proper names which are required for the voiceidentification in the keyword-spotter, and also a respective sequence of phonetic symbols including intonation information required for the voice synthesis are contained in the on-line lexicon 70. In addition, references to the corresponding entries in the on-line lexicon are contained in the electronic telephone directory 90 which contains the surnames of the subscribers with corresponding telephone numbers and addresses.
The caller is now guided through a dialog, during the course of which he finds out the desired telephone number by virtue of specifying the place name and the name of the subscriber.
The following voice dialog between the caller using the telephone apparatus 10 and the voice-dialog installation is explained in the flow chart according to Figure 2.
The caller is first asked verbally by the installation by way of the synthesizer 60 to input directly the desired name, for example "Meier". This input is subsequently temporarily stored in the voice memory 20. Even additional utterances, such as "er" and "please", are also recorded thereby in the voice memory 20. Subsequently, the caller is requested verbally by way of the synthesizer 60 to spell out the name previously directly input. Thereupon, the subscriber inputs the letter sequence M, E, I, E, R.
In conjunction with the orthographic information which is stored in the on-line lexicon 70, the alphabet identifier 30 ascertains similarity and makes a preselection from the list of available surnames stored in the on-line lexicon 70 under the place name. On account of identification uncertainties, the alphabetidentifier 30 ascertains a plurality of candidates, for example "Neier", "Meier", "Meter", "Mieter", "Neter", "Nieter", "Meiter", "Meider", etc.. This list of candidates is stored in the memory 40. The programcontrolled microcomputer 120 causes the keyword-spotter 50 to read out the user utterance "Meier" previously temporarily stored in the voice memory 20 and to load the pre-selected candidates which are in the memory 40.
On the basis of ascertaining similarity, the keywordspotter 50 compares the spoken name "Meier", which is directly input, with the list of candidates by using the phonetic information stored in the on-line lexicon 70. The keyword-spotter 50 supplies, for example, the names "Neier" and "Meier" as an identification result and stores them in the result memory 55. The voicedialog installation, on account of the phonetic and intonation-related information stored in the on-line lexicon 70, knows how to pronounce and intonate the identification results which have been found.
Thereupon, the names which have been found, in the present case the names "Neier" and "Meier", are successively transmitted by way of the synthesizer 60 to the telephone apparatus 10 of the caller. The caller can thereupon select the correct name. With this surname and the identified place name, a data bank inquiry of the electronic telephone directory 90 is then commenced. The names and addresses which are found are read out in a user-controlled manner, that is, the user can influence when the voice-output of the names and addresses which have been found is terminated and how often a list is read out or for which name additional information is to be output. In problem cases, it is possible for the caller to be connected through to an operator. As soon as the user of the voice-dialog installation indicates that the data output by way of the voice synthesizer 60 (first name, surname, street, street number) corresponds to the data of the person whose telephone number he is seeking, the microcomputer 120 causes the installation to read out the corresponding telephone number from the telephone directory 90 and inform the caller thereof verbally.
Owing to the lexicon-controlled identification of any utterances as a result of the combination of the alphabet-identifier 30 and the keyword-spotter 50, it is possible to process a clearly greater vocabulary at an acceptable identification rate than can conventional installations, which only use one voice identifier.
The reason for this can be seen in the fact that the alphabet-identifier 30 makes a pre-selection of the words which are to be identified and only this comparatively small selection of words which come into question is fed to the keyword-spotter 50 for actual identification.
Claims (19)
1. Voice-dialog method for automated output of information, having the following steps:
(a) intermittently loading orthographic-phonetic
information for a plurality of predetermined
utterances from a lexicon which is capable of
operating on-line, with the information being
available in real time;
(b) verbally requesting the user to input an
utterance;
(c) temporarily storing the utterance which has
been input;
(d) verbally requesting the user to spell the
utterance which has been input;
(e) in response to the spelt-out utterance,
identifying and selecting a plurality of the
predetermined, spelt-out reference utterances
with the aid of the stored orthographic
information on the basis of ascertaining
similarity;
(f) feeding the selected utterances and the
temporarily stored utterance to an utterance
identifier;
(g) identifying and selecting at least one
utterance from the selected utterances on the
basis of a similarity-comparison; and
(h) sequentially outputting the utterances found
in step (g) and the associated information in
synthesized voice form.
2. Voice-dialog method according to claim 1, characterised in that step (h) is repeated until the user terminates the synthesized voice output of the utterances.
3. Voice-dialog method according to claim 1 or 2, characterised in that steps (e) and (g) are terminated at the end of a predetermined time span and the user is requested to re-input his utterance if no utterance has been identified.
4. Voice-dialog method according to claim 2 or 3, characterised in that the user identifies one of the synthesized utterances as coinciding with his utterance, and in that, in response to this, an inquiry of an electronic telephone directory is commenced, which directory is capable of operating in real time and from which directory all of the data records meeting the criterion of the identified utterance are read out and made available to the user to choose from, and in that, on the basis of a name and an address read out from the directory, the user can identify the data record whose telephone number is to be output by the installation.
5. Voice-dialog method according to one of the claims 1 to 4, characterised in that orthographicphonetic information for predetermined utterances is loaded at predetermined instants from a lexicon which is capable of operating on-line.
6. Voice-dialog installation for carrying out the method according to one of the claims 1 to 5, comprising:
a device for the input of an utterance by a user,
at least one synthesizer for generating voice signals for the user,
a voice-inputting device,
an alphabet-identifier which can identify an utterance which is spelt out by the user and can select orthographically similar utterances from a plurality of predetermined spelt-out reference utterances,
an utterance-identifier which compares the utterance input by the user with the utterances selected by the alphabet-identifier and on the basis of ascertaining similarity supplies at least one utterance for output to the user, and
at least one lexicon which is capable of operating on-line and stores orthographic-phonetic information for the plurality of predetermined utterances which the alphabet-identifier, the utterance-identifier and the synthesizer can access in real time.
7. Voice-dialog installation according to claim 6, comprising a memory for temporary storage which temporarily stores the utterance input by the user, and by a memory which receives the utterances pre-selected by the alphabet-identifier.
8. Voice-dialog installation according to claim 6 or 7, characterised in that the utterance-identifier operates in keyword-spotting mode.
9. Voice-dialog installation according to one of the claims 6 to 8, characterised in that the data which is stored in the lexicon is orthographic, phonetic and intonation-related information for the predetermined utterances.
10. Voice-dialog installation according to claim 9, characterised in that additional information on homonyms is stored in the lexicon.
11. Voice-dialog installation according to one of the claims 6 to 10, characterised in that the utterance input by the user can be a place name, a surname or a plurality of words joined together.
12. Voice-dialog installation according to one of the claims 6 to 11, characterised in that the lexicon is capable of operating on-line and includes means for the storage of a general vocabulary, place names and surnames.
13. Voice-dialog installation according to one of the claims 6 to 12, characterised in that it is controlled by a program-controlled microcomputer.
14. Voice-dialog installation according to one of the claims 6 to 13, characterised in that the utterance-identifier and the alphabet-identifier are developed in such a way that they can reject an utterance input by the user and/or at the end of a given time span can ask the user to re-input his utterance.
15. Apparatus for speaker-independent voiceidentification, in particular for use in a voice-dialog installation according to one of the claims 6 to 14, having an alphabet-identifier which can identify an utterance spelt out by a user and can select several spelt-out reference utterances from a plurality of predetermined spelt-out reference utterances on the basis of ascertaining similarity, and having an utterance-identifier which, on the basis of ascertaining similarity, compares an utterance, which is input by the user and which corresponds to the spelt-out utterance, with the utterances which are preselected by the alphabet-identifier and supplies at least one output utterance as a result.
16. Apparatus for voice-identification according to claim 15, wherein the utterance-identifier operates in the keyword-spotting mode.
17. Apparatus for voice-identification according to claim 15 or 16, comprising a lexicon which stores orthographic and phonetic information on the plurality of predetermined utterances which the alphabetidentifier and the utterance-identifier can access in real time in order to ascertain utterances which sound alike or are orthographically similar.
18. A voice-dialog method, substantially as herein described with reference to Figure 2 of the accompanying drawings.
19. A voice-dialog installation, substantially as herein described with reference to, or as shown in,
Figure 1 of the accompanying drawings.
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
DE1995132114 DE19532114C2 (en) | 1995-08-31 | 1995-08-31 | Speech dialog system for the automated output of information |
Publications (3)
Publication Number | Publication Date |
---|---|
GB9618308D0 GB9618308D0 (en) | 1996-10-16 |
GB2304957A true GB2304957A (en) | 1997-03-26 |
GB2304957B GB2304957B (en) | 1999-09-29 |
Family
ID=7770897
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
GB9618308A Expired - Fee Related GB2304957B (en) | 1995-08-31 | 1996-09-02 | Voice-dialog system for automated output of information |
Country Status (3)
Country | Link |
---|---|
DE (1) | DE19532114C2 (en) |
FR (1) | FR2738382B1 (en) |
GB (1) | GB2304957B (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GB2353887A (en) * | 1999-09-04 | 2001-03-07 | Ibm | Speech recognition system |
GB2362746A (en) * | 2000-05-23 | 2001-11-28 | Vocalis Ltd | Data recognition and retrieval |
US6721702B2 (en) | 1999-06-10 | 2004-04-13 | Infineon Technologies Ag | Speech recognition method and device |
EP1693829A1 (en) * | 2005-02-21 | 2006-08-23 | Harman Becker Automotive Systems GmbH | Voice-controlled data system |
US7167545B2 (en) | 2000-12-06 | 2007-01-23 | Varetis Solutions Gmbh | Method and device for automatically issuing information using a search engine |
US7343288B2 (en) | 2002-05-08 | 2008-03-11 | Sap Ag | Method and system for the processing and storing of voice information and corresponding timeline information |
US7406413B2 (en) | 2002-05-08 | 2008-07-29 | Sap Aktiengesellschaft | Method and system for the processing of voice data and for the recognition of a language |
Families Citing this family (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
DE19907341A1 (en) * | 1999-02-20 | 2000-08-31 | Lutz H Karolus | Processing data as query information involves comparing original and alternative data files with data in connected database, outputting coinciding data to local data processing machine |
DE19907759C2 (en) * | 1999-02-23 | 2002-05-23 | Infineon Technologies Ag | Method and device for spelling recognition |
JP2001117828A (en) * | 1999-10-14 | 2001-04-27 | Fujitsu Ltd | Electronic device and storage medium |
EP1226576A2 (en) * | 1999-11-04 | 2002-07-31 | Telefonaktiebolaget Lm Ericsson | System and method of increasing the recognition rate of speech-input instructions in remote communication terminals |
DE10207895B4 (en) * | 2002-02-23 | 2005-11-03 | Harman Becker Automotive Systems Gmbh | Method for speech recognition and speech recognition system |
AT5730U3 (en) * | 2002-05-24 | 2003-08-25 | Roland Moesl | METHOD FOR FOGGING WEBSITES |
TWI298592B (en) * | 2005-11-18 | 2008-07-01 | Primax Electronics Ltd | Menu-browsing method and auxiliary-operating system of handheld electronic device |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0311414A2 (en) * | 1987-10-08 | 1989-04-12 | Nec Corporation | Voice controlled dialer having memories for full-digit dialing for any users and abbreviated dialing for authorized users |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
DE3732849A1 (en) * | 1987-09-29 | 1989-04-20 | Siemens Ag | SYSTEM ARCHITECTURE FOR AN ACOUSTIC HUMAN / MACHINE DIALOG SYSTEM |
US5131045A (en) * | 1990-05-10 | 1992-07-14 | Roth Richard G | Audio-augmented data keying |
US5293451A (en) * | 1990-10-23 | 1994-03-08 | International Business Machines Corporation | Method and apparatus for generating models of spoken words based on a small number of utterances |
DE69232407T2 (en) * | 1991-11-18 | 2002-09-12 | Kabushiki Kaisha Toshiba, Kawasaki | Speech dialogue system to facilitate computer-human interaction |
FR2690777A1 (en) * | 1992-04-30 | 1993-11-05 | Lorraine Laminage | Control of automaton by voice recognition - uses spelling of word or part of word by the operator to aid voice recognition and returns word recognised before acting |
WO1994014270A1 (en) * | 1992-12-17 | 1994-06-23 | Bell Atlantic Network Services, Inc. | Mechanized directory assistance |
-
1995
- 1995-08-31 DE DE1995132114 patent/DE19532114C2/en not_active Expired - Fee Related
-
1996
- 1996-08-28 FR FR9610517A patent/FR2738382B1/en not_active Expired - Fee Related
- 1996-09-02 GB GB9618308A patent/GB2304957B/en not_active Expired - Fee Related
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0311414A2 (en) * | 1987-10-08 | 1989-04-12 | Nec Corporation | Voice controlled dialer having memories for full-digit dialing for any users and abbreviated dialing for authorized users |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6721702B2 (en) | 1999-06-10 | 2004-04-13 | Infineon Technologies Ag | Speech recognition method and device |
GB2353887A (en) * | 1999-09-04 | 2001-03-07 | Ibm | Speech recognition system |
GB2353887B (en) * | 1999-09-04 | 2003-09-24 | Ibm | Speech recognition system |
US6629071B1 (en) | 1999-09-04 | 2003-09-30 | International Business Machines Corporation | Speech recognition system |
US6687673B2 (en) * | 1999-09-04 | 2004-02-03 | International Business Machines Corporation | Speech recognition system |
GB2362746A (en) * | 2000-05-23 | 2001-11-28 | Vocalis Ltd | Data recognition and retrieval |
US7167545B2 (en) | 2000-12-06 | 2007-01-23 | Varetis Solutions Gmbh | Method and device for automatically issuing information using a search engine |
US7343288B2 (en) | 2002-05-08 | 2008-03-11 | Sap Ag | Method and system for the processing and storing of voice information and corresponding timeline information |
US7406413B2 (en) | 2002-05-08 | 2008-07-29 | Sap Aktiengesellschaft | Method and system for the processing of voice data and for the recognition of a language |
EP1693829A1 (en) * | 2005-02-21 | 2006-08-23 | Harman Becker Automotive Systems GmbH | Voice-controlled data system |
US9153233B2 (en) | 2005-02-21 | 2015-10-06 | Harman Becker Automotive Systems Gmbh | Voice-controlled selection of media files utilizing phonetic data |
Also Published As
Publication number | Publication date |
---|---|
GB2304957B (en) | 1999-09-29 |
DE19532114C2 (en) | 2001-07-26 |
GB9618308D0 (en) | 1996-10-16 |
DE19532114A1 (en) | 1997-03-06 |
FR2738382B1 (en) | 1999-01-29 |
FR2738382A1 (en) | 1997-03-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP1049072B1 (en) | Graphical user interface and method for modifying pronunciations in text-to-speech and speech recognition systems | |
US7529678B2 (en) | Using a spoken utterance for disambiguation of spelling inputs into a speech recognition system | |
US8285537B2 (en) | Recognition of proper nouns using native-language pronunciation | |
KR100453021B1 (en) | Oral Text Recognition Method and System | |
US6999931B2 (en) | Spoken dialog system using a best-fit language model and best-fit grammar | |
US5454062A (en) | Method for recognizing spoken words | |
US7974843B2 (en) | Operating method for an automated language recognizer intended for the speaker-independent language recognition of words in different languages and automated language recognizer | |
US6975986B2 (en) | Voice spelling in an audio-only interface | |
US6996528B2 (en) | Method for efficient, safe and reliable data entry by voice under adverse conditions | |
KR19990008459A (en) | Improved Reliability Word Recognition Method and Word Recognizer | |
US5995931A (en) | Method for modeling and recognizing speech including word liaisons | |
GB2304957A (en) | Voice-dialog system for automated output of information | |
EP1975923A1 (en) | Multilingual non-native speech recognition | |
US9286887B2 (en) | Concise dynamic grammars using N-best selection | |
US7406408B1 (en) | Method of recognizing phones in speech of any language | |
EP0949606B1 (en) | Method and system for speech recognition based on phonetic transcriptions | |
EP1213706B1 (en) | Method for online adaptation of pronunciation dictionaries | |
US7430503B1 (en) | Method of combining corpora to achieve consistency in phonetic labeling | |
JPH0743599B2 (en) | Computer system for voice recognition | |
EP0786132B1 (en) | A method and device for preparing and using diphones for multilingual text-to-speech generating | |
EP1397797B1 (en) | Speech recognition | |
JP2006018028A (en) | Voice interactive method, voice interactive device, voice interactive device, dialog program, voice interactive program, and recording medium | |
WO2000036591A1 (en) | Speech operated automatic inquiry system | |
JPH0361954B2 (en) | ||
JP2005534968A (en) | Deciding to read kanji |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PCNP | Patent ceased through non-payment of renewal fee |
Effective date: 20100902 |