US5452397A - Method and system for preventing entry of confusingly similar phases in a voice recognition system vocabulary list - Google Patents
Method and system for preventing entry of confusingly similar phases in a voice recognition system vocabulary list Download PDFInfo
- Publication number
- US5452397A US5452397A US07/989,285 US98928592A US5452397A US 5452397 A US5452397 A US 5452397A US 98928592 A US98928592 A US 98928592A US 5452397 A US5452397 A US 5452397A
- Authority
- US
- United States
- Prior art keywords
- phrase
- received phrase
- phrases
- probability
- received
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Lifetime
Links
- 238000000034 method Methods 0.000 title claims abstract description 47
- 230000001419 dependent effect Effects 0.000 claims abstract description 26
- 230000008569 process Effects 0.000 claims description 22
- 238000004891 communication Methods 0.000 claims description 6
- 230000007246 mechanism Effects 0.000 claims description 4
- 230000001755 vocal effect Effects 0.000 claims 1
- 230000006870 function Effects 0.000 description 4
- 230000008901 benefit Effects 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 238000007792 addition Methods 0.000 description 2
- 230000007257 malfunction Effects 0.000 description 2
- 238000012552 review Methods 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 101000822695 Clostridium perfringens (strain 13 / Type A) Small, acid-soluble spore protein C1 Proteins 0.000 description 1
- 101000655262 Clostridium perfringens (strain 13 / Type A) Small, acid-soluble spore protein C2 Proteins 0.000 description 1
- 101000655256 Paraclostridium bifermentans Small, acid-soluble spore protein alpha Proteins 0.000 description 1
- 101000655264 Paraclostridium bifermentans Small, acid-soluble spore protein beta Proteins 0.000 description 1
- 230000004075 alteration Effects 0.000 description 1
- 230000002401 inhibitory effect Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 239000013598 vector Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
Definitions
- the present invention generally relates to voice recognition systems and, more particularly, to a method and system for preventing the entry of confusingly similar phrases in a vocabulary list of a speaker-dependent voice recognition system.
- Texas Instruments provides a TI System 1500 that permits voice recognition of commands for the purpose of performing numerous functions.
- An important feature of voice recognition system such as the TI System 1500 is the ability to speed dial or access a party with only a voice command. For example, a person may issue a command to the voice recognition system and say "Call Bob Johnson.” The voice recognition system will then access an associated pre-preprogrammed database having a telephone, or other access, number associated with a certain Bob Johnson. The system will then immediately use or dial that number.
- the present invention accordingly, provides a method and system that prevent the entry of confusingly similar phrases in a vocabulary list of a speaker-dependent voice recognition system that overcomes or reduces disadvantages and limitations associated with existing methods of building lists for voice recognition systems.
- One aspect of the present invention is a method for preventing the entry of confusingly similar phrases in a vocabulary list of a speaker-dependent voice recognition system that includes the steps of first receiving a phrase that is to be added or enrolled in the vocabulary list. The next step is to assign a first probability to all other phrases existing on the vocabulary list and a second, but lower, probability to the first-received phrase. The next step is to have the user repeat the phrase to be enrolled. The voice recognition system will then compare the repeated phrase to the entire vocabulary list that now includes the phrase as it was first-received. The next step is to indicate whether the repeated phrase matches a phrase on the vocabulary list other than the first-received phrase. Furthermore, the method includes the step of inhibiting the addition of the phrase in the event that the repeated phrase matches a phrase on the entire vocabulary list other than the first-received phrase.
- a technical advantage of the present invention is that it uses the same voice recognition algorithms existing in a speaker-dependent voice recognition system to make the distinctions between the confusingly similar phrases that a user may seek to add to the vocabulary list. This ensures that minimal additional costs will be necessary to perform the present invention. At the same time, and in critical instances, the present invention may substantially improve operation of speaker-dependent voice recognition systems.
- Another technical advantage of the present invention is that by notifying the user of the confusingly similar phrase, the present invention permits the user to immediately modify its input to avoid adding the confusingly similar phrase to the vocabulary list.
- FIG. 1 conceptually illustrates an exemplary vocabulary list of a speaker-dependent voice recognition system
- FIG. 2 provides a flow diagram of the steps and data flow of the enrollment process according to the preferred embodiment
- FIG. 3 shows a flow diagram of the vocabulary list update process according to the preferred embodiment
- FIG. 4 illustrates, by way of a flow diagram, the update grammar creation process according to the preferred embodiment
- FIG. 5 is a conceptual illustration of a vocabulary list modified for the purposes of the present invention.
- FIG. 6 is a system according to one embodiment of the present invention.
- FIG. 1 conceptually illustrates an exemplary list 10 of phrases that form a vocabulary list of a speaker-dependent voice recognition system for use with the preferred embodiment.
- the vocabulary list 10 of FIG. 1 may, for example, be a repertory dialing list of names for a voice-activated speed dialing system.
- a voice recognition system that may use the preferred embodiment of the present invention is a Texas Instruments System 1500 operating with a voice recognition application system having the phrase HG. Although this system is largely dependant upon software and algorithms to perform the inventive concepts of the present invention, circuitry and components may similarly perform the functions of the present invention. The present invention, therefore, clearly contemplates the use of circuitry to perform these functions.
- Vocabulary list 10 of FIG. 1 includes phases such as "Phrase 1" at the position indicated by reference numeral 12, "phrase 2" at the position that 14 indicates, and continuing on down to "Phrase N" at the position of reference numeral 16.
- One purpose of a speaker-dependent voice recognition system is to permit a user to add phrases to vocabulary list 10 and then call those phrases for command and control. For example, using the voice recognition system, a user may say “Call Phrase 1,” at which point the system recognizes the voice command and calls the person or location associated with “Phase 1.”
- vocabulary list 10 may end up with phrases that should not be on the vocabulary list. This causes confusion and may reduce the value or utility of vocabulary list 10.
- the preferred embodiment of the present invention provides a method and system to prevent confusingly similar phases from existing on vocabulary list 10.
- FIG. 2 shows the process flow of enrollment process 20 of the preferred embodiment.
- the user or subscriber will begin the enrollment process as indicated by block 22. This may be done by directing to the speaker-dependent voice recognition enrollment a command such as "review list.” The system will then, for example, prompt the subscriber by the command "say the speed-dial phrase" at step 24. The receiving circuitry of the voice recognition will then receive the phase to be added.
- the voice recognition system performs a process such as that known as Hidden Markov Modeling (HMM) enrollment using an energy based end-pointing or another suitable technique for identifying end points of the speech. This is performed at step 26.
- HMM Hidden Markov Modeling
- next steps are to create an HMM model using HMM technique and to add acoustic vectors to the subscriber template.
- the next step is to update vocabulary list 10.
- FIG. 3 illustrates the update process 30 of the preferred embodiment.
- the first step is for the voice recognition system to indicate to the subscriber that it is updating vocabulary list 10.
- the voice recognition system then prompts the subscriber to repeat the phrase.
- the voice recognition system may, for example, use the command: "Say the speed-dial name again," as indicated at block 34.
- the voice recognition system performs the HMM algorithm to search for the best phase based on the models loaded in a database associated with the voice recognition system at block 36. This is a comparison step that identifies whether the same or a confusingly similar phrase is on vocabulary list 10.
- the voice recognition system then returns the best model (i.e. phrase) and a score associated with that model.
- the method of the preferred embodiment queries whether the returned phrase is the same as the one that the subscriber seeks to enroll. If so, the voice recognition system communicates this information to the subscriber as a "success," as block 40 shows. The voice recognition system will then maintain the phrase on vocabulary list 10. If not, the system instructs the subscriber that the phrase he seeks to enroll is too similar to another phrase on vocabulary list 10, as block 42 shows. At this time, the method is to play back to the subscriber the confusingly similar phrase that is on vocabulary list 10 and then ask for instructions. The preferred embodiment, therefore, rejects the phrase from the enrollment process.
- a key process of the preferred embodiment is the creation of models that are acoustic models of all phrases on the template.
- FIG. 4 shows process 50 by which acoustic models are made for all phrases on vocabulary list 10.
- the voice recognition system updates the grammar creation.
- the system creates an HMM acoustic model for the phrase to be enrolled at block 54.
- the next step is assign a probability, as indicated by the variable "PROB,” to the model for the phrase being currently enrolled in vocabulary list 10, at block 56.
- all phrases that remain will be assigned a probability of 1.
- the value for the probability parameter PROB is chosen to maximize the discrimination of similar phrases while minimizing the probability that the phrase output is indeed different from the one being enrolled.
- FIG. 5 conceptual shows a vocabulary list as modified by the process of the preferred embodiment.
- “PHRASE 1" at position 12, “PHRASE 2" at position 14, on through “PHRASE N” at position 16, in the modified vocabulary list "PHRASE i" has been enrolled.
- Column 62 indicates the associated probability for each of the models on vocabulary list 10. For example, “PHRASE 1,” “PHRASE 2,” and “PHRASE N” all have a 1.0 probability value.
- the probability variable PROB having a value less than 1.0, is assigned position 60. This gives "PHRASE i" the probability necessary to support the discrimination that the preferred embodiment performs.
- the first step is to do an update in the learning process.
- This update begins with an enrollment by which a model of the added phrase is created.
- the next step is to have the user repeat the phrase and attempt to superimpose the first phrase on the second one or update it to make a better model.
- what is done in an attempt to make a better model is to assign an equal probability to all phrases or, similarly, update the single model irrespective of what other phrases are on the list.
- the preferred embodiment instead, evaluates the repeated phrase and, in the update, assigns probabilities to all phrases so that the system favors all other phrases over the phrase that the user seeks to add to the vocabulary list.
- the preferred embodiment tests the phrase to be enrolled against all other phrases that are presently on the vocabulary list. This is done by artificially lowering the probability that the system will recognize the second spoken phrase as the first spoken phrase. This entire comparison process is performed by the voice recognition system software such as the HG system software used in the Texas Instruments System 1500 voice recognition system.
- the alternative embodiment includes the steps of executing the recognition algorithm on the enrollment data using all phrase voice models except the one to be newly added to the system. In this situation, the recognizer will typically find the best match among the remaining models. The alternative method then employs a decision rule based on the score of the false match and the score obtained in the correct model to determine whether to accept the new phrase in vocabulary list 10. The alternative embodiment then adds the new phrase if the difference between the false match score and the correct match score is below a predetermined threshold. The voice recognition system will then inform the user that the phrase was or was not acceptable to add.
- the speaker dependent voice recognition system for example, is the TI System 1500 which when programmed with the HG software according to the above description and the above referenced flow charts, becomes a system with receiving circuitry for first receiving a first-received phrase for adding to a plurality of other phrases on a vocabulary list (FIG.
- the system provides maintaining circuitry for maintaining the first-received phrase on the list in the event that the comparison circuitry matches the second-received phrase to the first-received phrase.
- circuitry assigning probabilities is such that a probability maximizes the discrimination of similarities between the second-received phrase and each of the plurality of other phrases and minimizes the likelihood the voice recognition system will match the second-received phrase with a different phrase from the first-received phrase.
- the circuitry where the second probability is unitary.
- the communication circuitry includes circuitry playing a message stating that the first-received phrase is too similar to at least one of the plurality of other phrases.
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Artificial Intelligence (AREA)
- Telephonic Communication Services (AREA)
- Machine Translation (AREA)
- Document Processing Apparatus (AREA)
Abstract
Description
Claims (20)
Priority Applications (6)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US07/989,285 US5452397A (en) | 1992-12-11 | 1992-12-11 | Method and system for preventing entry of confusingly similar phases in a voice recognition system vocabulary list |
MYPI93002522A MY115138A (en) | 1992-12-11 | 1993-11-30 | Method and system preventing entry of confusingly similar phrases in a voice recognition system vocabulary list |
JP31064093A JP3388845B2 (en) | 1992-12-11 | 1993-12-10 | Method and apparatus for preventing the input of confusingly similar words |
DE69317229T DE69317229T2 (en) | 1992-12-11 | 1993-12-10 | Method and system for preventing the entry of confusingly similar sentences in a word list of a speech recognition system |
EP93309975A EP0601876B1 (en) | 1992-12-11 | 1993-12-10 | Method and system for preventing entry of confusingly similar phrases in a voice recognition system vocabulary list |
KR1019930027148A KR100283736B1 (en) | 1992-12-11 | 1993-12-10 | Method and system for preventing confusion of similar phrases into lexicon of speech recognition system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US07/989,285 US5452397A (en) | 1992-12-11 | 1992-12-11 | Method and system for preventing entry of confusingly similar phases in a voice recognition system vocabulary list |
Publications (1)
Publication Number | Publication Date |
---|---|
US5452397A true US5452397A (en) | 1995-09-19 |
Family
ID=25534958
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US07/989,285 Expired - Lifetime US5452397A (en) | 1992-12-11 | 1992-12-11 | Method and system for preventing entry of confusingly similar phases in a voice recognition system vocabulary list |
Country Status (6)
Country | Link |
---|---|
US (1) | US5452397A (en) |
EP (1) | EP0601876B1 (en) |
JP (1) | JP3388845B2 (en) |
KR (1) | KR100283736B1 (en) |
DE (1) | DE69317229T2 (en) |
MY (1) | MY115138A (en) |
Cited By (58)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5509104A (en) * | 1989-05-17 | 1996-04-16 | At&T Corp. | Speech recognition employing key word modeling and non-key word modeling |
US5649057A (en) * | 1989-05-17 | 1997-07-15 | Lucent Technologies Inc. | Speech recognition employing key word modeling and non-key word modeling |
US5717738A (en) * | 1993-01-11 | 1998-02-10 | Texas Instruments Incorporated | Method and device for generating user defined spoken speed dial directories |
US5737723A (en) * | 1994-08-29 | 1998-04-07 | Lucent Technologies Inc. | Confusable word detection in speech recognition |
US5737724A (en) * | 1993-11-24 | 1998-04-07 | Lucent Technologies Inc. | Speech recognition employing a permissive recognition criterion for a repeated phrase utterance |
US5752230A (en) * | 1996-08-20 | 1998-05-12 | Ncr Corporation | Method and apparatus for identifying names with a speech recognition program |
US5754977A (en) * | 1996-03-06 | 1998-05-19 | Intervoice Limited Partnership | System and method for preventing enrollment of confusable patterns in a reference database |
US5758322A (en) * | 1994-12-09 | 1998-05-26 | International Voice Register, Inc. | Method and apparatus for conducting point-of-sale transactions using voice recognition |
US5905789A (en) * | 1996-10-07 | 1999-05-18 | Northern Telecom Limited | Call-forwarding system using adaptive model of user behavior |
US5912949A (en) * | 1996-11-05 | 1999-06-15 | Northern Telecom Limited | Voice-dialing system using both spoken names and initials in recognition |
US5915238A (en) * | 1996-07-16 | 1999-06-22 | Tjaden; Gary S. | Personalized audio information delivery system |
US5915001A (en) * | 1996-11-14 | 1999-06-22 | Vois Corporation | System and method for providing and using universally accessible voice and speech data files |
US5917891A (en) * | 1996-10-07 | 1999-06-29 | Northern Telecom, Limited | Voice-dialing system using adaptive model of calling behavior |
US5987411A (en) * | 1997-12-17 | 1999-11-16 | Northern Telecom Limited | Recognition system for determining whether speech is confusing or inconsistent |
US6005927A (en) * | 1996-12-16 | 1999-12-21 | Northern Telecom Limited | Telephone directory apparatus and method |
US6012027A (en) * | 1997-05-27 | 2000-01-04 | Ameritech Corporation | Criteria for usable repetitions of an utterance during speech reference enrollment |
US6061654A (en) * | 1996-12-16 | 2000-05-09 | At&T Corp. | System and method of recognizing letters and numbers by either speech or touch tone recognition utilizing constrained confusion matrices |
US6122612A (en) * | 1997-11-20 | 2000-09-19 | At&T Corp | Check-sum based method and apparatus for performing speech recognition |
US6137863A (en) * | 1996-12-13 | 2000-10-24 | At&T Corp. | Statistical database correction of alphanumeric account numbers for speech recognition and touch-tone recognition |
US6141661A (en) * | 1997-10-17 | 2000-10-31 | At&T Corp | Method and apparatus for performing a grammar-pruning operation |
US6154579A (en) * | 1997-08-11 | 2000-11-28 | At&T Corp. | Confusion matrix based method and system for correcting misrecognized words appearing in documents generated by an optical character recognition technique |
US6167117A (en) * | 1996-10-07 | 2000-12-26 | Nortel Networks Limited | Voice-dialing system using model of calling behavior |
US6205261B1 (en) | 1998-02-05 | 2001-03-20 | At&T Corp. | Confusion set based method and system for correcting misrecognized words appearing in documents generated by an optical character recognition technique |
US6205428B1 (en) | 1997-11-20 | 2001-03-20 | At&T Corp. | Confusion set-base method and apparatus for pruning a predetermined arrangement of indexed identifiers |
US6208713B1 (en) | 1996-12-05 | 2001-03-27 | Nortel Networks Limited | Method and apparatus for locating a desired record in a plurality of records in an input recognizing telephone directory |
US6208965B1 (en) | 1997-11-20 | 2001-03-27 | At&T Corp. | Method and apparatus for performing a name acquisition based on speech recognition |
US6219453B1 (en) | 1997-08-11 | 2001-04-17 | At&T Corp. | Method and apparatus for performing an automatic correction of misrecognized words produced by an optical character recognition technique by using a Hidden Markov Model based algorithm |
US6223158B1 (en) | 1998-02-04 | 2001-04-24 | At&T Corporation | Statistical option generator for alpha-numeric pre-database speech recognition correction |
US6233560B1 (en) | 1998-12-16 | 2001-05-15 | International Business Machines Corporation | Method and apparatus for presenting proximal feedback in voice command systems |
US6243677B1 (en) * | 1997-11-19 | 2001-06-05 | Texas Instruments Incorporated | Method of out of vocabulary word rejection |
US20010003173A1 (en) * | 1999-12-07 | 2001-06-07 | Lg Electronics Inc. | Method for increasing recognition rate in voice recognition system |
US6400805B1 (en) | 1998-06-15 | 2002-06-04 | At&T Corp. | Statistical database correction of alphanumeric identifiers for speech recognition and touch-tone recognition |
US20020069059A1 (en) * | 2000-12-04 | 2002-06-06 | Kenneth Smith | Grammar generation for voice-based searches |
US20020128821A1 (en) * | 1999-05-28 | 2002-09-12 | Farzad Ehsani | Phrase-based dialogue modeling with particular application to creating recognition grammars for voice-controlled user interfaces |
US6535850B1 (en) | 2000-03-09 | 2003-03-18 | Conexant Systems, Inc. | Smart training and smart scoring in SD speech recognition system with user defined vocabulary |
US20030069729A1 (en) * | 2001-10-05 | 2003-04-10 | Bickley Corine A | Method of assessing degree of acoustic confusability, and system therefor |
US20030154077A1 (en) * | 2002-02-13 | 2003-08-14 | International Business Machines Corporation | Voice command processing system and computer therefor, and voice command processing method |
US20030163312A1 (en) * | 2002-02-26 | 2003-08-28 | Canon Kabushiki Kaisha | Speech processing apparatus and method |
US20040199375A1 (en) * | 1999-05-28 | 2004-10-07 | Farzad Ehsani | Phrase-based dialogue modeling with particular application to creating a recognition grammar for a voice-controlled user interface |
US20050143996A1 (en) * | 2000-01-21 | 2005-06-30 | Bossemeyer Robert W.Jr. | Speaker verification method |
US20050177376A1 (en) * | 2004-02-05 | 2005-08-11 | Avaya Technology Corp. | Recognition results postprocessor for use in voice recognition systems |
US20060025997A1 (en) * | 2002-07-24 | 2006-02-02 | Law Eng B | System and process for developing a voice application |
US20060190252A1 (en) * | 2003-02-11 | 2006-08-24 | Bradford Starkie | System for predicting speech recognition accuracy and development for a dialog system |
US20060203980A1 (en) * | 2002-09-06 | 2006-09-14 | Telstra Corporation Limited | Development system for a dialog system |
US20070005206A1 (en) * | 2005-07-01 | 2007-01-04 | You Zhang | Automobile interface |
US7266498B1 (en) * | 1998-12-18 | 2007-09-04 | Intel Corporation | Method and apparatus for reducing conflicts between speech-enabled applications sharing speech menu |
US20080126078A1 (en) * | 2003-04-29 | 2008-05-29 | Telstra Corporation Limited | A System and Process For Grammatical Interference |
US20080208578A1 (en) * | 2004-09-23 | 2008-08-28 | Koninklijke Philips Electronics, N.V. | Robust Speaker-Dependent Speech Recognition System |
US7630899B1 (en) | 1998-06-15 | 2009-12-08 | At&T Intellectual Property Ii, L.P. | Concise dynamic grammars using N-best selection |
US7653545B1 (en) | 1999-06-11 | 2010-01-26 | Telstra Corporation Limited | Method of developing an interactive system |
US7729913B1 (en) | 2003-03-18 | 2010-06-01 | A9.Com, Inc. | Generation and selection of voice recognition grammars for conducting database searches |
AU2004211007B2 (en) * | 2003-02-11 | 2010-08-19 | Telstra Corporation Limited | System for predicting speech recognition accuracy and development for a dialog system |
US20110137638A1 (en) * | 2009-12-04 | 2011-06-09 | Gm Global Technology Operations, Inc. | Robust speech recognition based on spelling with phonetic letter families |
US8200485B1 (en) | 2000-08-29 | 2012-06-12 | A9.Com, Inc. | Voice interface and methods for improving recognition accuracy of voice search queries |
US8275617B1 (en) | 1998-12-17 | 2012-09-25 | Nuance Communications, Inc. | Speech command input recognition system for interactive computer display with interpretation of ancillary relevant speech query terms into commands |
US20190079919A1 (en) * | 2016-06-21 | 2019-03-14 | Nec Corporation | Work support system, management server, portable terminal, work support method, and program |
US20190147853A1 (en) * | 2017-11-15 | 2019-05-16 | International Business Machines Corporation | Quantized dialog language model for dialog systems |
US11153472B2 (en) | 2005-10-17 | 2021-10-19 | Cutting Edge Vision, LLC | Automatic upload of pictures from a camera |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0885513A1 (en) * | 1996-12-17 | 1998-12-23 | Koninklijke Philips Electronics N.V. | Cordless telephone |
CN1216137A (en) * | 1996-12-24 | 1999-05-05 | 皇家菲利浦电子有限公司 | Method for training speech recognition system and apparatus for practising said method, in particular, portable telephone apparatus |
Citations (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0241163A1 (en) * | 1986-03-25 | 1987-10-14 | AT&T Corp. | Speaker-trained speech recognizer |
US4783804A (en) * | 1985-03-21 | 1988-11-08 | American Telephone And Telegraph Company, At&T Bell Laboratories | Hidden Markov model speech recognition arrangement |
US4977598A (en) * | 1989-04-13 | 1990-12-11 | Texas Instruments Incorporated | Efficient pruning algorithm for hidden markov model speech recognition |
USRE33597E (en) * | 1982-10-15 | 1991-05-28 | Hidden Markov model speech recognition arrangement | |
US5027406A (en) * | 1988-12-06 | 1991-06-25 | Dragon Systems, Inc. | Method for interactive speech recognition and training |
US5033087A (en) * | 1989-03-14 | 1991-07-16 | International Business Machines Corp. | Method and apparatus for the automatic determination of phonological rules as for a continuous speech recognition system |
US5054074A (en) * | 1989-03-02 | 1991-10-01 | International Business Machines Corporation | Optimized speech recognition system and method |
US5129002A (en) * | 1987-12-16 | 1992-07-07 | Matsushita Electric Industrial Co., Ltd. | Pattern recognition apparatus |
US5142585A (en) * | 1986-02-15 | 1992-08-25 | Smiths Industries Public Limited Company | Speech processing apparatus and methods |
US5212730A (en) * | 1991-07-01 | 1993-05-18 | Texas Instruments Incorporated | Voice recognition of proper names using text-derived recognition models |
US5218668A (en) * | 1984-09-28 | 1993-06-08 | Itt Corporation | Keyword recognition system and method using template concantenation model |
US5220639A (en) * | 1989-12-01 | 1993-06-15 | National Science Council | Mandarin speech input method for Chinese computers and a mandarin speech recognition machine |
US5271088A (en) * | 1991-05-13 | 1993-12-14 | Itt Corporation | Automated sorting of voice messages through speaker spotting |
-
1992
- 1992-12-11 US US07/989,285 patent/US5452397A/en not_active Expired - Lifetime
-
1993
- 1993-11-30 MY MYPI93002522A patent/MY115138A/en unknown
- 1993-12-10 JP JP31064093A patent/JP3388845B2/en not_active Expired - Lifetime
- 1993-12-10 DE DE69317229T patent/DE69317229T2/en not_active Expired - Lifetime
- 1993-12-10 KR KR1019930027148A patent/KR100283736B1/en not_active IP Right Cessation
- 1993-12-10 EP EP93309975A patent/EP0601876B1/en not_active Expired - Lifetime
Patent Citations (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
USRE33597E (en) * | 1982-10-15 | 1991-05-28 | Hidden Markov model speech recognition arrangement | |
US5218668A (en) * | 1984-09-28 | 1993-06-08 | Itt Corporation | Keyword recognition system and method using template concantenation model |
US4783804A (en) * | 1985-03-21 | 1988-11-08 | American Telephone And Telegraph Company, At&T Bell Laboratories | Hidden Markov model speech recognition arrangement |
US5142585A (en) * | 1986-02-15 | 1992-08-25 | Smiths Industries Public Limited Company | Speech processing apparatus and methods |
EP0241163A1 (en) * | 1986-03-25 | 1987-10-14 | AT&T Corp. | Speaker-trained speech recognizer |
US5129002A (en) * | 1987-12-16 | 1992-07-07 | Matsushita Electric Industrial Co., Ltd. | Pattern recognition apparatus |
US5027406A (en) * | 1988-12-06 | 1991-06-25 | Dragon Systems, Inc. | Method for interactive speech recognition and training |
US5054074A (en) * | 1989-03-02 | 1991-10-01 | International Business Machines Corporation | Optimized speech recognition system and method |
US5033087A (en) * | 1989-03-14 | 1991-07-16 | International Business Machines Corp. | Method and apparatus for the automatic determination of phonological rules as for a continuous speech recognition system |
US4977598A (en) * | 1989-04-13 | 1990-12-11 | Texas Instruments Incorporated | Efficient pruning algorithm for hidden markov model speech recognition |
US5220639A (en) * | 1989-12-01 | 1993-06-15 | National Science Council | Mandarin speech input method for Chinese computers and a mandarin speech recognition machine |
US5271088A (en) * | 1991-05-13 | 1993-12-14 | Itt Corporation | Automated sorting of voice messages through speaker spotting |
US5212730A (en) * | 1991-07-01 | 1993-05-18 | Texas Instruments Incorporated | Voice recognition of proper names using text-derived recognition models |
Non-Patent Citations (10)
Title |
---|
Bendelac et al., "Eyes free dialing for cellular telephones"; 17th Convention of Electrical and Electronics Engineers in Isreal, pp. 234-237, 5-7 Mar. 1991. |
Bendelac et al., Eyes free dialing for cellular telephones ; 17th Convention of Electrical and Electronics Engineers in Isreal, pp. 234 237, 5 7 Mar. 1991. * |
Fissore et al, "HMM modeling for speaker independent voice dialing in car environment"; ICASSP-92, pp. 249-252, vol. 1, 23-26 Mar. 1992. |
Fissore et al, HMM modeling for speaker independent voice dialing in car environment ; ICASSP 92, pp. 249 252, vol. 1, 23 26 Mar. 1992. * |
L. R. Rabiner, et al., A Voice Controlled, Repertory Dialer System, The Bell System Technical Journal, vol. 59, No. 7, Sep. 1980, USA, pp. 1153 1163. * |
L. R. Rabiner, et al., A Voice Controlled, Repertory-Dialer System, The Bell System Technical Journal, vol. 59, No. 7, Sep. 1980, USA, pp. 1153-1163. |
S. S. Awad, et al., A Voice Controlled Telephone Dialer, IEEE Transaction on Instrumentation & Measurement, vol. 38, No. 1, Feb. 1989, New York, pp. 119 1 25. * |
S. S. Awad, et al., A Voice Controlled Telephone Dialer, IEEE Transaction on Instrumentation & Measurement, vol. 38, No. 1, Feb. 1989, New York, pp. 119-1 25. |
Wheatley et al., "Robust automatic time alignment of orthographic transcriptions with unconstrained speech"; ICASSP '92, pp. 533-536, vol. 1, 1992. |
Wheatley et al., Robust automatic time alignment of orthographic transcriptions with unconstrained speech ; ICASSP 92, pp. 533 536, vol. 1, 1992. * |
Cited By (88)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5649057A (en) * | 1989-05-17 | 1997-07-15 | Lucent Technologies Inc. | Speech recognition employing key word modeling and non-key word modeling |
US5509104A (en) * | 1989-05-17 | 1996-04-16 | At&T Corp. | Speech recognition employing key word modeling and non-key word modeling |
US5717738A (en) * | 1993-01-11 | 1998-02-10 | Texas Instruments Incorporated | Method and device for generating user defined spoken speed dial directories |
US5737724A (en) * | 1993-11-24 | 1998-04-07 | Lucent Technologies Inc. | Speech recognition employing a permissive recognition criterion for a repeated phrase utterance |
US5737723A (en) * | 1994-08-29 | 1998-04-07 | Lucent Technologies Inc. | Confusable word detection in speech recognition |
US5758322A (en) * | 1994-12-09 | 1998-05-26 | International Voice Register, Inc. | Method and apparatus for conducting point-of-sale transactions using voice recognition |
US5754977A (en) * | 1996-03-06 | 1998-05-19 | Intervoice Limited Partnership | System and method for preventing enrollment of confusable patterns in a reference database |
US5915238A (en) * | 1996-07-16 | 1999-06-22 | Tjaden; Gary S. | Personalized audio information delivery system |
US6122617A (en) * | 1996-07-16 | 2000-09-19 | Tjaden; Gary S. | Personalized audio information delivery system |
US5752230A (en) * | 1996-08-20 | 1998-05-12 | Ncr Corporation | Method and apparatus for identifying names with a speech recognition program |
US5905789A (en) * | 1996-10-07 | 1999-05-18 | Northern Telecom Limited | Call-forwarding system using adaptive model of user behavior |
US5917891A (en) * | 1996-10-07 | 1999-06-29 | Northern Telecom, Limited | Voice-dialing system using adaptive model of calling behavior |
US6167117A (en) * | 1996-10-07 | 2000-12-26 | Nortel Networks Limited | Voice-dialing system using model of calling behavior |
US5912949A (en) * | 1996-11-05 | 1999-06-15 | Northern Telecom Limited | Voice-dialing system using both spoken names and initials in recognition |
US5915001A (en) * | 1996-11-14 | 1999-06-22 | Vois Corporation | System and method for providing and using universally accessible voice and speech data files |
US6885736B2 (en) | 1996-11-14 | 2005-04-26 | Nuance Communications | System and method for providing and using universally accessible voice and speech data files |
US6400806B1 (en) | 1996-11-14 | 2002-06-04 | Vois Corporation | System and method for providing and using universally accessible voice and speech data files |
US6208713B1 (en) | 1996-12-05 | 2001-03-27 | Nortel Networks Limited | Method and apparatus for locating a desired record in a plurality of records in an input recognizing telephone directory |
US6137863A (en) * | 1996-12-13 | 2000-10-24 | At&T Corp. | Statistical database correction of alphanumeric account numbers for speech recognition and touch-tone recognition |
US6061654A (en) * | 1996-12-16 | 2000-05-09 | At&T Corp. | System and method of recognizing letters and numbers by either speech or touch tone recognition utilizing constrained confusion matrices |
US6005927A (en) * | 1996-12-16 | 1999-12-21 | Northern Telecom Limited | Telephone directory apparatus and method |
US20080071538A1 (en) * | 1997-05-27 | 2008-03-20 | Bossemeyer Robert Wesley Jr | Speaker verification method |
US6012027A (en) * | 1997-05-27 | 2000-01-04 | Ameritech Corporation | Criteria for usable repetitions of an utterance during speech reference enrollment |
US6219453B1 (en) | 1997-08-11 | 2001-04-17 | At&T Corp. | Method and apparatus for performing an automatic correction of misrecognized words produced by an optical character recognition technique by using a Hidden Markov Model based algorithm |
US6154579A (en) * | 1997-08-11 | 2000-11-28 | At&T Corp. | Confusion matrix based method and system for correcting misrecognized words appearing in documents generated by an optical character recognition technique |
US6141661A (en) * | 1997-10-17 | 2000-10-31 | At&T Corp | Method and apparatus for performing a grammar-pruning operation |
US6243677B1 (en) * | 1997-11-19 | 2001-06-05 | Texas Instruments Incorporated | Method of out of vocabulary word rejection |
US6208965B1 (en) | 1997-11-20 | 2001-03-27 | At&T Corp. | Method and apparatus for performing a name acquisition based on speech recognition |
US6205428B1 (en) | 1997-11-20 | 2001-03-20 | At&T Corp. | Confusion set-base method and apparatus for pruning a predetermined arrangement of indexed identifiers |
US6122612A (en) * | 1997-11-20 | 2000-09-19 | At&T Corp | Check-sum based method and apparatus for performing speech recognition |
US5987411A (en) * | 1997-12-17 | 1999-11-16 | Northern Telecom Limited | Recognition system for determining whether speech is confusing or inconsistent |
US6223158B1 (en) | 1998-02-04 | 2001-04-24 | At&T Corporation | Statistical option generator for alpha-numeric pre-database speech recognition correction |
US6205261B1 (en) | 1998-02-05 | 2001-03-20 | At&T Corp. | Confusion set based method and system for correcting misrecognized words appearing in documents generated by an optical character recognition technique |
US6400805B1 (en) | 1998-06-15 | 2002-06-04 | At&T Corp. | Statistical database correction of alphanumeric identifiers for speech recognition and touch-tone recognition |
US7630899B1 (en) | 1998-06-15 | 2009-12-08 | At&T Intellectual Property Ii, L.P. | Concise dynamic grammars using N-best selection |
US20110202343A1 (en) * | 1998-06-15 | 2011-08-18 | At&T Intellectual Property I, L.P. | Concise dynamic grammars using n-best selection |
US8682665B2 (en) | 1998-06-15 | 2014-03-25 | At&T Intellectual Property Ii, L.P. | Concise dynamic grammars using N-best selection |
US9286887B2 (en) | 1998-06-15 | 2016-03-15 | At&T Intellectual Property Ii, L.P. | Concise dynamic grammars using N-best selection |
US7937260B1 (en) | 1998-06-15 | 2011-05-03 | At&T Intellectual Property Ii, L.P. | Concise dynamic grammars using N-best selection |
US6233560B1 (en) | 1998-12-16 | 2001-05-15 | International Business Machines Corporation | Method and apparatus for presenting proximal feedback in voice command systems |
US8831956B2 (en) | 1998-12-17 | 2014-09-09 | Nuance Communications, Inc. | Speech command input recognition system for interactive computer display with interpretation of ancillary relevant speech query terms into commands |
US8275617B1 (en) | 1998-12-17 | 2012-09-25 | Nuance Communications, Inc. | Speech command input recognition system for interactive computer display with interpretation of ancillary relevant speech query terms into commands |
US7266498B1 (en) * | 1998-12-18 | 2007-09-04 | Intel Corporation | Method and apparatus for reducing conflicts between speech-enabled applications sharing speech menu |
US8442812B2 (en) | 1999-05-28 | 2013-05-14 | Fluential, Llc | Phrase-based dialogue modeling with particular application to creating a recognition grammar for a voice-controlled user interface |
US8374871B2 (en) * | 1999-05-28 | 2013-02-12 | Fluential, Llc | Methods for creating a phrase thesaurus |
US10552533B2 (en) | 1999-05-28 | 2020-02-04 | Nant Holdings Ip, Llc | Phrase-based dialogue modeling with particular application to creating recognition grammars for voice-controlled user interfaces |
US8630846B2 (en) | 1999-05-28 | 2014-01-14 | Fluential, Llc | Phrase-based dialogue modeling with particular application to creating a recognition grammar |
US9251138B2 (en) | 1999-05-28 | 2016-02-02 | Nant Holdings Ip, Llc | Phrase-based dialogue modeling with particular application to creating recognition grammars for voice-controlled user interfaces |
US20020128821A1 (en) * | 1999-05-28 | 2002-09-12 | Farzad Ehsani | Phrase-based dialogue modeling with particular application to creating recognition grammars for voice-controlled user interfaces |
US20040199375A1 (en) * | 1999-05-28 | 2004-10-07 | Farzad Ehsani | Phrase-based dialogue modeling with particular application to creating a recognition grammar for a voice-controlled user interface |
US8650026B2 (en) | 1999-05-28 | 2014-02-11 | Fluential, Llc | Methods for creating a phrase thesaurus |
US7653545B1 (en) | 1999-06-11 | 2010-01-26 | Telstra Corporation Limited | Method of developing an interactive system |
US20010003173A1 (en) * | 1999-12-07 | 2001-06-07 | Lg Electronics Inc. | Method for increasing recognition rate in voice recognition system |
US7630895B2 (en) | 2000-01-21 | 2009-12-08 | At&T Intellectual Property I, L.P. | Speaker verification method |
US20050143996A1 (en) * | 2000-01-21 | 2005-06-30 | Bossemeyer Robert W.Jr. | Speaker verification method |
US6535850B1 (en) | 2000-03-09 | 2003-03-18 | Conexant Systems, Inc. | Smart training and smart scoring in SD speech recognition system with user defined vocabulary |
US8200485B1 (en) | 2000-08-29 | 2012-06-12 | A9.Com, Inc. | Voice interface and methods for improving recognition accuracy of voice search queries |
US20020069059A1 (en) * | 2000-12-04 | 2002-06-06 | Kenneth Smith | Grammar generation for voice-based searches |
US6973429B2 (en) | 2000-12-04 | 2005-12-06 | A9.Com, Inc. | Grammar generation for voice-based searches |
US20030069729A1 (en) * | 2001-10-05 | 2003-04-10 | Bickley Corine A | Method of assessing degree of acoustic confusability, and system therefor |
US7013276B2 (en) | 2001-10-05 | 2006-03-14 | Comverse, Inc. | Method of assessing degree of acoustic confusability, and system therefor |
US7299187B2 (en) * | 2002-02-13 | 2007-11-20 | International Business Machines Corporation | Voice command processing system and computer therefor, and voice command processing method |
US20030154077A1 (en) * | 2002-02-13 | 2003-08-14 | International Business Machines Corporation | Voice command processing system and computer therefor, and voice command processing method |
US20030163312A1 (en) * | 2002-02-26 | 2003-08-28 | Canon Kabushiki Kaisha | Speech processing apparatus and method |
US20060025997A1 (en) * | 2002-07-24 | 2006-02-02 | Law Eng B | System and process for developing a voice application |
US7712031B2 (en) | 2002-07-24 | 2010-05-04 | Telstra Corporation Limited | System and process for developing a voice application |
US20060203980A1 (en) * | 2002-09-06 | 2006-09-14 | Telstra Corporation Limited | Development system for a dialog system |
US8046227B2 (en) * | 2002-09-06 | 2011-10-25 | Telestra Corporation Limited | Development system for a dialog system |
AU2004211007B2 (en) * | 2003-02-11 | 2010-08-19 | Telstra Corporation Limited | System for predicting speech recognition accuracy and development for a dialog system |
US20060190252A1 (en) * | 2003-02-11 | 2006-08-24 | Bradford Starkie | System for predicting speech recognition accuracy and development for a dialog system |
US7917363B2 (en) * | 2003-02-11 | 2011-03-29 | Telstra Corporation Limited | System for predicting speech recognition accuracy and development for a dialog system |
US7729913B1 (en) | 2003-03-18 | 2010-06-01 | A9.Com, Inc. | Generation and selection of voice recognition grammars for conducting database searches |
US7840405B1 (en) | 2003-03-18 | 2010-11-23 | A9.Com, Inc. | Generation of speech recognition grammars for conducting searches |
US20110071827A1 (en) * | 2003-03-18 | 2011-03-24 | Lee Nicholas J | Generation and selection of speech recognition grammars for conducting searches |
US20080126078A1 (en) * | 2003-04-29 | 2008-05-29 | Telstra Corporation Limited | A System and Process For Grammatical Interference |
US8296129B2 (en) | 2003-04-29 | 2012-10-23 | Telstra Corporation Limited | System and process for grammatical inference |
US20050177376A1 (en) * | 2004-02-05 | 2005-08-11 | Avaya Technology Corp. | Recognition results postprocessor for use in voice recognition systems |
US7899671B2 (en) * | 2004-02-05 | 2011-03-01 | Avaya, Inc. | Recognition results postprocessor for use in voice recognition systems |
US20080208578A1 (en) * | 2004-09-23 | 2008-08-28 | Koninklijke Philips Electronics, N.V. | Robust Speaker-Dependent Speech Recognition System |
US20070005206A1 (en) * | 2005-07-01 | 2007-01-04 | You Zhang | Automobile interface |
US7826945B2 (en) * | 2005-07-01 | 2010-11-02 | You Zhang | Automobile speech-recognition interface |
US11153472B2 (en) | 2005-10-17 | 2021-10-19 | Cutting Edge Vision, LLC | Automatic upload of pictures from a camera |
US11818458B2 (en) | 2005-10-17 | 2023-11-14 | Cutting Edge Vision, LLC | Camera touchpad |
US20110137638A1 (en) * | 2009-12-04 | 2011-06-09 | Gm Global Technology Operations, Inc. | Robust speech recognition based on spelling with phonetic letter families |
US8195456B2 (en) | 2009-12-04 | 2012-06-05 | GM Global Technology Operations LLC | Robust speech recognition based on spelling with phonetic letter families |
US20190079919A1 (en) * | 2016-06-21 | 2019-03-14 | Nec Corporation | Work support system, management server, portable terminal, work support method, and program |
US20190147853A1 (en) * | 2017-11-15 | 2019-05-16 | International Business Machines Corporation | Quantized dialog language model for dialog systems |
US10832658B2 (en) * | 2017-11-15 | 2020-11-10 | International Business Machines Corporation | Quantized dialog language model for dialog systems |
Also Published As
Publication number | Publication date |
---|---|
EP0601876B1 (en) | 1998-03-04 |
MY115138A (en) | 2003-04-30 |
EP0601876A1 (en) | 1994-06-15 |
KR940015969A (en) | 1994-07-22 |
DE69317229T2 (en) | 1998-06-25 |
DE69317229D1 (en) | 1998-04-09 |
JPH06282291A (en) | 1994-10-07 |
KR100283736B1 (en) | 2001-03-02 |
JP3388845B2 (en) | 2003-03-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US5452397A (en) | Method and system for preventing entry of confusingly similar phases in a voice recognition system vocabulary list | |
US5832063A (en) | Methods and apparatus for performing speaker independent recognition of commands in parallel with speaker dependent recognition of names, words or phrases | |
US5983177A (en) | Method and apparatus for obtaining transcriptions from multiple training utterances | |
US6823307B1 (en) | Language model based on the speech recognition history | |
US6766295B1 (en) | Adaptation of a speech recognition system across multiple remote sessions with a speaker | |
US6925154B2 (en) | Methods and apparatus for conversational name dialing systems | |
KR100383352B1 (en) | Voice-operated service | |
US5717738A (en) | Method and device for generating user defined spoken speed dial directories | |
US5857169A (en) | Method and system for pattern recognition based on tree organized probability densities | |
US5895448A (en) | Methods and apparatus for generating and using speaker independent garbage models for speaker dependent speech recognition purpose | |
US6356868B1 (en) | Voiceprint identification system | |
US7974843B2 (en) | Operating method for an automated language recognizer intended for the speaker-independent language recognition of words in different languages and automated language recognizer | |
US5937383A (en) | Apparatus and methods for speech recognition including individual or speaker class dependent decoding history caches for fast word acceptance or rejection | |
EP0621532A1 (en) | Password verification system | |
WO2001046945A1 (en) | Learning of dialogue states and language model of spoken information system | |
US5832429A (en) | Method and system for enrolling addresses in a speech recognition database | |
WO2006101673A1 (en) | Voice nametag audio feedback for dialing a telephone call | |
JPH05181494A (en) | Apparatus and method for identifying audio pattern | |
JP2007124686A (en) | Method and system for enrolling address in speech recognition database | |
US7401023B1 (en) | Systems and methods for providing automated directory assistance using transcripts | |
US5995926A (en) | Technique for effectively recognizing sequence of digits in voice dialing | |
JP3945187B2 (en) | Dialog management device | |
JP3790038B2 (en) | Subword type speakerless speech recognition device | |
EP1160767B1 (en) | Speech recognition with contextual hypothesis probabilities | |
JP2000148178A (en) | Speech recognision system using composite grammar network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: TEXAS INSTRUMENTS INCORPORATED, TEXAS Free format text: ASSIGNMENT OF ASSIGNORS INTEREST.;ASSIGNORS:ITTYCHERIAH, ABRAHAM P.;WHEATLEY, BARBARA J.;REEL/FRAME:006354/0574 Effective date: 19921211 |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
FPAY | Fee payment |
Year of fee payment: 4 |
|
FPAY | Fee payment |
Year of fee payment: 8 |
|
FPAY | Fee payment |
Year of fee payment: 12 |
|
FEPP | Fee payment procedure |
Free format text: PAYOR NUMBER ASSIGNED (ORIGINAL EVENT CODE: ASPN); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
AS | Assignment |
Owner name: INTEL CORPORATION, CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:TEXAS INSTRUMENTS INCORPORATED;REEL/FRAME:041383/0040 Effective date: 20161223 |