JPH03157696A

JPH03157696A - Voice responding and recognizing system

Info

Publication number: JPH03157696A
Application number: JP1297832A
Authority: JP
Inventors: Junji Kojima; 小島　順治
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 1989-11-16
Filing date: 1989-11-16
Publication date: 1991-07-05

Abstract

PURPOSE:To improve recognition performance and to minimize the deterioration in the recognition performance by specifying a speaker by the transmission telephone number of the other person, using the standard pattern for the specific speaker only to the required person and changing the pattern to the standard pattern for nonspecific speakers when the result of the recognition is insufficient. CONSTITUTION:A network control section 1 specifies that the other person is the specific speaker when the transmission telephone number of the other person is received in a transmitter's number receiving section 5 of the network control section 1 via a switching network. The speaker is recognized rapidly by using the standard pattern for the specific speaker built in a voice recognizing section 2 and the voices of the other person are announced to a main control section 4 by which the voice response, such as required guidance, is executed and the recognition performance is enhanced. The voice recognizing section 2 changes the standard pattern to the standard pattern for the nonspecific speakers when the result of the recognition is insufficient. The recognition is thus executed and the deterioration in the recognition performance is thus minimized.

Description

【発明の詳細な説明】〔産業上の利用分野〕本発明は、ｌ５ＤＮ網のようなネットワークに接続され
て、サービスを行う音声応答認識装置の認識方式に関し
、特に音声ガイダンスを送出することにより、それに応
答して発声する利用者の音声を認識し、その認識結果に
よりサービスを行う音声応答認識方式に関する。[Detailed Description of the Invention] [Industrial Application Field] The present invention relates to a recognition method for a voice response recognition device that is connected to a network such as the 15DN network and provides services, and in particular, by transmitting voice guidance. The present invention relates to a voice response recognition method that recognizes the user's voice uttered in response to the voice response and provides a service based on the recognition result.

〔従来の技術Ｊ人手によっていた音声による情報提供サービスを、機械
による音声応答にすることによって、大幅な省力化、経
済化、および効率化が図れるため、近年は需要が増加し
てきている。音声を認識して。[Conventional Technology J] Demand has been increasing in recent years because it is possible to significantly save labor, become more economical, and improve efficiency by replacing voice-based information provision services that were previously performed manually with voice responses by machines. Recognize voice.

その内容により応答する音声応答認識サービスは、自動
案内による電話番号通知サービス、座席予約サービス、
証券会社の株価通知サービス、あるいは銀行残高照会サ
ービス等に広く用いられている。Voice response recognition services that respond depending on the content include automatic guidance phone number notification service, seat reservation service,
It is widely used for securities companies' stock price notification services, bank balance inquiry services, etc.

音声認識方法を個人差に対する対処方法により分類する
と、特定話者音声認識と不特定話者音声認識の２つの形
態がある。特定話者音声認識では、発声者が変更する毎
に標準パターンを！！録し直す必要がある。そのために
は、認識処理に先立って学習用音声が必要となるので、
発声者に対して学習のために予め録音しなければならな
い。それに対して、不特定話者認識では、全ての発声者
に対して直ちに認識処理に移れる。従って、不特定話者
認識は特定話者認識に比べて、発声者の違いによる音声
のばらつきを考慮した高度な認識技術が必要になる。不
特定話者の音声認識技術を用いた音声応答認識装置は、
既に商用化されており、銀行の残高照会や振込通知、証
券会社の株価照会等のサービスに使用されている。When speech recognition methods are classified based on how they deal with individual differences, there are two types: specific speaker speech recognition and non-specific speaker speech recognition. In speaker-specific speech recognition, a standard pattern is used every time the speaker changes! ! I need to re-record it. To do this, training audio is required prior to recognition processing, so
The speaker must be pre-recorded for learning purposes. In contrast, in speaker-independent recognition, recognition processing can be immediately started for all speakers. Therefore, compared to specific speaker recognition, speaker-independent recognition requires a more advanced recognition technique that takes into account the variations in speech due to differences in speakers. A voice response recognition device that uses speaker-independent voice recognition technology is
It has already been commercialized and is used for services such as bank balance inquiries and transfer notifications, and stock price inquiries at securities companies.

単語音声認識の第１の方式では、標準パターンとして単
語音声の短時間スペクトルの時系列自身を登録する。そ
して、入力音声の短時間スペクトルの時系列と標準パタ
ーンとのマツチングにより直接類似度を計算する。また
、第２の方式では。In the first method of word speech recognition, the time series itself of the short-term spectrum of word speech is registered as a standard pattern. Then, the degree of similarity is directly calculated by matching the time series of the short-term spectrum of the input speech and the standard pattern. Also, in the second method.

標準パターンとして音素の短時間スペクトルを登録し、
単語辞書には各単語を音素記号系列で登録する。そして
、認識処理では、入力音声の短時間スペクトルの時系列
を一旦、音素標準パターンを用いて認識し、その音素認
識結果と単語辞書の音素記号系列との間で単語の類似度
を計算する。Register the short-time spectrum of phonemes as a standard pattern,
Each word is registered in the word dictionary as a phoneme symbol sequence. In the recognition process, the time series of the short-time spectrum of the input speech is once recognized using a phoneme standard pattern, and the word similarity is calculated between the phoneme recognition result and the phoneme symbol sequence in the word dictionary.

なお、音声認識方式については、例えば、ｒ電子情報通
信ハンドブック第１分冊」昭和６３年３月３０日（株）
オーム社発行、ｐｐ、１１９１〜１２０７に記載されて
いる。For information on voice recognition methods, see, for example, "Electronic Information and Communication Handbook Vol. 1" March 30, 1988, Co., Ltd.
Published by Ohmsha, pp. 1191-1207.

[Problem to be solved by the invention]

ところで、不特定話者の音声認識では、話者によっては
非常に良好な認識性能を有するが、ある特定の話者に対
しては認識性能が著しく劣化するため、平均の認識性能
だけでは評価できない場合、例えばシーブボート現象等
がある。従って、特定話者に対する認識性能の劣化の解
決が、音声認識を用いたシステムの課題となっている。By the way, in speech recognition for non-specific speakers, recognition performance is very good for some speakers, but recognition performance deteriorates significantly for certain speakers, so it cannot be evaluated based on average recognition performance alone. For example, there is a sheave boat phenomenon. Therefore, solving the deterioration of recognition performance for a specific speaker is an issue for systems using speech recognition.

しかしながら、従来、実用化されている装置は、不特定
話者用の音声認識装置を用いており、ある特定の話者に
対して認識性能が良好でなくても、標準パターンを個別
に追加することはできない。However, conventional devices that have been put into practical use use speech recognition devices for unspecified speakers, and even if the recognition performance is not good for a specific speaker, standard patterns are added individually. It is not possible.

一方、全ての利用者に対して個別の標準パターンを用意
する方法も考えられるが、この方法は、メモリ容量が膨
大となるため、大規模システムでは実現不可能である。On the other hand, it is also possible to prepare individual standard patterns for all users, but this method requires an enormous amount of memory and is not practical in a large-scale system.

また、不特定話者用、特定話者用の両方の標準パターン
を備えた音声応答認識装置を設置する方法も考えられる
が、現状のアナログ網では、仮に特定話者用の標準パタ
ーンを用意しても、何等かの方法で話者の同定をする必
要があり、実現には困難があった。なお、話者の同定方
法として、音声応答認識装置に着信した後、ある一定の
入力機会を設けて、話者識別信号（例えば、加入者番号
等）を入力させる方法も考えられるが、その入力手段（
信号送受信装置等）を設ける必要があり、しかもシーケ
ンスが複雑となる等の問題もある。Another option is to install a voice response recognition device that has standard patterns for both unspecified speakers and specific speakers, but with the current analog network, it is possible to prepare standard patterns for specific speakers. However, it was difficult to realize this because it required some method of identifying the speaker. In addition, as a method for identifying the speaker, it is possible to provide a certain input opportunity after a call arrives at the voice response recognition device and have the speaker input signal (for example, subscriber number, etc.). means(
There are also problems such as the need to provide a signal transmitting/receiving device (signal transmitting/receiving device, etc.), and the sequence becomes complicated.

なお、不特定話者の標準パターンは、通常、多くの人に
対して平均的に認識性能が高くなるように設計されてい
るので、特に問題はないのであるが、ある特定の話者に
対してのみ認識性能の向上が望まれている。Note that standard patterns for non-specific speakers are usually designed to have high recognition performance on average for many people, so there is no particular problem. Improvements in recognition performance are desired only in these areas.

本発明の目的は、このような従来の課題を解決し、不特
定話者のうち、認識性能が著しく劣化するような特定の
話者に対しても、認識性能を向上させることができる音
声応答認識方式を提供することにある。An object of the present invention is to solve such conventional problems and to provide a voice response that can improve recognition performance even for specific speakers among unspecified speakers for whom recognition performance is significantly degraded. The objective is to provide a recognition method.

〔課題を解決するための手段１上記目的を達成するため、本発明の音声応答認識方式は
、音声応答認識装置内に不特定８６者用の標準パターン
と複数の特定話者用の標準パターンとを登録しておき、
発信者の電話番号が通知されない場合には、不特定話者
用の標準パターンを選択して認識処理を行い、発信者の
電話番号が通知された場合には、不特定話者の標準パタ
ーンあるいは該電話番号に対応する特定話者用の標準パ
ターンのいずれか一方を選択して認識処理を行い、該特
定話者用の標準パターンで認識処理を行った結果、所定
の性能を満たさないときには、自動的に不特定パターン
に変更して、再度の認識処理およびその後のサービスを
続行することに特徴がある。[Means for Solving the Problems 1] In order to achieve the above object, the voice response recognition method of the present invention includes a standard pattern for 86 unspecified speakers and a standard pattern for a plurality of specific speakers in the voice response recognition device. Register and
If the caller's phone number is not notified, the standard pattern for unspecified speakers is selected and recognition processing is performed, and if the caller's phone number is notified, the standard pattern for unspecified speakers or the standard pattern for unspecified speakers is selected. If one of the standard patterns for a specific speaker corresponding to the telephone number is selected and recognized, and as a result of performing the recognition process using the standard pattern for the specific speaker, the predetermined performance is not satisfied. The feature is that it automatically changes to an unspecified pattern and continues the recognition process and subsequent services.

〔作　　用］本発明においては、相手を特定する方法の１つとして、
ネットワークから通知される発信者電話番号を用い、そ
れにより特定された利用者に対し、特定話者用の標準パ
ターンを用いることにより、認識性能の向上を図ってい
る。すなわち、先ずネットワークから通知される発信者
電話番号を受信することにより相手を特定し、特定した
利用者に対して特定者用標準パターンの必要性があるか
否かを判断し、必要であれば、電話番号−標準パターン
対応テーブルを検索して特定話者パターンを読み出し、
それを用いて認識を行う。その際に、認識性能が良けれ
ば、特定話者パターンによる認識を行い、認識性能が悪
ければ、不特定話者パターンによる認識を行う。[Function] In the present invention, as one of the methods for identifying the other party,
The recognition performance is improved by using the caller's telephone number notified from the network and using a standard pattern for a specific speaker for the user identified by the caller's telephone number. That is, first, the caller is identified by receiving the caller's phone number notified from the network, and it is determined whether or not there is a need for a standard pattern for specified users for the identified user, and if necessary, , retrieve the specific speaker pattern by searching the telephone number-standard pattern correspondence table,
Recognition is performed using it. At this time, if the recognition performance is good, recognition is performed using a specific speaker pattern, and if the recognition performance is poor, recognition is performed using an unspecified speaker pattern.

〔実施例Ｊ以下、本発明の実施例を、図面により詳細に説明する。[Example J Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.

第２図は、本発明の音声応答認識方式を適用した装置の
ブロック図である。FIG. 2 is a block diagram of a device to which the voice response recognition method of the present invention is applied.

音声応答認識装置は、ｌ５ＤＮ等のネットワークを制御
する網制御部ｌと、話者の音声を認識する音声認識部２
と、着信に対して音声により応答する音声応答部３と、
これらの回路を制御する主制御部４とから構成される。The voice response recognition device includes a network control unit 1 that controls a network such as 15DN, and a voice recognition unit 2 that recognizes the speaker's voice.
and a voice response unit 3 that responds to an incoming call with voice;
It is composed of a main control section 4 that controls these circuits.

網制御部ｌは、発着信を行う発着信機能と、発信者の電
話番号を受信するための発信者番号受信部５を備えてお
り、主制御部４との間では連絡線を介して必要な情報の
授受を行っている。音声認識部２は、利用者の音声を認
識すると、その結果を主制御部４に報告する。また、音
声応答部３は、利用者に音声ガイダンスで応答し、主制
御部４の指示のもとに種々の異なる音声ガイダンスを出
力する。主制御部４では、各部の制御を行う他に、サー
ビスの進捗管理および蓄積データの検索等の業務処理を
行う。The network control unit l is equipped with a call origination/reception function for making and receiving calls, and a caller number receiving unit 5 for receiving the caller's telephone number, and is connected to the main control unit 4 via a communication line. We are exchanging and receiving information. When the voice recognition unit 2 recognizes the user's voice, it reports the result to the main control unit 4. Further, the voice response section 3 responds to the user with voice guidance, and outputs various different voice guidance under instructions from the main control section 4. The main control unit 4 not only controls each unit but also performs business processing such as service progress management and stored data search.

第３図は、第２図における音声認識部の構成例を示す図
である。FIG. 3 is a diagram showing an example of the configuration of the speech recognition section in FIG. 2.

第３図において、６は入力された音声を分析する音声分
析部、７は標準パターンを記憶しておく標準パターンメ
モリ、８は標準パターンと入力された音声とを比較照合
する照合演算部、９は比較照合の結果、一致するか否か
を判定し、辞書と照合してどの単語であるかを判別する
結果判定部、ｌＯは単語辞書を格納する辞書メモリであ
る。In FIG. 3, reference numeral 6 denotes a voice analysis unit that analyzes the input voice, 7 a standard pattern memory that stores standard patterns, 8 a matching operation unit that compares and matches the standard pattern and the input voice, and 9 is a result determination unit that determines whether or not there is a match as a result of comparison and verification, and compares it with a dictionary to determine which word it is, and lO is a dictionary memory that stores a word dictionary.

入力された音声に対して、先ず音声分析部６で音声の特
徴パラメータの抽出を行う。標準パターンメモリ７には
、予め標準的な音声が特徴パラメータの形で蓄積されて
いる。照合演算部８では、入力音声と標準パターンとの
照合演算を行い、入力データと標準パターンとの距離を
計算して、その結果を結果判定部９に出力する。結果判
定部９では、距離計算の結果をもとに、辞書メモリ１０
の内容と照合し、入力された音声がどのような言葉であ
るかを判定して、主制御部４に通知する。First, the voice analysis section 6 extracts voice characteristic parameters from the input voice. In the standard pattern memory 7, standard sounds are stored in advance in the form of characteristic parameters. The matching calculation unit 8 performs a matching calculation between the input voice and the standard pattern, calculates the distance between the input data and the standard pattern, and outputs the result to the result determination unit 9. The result determination unit 9 stores the dictionary memory 10 based on the distance calculation result.
, and determines what kind of words the input voice is, and notifies the main control unit 4 of the same.

第４図は、第３図における標準パターンメモリに格納さ
れる標準パターンの構成例を示す図である。FIG. 4 is a diagram showing an example of the configuration of the standard pattern stored in the standard pattern memory in FIG. 3.

標準パターンの作成方法については、種々の方法が考え
られる。ここでは、特定話者と不特定話者の標準パター
ンが作成されているものとする。Various methods can be considered for creating the standard pattern. Here, it is assumed that standard patterns for specific speakers and non-specific speakers have been created.

（ａ）に示すように、不特定話者の標準パターンとして
は、男性用と女性用の２種類が用意されている。この場
合、最初は利用者が男性か女性か判別できないので、両
方を用いて認識する。また、（ｂ）に示すように、特定
話者用の標準パターンとしては、必要な話者の数だけｎ
個用意される。As shown in (a), two types of standard patterns for unspecified speakers are prepared: one for men and one for women. In this case, since it is initially impossible to determine whether the user is male or female, both are used for recognition. In addition, as shown in (b), as a standard pattern for specific speakers, n
Each will be prepared.

第１図は、本発明の一実施例を示す音声応答認識方式の
処理フローチャートである。FIG. 1 is a processing flowchart of a voice response recognition method showing an embodiment of the present invention.

着信があると、着信時に発信電話番号が通知されている
か否かを確認しくステップ１０１）％通知されていなけ
れば、不特定話者の標準パターンを選択して処理を行う
（ステップ１０７）。また、通知されている場合には、
先ずその電話番号と特定話者用の標準パターンの必要性
の有無を主制御部４が判断しくステップ１０２）、必要
性が無ければ、不特定話者パターンによる認識処理を行
う（ステップ１０７）、必要性が有れば、その電話番号
に対応する標準パターンをテーブルから求め（ステツブ
１０３）、その特定話者用標準パターンを用いて認識処
理を行う（ステップ１０４）。認識処理の結果（ステッ
プ１０５）、認識性能がよい場合には、その特定話者パ
ターンを選択して、それ以後の認識はその標準パターン
で行い（ステップ１０６）、もし認識性能が悪いときに
は、不特定話者パターンを選択して、それ以後の認識処
理を行う（ステップ１０７）、なお、特定話者用の標準
バタンの必要性の判断については、電話番号自体を主制
御部４で照合してもよく、また電話番号と必要性の有無
を記述したテーブルを用いてもよい。When there is an incoming call, it is checked whether the calling telephone number was notified at the time of the incoming call (step 101). If not, a standard pattern for an unspecified speaker is selected and processed (step 107). Additionally, if notified,
First, the main control unit 4 determines whether there is a need for a standard pattern for the telephone number and a specific speaker (step 102), and if there is no need, performs recognition processing using an unspecified speaker pattern (step 107). If necessary, a standard pattern corresponding to the telephone number is obtained from the table (step 103), and recognition processing is performed using the standard pattern for the specific speaker (step 104). As a result of the recognition processing (step 105), if the recognition performance is good, that specific speaker pattern is selected and subsequent recognition is performed using that standard pattern (step 106), and if the recognition performance is poor, the specific speaker pattern is selected. A specific speaker pattern is selected and subsequent recognition processing is performed (step 107). Note that the necessity of a standard button for a specific speaker is determined by collating the telephone number itself with the main control unit 4. Alternatively, a table describing telephone numbers and whether or not they are necessary may be used.

このように、本発明では、ネットワークから通知された
発信者の電話番号を利用して標準パターンの選択を行う
ことにより、総合的な音声認識性能を向上させる。しか
し、発信者電話番号の通知機能については、ネットワー
クへの機能として具備していても、利用者が電話番号の
送出の有無を選択できることになっているので、通知さ
れない場合の処理も考慮している。また、発信電話番号
を利用すれば、概ね発信者を同定することができるが、
例えば、家族等が利用する場合もあるため、正確に同一
であるとは限らない。In this manner, the present invention improves overall speech recognition performance by selecting a standard pattern using the caller's telephone number notified from the network. However, even if the caller's phone number notification function is provided as a network function, the user can choose whether or not to send the phone number, so consideration should be given to what to do if the caller's phone number is not sent. There is. Additionally, the caller can generally be identified by using the calling phone number, but
For example, it may be used by family members, so it may not be exactly the same.

このような問題を考慮して、利用者の音声を認識して自
動的に応答しながらサービスを進める本発明の音声応答
認識装置においては、利用者の発信電話番号が通知され
ない場合、不特定話者用の標準パターンを選択して認識
処理を行い、また、利用者の発信電話番号が通知され、
かつシステムが特定話者用の標準パターンの必要有りと
判断した場合にのみ、その電話番号に対応する特定話者
用の標準パターンを選択して認識処理を行う。In consideration of such problems, the voice response recognition device of the present invention recognizes the user's voice and automatically responds while proceeding with the service. A standard pattern for users is selected and recognition processing is performed, and the user's calling phone number is notified.
Only when the system determines that a standard pattern for a specific speaker is necessary, a standard pattern for a specific speaker corresponding to the telephone number is selected and recognition processing is performed.

また、利用者の発信電話番号が受信された場合でも、そ
の利用者に対して不特定話者用の標準パターンで既に良
好な認識性能が得られている場合には、不特定話者用の
標準パターンを選択して認識処理を行うことにする。さ
らに、同じ電話番号から想定された利用者と異なる人間
が発声する場合に備えて、発信電話番号に基づいて予め
選択したパターンが所定の性能を満たさないときには、
自動的に不特定話者用のパターンに変更して、サービス
を継続する。In addition, even if a user's calling phone number is received, if good recognition performance has already been obtained for that user using the standard pattern for speaker-independent use, We will select a standard pattern and perform recognition processing. Furthermore, in case the pattern selected in advance based on the calling phone number does not satisfy the predetermined performance, in case a person different from the expected user utters the same phone number,
The pattern is automatically changed to one for unspecified speakers and the service continues.

第５図は、本発明で用いられる電話番号と標準パターン
の対応テーブルの例を示す図である。FIG. 5 is a diagram showing an example of a correspondence table between telephone numbers and standard patterns used in the present invention.

ここでは、特定話者用の標準パターンの必要な電話番号
に対応して、ｎ個の標準パターンが用意されているもの
とする。このテーブルは、単に標準パターンの先頭番地
を記述するようなものでも差し支えない。また、第１図
の処理とは異なるが、特定話者用の標準パターンの必要
性の有無の判断と併せて、１つのテーブルで処理を行う
ことも可能である。Here, it is assumed that n standard patterns are prepared corresponding to telephone numbers that require standard patterns for specific speakers. This table may be one that simply describes the starting addresses of standard patterns. Although different from the process shown in FIG. 1, it is also possible to perform the process using one table in addition to determining whether or not a standard pattern for a specific speaker is necessary.

第６図は１本発明の他の実施例に用いられるテーブル例
を示す図である。FIG. 6 is a diagram showing an example of a table used in another embodiment of the present invention.

第６図では、電話番号部分には対象者全ての電話番号が
示されており、それに対応して選択すべき標準パターン
の格納アドレス番号が示されている。また、不特定話者
用の標準パターンを用いる場合には、その旨が記述され
ている。この場合には、パターン番号Ｏとして、不特定
用を表わしている。また、ここでは、電話番号１番に対
応する標準パターンの格納アドレスは１０００番地、電
話番号４番に対応する標準パターンの格納アドレスは２
０００番地、・・・・どなっている。In FIG. 6, the telephone number portion shows the telephone numbers of all the subjects, and correspondingly the storage address numbers of the standard pattern to be selected are shown. Furthermore, if a standard pattern for unspecified speakers is used, this is stated. In this case, the pattern number O indicates unspecified use. Also, here, the storage address of the standard pattern corresponding to telephone number 1 is address 1000, and the storage address of the standard pattern corresponding to telephone number 4 is address 2.
Address 000...what's going on?

第６図のテーブルを用いれば、全ての電話番号に対して
、特定話者用の標準パターンが必要か否かの判断をＯま
たは０以外の番号で行い、その電話番号に対応する特定
話者用の標準パターン格納アドレスを見て、そのアドレ
スでメモリにアクセスし、その標準パターンを読み出せ
ばよい。Using the table in Figure 6, it is possible to judge whether a standard pattern for a specific speaker is required for all telephone numbers using O or a number other than 0, and All you have to do is look at the standard pattern storage address for , access the memory at that address, and read the standard pattern.

第７図は、本発明で用いられる標準パターン変更の場合
の例を示す図である。FIG. 7 is a diagram showing an example of changing the standard pattern used in the present invention.

ここでは、認識性能のチエツクにより、標準パターンを
変更する場合の具体例を示している。Here, a specific example is shown in which the standard pattern is changed by checking the recognition performance.

すなわち、本発明では、電話番号により利用者を特定し
ているため、必ずしも期待された利用者が発声していな
い場合も考えられる。また、特定話者の標準パターンの
場合、風邪等の突然の声質の変化に対応し難いことが考
えられる。そこで、電話番号に対応する特定話者の標準
パターンを選択して使用した場合、認識性能をチエツク
し、その結果がよくないときには、再び不特定話者の標
準パターンに変更して、認識処理を進める。That is, in the present invention, since the user is specified by the telephone number, there may be cases where the expected user does not necessarily speak. Furthermore, in the case of a standard pattern for a specific speaker, it may be difficult to respond to sudden changes in voice quality such as when a person has a cold. Therefore, when a standard pattern for a specific speaker corresponding to a telephone number is selected and used, the recognition performance is checked, and if the results are not good, the standard pattern is changed again to the standard pattern for an unspecified speaker and recognition processing is performed. Proceed.

例えば、第７図に示すように、先ず出力ガイダンスで「
暗証番号をどうぞ」と発信者に対して送出し、その発信
者からの入力音声がｌｒｌ　２３４Ａであるとき、これ
を特定話者用の標準パターンで認識した結果、あまりよ
くなかったとする、そこで、その認識結果の暗証番号で
ある［ｒ２２８５ですね」と出力ガイダンスで間合わせ
る。それに対して、発信者から「いいえ」と否定の回答
が返送されてきたので、出力ガイダンスで「もう−度暗
証番号をどうぞ」と間合わせる。再度、発信者から暗証
番号１ｒ１２３４ｊが送られてくると、次は不特定話者
用の標準パターンを用いて認識処理を行う。このように
、認識性能のチエツクにより、適宜、特定話者用標準パ
ターンから不特定話者用標準パターンに変更することに
より、不適切な標準パターン選択に対して認識劣化を必
要最小限に抑えることができる。なお、認識装置に認識
を指示するためのコマンドを発行するが、そのコマンド
の後に数バイトの標準パターン番号を付加するだけであ
るため、リアルタイムで標準パターンの変更処理が可能
である。For example, as shown in Figure 7, first the output guidance says "
"Please give me your PIN number" is sent to the caller, and when the input voice from the caller is lrl 234A, this is recognized using a standard pattern for a specific speaker, and the results are not very good.So, The output guidance is ``R2285,'' which is the PIN number resulting from the recognition. In response, the caller returned a negative answer of ``No,'' so the output guidance was ``Please enter your PIN again.'' When the sender sends the PIN number 1r1234j again, recognition processing is performed using a standard pattern for unspecified speakers. In this way, by checking the recognition performance and changing the standard pattern for specific speakers to the standard pattern for non-specific speakers as appropriate, recognition deterioration due to inappropriate standard pattern selection can be minimized to the necessary minimum. Can be done. Note that although a command is issued to instruct the recognition device to perform recognition, only a few bytes of a standard pattern number are added after the command, so it is possible to change the standard pattern in real time.

従来は、不特定話者用の標準パターンのみを用いてサー
ビスを行っていたので、特定の人に対しては著しく認識
性能が劣化する現象が生じていたが、本発明では、相手
の発信電話番号によりある程度話者を特定して、必要な
人にだけ特定話者用の標準パターンを用いることにより
、認識性能を向上させている。また、発信電話番号に対
応して想定される話者が出ない場合、あるいは風邪等に
より声質の変化が起こった場合に備えて、特定話者用の
標準パターンを選択した後も認識性能をチエツクし、性
能が十分でないときには不特定話者用の標準パターンに
再び変更できるようにする。In the past, services were provided using only standard patterns for unspecified speakers, resulting in a phenomenon in which recognition performance was significantly degraded for specific people. Recognition performance is improved by identifying speakers to some extent by numbers and using standard patterns for specific speakers only for those who need them. In addition, the recognition performance is checked even after selecting a standard pattern for a specific speaker, in case the expected speaker does not appear for the calling phone number, or the voice quality changes due to a cold, etc. However, if the performance is not sufficient, it is possible to change back to the standard pattern for unspecified speakers.

〔Effect of the invention〕

以上説唄したように、本発明によれば、相手の発信電話
番号によって話者を特定し、必要な人にだけ特定話者用
の標準パターンを用いて認識処理を行うことにより、認
識性能の向上を図るとともに、認識性能が不十分の場合
には不特定話者用標準パターンに変更することにより、
認識性能の劣化を最小限に抑えることができる。As explained above, according to the present invention, the speaker is identified by the calling party's telephone number, and recognition processing is performed using a standard pattern for the specific speaker only for those who need it, thereby improving recognition performance. In addition to improving recognition performance, if recognition performance is insufficient, by changing to a standard pattern for unspecified speakers,
Deterioration in recognition performance can be minimized.

[Brief explanation of the drawing]

第１図は本発明の一実施例を示す音声応答認識方式の処
理フローチャート、第２図は本発明の認識方式を用いた
音声応答認識装置のブロック図、第３図は第２図におけ
る音声認識部の構成図、第４図は本発明で用いられる標
準パターンの構成例を示す図、第５図は本発明において
用いられる電話番号標準パターン対応テーブルの例を示
す図、第６図は本発明の他の実施例を示す対応テーブル
の説明図、第７図は本発明において、認識チエツクによ
り標準パターンを変更する場合の説明図である。１：網制御部、２：音声認識部、３：音声応答部、４：
主制御部、５：発信者番号受信部、６：音声分析部、７
：標準パターンメモリ、８：照合演算部、９：結果判定
部、ｌＯ：辞書。第図着信才寸　　Ｌ熔箔冨 −−−−−１ｍ；ヨ〒第図電話番号標準バタン対応テーブルの例第６図FIG. 1 is a processing flowchart of a voice response recognition method showing an embodiment of the present invention, FIG. 2 is a block diagram of a voice response recognition device using the recognition method of the present invention, and FIG. 3 is a voice recognition method in FIG. FIG. 4 is a diagram showing a configuration example of a standard pattern used in the present invention, FIG. 5 is a diagram showing an example of a telephone number standard pattern correspondence table used in the present invention, and FIG. 6 is a diagram showing an example of the standard pattern correspondence table used in the present invention. FIG. 7 is an explanatory diagram of a correspondence table showing another embodiment of the present invention, and is an explanatory diagram of a case where a standard pattern is changed by a recognition check in the present invention. 1: Network control unit, 2: Voice recognition unit, 3: Voice response unit, 4:
Main control unit, 5: Caller number receiving unit, 6: Voice analysis unit, 7
: Standard pattern memory, 8: Verification calculation section, 9: Result judgment section, lO: Dictionary. Figure: Incoming call size L: 1 m; YO Figure: Example of telephone number standard slam correspondence table Figure 6

Claims

[Claims]

(1) Receive the caller's phone number notified from the network, compare and recognize the caller's voice with a standard pattern, and respond to the caller with voice guidance based on the recognition results to provide service. In the voice response recognition device to be advanced, a standard pattern for unspecified speakers and a standard pattern for multiple specific speakers are registered in the voice response recognition device, and if the caller's telephone number is not notified, , selects the standard pattern for unspecified speakers and performs recognition processing, and when the caller's phone number is notified, the standard pattern for unspecified speakers or the specific speaker corresponding to the phone number is selected. Select one of the standard patterns for the specific speaker and perform the recognition process, and if the result of the recognition process using the standard pattern for the specific speaker does not meet the predetermined performance, it will automatically change to the unspecified pattern above. A voice response recognition method characterized in that the recognition process is repeated and the subsequent service is continued.