JPH06130989A

JPH06130989A - Voice recognizing device

Info

Publication number: JPH06130989A
Application number: JP30632092A
Authority: JP
Inventors: Takeshi Kanai; 健金井
Original assignee: Ricoh Co Ltd
Current assignee: Ricoh Co Ltd
Priority date: 1992-10-20
Filing date: 1992-10-20
Publication date: 1994-05-13

Abstract

PURPOSE:To provide the voice recognizing device for selecting a correct candidate out of plural candidates with simple operations. CONSTITUTION:The plural recognized results are outputted concerning a single syllable. Concerning the recognized result of a voice inputted from a microphone 1, a single syllable recognizing device 2 performs language processing by using a language processor 3, outputs the result while adding a score expressing certanity for the unit of a clause and displays the clause candidates on a display part 5 together with symbols for distinguishing the candidates in the score order of certanity. When the symbol displayed parallelly to the candidate for specifying the displayed candidate is voiced, this is recognized by the voice recognizing device 2 controlled by a controller 4.

Description

Detailed Description of the Invention

【０００１】[0001]

【技術分野】本発明は、音声認識装置、より詳細には、
単音節認識における日本語入力装置に関する。TECHNICAL FIELD The present invention relates to a speech recognition device, and more particularly,
Japanese input device for monosyllabic recognition.

【０００２】[0002]

【従来技術】音声による日本語入力は文書入力の手段と
して極めて自然で、一般に、単音節認識は記憶する音声
の辞書が単語認識に比べて少なくてすみ、注目されてい
る。しかし、特開平３−２２１９９９号公報に示すよう
に、単音節認識は、一般に、認識率が低く、認識結果と
して複数候補を出す必要がある。そこで、文節長が長く
なると、言語処理を行っても文節候補数が極めて多くな
り、いかに使用者に負担をかけないで多くの候補の中か
ら正しい候補を選ぶかという問題点がある。2. Description of the Related Art Japanese input by voice is extremely natural as a means for inputting documents, and in general, monosyllabic recognition has attracted attention because it requires a smaller voice dictionary to be stored than word recognition. However, as shown in JP-A-3-221999, single syllable recognition generally has a low recognition rate, and it is necessary to output a plurality of candidates as a recognition result. Therefore, if the bunsetsu length becomes long, the number of bunsetsu candidates becomes extremely large even if language processing is performed, and there is a problem in that a correct candidate is selected from many candidates without burdening the user.

【０００３】[0003]

【目的】上述のように、音声による日本語入力は文書入
力の手段として極めて自然で、特に、単音節認識は、記
憶する音声の辞書が単語認識に比べて少なくてすみ、注
目されている。しかし、単音節認識は認識率が低く認識
結果として複数候補を出す必要がある。そこで、文節長
が長くなると言語処理を行っても文節候補数が極めて多
くなり、いかに使用者に負担をかけないで多くの候補の
中から正しい候補を選ぶかという問題点がある。そこ
で、本発明は簡単な操作によって複数候補の中から正し
い候補を選ぶ音声認識装置を提供することを目的とす
る。[Purpose] As described above, inputting Japanese by voice is extremely natural as a means for inputting documents, and in particular, monosyllabic recognition has attracted attention because it requires a smaller voice dictionary to be stored than word recognition. However, the recognition rate of monosyllabic recognition is low and it is necessary to output multiple candidates as the recognition result. Therefore, if the bunsetsu length becomes long, the number of bunsetsu candidates becomes extremely large even if language processing is performed, and there is a problem in that a correct candidate is selected from many candidates without burdening the user. Therefore, an object of the present invention is to provide a voice recognition device that selects a correct candidate from a plurality of candidates by a simple operation.

【０００４】[0004]

【構成】本発明は、上記目的を達成するために、（１）
１つの単音節について複数の認識結果を出力する単音節
認識装置と、その認識結果について言語処理を行って文
節単位に確からしさを表すスコアを付けて出力する言語
処理装置と、確からしさのスコア順に文節候補を候補が
区別できるよう記号と一緒に表示する表示装置と、表示
された候補を特定するために候補と並行して表示されて
いる記号が発声された場合、これを認識するための音声
認識装置とを備えたことを特徴としたものであり、更に
は、（２）前記候補表示時においてスクロール機能に相
当する記号を決めておき、その記号が発声されると確か
らしさのスコア順に次の候補群を表示すること、或い
は、（３）候補確定時に音声認識装置が誤認識した場合
に訂正機能に相当する記号を決めておき、その記号が発
声されると候補表示・選択の状態に戻ることを特徴とし
たものである。以下、本発明の実施例に基いて説明す
る。In order to achieve the above object, the present invention provides (1)
A single syllable recognition device that outputs a plurality of recognition results for one single syllable, a language processing device that performs a language process on the recognition result and outputs a score indicating the certainty of each syllable, and the order of certainty scores. A display device that displays phrase candidates together with symbols so that the candidates can be distinguished, and a voice for recognizing a symbol that is displayed in parallel with the candidates to identify the displayed candidate A recognition device is further provided, and further, (2) a symbol corresponding to the scroll function is determined at the time of displaying the candidates, and when the symbol is uttered, the following symbols are placed in the order of certainty of probability. , Or (3) a symbol corresponding to a correction function is determined in the case where the voice recognition device erroneously recognizes at the time of confirming the candidate, and the symbol is displayed when the symbol is uttered. Is obtained is characterized in that return to the state of the selection. Hereinafter, it demonstrates based on the Example of this invention.

【０００５】図１は、本発明による音声認識装置の一実
施例を説明するためのブロック図で、図において、１は
マイク、２は単音節認識装置、３は言語処理装置、４は
制御装置、５は表示装置である。まず、単音節認識装置
２は、マイク１から音声を取り込み、１つの単音節に対
して複数の候補を出力する。次に、言語処理装置３で単
音節認識装置２の認識結果から確からしさを表すスコア
を付けて複数文節候補を出力する。次に、表示装置５で
この複数文節をスコア順に文節を区別する記号と並べて
表示する。そして、制御装置４の制御により、使用者が
文節に対応した記号を発声するのを単音節認識装置２で
認識し、正しい文節を特定する。FIG. 1 is a block diagram for explaining an embodiment of a voice recognition device according to the present invention. In the figure, 1 is a microphone, 2 is a monosyllabic recognition device, 3 is a language processing device, and 4 is a control device. 5 is a display device. First, the monosyllabic recognition device 2 takes in a voice from the microphone 1 and outputs a plurality of candidates for one monosyllabic. Next, the language processing device 3 attaches a score indicating the certainty from the recognition result of the monosyllabic recognition device 2 and outputs a plurality of phrase candidates. Next, the display device 5 displays the plurality of phrases side by side in the order of scores with symbols for distinguishing the phrases. Then, under the control of the control device 4, the monosyllabic recognition device 2 recognizes that the user utters a symbol corresponding to a syllable, and specifies a correct syllable.

【０００６】次に、本装置を使用して文節を入力する過
程を具体例をもとに説明する。例えば、「メカトロニク
ス」とマイク１に向かって発声すると、単音節認識装置
２により表１の音節ラティスが生成される。Next, the process of inputting a phrase using this apparatus will be described based on a specific example. For example, when "Mechatronics" is uttered into the microphone 1, the monosyllabic recognition device 2 generates the syllable lattice of Table 1.

【０００７】[0007]

【表１】 [Table 1]

【０００８】表１で、例えば、「メ」と発声したときの
認識率による１位の候補は「め」、２位は「ね」、３位
は「ん」となっている。次に、この音節ラティスについ
て言語処理装置３で文節の展開を行い認識率、言語学的
妥当性からそれぞれの文節を評価し、確からしさのスコ
アを付けて出力する。その出力例を表２に示す。In Table 1, for example, the first candidate according to the recognition rate when "M" is uttered is "me", the second candidate is "ne", and the third candidate is "n". Then, the linguistic processor 3 develops the syllables in the syllable lattice, evaluates each syllable from the recognition rate and the linguistic validity, and outputs the score with certainty. The output example is shown in Table 2.

【０００９】[0009]

【表２】 [Table 2]

【００１０】次に、制御装置４により表示装置５にスコ
ア付き文節候補群を表示する。その表示例を表３に示
す。候補はスコアの高い順に上から表示されており、左
端に候補を区別するための記号として「あ」「い」…
「え」が表示されている。また、制御装置４により単音
節認識装置２が音声入力可能状態となり、この場合は
「い」が正しい候補に対応しており「い」と発声する
と、認識装置２でそれを認識し、候補が確定する。Next, the control device 4 displays the score-added phrase candidate group on the display device 5. The display example is shown in Table 3. The candidates are displayed from the top in the order of high score, and "a", "i" ... are symbols at the left end to distinguish the candidates.
"E" is displayed. Further, the control device 4 enables the monosyllabic recognition device 2 to input a voice, and in this case, "i" corresponds to a correct candidate, and when uttering "i", the recognition device 2 recognizes it and the candidate is found. Determine.

【００１１】[0011]

【表３】 [Table 3]

【００１２】また、請求項２に相当する実施例は、以上
に説明した実施例で、文節候補群を表示している時に正
しい候補がない場合、スクロール機能に対応した記号と
して「す」を決めておき、「す」が発声されると音声認
識装置２でそれを認識し、次にスコアの高い文節群を表
示するものである。The embodiment corresponding to claim 2 is the embodiment described above, and when there is no correct candidate when displaying the phrase candidate group, "su" is determined as the symbol corresponding to the scroll function. When "su" is uttered, it is recognized by the voice recognition device 2 and the phrase group having the next highest score is displayed.

【００１３】更に、請求項３に相当する実施例は、前述
の実施例で、文節候補を確定する時正しい候補に対応し
た「い」と発声したのに認識装置２が「あ」と誤認識し
て確定候補が違った場合、訂正機能に相当する記号とし
て「て」をあらかじめ決めておき、「て」と発声される
と音声認識装置２でそれを認識し文節候補群を表示して
候補を選ぶ状態に戻るものである。Further, the embodiment corresponding to claim 3 is the above-mentioned embodiment, in which the recognition device 2 erroneously recognizes as "a" even though uttering "i" corresponding to the correct candidate when determining the phrase candidate. If the finalized candidate is different, “te” is predetermined as a symbol corresponding to the correction function, and when the “te” is uttered, the voice recognition device 2 recognizes it and displays the phrase candidate group to display the candidate. It returns to the state of choosing.

【００１４】[0014]

【効果】以上の説明から明らかなように、本発明による
と、以下のような効果がある。（１）請求項１の音声認識装置においては、文節候補を
音声で選択できるようにしたので、より簡単な操作で入
力できる。（２）請求項２の音声認識装置においては、次文節候補
群を音声で表示・選択できるようにしたので、より広範
囲の候補から選択できる。（３）請求項３の音声認識装置においては、候補の確定
時に音声により一度確定した候補を解除できるようにし
たので、確定時に認識装置が誤認識しても正しい候補が
入力出来る。As is apparent from the above description, the present invention has the following effects. (1) In the voice recognition device according to the first aspect, since the phrase candidate can be selected by voice, the phrase can be input by a simpler operation. (2) In the voice recognition device according to claim 2, since the next phrase candidate group can be displayed and selected by voice, it is possible to select from a wider range of candidates. (3) In the voice recognition device according to the third aspect, when the candidate is confirmed, the candidate once confirmed by voice can be released, so that a correct candidate can be input even if the recognition device makes a mistake when confirming.

[Brief description of drawings]

【図１】本発明による音声認識装置の一実施例を説明
するためのブロック図である。FIG. 1 is a block diagram for explaining an embodiment of a voice recognition device according to the present invention.

[Explanation of symbols]

１…マイク、２…単音節認識装置、３…言語処理装置、
４…制御装置、５…表示装置。1 ... microphone, 2 ... monosyllabic recognition device, 3 ... language processing device,
4 ... Control device, 5 ... Display device.

Claims

[Claims]

1. A monosyllabic recognition device which outputs a plurality of recognition results for one monosyllabic, and a language processing device which performs a language process on the recognition result and outputs a score indicating a certainty for each syllable. A display device that displays the phrase candidates together with the symbols so that the candidates can be distinguished in the order of certainty score, and if a symbol displayed in parallel with the candidate is uttered to identify the displayed candidate, this is displayed. A voice recognition device comprising a voice recognition device for recognition.

2. The voice recognition device according to claim 1, wherein
A voice recognition device characterized in that when a candidate is displayed, a symbol corresponding to a scroll function is determined, and when the symbol is uttered, the next group of candidates is displayed in the order of certainty score.

3. The voice recognition device according to claim 1, wherein
A voice recognition device characterized in that, when a voice recognition device makes an erroneous recognition when a candidate is determined, a symbol corresponding to a correction function is determined, and when the voice is uttered, the candidate display / selection state is restored.