[go: up one dir, main page]

JPS61145600A - Word recognition equipment - Google Patents

Word recognition equipment

Info

Publication number
JPS61145600A
JPS61145600A JP59269079A JP26907984A JPS61145600A JP S61145600 A JPS61145600 A JP S61145600A JP 59269079 A JP59269079 A JP 59269079A JP 26907984 A JP26907984 A JP 26907984A JP S61145600 A JPS61145600 A JP S61145600A
Authority
JP
Japan
Prior art keywords
word
phoneme
phoneme sequence
storage unit
likelihood
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
JP59269079A
Other languages
Japanese (ja)
Inventor
森井 秀司
藤井 諭
二矢田 勝行
昌克 星見
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Panasonic Holdings Corp
Original Assignee
Matsushita Electric Industrial Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Matsushita Electric Industrial Co Ltd filed Critical Matsushita Electric Industrial Co Ltd
Priority to JP59269079A priority Critical patent/JPS61145600A/en
Publication of JPS61145600A publication Critical patent/JPS61145600A/en
Pending legal-status Critical Current

Links

Abstract

(57)【要約】本公報は電子出願前の出願データであるた
め要約のデータは記録されません。
(57) [Summary] This bulletin contains application data before electronic filing, so abstract data is not recorded.

Description

【発明の詳細な説明】 産業上の利用分野 本発明は音声認識装置に用いる単語認識装置に関するも
のである。
DETAILED DESCRIPTION OF THE INVENTION Field of Industrial Application The present invention relates to a word recognition device used in a speech recognition device.

従来例の構成とその問題点 単語音声の認識方法の従来例として、入力された音声か
らまず音素を単位とした認識を行ない、認識された音素
の系列と単語辞書に格納された認識対象単語の音素系列
との音素系列間の類似度を求め、最も高い類似度を得る
単語辞書の音素系列て相当する単語を認識単語とする方
法が知られている。これは、「音声スペクトルの概略形
とその動特性を利用した単語音声認識システム」三輪他
、日本音響学会誌34(1978)に示されている。
Structure of the conventional example and its problems In the conventional example of the word speech recognition method, the input speech is first recognized in units of phonemes, and the recognized phoneme sequence and the recognition target word stored in the word dictionary are used. A method is known in which the degree of similarity between a phoneme sequence and a phoneme sequence is determined, and the word corresponding to the phoneme sequence in a word dictionary with the highest degree of similarity is selected as a recognized word. This is shown in "Word speech recognition system using the outline form of the speech spectrum and its dynamic characteristics" by Miwa et al., Journal of the Acoustical Society of Japan 34 (1978).

以下図面を参照しながら従来例の単語認識方法について
説明する。第1図は従来例の単語認識装置の構成を示し
たものである。図において1は類似度演算部、 2(r
i認識対象単語の音素系列が格納されている単語辞書、
3は音素間の尤度が格納されている音素間尤度格納部、
4は最も類似度の高い単語を選び出す単語判定部である
。以上のように構成された単語認識装置について以下そ
の認識方法を説明する。
A conventional word recognition method will be described below with reference to the drawings. FIG. 1 shows the configuration of a conventional word recognition device. In the figure, 1 is a similarity calculation unit, 2(r
a word dictionary storing phoneme sequences of i-recognition target words;
3 is an inter-phoneme likelihood storage unit in which the likelihood between phonemes is stored;
4 is a word determination unit that selects the word with the highest degree of similarity. The recognition method of the word recognition device configured as described above will be explained below.

第1図の単語認識装置の前段に位置する音素認識装置(
図示せず)によジ入力された音声信号は音素の系列に変
換され第1図の類似度演算部1に送られる二類似度演算
部1では入力された音声の音素系列と、単語辞書2に格
納しである認識の対象となる166個の単語の音素系列
との間の単語類似度を音素間尤度格納部3に格納されて
いる音素間の尤度を用いて算出する。そして、得られた
166個の単語類似度は単語判定部4に送られ単語類似
度最大のものが選択され最大類似度を得た単語辞書の音
素系列に相当する単語を認識単語として出力する。
The phoneme recognition device (
The input speech signal is converted into a phoneme sequence (not shown) and sent to the similarity calculation unit 1 shown in FIG. The degree of word similarity between the phoneme sequences of the 166 words stored in the 166 words to be recognized is calculated using the likelihood between phonemes stored in the likelihood between phonemes storage unit 3. Then, the obtained 166 word similarities are sent to the word determination section 4, the one with the maximum word similarity is selected, and the word corresponding to the phoneme sequence of the word dictionary for which the maximum similarity has been obtained is outputted as a recognized word.

次に、単語判定部4における単語類似度の算出方法につ
いて説明する。単語辞書2の辞書項目の音゛素系列をD
(D、、D2・・・・・・D□)、入力音素系列をw 
(w、 、 W2・・・・−・WJ)、 fcだしI、
Jは辞書項目り及び入力音素系列Wの音素数とする。単
語類似度S (D 、 W ’)は式1に示す漸化式に
より求める。
Next, a method for calculating word similarity in the word determination section 4 will be explained. The phoneme series of dictionary entries in word dictionary 2 is D
(D,,D2...D□), input phoneme sequence w
(w, , W2...-WJ), fc dashi,
Let J be the dictionary entry and the number of phonemes in the input phoneme sequence W. The word similarity S (D, W') is determined by the recurrence formula shown in Equation 1.

ここで、L=qCi−1*1−1) La=q(i−1,j−2)+1a(j−1)Laa=
 g(i−1,j−3)+1aa(j−2)+1aa(
j−1) Lo = g(i−2,1−1)+4o(1−1)Lo
o ==g (i −3、1−1)+Joo(i−2)
十10o(i−1)g(o、o)= J(1+1 、J
+1 )=Oq(t、o)=q(o+j)=−ω (ただしi≠’+]≠0) qCi 、 j+1>=q(1+1 * i )=−■
(ただしi≠I+1 、 i≠J+1)式1においてj
i(i、j)は辞書項目の音素系列りのi番目の音素D
iと入力音素系列WOj番目の音素Wj との尤度を示
す。同根にハ(j)はwiの音素が付加する尤度、 !
10(i)はDiの音素が脱落する尤度を示す。また、
AaaN)、1oo(i)l:tWiの音素が連続して
付加する尤度とDiの音素が連続して脱落する尤度を示
す。式1は■単語境界同志は必ず対応する。■音素の付
加又は脱落は2連続以内である。■付加と脱落は連続し
て生起しない。という制限を加えて辞書項目の音素系列
りと入力音素系列Wの各音素Di 、 Wjを対応させ
た場合における最適な対応の結果得らnる類似度を表わ
している。また式1におけるβ(i9口。
Here, L=qCi-1*1-1) La=q(i-1,j-2)+1a(j-1)Laa=
g(i-1,j-3)+1aa(j-2)+1aa(
j-1) Lo = g(i-2,1-1)+4o(1-1)Lo
o ==g (i -3, 1-1) + Joo (i-2)
10o(i-1)g(o,o)=J(1+1,J
+1)=Oq(t,o)=q(o+j)=-ω (however, i≠'+]≠0) qCi, j+1>=q(1+1*i)=-■
(However, i≠I+1, i≠J+1) In equation 1, j
i (i, j) is the i-th phoneme D in the phoneme series of the dictionary entry
The likelihood between i and the input phoneme sequence WOj-th phoneme Wj is shown. Ha (j) is the likelihood that the wi phoneme is added to the same root, !
10(i) indicates the likelihood that the phoneme of Di is dropped. Also,
AaaN), 1oo(i)l: Shows the likelihood that the phonemes of tWi are added consecutively and the likelihood that the phonemes of Di are consecutively dropped. Equation 1 is: ■ Word boundaries always correspond. ■ Addition or omission of phonemes is within two consecutive additions or omissions. ■Addition and omission do not occur consecutively. It represents the degree of similarity obtained as a result of the optimal correspondence when the phoneme sequence of the dictionary entry is made to correspond to each phoneme Di, Wj of the input phoneme sequence W with the following restriction. Also, β(i9mouth) in Formula 1.

ga(i)、gaa(i)、6o(i)、1oo(i)
の各尤度の値はあらかじめ多数の音声の音素認Rを行な
った結果から得られる音素の付加や脱落を含む音素認識
の誤りの確率を表わすConfusion Matri
xの各成分の対数値として求められ、音素間尤度格納部
3に格納さn−’cいる。このConfusion M
atrixでは1つの入力音素に対する全ての認識音素
(脱落を含む)の出現確率の和は1となっている。
ga(i), gaa(i), 6o(i), 1oo(i)
Each likelihood value is a Confusion Matri that represents the probability of an error in phoneme recognition, including the addition or omission of phonemes, obtained from the results of performing phoneme recognition R on a large number of voices in advance.
It is calculated as the logarithm value of each component of x and stored in the inter-phoneme likelihood storage unit 3. This Confusion M
In atrix, the sum of the appearance probabilities of all recognized phonemes (including omissions) for one input phoneme is 1.

すなわち、従来例による単語認識の方法は入力音素系列
に対し式1により得られる単語類似度を単語辞書に含ま
nる全ての辞書項目について算出し、最も単語類似度が
高いものに対応する単語を認識単語とするものである。
In other words, the conventional word recognition method calculates the word similarity obtained by equation 1 for the input phoneme sequence for all n dictionary items included in the word dictionary, and selects the word corresponding to the one with the highest word similarity. This is a recognized word.

しかしながら、従来例による方法は、入力される音声の
語頭や語尾に口から発生される呼吸音やため息、さらに
、「ええと」などの意味のない音声のような雑音が付加
さnた場合有効に動作しない場合が多い。第2図は「市
川」という音声の前に呼吸音による雑音が付加された場
合の音素認識結果の音素系列の例を示したもので呼吸音
による雑音は1ha1と認識されその雑音から音声の始
端までの無音部は101(促音)と認識されてしまった
ため1ha01という3音素が語頭に連続して付加して
いる。このように語頭や語尾に口から発せられる雑音が
付加されると雑音部は2音素以上に認識されやすく、雑
音から音声の始端までの無音部、あるい、q音声の終端
から雑哀までの無音部は促音101と認識されやすくな
るため3音素以上連続して音素が付加されることが多い
。式1の漸化式は、音素の付加は2連続以内であるとい
う制限のもとての漸化式であるため3音素以上連続して
付加が起った場合、第3図に示すように辞書項目の音素
Di  と入力音素Wiは最適な対応を得ることが出来
ず、正解単語の辞書項目との単語類似度は小さな値とな
るため正しい認識結果を得にくくなるという欠点を有し
ている。
However, the conventional method is effective when noises such as breathing sounds or sighs generated from the mouth, or meaningless sounds such as "um" are added to the beginning or end of the input speech. It often doesn't work. Figure 2 shows an example of the phoneme sequence resulting from phoneme recognition when noise due to breathing sounds is added before the voice ``Ichikawa''. Since the silent part up to this point was recognized as 101 (consonant), the three phonemes 1ha01 are added consecutively to the beginning of the word. When the noise emitted from the mouth is added to the beginning or end of a word in this way, the noise part is easily recognized as two or more phonemes. Since a silent part is easily recognized as a consonant 101, three or more phonemes are often added consecutively. The recurrence formula in Equation 1 is a recurrence formula with the restriction that the addition of phonemes is limited to two consecutive times, so if three or more phonemes are added consecutively, as shown in Figure 3, This method has the disadvantage that it is not possible to obtain an optimal correspondence between the phoneme Di of the dictionary entry and the input phoneme Wi, and the word similarity of the correct word with the dictionary entry is a small value, making it difficult to obtain correct recognition results. .

発明の目的 本発明は従来技術のもつ以上のような欠点を解消するも
ので、音声の語頭1語尾に雑音が付加された場合でも性
能劣下の少ない単語認識装置を提供するものである。
OBJECTS OF THE INVENTION The present invention eliminates the above-mentioned drawbacks of the prior art, and provides a word recognition device that exhibits little performance deterioration even when noise is added to the beginning or end of a speech word.

発明の構成 本発明による基本構成は認識対象単語の音素系列が格納
されている単語辞書記憶部と、音素間尤度が格納されて
いる音素間尤度格納記憶部と、入力音素系列の語頭また
は語尾に雑音が付加されている可能性があるか判定する
音素系列検定部と。
Structure of the Invention The basic structure according to the present invention includes a word dictionary storage unit storing phoneme sequences of words to be recognized, a phoneme-to-phoneme likelihood storage unit storing phoneme-to-phoneme likelihoods, and a word dictionary storage unit storing phoneme sequences of words to be recognized; A phoneme sequence testing unit that determines whether there is a possibility that noise is added to the end of a word.

雑音が付加されている可能性がある場合は雑音部分を除
去した単語境界を定め、修正入力音素系列全発生する単
語境界再決定部と、入力音素系列又は入力音素系列及び
修正入力音素系列と単語辞書記憶部の音素系列との単語
類似度を計算する単語類似度演算部と、単語類似度のう
ち最大となるものを選びその最大単語類似度算出の除用
いた単語辞書の音素系列に相当する単語を認識単語とし
て出力する単語判定部を備え、音声の語頭または語尾に
雑音が付加されている可能性のある場合には、入力音素
系列と単語境界を修正された修正入力音素系列の2種類
の入力音素系列と単語辞書の音素系列との単語類似度を
求めるようにしたものであるう 実施例の説明 以下本発明の一実施例について図面を参照しながら説明
する。第4図は本発明の一実施例における音声認識装置
に組込まれた単語認識装置のブロック図を示したもので
ある。第4図において5は単語類似度演算部で前段の音
素認識装置(図示せず)によf)認識された入力音声の
音素系列と、単語辞書記憶部6に格納されている認識の
対象となる単語の音素系列との単語類似度を計算する。
If there is a possibility that noise has been added, a word boundary from which the noise part has been removed is determined, and a word boundary re-determining unit generates all modified input phoneme sequences, and the input phoneme sequence or the input phoneme sequence and the modified input phoneme sequence and the word. The word similarity calculation unit calculates the word similarity with the phoneme sequence of the dictionary storage unit, and the word similarity calculation unit selects the maximum word similarity and calculates the maximum word similarity corresponding to the phoneme sequence of the word dictionary. It is equipped with a word judgment unit that outputs words as recognized words, and when there is a possibility that noise is added to the beginning or end of the speech, there are two types of input phoneme sequences: an input phoneme sequence and a modified input phoneme sequence with word boundaries corrected. DESCRIPTION OF AN EMBODIMENT The degree of word similarity between an input phoneme sequence and a phoneme sequence in a word dictionary is determined.An embodiment of the present invention will be described below with reference to the drawings. FIG. 4 shows a block diagram of a word recognition device incorporated in a speech recognition device according to an embodiment of the present invention. In FIG. 4, reference numeral 5 denotes a word similarity calculation unit that calculates the phoneme sequence of the input voice recognized by the previous stage phoneme recognition device (not shown) and the recognition target stored in the word dictionary storage unit 6. The word similarity with the phoneme sequence of the word is calculated.

また7は音素間尤度格納記憶部で、単語類似度演算部5
において単語類似度を算出する際に用いられる音素間尤
度が格納されている。この音素間尤度はあらかじめ多数
の音声の音素認識を行ない、そしてその結果得られる音
素の付加や脱落を含む音素の認識の誤りの確率全表わす
ConfusionMatxixの各成分の対数値を求
めることにより得られたものを用いている。更に8は音
素系列検定部で前段の音素認識装置から送られて来た音
素系列に促音1Q1が含まれているかの検定を行なう部
分である。そして、入力音素系列に促音1o1が含まれ
ている場合には、単語境界再決定部9において単語境界
の再決定が行なわれ、その再決定された音素系列は再び
単語類似度演算部6に送られる。1oは単語判定部で単
語類似度演算部6で算出された単語類似度のうち最大の
ものを求め。
Further, 7 is a storage unit for storing likelihood between phonemes, and a word similarity calculation unit 5
The inter-phoneme likelihood used in calculating word similarity is stored. This inter-phoneme likelihood can be obtained by performing phoneme recognition in advance on a large number of voices, and then calculating the logarithm value of each component of ConfusionMatxix, which represents the total probability of phoneme recognition errors, including phoneme additions and omissions. I'm using something like this. Further, reference numeral 8 denotes a phoneme sequence verification section which verifies whether the phoneme sequence sent from the previous stage phoneme recognition device includes the consonant 1Q1. If the input phoneme sequence includes the consonant 1o1, the word boundary is redetermined in the word boundary redetermination unit 9, and the redetermined phoneme sequence is sent to the word similarity calculation unit 6 again. It will be done. 1o is a word determination unit that determines the maximum word similarity among the word similarities calculated by the word similarity calculation unit 6.

その最大単語類似度を得た単語辞書記憶部6に格納され
ている辞書項目の音素系列に相当する単語を認識単語と
して出力する。
The word corresponding to the phoneme sequence of the dictionary item stored in the word dictionary storage unit 6 for which the maximum word similarity has been obtained is output as a recognized word.

以上のように構成された単語認識装置についてその動作
を説明する。マイク等より入力された音声は音素認識装
置により音素の系列に変換され第4図の単語認識装置に
送られる。人力された音素系列は第4図の単語類似度演
算部5と音素系列検定部8に送られる。単語類似度演算
部6では入力された音素系列と単語辞書記憶部6に格納
されている認識対象単語の音素系列との間の単語類似度
全音素間尤度格納記憶部7に格納されている音素間尤度
を用いて計算する。この単語間類似度は式1に示す式で
行なっている。そして、この単語類似度は単語辞書記憶
部6に格納されている全ての音素系列について計算され
る。
The operation of the word recognition device configured as described above will be explained. Speech input from a microphone or the like is converted into a series of phonemes by a phoneme recognition device and sent to the word recognition device shown in FIG. The manually generated phoneme sequence is sent to the word similarity calculation section 5 and the phoneme sequence verification section 8 shown in FIG. In the word similarity calculation unit 6, the word similarity between the input phoneme sequence and the phoneme sequence of the recognition target word stored in the word dictionary storage unit 6 is stored in the total inter-phoneme likelihood storage storage unit 7. Calculated using inter-phoneme likelihood. This inter-word similarity is calculated using the equation shown in equation 1. Then, this word similarity is calculated for all phoneme sequences stored in the word dictionary storage section 6.

一方、音素系列検定部8に入力された音素系列は音素系
列の中に促音101が含まれているか検定される。入力
音素系列の中に促音が含まれていない場合は以下の処理
は行わないが、入力音素系列の中に促音が含まれている
場合には、語頭あるいは語尾に雑音が付加さnている可
能性があるということで入力音素系列は単語境界再決定
部11に送られ単語境界が修正される。単語境界再決定
部11では入力音素系列に含まれる促音の位置が語尾よ
りも語頭に近い場合は語頭に雑音が付加されている可能
性があるとし、促音に後続する音素全語頭とする単語境
界の修正を行なう。逆に促音の位置が語頭よりも語尾の
方に近い場合は語尾に雑音が付加されている可能性があ
るということで促音の前の音素を語尾とする単語境界の
修正が行なわれる。単語境界再決定部9において単語境
界が修正された音素系列は単語類似度演算部5に送られ
単語辞書記憶部6の音素系列との単語類似度が計算され
る。すなわち、入力音素系列に促音が含まれている場合
は入力音素系列と単語辞書の音素系列との単語類似度に
加え、単語境界を修正された入力音素系列と単語辞書の
音素系列との単語類似度も計算される。そして、計算さ
れた単語類似度は単語決定部1oに送られ、単語類似度
が最大となる単語辞書の音素系列に和尚する単語を認識
単語として出力する。
On the other hand, the phoneme sequence input to the phoneme sequence testing section 8 is tested to see if the phoneme sequence includes a consonant 101. If the input phoneme series does not include a consonant, the following processing is not performed, but if the input phoneme series does include a consonant, noise may have been added to the beginning or end of the word. The input phoneme sequence is sent to the word boundary re-determining unit 11 and the word boundaries are corrected. The word boundary re-determining unit 11 determines that if the position of a consonant included in the input phoneme sequence is closer to the beginning of a word than to the end of the word, there is a possibility that noise has been added to the beginning of the word, and sets the word boundary to the beginning of all phonemes following the consonant. Make corrections. On the other hand, if the consonant is closer to the end of the word than the beginning, there is a possibility that noise has been added to the end of the word, so the word boundary is corrected so that the phoneme before the consonant becomes the end of the word. The phoneme sequence whose word boundaries have been corrected in the word boundary re-determination unit 9 is sent to the word similarity calculation unit 5, where the word similarity with the phoneme sequence in the word dictionary storage unit 6 is calculated. In other words, if the input phoneme sequence contains a consonant, in addition to the word similarity between the input phoneme sequence and the phoneme sequence in the word dictionary, the word similarity between the input phoneme sequence with word boundaries corrected and the phoneme sequence in the word dictionary is determined. Degrees are also calculated. The calculated word similarity is then sent to the word determination unit 1o, and the word that matches the phoneme sequence of the word dictionary with the maximum word similarity is output as a recognized word.

本実施例によnば音声の語頭あるいは語尾に。According to this embodiment, if n is at the beginning or end of a voice.

呼吸音やせきばらい、あるいは「ええと」等の意味のな
い音声のような雑音が付加された場合でも単語境界再決
定部9により雑音が除去されるため正しい認識結果を得
ることが出来る。これは前述したような雑音と音声の間
には無音区間が存在することが多く、この無音区間の音
素認識結果が促音101となるということを利用したも
のである。
Even if noises such as breathing sounds, coughing, or meaningless voices such as "um" are added, the word boundary re-determination unit 9 removes the noises, making it possible to obtain correct recognition results. This is based on the fact that there is often a silent section between noise and speech as described above, and the phoneme recognition result of this silent section is a consonant 101.

本実施例による単語認識装置を組み込んだ音声認識装置
を用い男女計40名の話者が発声した274単語により
評価実験全行なった結果、前述したような雑音が付加さ
れている場合でも有効に動作し、平均単語認識率95.
6%という良好な結果全書ることが出来た。
As a result of conducting evaluation experiments using 274 words uttered by a total of 40 male and female speakers using the speech recognition device incorporating the word recognition device according to this embodiment, we found that it works effectively even when noise is added as described above. The average word recognition rate was 95.
I was able to write the whole thing with a good result of 6%.

発明の効果 以上のように、本発明は認識対象となる単語の音素系列
が格納されている単語辞書記憶部と、音素間の尤度が格
納されている音素間尤度格納記憶部と、入力音素系列に
雑音が付加されている可能性があるか判定を行なう音素
系列検定部と、単語境界を修正し修正入力音素系列を発
生する単語境界再決定部と、入力音素系列または入力音
素系列及び修正入力音素系列と単語辞書記憶部に格納さ
れている音素系列との単語類似度を計算する単語類似度
演算部と、計算された単語類似度のうち最大のものを選
びその単語類似度の計算に用いた単語辞書記憶部の音素
系列に相当する単語を認識結果として出力する単語判定
部により構成される単語認識装置であり、本発明は、音
声の語頭や語尾に呼吸音やせきばらい、あるいは「ええ
と」などの意味のない音声のような雑音が付加された場
合、雑音と音声の間に無音区間が存在することが多いと
いうことを利用し、音素系列検定部において入力音素系
列に雑音が付加されている可能性があるか判定全行ない
、雑音が付加されている可能性がある場合には単語境界
再決定部において単語境界を修正し、入力音素系列と単
語境界を修正した音素系列の2種類の入力音素系列によ
り単語類似度金求めるようにしたもので、雑音が付加さ
れた音声に対しても正しい認識結果を得られる利点を有
する。
Effects of the Invention As described above, the present invention includes a word dictionary storage unit storing phoneme sequences of words to be recognized, an inter-phoneme likelihood storage unit storing likelihoods between phonemes, and an input A phoneme sequence verification unit that determines whether noise may be added to the phoneme sequence; a word boundary re-determination unit that corrects word boundaries and generates a modified input phoneme sequence; A word similarity calculation unit that calculates the word similarity between the corrected input phoneme sequence and the phoneme sequence stored in the word dictionary storage unit, and a word similarity calculation unit that selects the maximum word similarity among the calculated word similarities. The present invention is a word recognition device that includes a word judgment unit that outputs a word corresponding to the phoneme sequence of the word dictionary storage unit used in the word dictionary storage unit as a recognition result. When noise such as meaningless speech such as "um" is added, there is often a silent interval between the noise and the speech. If there is a possibility that noise has been added, the word boundaries are corrected in the word boundary re-determining unit, and the input phoneme sequence and the phoneme sequence with the corrected word boundaries are combined. This method calculates word similarity using two types of input phoneme sequences, and has the advantage that correct recognition results can be obtained even for speech with added noise.

【図面の簡単な説明】[Brief explanation of drawings]

第1図は従来の単語認識装置の機能構成を示すブロック
図、第2図は語頭に雑音が付加した場合の音素系列の例
を示す図、第3図は入力音素と単語辞書の音素との誤っ
た対応をとる例を示す図。 第4図は本発明の一実施例における単語認識装置の機能
ブロック図である。 6・・・・・・単語類似度演算部、6・・・・・・単語
辞書記憶部、7・・・・・・音素間尤度格納記憶部、8
・・・・・・音素系列検定部、9・・・・・・単語境界
再決定部、1o・・・・・・単語判定部。
Figure 1 is a block diagram showing the functional configuration of a conventional word recognition device, Figure 2 is a diagram showing an example of a phoneme sequence when noise is added to the beginning of a word, and Figure 3 is a diagram showing the relationship between input phonemes and phonemes in a word dictionary. A diagram showing an example of taking an incorrect response. FIG. 4 is a functional block diagram of a word recognition device in one embodiment of the present invention. 6... Word similarity calculation unit, 6... Word dictionary storage unit, 7... Inter-phoneme likelihood storage storage unit, 8
. . . Phoneme sequence testing section, 9 . . . Word boundary re-determination section, 1o . . . Word judgment section.

Claims (1)

【特許請求の範囲】[Claims] 認識対象となる単語の音素系列が格納されている単語辞
書記憶部と、音素間の尤度が格納されている音素間尤度
格納記憶部と、音声から得られる入力音声系列に雑音が
付加されているか否かを判定する音素系列検定部と、雑
音が付加されている場合に単語境界を修正し修正入力音
素系列を発生する単語境界再決定部と、入力音声系列又
は入力音素系列及び修正入力音素系列と前記単語辞書記
憶部に格納されている音素系列との単語類似度を、前記
音素間尤度格納記憶部に格納されている音素間尤度を用
いて算出する単語類似度演算部と、前記単語類似度演算
部で算出された単語類似度のうち最大のものを選びその
単語類似度の計算に用いた単語辞書記憶部の音素系列に
相当する単語を認識結果として出力する単語判定部とを
具備することを特徴とする単語認識装置。
A word dictionary storage unit stores the phoneme sequence of the word to be recognized, an inter-phoneme likelihood storage unit stores the likelihood between phonemes, and a word dictionary storage unit stores the phoneme sequence of the word to be recognized, and an inter-phoneme likelihood storage unit stores the likelihood between phonemes. a word boundary re-determining unit that corrects word boundaries and generates a corrected input phoneme sequence when noise is added; and an input speech sequence or input phoneme sequence and corrected input phoneme sequence. a word similarity calculation unit that calculates a word similarity between a phoneme sequence and a phoneme sequence stored in the word dictionary storage unit using an inter-phoneme likelihood stored in the inter-phoneme likelihood storage unit; , a word determination unit that selects the maximum word similarity among the word similarities calculated by the word similarity calculation unit and outputs the word corresponding to the phoneme sequence in the word dictionary storage unit used for calculating the word similarity as a recognition result; A word recognition device comprising:
JP59269079A 1984-12-19 1984-12-19 Word recognition equipment Pending JPS61145600A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP59269079A JPS61145600A (en) 1984-12-19 1984-12-19 Word recognition equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP59269079A JPS61145600A (en) 1984-12-19 1984-12-19 Word recognition equipment

Publications (1)

Publication Number Publication Date
JPS61145600A true JPS61145600A (en) 1986-07-03

Family

ID=17467370

Family Applications (1)

Application Number Title Priority Date Filing Date
JP59269079A Pending JPS61145600A (en) 1984-12-19 1984-12-19 Word recognition equipment

Country Status (1)

Country Link
JP (1) JPS61145600A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0289100A (en) * 1988-09-26 1990-03-29 Sharp Corp Voice recognizing device

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0289100A (en) * 1988-09-26 1990-03-29 Sharp Corp Voice recognizing device

Similar Documents

Publication Publication Date Title
JP4169921B2 (en) Speech recognition system
US5167004A (en) Temporal decorrelation method for robust speaker verification
JP2692581B2 (en) Acoustic category average value calculation device and adaptation device
US4882759A (en) Synthesizing word baseforms used in speech recognition
CN1251194A (en) Recognition system
JPH08234788A (en) Method and equipment for bias equalization of speech recognition
US20040236577A1 (en) Acoustic model creation method as well as acoustic model creation apparatus and speech recognition apparatus
CN1148720C (en) Speaker recognition
JP3001037B2 (en) Voice recognition device
US7181395B1 (en) Methods and apparatus for automatic generation of multiple pronunciations from acoustic data
JP2780676B2 (en) Voice recognition device and voice recognition method
JP2000250576A (en) Feature extracting method for speech recognition system
KR20170134115A (en) Voice recognition apparatus using WFST optimization and method thereof
JP3397568B2 (en) Voice recognition method and apparatus
JPH08106296A (en) Word recognition system
CN1251193A (en) Speech analysis system
US5220609A (en) Method of speech recognition
JP2002366192A (en) Voice recognition method and voice recognition device
JPH10149191A (en) Model adaptation method, apparatus and storage medium
JPS61145600A (en) Word recognition equipment
US8229739B2 (en) Speech processing apparatus and method
JP3437492B2 (en) Voice recognition method and apparatus
JP3100180B2 (en) Voice recognition method
JP2961916B2 (en) Voice recognition device
JP3008520B2 (en) Standard pattern making device