[go: up one dir, main page]

JPH08328583A - Speach recognition device - Google Patents

Speach recognition device

Info

Publication number
JPH08328583A
JPH08328583A JP7136725A JP13672595A JPH08328583A JP H08328583 A JPH08328583 A JP H08328583A JP 7136725 A JP7136725 A JP 7136725A JP 13672595 A JP13672595 A JP 13672595A JP H08328583 A JPH08328583 A JP H08328583A
Authority
JP
Japan
Prior art keywords
likelihood
cumulative likelihood
cumulative
frame
output
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
JP7136725A
Other languages
Japanese (ja)
Other versions
JP2853731B2 (en
Inventor
Shinsuke Sakai
信輔 坂井
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
NEC Corp
Original Assignee
NEC Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by NEC Corp filed Critical NEC Corp
Priority to JP7136725A priority Critical patent/JP2853731B2/en
Publication of JPH08328583A publication Critical patent/JPH08328583A/en
Application granted granted Critical
Publication of JP2853731B2 publication Critical patent/JP2853731B2/en
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Abstract

PURPOSE: To provide a high speed speach recognition device small in processing amount for deciding a mowing threshold value while having stable retrieval efficiency to fluctuation in ambient noise environment and change in a setting value of a beam width. CONSTITUTION: A characteristic extraction part 101 converts a voice input into time sequence of a characteristic vector to output to a gradual calculation part 104. A standard pattern storage part 102 stores a standard pattern. A cumulative likelihood storage part 103 stores cumulative likelihood outputted from a cumulative likelihood output part 105. The gradual calculation part 104 obtains the characteristic vector, the standard pattern of an i-th frame and the cumulative likelihood from the cumulative likelihood until an i-1 frame to the cumulative likelihood until the i-th frame. A cumulative likelihood output part 105 selects candidates until an M-th in a partial section of a certain K frame from the set of the inputted cumulative likelihood, and thereafter, selects the candidates by using a mean value of a cumulative likelihood difference between a most likelihood candidate and the M-th candidate obtained in the partial section of the K frame to output to a cumulative likelihood storage part 103. A result output part 106 outputs the recognition result based on the cumulative likelihood until a final frame.

Description

【発明の詳細な説明】Detailed Description of the Invention

【0001】[0001]

【産業上の利用分野】本発明は、音声認識装置に関す
る。
BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a voice recognition device.

【0002】[0002]

【従来の技術】音声認識装置は非常に大きい演算量を必
要とするため、従来よりビームサーチによる演算量の削
減が試みられている。ビームサーチによる候補刈り取り
のためのビームの幅の設定法としては、各候補刈り取り
時に、尤度の高いものから一定数の候補を残す方法と、
最大尤度から一定幅の範囲の尤度をもつ候補を残す方法
が良く知られている。
2. Description of the Related Art Since a voice recognition device requires a very large amount of calculation, it has been attempted to reduce the amount of calculation by beam search. As a method of setting the beam width for candidate pruning by beam search, a method of leaving a certain number of candidates from the one with high likelihood at the time of pruning each candidate,
A well-known method is to leave candidates having a likelihood within a certain range from the maximum likelihood.

【0003】伊藤らによる、音響学会研究発表会講演論
文集1993年10月73〜74ページに掲載の論文
「連続音声認識におけるビームサーチ」おいては、一定
数の候補を残す方法のほうが、ビーム幅の設定値の変化
に対して探索効率が安定していると報告されている。ま
た、最大尤度から一定幅の範囲の尤度をもつ候補の数
は、発声時の周囲雑音環境の影響を受けて変動すると考
えられるが、一定数の候補を残す方法においては、その
ような候補数の変動はない。
In the paper "Beam search in continuous speech recognition" published by Ito et al. On the Acoustical Society Research Presentation, October 73, pp. 73-74, the method of leaving a fixed number of candidates is better than the method of leaving a beam. It has been reported that the search efficiency is stable with respect to changes in the set value of the width. Moreover, the number of candidates having a likelihood within a certain range from the maximum likelihood is considered to fluctuate under the influence of the ambient noise environment at the time of utterance, but in the method of leaving a certain number of candidates, There is no change in the number of candidates.

【0004】[0004]

【発明が解決しようとする課題】しかしながら、上述の
一定数(仮にM個とする)の候補を残す方法では、候補
刈り取り時に第M位の候補を求めるための並べ替え処理
が必要となるために、処理量が多いという欠点があっ
た。
However, the above method of leaving a fixed number (probably M) of candidates requires rearrangement processing for obtaining the Mth candidate when pruning candidates. However, there was a drawback that the processing amount was large.

【0005】[0005]

【課題を解決するための手段】請求項1記載の発明によ
れば、入力音声の最初の数フレームで、最大尤度と一定
個数番目の候補の尤度との差をもとめておき、それ以後
は、前記差を用いて、ビームサーチにおける候補刈り取
りのための閾値を設定することを特徴とする音声認識装
置が得られる。
According to the first aspect of the invention, the difference between the maximum likelihood and the likelihood of a fixed number of candidates is found in the first several frames of the input speech, and thereafter, the difference is found. Is used to set a threshold value for candidate pruning in beam search, and a speech recognition apparatus is obtained.

【0006】請求項2記載の発明によれば、音声信号を
分析して特徴ベクトル時系列を出力する特徴抽出部と、
あらかじめ作成された標準パタンを蓄えておく標準パタ
ン記憶部と、累積尤度を保持する累積尤度記憶部と、前
記累積尤度記憶部に蓄えられた累積尤度と前記特徴ベク
トルの時系列と前記標準パタンとから新しい累積尤度を
求める漸化式計算部と、前記特徴ベクトル時系列のう
ち、ある部分系列に対しては、漸化式計算部でもとめら
れた累積尤度のうち一定個数を出力するとともに、最大
の累積尤度と前記一定個数番目の尤度との差を蓄積して
おき、それ以降の部分系列に対しては、前記蓄積された
尤度の差を用いて求められた閾値により、出力する累積
尤度を決定する累積尤度出力部と、前記累積尤度出力部
から出力される累積尤度より前記音声信号に対する認識
結果を求める結果出力部とを有することを特徴とする音
声認識装置が得られる。
According to the second aspect of the present invention, a feature extraction section for analyzing a voice signal and outputting a feature vector time series,
A standard pattern storage unit that stores a standard pattern created in advance, a cumulative likelihood storage unit that holds a cumulative likelihood, a cumulative likelihood stored in the cumulative likelihood storage unit, and a time series of the feature vector, A recurrence formula calculation unit that obtains a new cumulative likelihood from the standard pattern, and a certain number of cumulative likelihoods determined by the recurrence formula calculation unit for a certain partial sequence of the feature vector time series. Is output, and the difference between the maximum cumulative likelihood and the certain number of likelihoods is accumulated, and for subsequent subsequences, the difference is calculated using the accumulated likelihood difference. A cumulative likelihood output unit that determines a cumulative likelihood to be output according to the threshold, and a result output unit that obtains a recognition result for the voice signal from the cumulative likelihood output from the cumulative likelihood output unit. A voice recognition device .

【0007】請求項3記載の発明によれば、入力音声の
任意の個数の部分系列のおのおのに対して、第M位の候
補の累積尤度の最大累積尤度との差の平均値を求め、次
の部分系列の間では、前の部分系列で求めた前記差の平
均値を用いて候補刈り取りの閾値を設定することを特徴
とする音声認識装置における閾値設定方法が得られる。
According to the third aspect of the present invention, the average value of the difference between the cumulative likelihood of the M-th candidate and the maximum cumulative likelihood is calculated for each of the arbitrary number of subsequences of the input speech. A threshold setting method for a voice recognition device is obtained, in which the threshold value for candidate pruning is set using the average value of the differences obtained in the previous partial series between the next partial series.

【0008】[0008]

【実施例】次に、本発明について図面を参照して説明す
る。
Next, the present invention will be described with reference to the drawings.

【0009】図1は、本発明の一実施例を示すブロック
図である。図1を参照すると本発明の実施例は、特徴抽
出部101と、標準パタン記憶部102と、累積尤度記
憶部103と、漸化式計算部104と、累積尤度出力部
105と、結果出力部106とから構成される。
FIG. 1 is a block diagram showing one embodiment of the present invention. Referring to FIG. 1, the embodiment of the present invention includes a feature extraction unit 101, a standard pattern storage unit 102, a cumulative likelihood storage unit 103, a recurrence formula calculation unit 104, a cumulative likelihood output unit 105, and a result. And an output unit 106.

【0010】特徴抽出部101は、音声入力を特徴ベク
トルの時系列に変換し、漸化式計算部104に出力す
る。標準パタン記憶部102は、標準パタンを記憶す
る。累積尤度記憶部103は、累積尤度出力部105か
ら出力される累積尤度を記憶する。処理が開始される以
前には、全認識パス候補に対して累積尤度の初期値1.
0を保持する。漸化式計算部104は、第iフレームの
特徴ベクトル、標準パタン、および第i−1フレームま
での累積尤度から、第iフレームまでの累積尤度を求め
る。累積尤度出力部105は、入力された累積尤度の集
合から、次フレームの累積尤度計算に用いられるものを
選択し、累積尤度記憶部103に出力する。結果出力部
106は、最終フレームまでの累積尤度に基づいて認識
結果を出力する。
The feature extraction unit 101 converts a voice input into a time series of feature vectors and outputs it to the recurrence formula calculation unit 104. The standard pattern storage unit 102 stores standard patterns. The cumulative likelihood storage unit 103 stores the cumulative likelihood output from the cumulative likelihood output unit 105. Before the processing is started, the initial value of the cumulative likelihood is 1.
Holds 0. The recurrence formula calculation unit 104 obtains the cumulative likelihood up to the i-th frame from the feature vector of the i-th frame, the standard pattern, and the cumulative likelihood up to the (i-1) th frame. The cumulative likelihood output unit 105 selects, from the set of input cumulative likelihoods, the one used for the cumulative likelihood calculation of the next frame, and outputs it to the cumulative likelihood storage unit 103. The result output unit 106 outputs the recognition result based on the cumulative likelihood up to the final frame.

【0011】次に、図1及び図2を参照して、本実施例
の動作について説明する。
Next, the operation of this embodiment will be described with reference to FIGS.

【0012】入力された音声は、特徴抽出部101にお
いて、一定の時間間隔ごとに、音声の周波数をスペクト
ルをあらわす特徴ベクトルに変換され、漸化式計算部1
04に出力される。この一定の時間間隔を以下ではフレ
ームと呼ぶ。第iフレームにおいて、漸化式計算部10
4では、標準パタン記憶部102に保持されている標準
パタン REF={R1 ,…,RN }、ここでRw ={r
w (1),…,rw (Jw )} を用いて、現在のフレームの特徴ベクトルの各標準パタ
ンに対する局所的尤度 lw (i,j)(w=1,…,N、j=1,…,Jw ) を求める。ここで、Nは標準パタン数、Jw はw番目の
標準パタンのフレーム長である。次に、この局所的尤
度、及び累積尤度記憶部103に保持されている第i−
1フレームの累積尤度集合 G={g1 (i−1,1),…,g1 (i−1,
1 ),…,gN (i−1,1),…,gN (i−1,
N )} から、動的計画法に基づいた最大化処理により、下記数
1として現在のフレームの認識パス候補およびその累積
尤度を求める(図2のステップ1)。
In the feature extraction unit 101, the input voice is converted into a feature vector representing a spectrum of the frequency of the voice at regular time intervals, and the recurrence formula calculation unit 1
It is output to 04. Hereinafter, this fixed time interval is called a frame. In the i-th frame, the recurrence formula calculation unit 10
4, the standard pattern REF = {R 1 , ..., RN } held in the standard pattern storage unit 102, where R w = {r
Using w (1), ..., R w (J w )}, the local likelihood l w (i, j) (w = 1, ..., N, j) for each standard pattern of the feature vector of the current frame. = 1, ..., J w ). Here, N is the number of standard patterns, and J w is the frame length of the wth standard pattern. Next, the local likelihood and the i-th stored in the cumulative likelihood storage unit 103.
Cumulative likelihood set of 1 frame G = {g 1 (i-1, 1), ..., G 1 (i-1,
J 1), ..., g N (i-1,1), ..., g N (i-1,
J N )}, the recognition path candidate of the current frame and its cumulative likelihood are obtained by the maximization process based on the dynamic programming as the following expression 1 (step 1 in FIG. 2).

【0013】[0013]

【数1】 累積尤度出力部105は、あらかじめ決められたKと比
較して、i≦Kであるならば、最大値から第M番目の累
積尤度を求め、これを候補刈り取りのための閾値θと
し、これと最大尤度との差dを求める。後で平均を求め
るために、dの累積値Sd を、Sd =Sd +dと更新す
る(ステップ2,6、及び7)。
[Equation 1] The cumulative likelihood output unit 105 compares with a predetermined K, and if i ≦ K, calculates the Mth cumulative likelihood from the maximum value, and sets this as the threshold θ for candidate pruning, The difference d between this and the maximum likelihood is calculated. The cumulative value S d of d is updated to S d = S d + d for later averaging (steps 2, 6, and 7).

【0014】なお、Sd は、第1フレーム以前には0に
初期化しておく。
Note that S d is initialized to 0 before the first frame.

【0015】i=Kの場合は、Kフレーム間の最大尤度
と候補刈り取り閾値との差の平均D=Sd /Kを求める
(ステップ3)。
When i = K, the average D = S d / K of the differences between the maximum likelihood between K frames and the candidate cutting threshold is calculated (step 3).

【0016】また、i>Kの場合は、候補刈り取りのた
めの閾値θは、θ=gmax −Dとする。gmax は、第i
フレームにおける累積尤度の最大値である(ステップ
4)。
When i> K, the threshold value θ for cutting the candidate is θ = g max -D. g max is the i-th
It is the maximum value of the cumulative likelihood in the frame (step 4).

【0017】各フレームにおいて、累積尤度出力部10
5は、累積尤度の閾値θよりも大きい尤度をもつ認識パ
ス候補のみを累積尤度記憶部に出力する(ステップ
5)。
In each frame, the cumulative likelihood output unit 10
5 outputs only the recognition path candidates having a likelihood larger than the cumulative likelihood threshold θ to the cumulative likelihood storage unit (step 5).

【0018】現フレームが最終フレームである場合は、
累積尤度出力部105は、標準パタンの終端点に達した
すべての認識パス候補を結果出力部106に出力する。
結果出力部106は、累積候補が最大の認識パス候補を
もとめ、認識結果を出力する(ステップ8,9)。
If the current frame is the last frame,
The cumulative likelihood output unit 105 outputs to the result output unit 106 all recognition path candidates that have reached the end point of the standard pattern.
The result output unit 106 finds a recognition path candidate having the largest cumulative candidate and outputs the recognition result (steps 8 and 9).

【0019】以上、本実施例では、入力音声の最初のK
フレームで、第M位の候補の累積尤度の最大累積尤度と
の差の平均値を求めるという例によって説明したが、さ
らに一般には、入力音声の任意のLmax 個の部分系列l
1 ,li2 ,…,liLmax(Lmax ≧1)(これらの
部分系列を仮に学習区間と呼ぶ)のおのおのに対して、
上記の平均値を求め、学習区間lik と次の学習区間l
k+1 の間では、lik で求めた差の平均値を用いて候
補刈り取りの閾値を設定するという方法をとることがで
きる。
As described above, in this embodiment, the first K of the input voice is used.
The example has been described in which the average value of the difference between the cumulative likelihood of the M-th candidate and the maximum cumulative likelihood is calculated in a frame, but more generally, any L max subsequence l of the input speech is more generally described.
For each of i 1 , li 2 , ..., Li Lmax (L max ≧ 1) (these subsequences are tentatively referred to as learning intervals),
The above average value is calculated, and the learning section li k and the next learning section l
Between i k + 1 , a method of setting a threshold value for candidate pruning using the average value of the differences obtained by li k can be used.

【0020】[0020]

【発明の効果】以上説明したように、本発明による音声
認識装置は、周囲の雑音環境の変動やビーム幅Mの設定
値の変化に対応して第M位の候補の累積尤度と最大の累
積尤度の差が大きく変動するような場合でも、入力の一
部を用いてこの差の平均値を求めておき、これを用いて
刈り取り閾値の決定を行なうので、入力の全ての区間に
対して累積尤度第M位までの候補を残す方法に準ずる候
補の刈り取りが行なわれ、安定した探索効率を有しなが
らも、刈り取り閾値決定のための処理量が多くならない
という効果を有する。
As described above, the speech recognition apparatus according to the present invention corresponds to the cumulative likelihood and maximum of the M-th candidate in response to the fluctuation of the surrounding noise environment and the change of the set value of the beam width M. Even if the difference in cumulative likelihood fluctuates greatly, the average value of this difference is obtained using a part of the input, and the cutting threshold is determined using this, so that for all sections of the input. The candidates are pruned according to the method of leaving the candidates up to the M-th cumulative likelihood, and there is an effect that the amount of processing for determining the pruning threshold does not increase while having stable search efficiency.

【図面の簡単な説明】[Brief description of drawings]

【図1】本発明の音声認識装置の一実施例の構成を示し
たブロック図である。
FIG. 1 is a block diagram showing a configuration of an embodiment of a voice recognition device of the present invention.

【図2】図1に示す音声認識装置の一実施例の処理の流
れを示したフローチャートである。
FIG. 2 is a flowchart showing a processing flow of an embodiment of the voice recognition device shown in FIG.

【符号の説明】[Explanation of symbols]

101 特徴抽出部 102 標準パタン記憶部 103 累積尤度記憶部 104 漸化式計算部 105 累積尤度出力部 106 結果出力部 101 Feature Extraction Unit 102 Standard Pattern Storage Unit 103 Cumulative Likelihood Storage Unit 104 Recurrence Formula Calculation Unit 105 Cumulative Likelihood Output Unit 106 Result Output Unit

Claims (3)

【特許請求の範囲】[Claims] 【請求項1】 入力音声の最初の数フレームで、最大尤
度と一定個数番目の候補の尤度との差をもとめておき、
それ以後は、前記差を用いて、ビームサーチにおける候
補刈り取りのための閾値を設定することを特徴とする音
声認識装置。
1. The difference between the maximum likelihood and the likelihood of a fixed number of candidates is found in the first several frames of the input speech,
After that, the difference is used to set a threshold value for candidate pruning in beam search.
【請求項2】 音声信号を分析して特徴ベクトル時系列
を出力する特徴抽出部と、 あらかじめ作成された標準パタンを蓄えておく標準パタ
ン記憶部と、 累積尤度を保持する累積尤度記憶部と、 前記累積尤度記憶部に蓄えられた累積尤度と前記特徴ベ
クトルの時系列と前記標準パタンとから新しい累積尤度
を求める漸化式計算部と、 前記特徴ベクトル時系列のうち、ある部分系列に対して
は、漸化式計算部でもとめられた累積尤度のうち一定個
数を出力するとともに、最大の累積尤度と前記一定個数
番目の尤度との差を蓄積しておき、それ以降の部分系列
に対しては、前記蓄積された尤度の差を用いて求められ
た閾値により、出力する累積尤度を決定する累積尤度出
力部と、 前記累積尤度出力部から出力される累積尤度より前記音
声信号に対する認識結果を求める結果出力部とを有する
ことを特徴とする音声認識装置。
2. A feature extraction unit that analyzes a voice signal and outputs a feature vector time series, a standard pattern storage unit that stores a standard pattern created in advance, and a cumulative likelihood storage unit that holds a cumulative likelihood. A recurrence formula calculation unit that obtains a new cumulative likelihood from the cumulative likelihood stored in the cumulative likelihood storage unit, the time series of the feature vector, and the standard pattern; and, among the feature vector time series, For a partial sequence, while outputting a fixed number of cumulative likelihoods determined by the recurrence formula calculation unit, the difference between the maximum cumulative likelihood and the fixed number th likelihood is accumulated, For subsequences thereafter, a cumulative likelihood output unit that determines a cumulative likelihood to be output by a threshold value obtained by using the accumulated difference in likelihood, and an output from the cumulative likelihood output unit. From the accumulated likelihood That the recognition result speech recognition apparatus characterized by having a a determined result output unit.
【請求項3】 入力音声の任意の個数の部分系列のおの
おのに対して、第M位の候補の累積尤度の最大累積尤度
との差の平均値を求め、次の部分系列の間では、前の部
分系列で求めた前記差の平均値を用いて候補刈り取りの
閾値を設定することを特徴とする音声認識装置における
閾値設定方法。
3. An average value of the difference between the cumulative likelihood of the M-th candidate and the maximum cumulative likelihood is calculated for each of the arbitrary number of partial sequences of the input speech, and between the next partial sequences, A threshold setting method in a voice recognition device, characterized in that a threshold for candidate cutting is set using an average value of the differences obtained in the previous partial series.
JP7136725A 1995-06-02 1995-06-02 Voice recognition device Expired - Lifetime JP2853731B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP7136725A JP2853731B2 (en) 1995-06-02 1995-06-02 Voice recognition device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP7136725A JP2853731B2 (en) 1995-06-02 1995-06-02 Voice recognition device

Publications (2)

Publication Number Publication Date
JPH08328583A true JPH08328583A (en) 1996-12-13
JP2853731B2 JP2853731B2 (en) 1999-02-03

Family

ID=15182045

Family Applications (1)

Application Number Title Priority Date Filing Date
JP7136725A Expired - Lifetime JP2853731B2 (en) 1995-06-02 1995-06-02 Voice recognition device

Country Status (1)

Country Link
JP (1) JP2853731B2 (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7054935B2 (en) 1998-02-10 2006-05-30 Savvis Communications Corporation Internet content delivery network
US8924466B2 (en) 2002-02-14 2014-12-30 Level 3 Communications, Llc Server handoff in content delivery network
US8930538B2 (en) 2008-04-04 2015-01-06 Level 3 Communications, Llc Handling long-tail content in a content delivery network (CDN)
US9203636B2 (en) 2001-09-28 2015-12-01 Level 3 Communications, Llc Distributing requests across multiple content delivery networks based on subscriber policy
US9762692B2 (en) 2008-04-04 2017-09-12 Level 3 Communications, Llc Handling long-tail content in a content delivery network (CDN)
US10924573B2 (en) 2008-04-04 2021-02-16 Level 3 Communications, Llc Handling long-tail content in a content delivery network (CDN)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP6461660B2 (en) 2015-03-19 2019-01-30 株式会社東芝 Detection apparatus, detection method, and program

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7054935B2 (en) 1998-02-10 2006-05-30 Savvis Communications Corporation Internet content delivery network
US9203636B2 (en) 2001-09-28 2015-12-01 Level 3 Communications, Llc Distributing requests across multiple content delivery networks based on subscriber policy
US8924466B2 (en) 2002-02-14 2014-12-30 Level 3 Communications, Llc Server handoff in content delivery network
US9167036B2 (en) 2002-02-14 2015-10-20 Level 3 Communications, Llc Managed object replication and delivery
US9992279B2 (en) 2002-02-14 2018-06-05 Level 3 Communications, Llc Managed object replication and delivery
US10979499B2 (en) 2002-02-14 2021-04-13 Level 3 Communications, Llc Managed object replication and delivery
US8930538B2 (en) 2008-04-04 2015-01-06 Level 3 Communications, Llc Handling long-tail content in a content delivery network (CDN)
US9762692B2 (en) 2008-04-04 2017-09-12 Level 3 Communications, Llc Handling long-tail content in a content delivery network (CDN)
US10218806B2 (en) 2008-04-04 2019-02-26 Level 3 Communications, Llc Handling long-tail content in a content delivery network (CDN)
US10924573B2 (en) 2008-04-04 2021-02-16 Level 3 Communications, Llc Handling long-tail content in a content delivery network (CDN)

Also Published As

Publication number Publication date
JP2853731B2 (en) 1999-02-03

Similar Documents

Publication Publication Date Title
US8612225B2 (en) Voice recognition device, voice recognition method, and voice recognition program
US9536525B2 (en) Speaker indexing device and speaker indexing method
US7039588B2 (en) Synthesis unit selection apparatus and method, and storage medium
US6980955B2 (en) Synthesis unit selection apparatus and method, and storage medium
US7054810B2 (en) Feature vector-based apparatus and method for robust pattern recognition
US8630853B2 (en) Speech classification apparatus, speech classification method, and speech classification program
US20020184020A1 (en) Speech recognition apparatus
US7319960B2 (en) Speech recognition method and system
JP4531166B2 (en) Speech recognition method using reliability measure evaluation
US20110077943A1 (en) System for generating language model, method of generating language model, and program for language model generation
US20030200086A1 (en) Speech recognition apparatus, speech recognition method, and computer-readable recording medium in which speech recognition program is recorded
US20130185068A1 (en) Speech recognition device, speech recognition method and program
JPH08234788A (en) Method and equipment for bias equalization of speech recognition
US7010483B2 (en) Speech processing system
JPH0372999B2 (en)
JPH0962291A (en) Pattern adaptive method using describing length minimum reference
JP2751856B2 (en) Pattern adaptation method using tree structure
JPH08328583A (en) Speach recognition device
Gangadharaiah et al. A novel method for two-speaker segmentation.
Shinozaki et al. Hidden mode HMM using bayesian network for modeling speaking rate fluctuation
JP4659541B2 (en) Speech recognition apparatus and speech recognition program
US7912715B2 (en) Determining distortion measures in a pattern recognition process
JP3353334B2 (en) Voice recognition device
KR101134450B1 (en) Method for speech recognition
JPH09305195A (en) Speech recognition device and speech recognition method

Legal Events

Date Code Title Description
A01 Written decision to grant a patent or to grant a registration (utility model)

Free format text: JAPANESE INTERMEDIATE CODE: A01

Effective date: 19981021