[go: up one dir, main page]

JPS58159595A - Monosyllabic voice recognition system - Google Patents

Monosyllabic voice recognition system

Info

Publication number
JPS58159595A
JPS58159595A JP57030033A JP3003382A JPS58159595A JP S58159595 A JPS58159595 A JP S58159595A JP 57030033 A JP57030033 A JP 57030033A JP 3003382 A JP3003382 A JP 3003382A JP S58159595 A JPS58159595 A JP S58159595A
Authority
JP
Japan
Prior art keywords
monosyllabic
matching
speech
candidate
unit
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
JP57030033A
Other languages
Japanese (ja)
Inventor
教幸 藤本
佐藤 泰雄
大山 隆之
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fujitsu Ltd
Original Assignee
Fujitsu Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fujitsu Ltd filed Critical Fujitsu Ltd
Priority to JP57030033A priority Critical patent/JPS58159595A/en
Publication of JPS58159595A publication Critical patent/JPS58159595A/en
Pending legal-status Critical Current

Links

Abstract

(57)【要約】本公報は電子出願前の出願データであるた
め要約のデータは記録されません。
(57) [Summary] This bulletin contains application data before electronic filing, so abstract data is not recorded.

Description

【発明の詳細な説明】 (a)  発明の技術分野 音声を認識させる単音節音声認識方式に関する。[Detailed description of the invention] (a) Technical field of the invention This invention relates to a monosyllabic speech recognition method for recognizing speech.

缶)技術の背景 近年音声認識技術の向上に伴い、話者の音声を認識する
場合、認識誤りの少い音声認識装置の出現が望まれてい
る。音声認識方式は主として話者の単音節音声を予め特
徴パラメータに変換して記憶させておき、未知入力単音
節音声の特徴パラメータと予め記憶させた特徴パラメー
タとを照合して最も似ているものを該当する単音節音声
として認識するものであるが、同じ単音節音声でも発声
の仕方では特徴パラメータは変化し、例え同一単音節音
声を何回か発声方法を変えて登録しておいても誤りを零
にすることは困難である。特に認識誤シを生じ易い特徴
パラメータを有する単音節音声は照合方法を考慮しない
と認識率の向上を計ることが出来ない。このため予め登
録しである総ての単音節音声と話者の単音節音声とを照
合した後、該照合結果に基づき未知入力単音節音声に最
も似ている単音節音声から順に順次複数の再照合候補を
登録済単音節音声より選出し、該複数の再照合候補の絹
合せに応じて定まる再照合パラメータにより未知入力単
音節音声と該禅数の再照合候補とを再照合して認識率の
向トを計る滋音節嵜市W拗方式が提案でれている。しか
し上記再照合方式には改善の余地があpその対策が望ま
れている1、(c)  発明の目的 本発明の目的は上記要望に基づき上記再照合方式の学音
節音声認識方式に於て、再照合候補の数を絞って町照合
に要する時間を短縮すると共に装置の構成を簡易fヒし
経済性の向上を計るものである。
Background of the Technology As speech recognition technology has improved in recent years, there has been a desire for a speech recognition device with fewer recognition errors when recognizing a speaker's voice. The speech recognition method mainly converts the speaker's monosyllabic speech into feature parameters and stores them in advance, and then compares the feature parameters of the unknown input monosyllabic speech with the pre-stored feature parameters to find the one that is most similar. The system recognizes the corresponding monosyllabic speech, but even if the same monosyllabic speech is pronounced, the characteristic parameters change depending on the way it is uttered. It is difficult to reduce it to zero. In particular, for monosyllabic speech that has characteristic parameters that are likely to cause recognition errors, it is impossible to improve the recognition rate unless the matching method is taken into consideration. For this reason, after comparing all monosyllabic voices registered in advance with the monosyllabic voice of the speaker, multiple replays are performed in order from the monosyllabic voice that is most similar to the unknown input monosyllabic voice based on the matching results. The recognition rate is calculated by selecting matching candidates from the registered monosyllabic voices and re-matching the unknown input monosyllabic speech with the re-matching candidates for the Zen number using the re-matching parameters determined according to the combination of the plurality of re-matching candidates. A method has been proposed to measure the direction of the earthquake. However, there is room for improvement in the above-mentioned re-verification method, and countermeasures are desired. This aims to shorten the time required for town verification by narrowing down the number of candidates for reverification, and to improve economic efficiency by simplifying the configuration of the device.

(d)  発明の構成 本発明の構成は予め単音節音声を登録しておき、未知入
力単音節音声の特徴パラメータと予め登録された総ての
単音節音声の特徴パラメータをDP照会して最も良く似
ているものから上位順に順次複数の再照合候補を該登録
済単音節音声より選別し、該複数の杓照合候補の組合せ
に応じて定まる再照合パラメータにより未知入力単音節
音声と該再照合候補とを再照合して、その結果最も良く
似ている再照合候補を該当単音節音声として認識するが
、該複数の再照合候補を選別する際にDP照合における
第−位の再照合候補と未知入力単音節音声との類似度が
該第−位の再照合候補により詑められる閾値以上の場合
は再照合工程を省略して該第−位の再照合候補を認識結
果として送出し単音節音声認識時間の短縮と再照合回路
の簡易化を計るものである。
(d) Structure of the Invention The structure of the present invention is best achieved by registering monosyllabic speech in advance, and querying the feature parameters of the unknown input monosyllabic speech and the feature parameters of all previously registered monosyllabic speech in the DP. A plurality of rematching candidates are sequentially selected from the registered monosyllabic speech in descending order of similarity, and the unknown input monosyllabic speech and the rematching candidate are selected based on a rematching parameter determined according to the combination of the plurality of ladle matching candidates. As a result, the most similar re-matching candidate is recognized as the corresponding monosyllabic speech, but when selecting the plurality of re-matching candidates, it is necessary to distinguish between the highest re-matching candidate in DP matching and the unknown. If the degree of similarity with the input monosyllabic speech is equal to or greater than the threshold that can be satisfied by the re-matching candidate of the -th rank, the re-matching step is omitted and the re-matching candidate of the -th rank is sent out as the recognition result, and the monosyllabic speech is output. This aims to shorten recognition time and simplify the re-verification circuit.

(e)  発明の実施例 図は本発明の一実施例を示す回路のブロック図である。(e) Examples of the invention The figure is a block diagram of a circuit showing one embodiment of the present invention.

先ず話者は予め単音節音声を登録するため制御部8の制
御により切替部3をパラメータ格納部4に接続し、単音
節音声を入力より加える。
First, the speaker connects the switching section 3 to the parameter storage section 4 under the control of the control section 8 in order to register monosyllabic speech in advance, and inputs the monosyllabic speech.

前舵処理部1は音声レベル調整及びアナログディジタル
変換等を行ないパラメータ抽出部2へ送出し、パラメー
タ抽出部2は前記単音節音声の特徴パラメータを抽出し
パラメータ格納部4へ格納する。次に単音節声の認識を
行なわせるため、話者は制御部8の制御により切替部3
を記憶部5へ接続し、単音節音声を発声する。前記同様
の動作により前処理部1、パラメータ抽出部2、切替部
3を紗で記憶部5へ入った未知入力単音節音声の特徴パ
ラメータは制御部8の制御によりパラメータ3− 格納部4に格納されている全単音節音声の特徴パラメー
タと照合部6に於てDP照合され、該全単音節音声の特
徴パラメータ中で最も良く似た特徴パラメータを持つ単
音節音声が第−位の再照合候補として選出され、続いて
順次複数の再照合候補が選出され判定部7へ送られる。
The front rudder processing section 1 performs audio level adjustment, analog-to-digital conversion, etc., and sends it to the parameter extraction section 2. The parameter extraction section 2 extracts characteristic parameters of the monosyllabic speech and stores them in the parameter storage section 4. Next, in order to recognize a monosyllabic voice, the speaker selects the switching unit 3 under the control of the control unit 8.
is connected to the storage unit 5, and a monosyllabic voice is uttered. The characteristic parameters of the unknown input monosyllabic speech that have entered the storage unit 5 through the preprocessing unit 1, parameter extraction unit 2, and switching unit 3 through the same operation as described above are stored in the parameter 3-storage unit 4 under the control of the control unit 8. The matching unit 6 performs DP comparison with the feature parameters of all the monosyllabic speeches, and the monosyllabic speech with the most similar feature parameters among the feature parameters of all the monosyllabic speeches is designated as the highest re-matching candidate. Then, a plurality of re-verification candidates are sequentially selected and sent to the determination unit 7.

判定部7では照合部6で計算される未知入力単音節音声
と再照合候補との距離により類似度を判定する。即ち照
合部6で前記の如く第−位として選出された再照合候補
と未知人力単音節音声との類似度が該第−位の再照合候
補により予め定められている閾値より大きく、該再照合
候補が殆ど間違いなく未知入力単音節音声と判定して良
い場合は制御部8を経て出力に認識結果として送出する
。しかし前記第−位の再照合候補と未知入力単音節音声
との類似度が前記閾値より小さく該第−位の再照合候補
を未知入力単音節音声と判定するには危険がある場合は
再照合動作を行々っで認識する。従って制御部8は該再
照合候補に相当する特徴パラメータをパラメータ格納部
4よシ乗算器10へ、記憶部5に4− 人っている未知入力単音節音声の特徴パラメータを乗算
器11へ夫々送出させ、判定部7は該再照合候補により
定まる再照合パラメータ、即ち再照合候補を相互に識別
するに適した周波数帯域の成分を強調し、その他の周波
数帯域成分を減少させたものを周波数ウェイト記憶部1
2より乗算器10゜11へ送出させる0又判定部7は該
再照合候補に応じて定まる最適の照合区間を決定するパ
ラメータである閾値を閾値記憶部13より再照合部9へ
送出させる0再熱合部9は乗算器10.11の出力と該
閾値記憶部13よりの閾値とにより再照合する0前記第
−位の再照合候補より順に複数の再照合候補が未知入力
単音節音声と再照合され最も良く似た再照合候補が認識
結果として制御部8より出力へ送出される〇 (f)  発明の詳細 な説明した如く本発明は再照会方式を用いる単音節音声
認識方式に於て、再照合候補の数を絞って再照合に要す
る時間を短縮し、且つ再照合動作に関連する構成機器を
簡易化することが可能で経済性を向上させることが出来
るため、その効果は大なるものがある。
The determining unit 7 determines the degree of similarity based on the distance between the unknown input monosyllabic speech calculated by the matching unit 6 and the re-matching candidate. That is, the degree of similarity between the re-matching candidate selected as the first-ranked re-matching candidate and the unknown human monosyllabic speech by the matching unit 6 as described above is greater than the threshold predetermined by the second-ranking re-matching candidate, and the re-matching is performed. If it is determined that the candidate is almost definitely an unknown input monosyllabic speech, it is sent as an output through the control unit 8 as a recognition result. However, if the degree of similarity between the re-matching candidate at the highest rank and the unknown input monosyllabic speech is smaller than the threshold value and there is a danger in determining the re-matching candidate at the highest rank as the unknown input monosyllabic speech, re-matching is necessary. Recognize actions step by step. Therefore, the control unit 8 transfers the feature parameters corresponding to the re-verification candidate from the parameter storage unit 4 to the multiplier 10, and transfers the feature parameters of the unknown input monosyllabic speech stored in the storage unit 5 to the multiplier 11. The determining unit 7 determines the rematching parameters determined by the rematching candidates, that is, emphasizes the frequency band components suitable for mutually discriminating the rematching candidates, and reduces other frequency band components as frequency weights. Storage part 1
2 to the multipliers 10 and 11. The 0-or determination unit 7 sends the threshold value, which is a parameter for determining the optimal matching interval determined according to the re-matching candidate, from the threshold storage unit 13 to the re-matching unit 9. The combination unit 9 performs re-verification using the output of the multiplier 10.11 and the threshold value from the threshold value storage unit 13. A plurality of re-verification candidates are re-verified with the unknown input monosyllabic speech in order from the -th rank re-verification candidate. The most similar re-verification candidate after collation is sent to the output from the control unit 8 as a recognition result (f) As described in detail, the present invention provides a monosyllabic speech recognition method using the re-referral method. The effect is significant because it is possible to reduce the number of reverification candidates, shorten the time required for reverification, and simplify the component equipment related to reverification operation, improving economic efficiency. There is.

【図面の簡単な説明】[Brief explanation of drawings]

図は本発明の一実施しuを示す回路のブロック図である
。1は前処理部、2はパラメータ抽出部、3は切替部、
4はパラメータ格納部、5は記憶部、6は照合部、7は
判定部、8は制御部、9は再照合部、10.11は乗算
器、12は周波数ウェイト記憶部、13は閾値記憶部で
ある。
The figure is a block diagram of a circuit illustrating one embodiment of the present invention. 1 is a preprocessing unit, 2 is a parameter extraction unit, 3 is a switching unit,
4 is a parameter storage unit, 5 is a storage unit, 6 is a collation unit, 7 is a determination unit, 8 is a control unit, 9 is a re-verification unit, 10.11 is a multiplier, 12 is a frequency weight storage unit, 13 is a threshold value storage Department.

Claims (1)

【特許請求の範囲】[Claims] 予め登録された総ての単音節音声と未知入力単音節音声
とを照合した後、該照合結果に基づき複数の再照合候補
を登録済の単音節音声よ9選出し、該再照合候補の組合
せごとに再照合パラメータを選定して未知入力単音節音
声と再照合する音声認識装置に於て、前記再照合候補を
選出する際に第−位となった再照合候補の類似度が該再
照合候補ごとに定められた閾値以上の場合は再照合を省
略する事を特徴とする単音節音声認識方式。
After comparing all previously registered monosyllabic speech with the unknown input monosyllabic speech, a plurality of re-matching candidates are selected from the registered monosyllabic speech based on the matching results, and a combination of the re-matching candidates is selected. In a speech recognition device that selects a rematching parameter for each case and rematches it with an unknown input monosyllabic speech, the similarity of the rematching candidate that ranks highest when selecting the rematching candidate is the rematching parameter. A monosyllabic speech recognition method that is characterized by omitting re-verification if the threshold value determined for each candidate is exceeded.
JP57030033A 1982-02-26 1982-02-26 Monosyllabic voice recognition system Pending JPS58159595A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP57030033A JPS58159595A (en) 1982-02-26 1982-02-26 Monosyllabic voice recognition system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP57030033A JPS58159595A (en) 1982-02-26 1982-02-26 Monosyllabic voice recognition system

Publications (1)

Publication Number Publication Date
JPS58159595A true JPS58159595A (en) 1983-09-21

Family

ID=12292501

Family Applications (1)

Application Number Title Priority Date Filing Date
JP57030033A Pending JPS58159595A (en) 1982-02-26 1982-02-26 Monosyllabic voice recognition system

Country Status (1)

Country Link
JP (1) JPS58159595A (en)

Similar Documents

Publication Publication Date Title
US6751595B2 (en) Multi-stage large vocabulary speech recognition system and method
US6012027A (en) Criteria for usable repetitions of an utterance during speech reference enrollment
US6076055A (en) Speaker verification method
EP0822539B1 (en) Two-staged cohort selection for speaker verification system
JPH10319988A (en) Speaker identifying method and speaker recognizing device
KR20210050884A (en) Registration method and apparatus for speaker recognition
US11081115B2 (en) Speaker recognition
CN112735437A (en) Voiceprint comparison method, system and device and storage mechanism
US6499012B1 (en) Method and apparatus for hierarchical training of speech models for use in speaker verification
US7340398B2 (en) Selective sampling for sound signal classification
JPS58159595A (en) Monosyllabic voice recognition system
JP2589299B2 (en) Word speech recognition device
JP2589300B2 (en) Word speech recognition device
JP2000099090A (en) Speaker recognition using symbol strings
JPS58159599A (en) Monosyllabic speech recognition method
JPS58159598A (en) Monosyllabic speech recognition method
JPS58159600A (en) Monosyllabic speech recognition method
JPS58159597A (en) Monosyllabic voice recognition system
JP2989231B2 (en) Voice recognition device
CN114155864B (en) Elevator control method, device, electronic device and readable storage medium
JPS58159590A (en) Monosyllabic voice recognition system
JPS6052440B2 (en) voice recognition device
JPS5934595A (en) Voice recognition processing system
CN117935814A (en) Model training method, voice awakening method, device and storage medium
JPS62111295A (en) Voice recognition equipment