JPS59216198A

JPS59216198A - Sound/soundless discrimination system for voice

Info

Publication number: JPS59216198A
Application number: JP9213583A
Authority: JP
Inventors: 清水　雅久
Original assignee: Sanyo Electric Co Ltd; Sanyo Denki Co Ltd
Current assignee: Sanyo Electric Co Ltd; Sanyo Denki Co Ltd
Priority date: 1983-05-24
Filing date: 1983-05-24
Publication date: 1984-12-06
Also published as: JPH0420198B2

Abstract

(57)【要約】本公報は電子出願前の出願データであるた
め要約のデータは記録されません。(57) [Summary] This bulletin contains application data before electronic filing, so abstract data is not recorded.

Description

【発明の詳細な説明】印　・産業上の利用分野未発明は音声分析台５ｙに用いられる音声の有声≦無声
判定方式に関する。DETAILED DESCRIPTION OF THE INVENTION The field of industrial application not yet invented relates to a voiced ≦ unvoiced determination method used in the voice analysis stand 5y.

（ロ）　従来技術現在、音声合成装置としてニック−コール方式が主流、
となっており、その概ＷＰ１を第１図に示す０同図に於
いて（１）は人間の声帯振動の周期性ｖｉ−模擬して周
期パルス音源信号を発生する有声音源発生回路、（２）
は人間の気管での乱流振動の非周期性を模擬して雑音音
源信号を発生する無声音源発生回路である。（３）は上
記有声音源発生回路（１）から得られる周期パルス音源
信号又は無声音源発生回路（２）から得られる雑音音源
信吟ヲ選択する選択スイッチであ′る。（４）は該スイ
ッチ（３）から得られる音源信号にその信号本来の振巾
成分を付与する乗算器、（５）は人間の声道の音響特性
を模擬したディジタルフィルタであり、上記乗算器（４
）から得られる有声又は無声音源信号を濾波する事に依
って、有声音声又は無声音声の音声波形信号が出力され
る。（６）は該ディジタルフィルタ（５）から得られる
ディジタル値の音声波形信号をアナログ値に変換するＤ
＠に変換器、（７）は該Ｄ−Ａ変換器（６）からの音声
波形信号に基づいて、合成音声を発声するスピーカであ
る。（８）にパラメータメモリであり、上記有声音源発
生回路（１）での周期パルス音源信号の周期？設定する
と共に上記選択スイッチ（３）での音源信号の選択を指
示するピッチパラメータＰと上記乗算器（４）での振巾
成分子’ｓ＞定するアングツくラメータＡと、上記ディ
ジタルフィルタ（５）のフィルタ特性を決定するパーコ
ール係数’ｉＦ（＝　（Ｋｎ　Ｋ２１　”’＋　Ｌｏ）
と、がフレーム周期毎にえられている０従って、無声音声を合成する時には、ピツチノくラメー
タＰの値は０”となっており、選択スイッチ（３）ハこ
の値″０”を検知して、無声音源発生回路（２）からの
雑音音源信号を乗算器（４）に導入し、この音源信号に
アンプパラメータＡの値を乗算する事に依って、無声音
源信号を得る事になる。一方、有声音声を合成する時に
はビツチノくラメータＰの値は０”でない数値に依って
ピッチ周期を示しており、この値Ｐに依って、有声音源
発生回路（１）はその周期パルス音源信号の周期を設定
すると共に、選択スイッチ（３）はピッチパラメータＰ
の値が６０”でない事を検知して有声音源発生回路（１
）からの周期パルス音源信号を乗算器（４）に導入しこ
の音源信号にアンプパラメ・−タルの値を乗算する事に
依り、有声音源信号を得る事になる。(b) Prior art At present, the Nick-Call method is the mainstream voice synthesizer.
The approximate WP1 is shown in FIG. )
is a silent sound source generation circuit that generates a noise sound source signal by simulating the non-periodic nature of turbulent vibration in the human trachea. (3) is a selection switch for selecting the periodic pulse sound source signal obtained from the voiced sound source generation circuit (1) or the noise sound source signal obtained from the unvoiced sound source generation circuit (2). (4) is a multiplier that adds the original amplitude component to the sound source signal obtained from the switch (3); (5) is a digital filter that simulates the acoustic characteristics of the human vocal tract; (4
), a voice waveform signal of voiced or unvoiced voice is output by filtering the voiced or unvoiced sound source signal obtained from the source. (6) is a D converter that converts the digital audio waveform signal obtained from the digital filter (5) into an analog value.
@ is a converter, and (7) is a speaker that produces synthesized speech based on the audio waveform signal from the D-A converter (6). (8) is a parameter memory, which is the period of the periodic pulse sound source signal in the voiced sound source generation circuit (1)? The pitch parameter P that is set and also instructs the selection of the sound source signal with the selection switch (3), the amplitude component element A of the multiplier (4), and the angle parameter A that determines the amplitude component element of the multiplier (4); ) is the Percoll coefficient 'iF(= (Kn K21 '''+ Lo) that determines the filter characteristics of
is obtained every frame period. Therefore, when synthesizing unvoiced speech, the value of the pitch parameter P is 0", and the selection switch (3) detects this value "0". , the noise source signal from the unvoiced source generating circuit (2) is introduced into the multiplier (4), and by multiplying this source signal by the value of the amplifier parameter A, an unvoiced source signal is obtained. , when synthesizing voiced speech, the value of bitch parameter P indicates the pitch period by a non-zero value, and depending on this value P, the voiced sound source generation circuit (1) determines the period of the periodic pulse sound source signal. In addition to setting the pitch parameter P, the selection switch (3)
The voiced sound source generation circuit (1
) is introduced into the multiplier (4) and this sound source signal is multiplied by the value of the amplifier parameter, thereby obtaining a voiced sound source signal.

斯して、得られた有声又は無声音源信号はパーコール係
数Ｋｌ＋　”２ｓ・・・＋　Ｋ１０にてフィルタ特性が
制御されたディジタルフィルタ（５１にて濾波され、さ
らにＤ−Ａ変換（６）されてスピーカ（７）にて有声又
は無声の合成音声が発声される。The voiced or unvoiced sound source signal thus obtained is filtered by a digital filter (51) whose filter characteristics are controlled by a Percoll coefficient Kl+"2s...+K10, and further subjected to D-A conversion (6). A voiced or unvoiced synthesized voice is uttered by a speaker (7).

斯様な音声合成装置に於いては、そのノくラメータメモ
リ（８）に・貯えておくべきパラメータＰ、Ａ。In such a speech synthesizer, parameters P and A should be stored in its parameter memory (8).

Ｋ１−に１゜を予じめ元音声の波形＠号に基づいて、１
０ｍ　ｓｅａ程度のフレーム周期毎に分析抽出しておく
必要がある０この分析の為の装置としては従来、第２図
に示す如く、音声波形信号Ｓの０次から１４０次までの
自己相関関係値Ｖ。−■１４０　ｋ算出する相関器ｆｕ
ｌｌと、該相関器（１１）から得られる０次から１０次
の各自己相関関数値■。〜Ｖ、ｏ＜Ｃ基づいて１次から
１０次のパーコール係数に、〜Ｌｒ＋に導出するパーコ
ール係数抽出器ａ２１と、これとは別に上記相関器１１
１１から得られる各自己相関関数値Ｖｌｌ〜Ｖ１４０並
びに上記パーコール係数抽出器（１２＋から得られる１
０個のパーコール係数に１〜ＫＩＯに基づ５い・て変形
相関、処理を行ない音声波形信号の変形相関関数Ｗ（τ
）を導出する変形相関器（１３１と、該賢形相関器日か
らの相関関数Ｗ（τ）の最大値ｐｍ’を求め、この時の
遅れ時間τをピッチパラメータＰとして出力する最大値
検出器財と、上記パーコール係数抽出器財てのパーコー
ル係数Ｋｌ’＝ＫＩＯの抽出に併なって得られる１フレ
一ム周期単位の残差電力Ｅｉ上記最大値検出器ａ句から
得られるピッチパラメータＰに基づいて１ピツチ毎に配
分したアンプパラメータＡを算出するアンプパラメータ
抽出器卵と、を用いて、各パラメータＰ、Ａ、に、〜Ｋ
ＩＧを得ていた。ζらには、ピッチパラメータＰの値を
６０”とするか否かに依る有声無声の判定は、有声無声
判定部側に依って行なわれ、この判定条件については特
公昭５５−３４９５６号に詳しく記載されている様に、
上記最大値検出−路１１４＋から得られる変形相関関数
Ｗ（τ）の最大値ρｍと上記パーコール係数抽出器（＋
２１から得られる１次のパーコール係数に、に０．５倍
した値０−５ｘ＋と全加算した値、即ち２ｍ＋０．５ｘ
、に求め、この値を特定の閾値ｔと比較し、ｐ　ｍ　＋０．５　Ｋ、≧ｔの時、有声音声であり、逆
に、ｐ　ｍ＋　０．５　Ｋ１　＜　ｔの時、無声音声である
と判定しており、これに基づく判定信号Ｕを出力してい
た０しかしながら、斯様な従来のパーコール方式の音声分析
方法に於いては、その王たるパラメータであって情報量
の最も大きなパーコール係数に、〜に＋ｏ　’に求める
に当って相関器１ｕｌｌにて０次から１０次までの自己
相関係数■。〜■１゜を算出するだけで良いのに対して
、情報量が比較的小さなピッチパラメータＰ及び有声無
声の判定信号Ｕｉ得るのに相関器［１１１にてざらに１
１次から１４０次までの自己相関関数値Ｖｌｌ〜Ｖ１４
０を算出しなければならず、この為の膨大な計算量、並
びにこね１等間数値ｖ１□〜ｖ１４ｆｌを用いｆ？：、
変形相関器ｔ１３１でのさらに膨大な計鎖、量が斯る分
析過程に要する計算量の太−？を占めていた。Set 1° to K1- in advance based on the waveform of the original audio.
It is necessary to analyze and extract every frame period of about 0m sea.As a conventional device for this analysis, as shown in Figure 2, the autocorrelation values from the 0th to the 140th order of the audio waveform signal S are used. V. -■140k correlator fu to calculate
ll, and each 0th to 10th order autocorrelation function value ■ obtained from the correlator (11). A Percoll coefficient extractor a21 that derives Percoll coefficients from 1st order to 10th order to ~Lr+ based on ~V, o<C, and the above-mentioned correlator 11 separately from this.
Each autocorrelation function value Vll to V140 obtained from 11 and the above Percoll coefficient extractor (1 obtained from 12+
The modified correlation function W(τ
), and a maximum value detector that calculates the maximum value pm' of the correlation function W(τ) from the wise correlator date and outputs the delay time τ at this time as the pitch parameter P. and the Percoll coefficient Kl' of all the above Percoll coefficient extractor goods = residual power Ei in one frame period unit obtained along with the extraction of KIO; the pitch parameter P obtained from the maximum value detector a phrase Using an amplifier parameter extractor that calculates the amplifier parameter A distributed for each pitch based on the parameter P, A, ~K
I was getting IG. ζ et al., the voiced/unvoiced determination based on whether or not the value of the pitch parameter P is set to 60'' is performed by the voiced/unvoiced determining section, and this determination condition is detailed in Japanese Patent Publication No. 55-34956. As stated,
The maximum value ρm of the modified correlation function W(τ) obtained from the maximum value detection path 114+ and the Percoll coefficient extractor (+
The value obtained by adding the first-order Percoll coefficient obtained from 21 by 0.5 times 0-5x+, that is, 2m+0.5x
, and compare this value with a specific threshold t. When p m + 0.5 K, ≥ t, it is voiced speech, and conversely, when p m + 0.5 K1 < t, it is unvoiced speech. However, in such conventional Percoll-based voice analysis methods, the Percoll coefficient, which is the main parameter and has the largest amount of information, is In calculating +o', the autocorrelation coefficients from the 0th order to the 10th order are obtained using a correlator 1ull. 〜■1°, whereas it is necessary to calculate the pitch parameter P, which has a relatively small amount of information, and the voiced/unvoiced judgment signal Ui, the correlator [111 roughly 1°
Autocorrelation function values from 1st to 140th order Vll to V14
It is necessary to calculate 0, which requires a huge amount of calculation, and using the kneading 1st interval values v1□ to v14fl, f? :,
The even larger number of calculations in the modified correlator t131 increases the amount of calculation required for such an analysis process. was occupied.

従って、斯る従来方法を用いて音声分析装置の化小へ並びに分析時開の短縮を図るには限界があり。Therefore, it is difficult to develop a speech analysis device using such conventional methods. There are limits to how much space can be reduced and how quickly the analysis time can be shortened.

マイクロコンピュータクラスの計算処理システムを用い
ていたのでは音声分析の実時間処理は不可能であった。Real-time processing of speech analysis was impossible using a microcomputer-class computing system.

一方、癌−声波形信号から＠接実時間でピッチパラメー
タＰ、￥は有声無声の判定信号Ｕを抽出する為の方式が
従来から提案されており、これ等の１六を採用すれば、
従来の音声分析方法に於ける変形相関器ｔ１３１を不要
とし、さらには相関器ｌの割算ｉＦを太１Ｊに低減でき
る事となるが、特に従来の有声無声の判定方式に於いて
はその判定精度が低い為に、音声の有声音声領域と無声
音声領域との識別誤差が大きく第１図に示した如き音声
合成装置にて得られる合成音声の品質が劣化する欠点が
あった。On the other hand, a method has been proposed for extracting the pitch parameter P and the voiced/unvoiced determination signal U from the cancer-voice waveform signal in @contact time, and if these 16 are adopted,
This eliminates the need for the modified correlator t131 in the conventional speech analysis method, and further reduces the division iF of the correlator l to 1J, which is especially difficult to use in the conventional voiced/unvoiced determination method. Because of the low accuracy, there is a large error in identifying voiced and unvoiced regions of speech, which has the disadvantage of deteriorating the quality of synthesized speech obtained by the speech synthesizer as shown in FIG.

（ハ）　発明の目的本発明は、上述の点に鑑みて為され、精度が旨く実時間
での判定処理が可能な音声の有声無声判定方式を提供す
るものである。(C) Object of the Invention The present invention has been made in view of the above-mentioned points, and provides a voiced/unvoiced voice determination method that is highly accurate and capable of performing determination processing in real time.

に）　発明の構成本発明の音声の有声無声判定方式は、音声波形信号の低
域成分信号の最大値をＬ−１音声波、影信号の０次の自
己相関関数値（ｒｖ。、１次のパーコール係数をに１．
夫々異なる定数をａ、ｂ、ｃ、ｄとした時、（１）Ｌ皿ｘ（ａ（ｌｌ）　　Ｌ　ｍａｘ　＞　ａ　、且っに、＜ｂ（Ｉ
ｆｆ）　　Ｌ　ｒｎａｘ）　ａ、且つｂ（Ｋ、＜ｃ、且
つ■。＜ｄの内いずれかの式を満足する時、無声であり
、それ以外の時、有声であると判御するものである。2) Structure of the Invention The voiced/unvoiced voice determination method of the present invention calculates the maximum value of the low-frequency component signal of the voice waveform signal as the L-1 voice wave and the 0th order autocorrelation function value (rv., 1st order) of the shadow signal. The Percoll coefficient of 1.
When different constants are a, b, c, and d, (1) L plate x(a (ll) L max > a, and <b(I
ff) L rnax) a, and b(K, < c, and ■. < d) When it satisfies any of the following expressions, it is determined to be voiceless, and otherwise it is determined to be voiced. .

（ホ）　実施例Ｗろ図に本発明の音声の有声無声判定部式を採用した音
声分析装置？示す。１ｊｊ１図に於いて、（２１）はＰ
ＣＭ化された音声波形信号Ｓの０次から１０次までの自
己相関関数値Ｖ。〜■、。をｑ出する相関器、のは該相
関器シ１）から得られる自己相関関数値■。〜ｖ１゜に
基づいて１次から１０次までのノく−コール係数を導出
するパーコール係数抽出型である。□□□はＰＣＭ化さ
れた音声波形信号Ｓの高竣成のを遮断すルテイジタルフ
ィルタ構成のローノ＜スフイルタテあり、音声の基本周
波数であるピッチ周期が存在する１　０３Ｈｚ以下の低
域成分信号Ｌ全通過せしめ１００Ｈｚ以上に存在するホ
ルマントの影響）全除去する。（２４＋　ｆｌピッチパ
ラメータ抽出器であり、該ローパスフィルタ（２３）か
らの低域成分信号りの差分値を正、負、零の三値化１ご
号１１　ｉ　Ｉ＋　、　＋１−ｉ　１１．０″としてこ
の三価化信号の自己相関関数値Ｗ（τ）を算出して、こ
のイ１ｎが１”に最も近い最大値となる時の遅れ時間τ
を求め、この値τをピッチＩ（ラメータＰとして出力す
る０１２５）はアンブノくラメータ抽出器であり、上記
パーコール係数に、〜に１゜の抽出に併つて得られる１
フレ一ム周期単位の残差電力Ｅを上記ピッチパラメータ
抽出器財から蜀られるピッチパラメータＰに基づいて１
ピツチ毎に配分したアンプパラメータＡを算出する。（
２８）ニ本発明に保る有声無声判定部であり、上記相関
器ｃ！１１にて算出した音声波形信号Ｓの０次から１０
次までの自己相関関数、［Ｖ。〜■、。の内、０次のそ
れで表わされる音声波形信号の電力値■。と、上記）（
−コール係数抽出器（２２）にてり出した１次から１０
次までのノく−コール係数に１〜ＫＩＯの内、１次のそ
れで表わされる音声波形＠牲の１サンプル遅延の相関度
に、と、さらに上記ローパスフィルタ（２３）から得ら
れる音声波形信号Ｓの低Ｍ成分信＠Ｌとを用いて有声無
声の判定が行なわれる。即ち、特別な関数又はパラメー
タを用いる事なく、パーコール方式の音声分析に必要不
可欠な上記電力■。及び相関度に１、並びにピッチパラ
メータＰの抽出に８女な上記低域成分信号りを用いての
判定処理が行なわれる。該有声無声判定部＠は、上記低
域成分信号りの１フレーム中の最大値Ｌｍａｘを検出す
る最大値検出器筒と、該最大値検知器筒からの最大値Ｌ
　ｎａｘと特定値ａとを比較する第１の比較器圀）と、
上記パーコール係数抽出器のからの相関度に１と特定値
すとを比較する第２の比較器（２９）と、同じく相関度
に１と特定値Ｃ（ｂ＜ｃ）とを比較する第６の比較器（
２））と上記相関器（２１＋からの電力■。と％定値ｄ
とを比較する第４の比較器（３Ｉ）と、１に備え、その
判定方式は、第１の比較器（２８）にてＬ　ｍａｘ≦ａ
を検知して音声波形信号Ｓの低周波成分が小をいと認め
られる時、又は第２の比較器＠にてに、＜　１）　ｉ検
知して音声波形信号Ｓのランダム性が高いと認められる
時、又は第６の比較器（３０）にてに、（ｃを検知する
と共に第４の比較器の１１にてＶ。＜ｄを検知し、音声
波形信号のランダム性がある８度低い場合でもその電力
が小さい時はこのフレームの音声波形信号Ｓは無声五声
に依るものと判定するものである。具体的には、ヲ、が
実験的に求められた最適な各定数の条件範囲であり、例
えば（１）　　Ｌｍａｘ≦ヲ７（ｌｌｌ　　Ｌ　ｍｘ　＞７ｔ　、且つに、〈０（ｉｆ
ｆ）　　Ｌ　ｍａｘ　＞２−、、且つ０＜Ｋ、＜０．８
７５１且つＶ。(E) A speech analysis device employing the speech voiced/unvoiced determination section formula of the present invention in the embodiment W diagram? show. In figure 1jj1, (21) is P
Autocorrelation function values V from the 0th order to the 10th order of the CM audio waveform signal S. ~■,. The correlator that outputs q is the autocorrelation function value obtained from the correlator 1). This is a Percoll coefficient extraction type that derives Noku-Call coefficients from the first order to the tenth order based on ~v1°. □□□ has a rotary digital filter configuration that blocks high frequency signals of the PCM audio waveform signal S, and completely passes the low-frequency component signal L below 103 Hz, where there is a pitch period that is the fundamental frequency of audio. The influence of formants that exist above 100 Hz) is completely removed. (24+ fl pitch parameter extractor, which converts the difference value of the low-frequency component signal from the low-pass filter (23) into three values of positive, negative, and zero. The autocorrelation function value W(τ) of this trivalent signal is calculated as , and the delay time τ when this i1n becomes the maximum value closest to 1"
The pitch I (0125 output as a parameter P) is an ambuno parameter extractor, and the above Percoll coefficient is added to the 1 obtained by extracting 1 degree to ~.
The residual power E per frame period is calculated as 1 based on the pitch parameter P extracted from the pitch parameter extractor.
Calculate the amplifier parameter A distributed for each pitch. (
28) D. A voiced/unvoiced determination unit maintained in the present invention, which is the correlator c! 10 from the 0th order of the audio waveform signal S calculated in step 11
The autocorrelation function up to [V. ~■,. Among them, the power value of the audio waveform signal represented by the 0th order ■. and above)(
- 10 from the first order extracted by the call coefficient extractor (22)
The correlation coefficient of the voice waveform represented by the first-order one of 1 to KIO to the next call coefficient @1 sample delay, and the voice waveform signal S obtained from the above-mentioned low-pass filter (23). Voiced/unvoiced judgment is performed using the low M component signal @L. In other words, the above-mentioned power (■) is indispensable for Percall-based voice analysis without using any special functions or parameters. A determination process is performed using the above-mentioned low frequency component signal with a correlation degree of 1 and a pitch parameter P of 8. The voiced/unvoiced determination unit @ includes a maximum value detector tube that detects the maximum value Lmax in one frame of the low frequency component signal, and a maximum value Lmax from the maximum value detector tube.
a first comparator that compares nax and a specific value a);
A second comparator (29) that compares the correlation degree of 1 from the Percoll coefficient extractor with a specific value C, and a sixth comparator that also compares the correlation degree of 1 and a specific value C (b<c). comparator (
2)) and the above correlator (power from 21+) and % constant value d
and a fourth comparator (3I) for comparing L max≦a
When the low frequency component of the audio waveform signal S is recognized to be small by detecting < 1) i is detected and the randomness of the audio waveform signal S is recognized to be high. or when the sixth comparator (30) detects (c and the fourth comparator 11 detects V.<d, and the audio waveform signal is 8 degrees lower due to randomness) However, when the power is small, it is determined that the audio waveform signal S of this frame is due to the unvoiced five tones.Specifically, wo is within the condition range of the optimal constants determined experimentally. Yes, for example (1) Lmax≦ヲ7 (lll L mx >7t, and <0(if
f) L max >2-, and 0<K, <0.8
751 and V.

停の６式の内、いずれかの式を満足する時、無声であり、
それ以外の時有声であると、１ｏｏｓに近い精度でｇ足
されこの判定＠号Ｕが出力される。When any one of the six equations for stopping is satisfied, it is silent,
If it is voiced at other times, g is added with an accuracy close to 1oos and this judgment @No.U is output.

ここそ参考までに、上記ローノくスフイルタのとピッチ
パラメータ抽出器ｇ：（２４＋の具体的構成を第４図に
示す。同図のローパスフィルタｃ！３１は２個の加算器
顛・・・と１個の乗算器（４１）と２個の遅延素子（４
２のディジタルフィルタ構成？備え、その伝達関数はと
なり、その定数ｅはただし、　ｆｃは遮断周波数、１日は標本化周波数のように決定される０従って、ｆｃを例えば１００歌程度に設定すればＰＣＭ
化された音声波形＠号ｓ［２のホルマント成分はかなり
低減されピッチパラメータＰの抽出誤りを低減できる。For reference, the specific configuration of the pitch parameter extractor g:(24+) of the above-mentioned ronnox filter is shown in Figure 4.The low-pass filter c!31 in the same figure is composed of two adders... One multiplier (41) and two delay elements (4
2 digital filter configuration? Therefore, if fc is set to about 100 songs, then the PCM
The formant component of the converted speech waveform @s[2 is considerably reduced, and the error in extracting the pitch parameter P can be reduced.

ピッチパラメータ抽出翻例は、上記ローパスフィルタ（
ハ）から得られる音声波形信号ｅ　１（ｚｌの低域成分
信号Ｓ　、＜Ｚ）とこれに１サンプル分の遅延旧を行な
った信号Ｓ。（２−９とを比較する比較回路（４３Ｉを
設け、この比較の結果、の三値化信号を出力し、該信号
が三値化信号相関器（４４）で自己相関され、この相関
関数値が最大となる時の、ｉ！ｉ！九時間τがピッチパ
ラメータＰとして出力される。即ち、三値化信号の自己
相関処理の為の演算量は、例えば１サンプル１２ビツト
の音声波形信号ｓｌ直接自己相関処理する為の演算量に
比べて大巾に低減されているにもかかわらす、ピッチパ
ラメータＰの抽出ぬりはほとんどない。An example of pitch parameter extraction is the above low-pass filter (
c) Audio waveform signal e 1 (low frequency component signal S of zl, <Z) obtained from c) and signal S obtained by delaying this by one sample. (A comparison circuit (43I) is provided to compare 2-9 with When i!i!9 time τ is the maximum, it is output as the pitch parameter P.In other words, the amount of calculation for autocorrelation processing of the ternary signal is, for example, the audio waveform signal sl of 12 bits per sample. Although the amount of calculation is greatly reduced compared to the amount of calculation required for direct autocorrelation processing, there is almost no extraction of the pitch parameter P.

（へ）　発明の効果本発明の音声の有声無声判定方式は以上の説明から明ら
な如く、音声波形信号の低域成分信号の最大値ｆｆ　Ｌ
　ｍａＸ　ｓ音声波形信号の０次の自己相関関数値を■
い１次のパーコール係数をに１、夫々異なる定数をａ、
　ｂ、　ｃ、　ｄとした時、中Ｌｍａｘ≦ａ（１１）、Ｌ　ｍａｘ　＞　ａ　ｓ且つｘ、＜ｂ（１ｉ
ｌ）　　Ｌ　ｍａｘ　＞　ａ　、且つｂ　＜ＫＩ＜：　
ＩＣ，目つＶ。＜　ｄのいずれかの式を満足する時、無
声であり、それ以外の時、有声であると判定するもので
あるので従来方式のｎｏ＜、膨大な演算量全必要とする
音声波形４１号の高次の自己相関処理２行なう事なく、
少゛々い演算量で実時間での判定処理が実行できる。(f) Effects of the Invention As is clear from the above explanation, the voiced/unvoiced voice determination method of the present invention determines the maximum value ff L of the low frequency component signal of the voice waveform signal.
maX s The zero-order autocorrelation function value of the audio waveform signal is ■
Let the first-order Percoll coefficient be 1, and let the different constants be a,
When b, c, and d, medium Lmax≦a (11), Lmax > a s and x, <b(1i
l) L max > a and b <KI<:
IC, eyes V. When any of the expressions < d is satisfied, it is determined that it is unvoiced, and otherwise it is determined that it is voiced. Without performing two high-order autocorrelation processes,
Judgment processing can be performed in real time with a small amount of calculation.

しかも、この判足東件として上述の如き復数の有効なデ
ータＬ凪ｘ＋に１ｔ　Ｖｏｋ用いているので、ｈ胚の高
い判定処理が可能となる。Moreover, since 1tVok is used for the valid data L-x+ of the above-mentioned number as a test case, high-quality determination processing of h embryos is possible.

また音声分析処理にて音声の特徴パラメータを導出する
為に取り扱われるデータケ流用して有声無声の判定がで
きるので、この判定処理の為の演算量は極めて少なく、
斯る音声分析処理？小型のマイクロコンピュータクラス
の計算処理システムにて実時間で実行する事が可能とな
る。In addition, since it is possible to determine whether voice is voiced or unvoiced by reusing the data used to derive voice characteristic parameters in the voice analysis process, the amount of calculation for this determination process is extremely small.
Such voice analysis processing? It becomes possible to execute in real time on a small microcomputer-class calculation processing system.

[Brief explanation of the drawing]

単１図は一般的な音声合成装置の構成を示すブロック図
、第２図は従来の音声分析装置の構成を示すブロック図
、第６図は本発明の音声の有声無声の利足方式を採用し
た音声分析装置の構成を示すブロック図、第４図は本発
明方式に係る音声分析装置の要部のブロック図であり、
（２１）は相関器、呟はパーコール係数抽出器、ハハロ
ーバスフィルタ、シ４）はピッチパラメータ抽出器、ｔ
２５）はアンプパラメータ抽出器、吸）は有声無声判定
部、（２７）は最大値検出回路、（至）（２９）＠ｉｓ
ｕは比較回路全天々示している０Figure 1 is a block diagram showing the configuration of a general speech synthesis device, Figure 2 is a block diagram showing the configuration of a conventional speech analysis device, and Figure 6 adopts the voiced and unvoiced method of the present invention. FIG. 4 is a block diagram showing the configuration of a speech analysis device according to the present invention, and FIG.
(21) is a correlator, t is a Percoll coefficient extractor, ha-hello bass filter, 4) is a pitch parameter extractor, t
25) is an amplifier parameter extractor, (in) is a voiced/unvoiced judgment unit, (27) is a maximum value detection circuit, (to) (29) @is
u indicates the entire comparison circuit 0

Claims

[Claims]

(1) The maximum value Lmax of the low-frequency component signal obtained from the low-pass filter that completely blocks the high-frequency components of the audio waveform signal,
Autocorrelation function of audio waveform signal? Zero-order autocorrelation flash value V obtained from the correlator to be calculated. and the 0th to nth autocorrelation obtained by the upper phase 5lil device [...function 1 obtained from the Percoll coefficient extractor that derives 1 to Kne from the 1st to 1 for the next Percoll coefficient and constants a, b, c. The following formula expressed by d and % formula % ( (When any of the formulas in formula 06 is satisfied, the above audio waveform signal is determined to be an unvoiced audio signal, and conversely, when neither formula is satisfied. , a patented voiced/unvoiced voice determination method that determines that the audio waveform signal is a voiced audio signal.