JPH04248722A

JPH04248722A - Data coding method

Info

Publication number: JPH04248722A
Application number: JP3014402A
Authority: JP
Inventors: Yasunaga Miyazawa; 宮沢康永
Original assignee: Seiko Epson Corp
Current assignee: Seiko Epson Corp
Priority date: 1991-02-05
Filing date: 1991-02-05
Publication date: 1992-09-04

Abstract

PURPOSE:To execute coding of an input data in the data at high speed coding method adopting vector quantization. CONSTITUTION:A code vector included in a code book is divided into M-kinds of categories 302, 303 and a code vector belonging to each of the M kinds of categories is further divided into M(2) kinds of categories. Similarly, the code vector is divided up to the N-th stage. A characteristic vector of each category is used as a centroid vector of the code vector belonging to the category. In the case of coding, according to the tree-structure, retrieval is implemented based on the distance calculation between the input vector and the characteristic vector of each category and the result is coded to an optimum code vector.

Description

[Detailed description of the invention]

【０００１】0001

【産業上の利用分野】本発明は、データ圧縮を用いる、
音声認識装置、画像認識装置、ディジタル通信などの分
野に関する。[Industrial Application Field] The present invention uses data compression.
It relates to fields such as voice recognition devices, image recognition devices, and digital communications.

【０００２】0002

【従来の技術】従来、「Ａｎ　Ａｌｇｏｒｉｔｈｍ　ｆ
ｏｒ　Ｖｅｃｔｏｒ　Ｑｕａｎｔｉｚｅｒ　Ｄｅｓｉｇ
ｎ」（ＩＥＥＥ　ＴＲＡＳＡＣＴＩＯＮＳ　ＯＮＣＯＪ
ＭＭＵＮＩＣＡＴＩＯＮＳ，　ＶＯＬ．ＣＯＭ−２８，
ＮＯ．１，ＪＡＮＵＡＲＹ　１９８０．　ｂｙ　ＬＩＮ
ＤＥ，ＢＵＺＯ　ａｎｄ　ＧＲＡＹ）に記載されている
ように、ベクトル量子化による、ディジタル信号のデー
タ圧縮が知られていた。[Prior Art] Conventionally, "An Algorithm f
or Vector Quantizer Design
n” (IEEE TRASACTIONS ONCOJ
MMUNICATIONS, VOL. COM-28,
No. 1, JANUARY 1980. by LIN
Data compression of digital signals by vector quantization was known, as described in DE, BUZO and GRAY).

【０００３】0003

【発明が解決しようとする課題】しかし、従来のベクト
ル量子化では、入力データを、コードブック中のＴ個の
コードベクトルのいづれかにコード化する際、入力デー
タとコードブック中の各コードベクトルとＴ回距離計算
をするため、コード化に時間がかかり、認識処理やデー
タ通信を実時間で処理することを困難にする、という問
題点があった。この問題点を解決し、データ圧縮処理の
計算時間を高速にすることが、本発明の課題である。[Problems to be Solved by the Invention] However, in conventional vector quantization, when encoding input data into any of the T code vectors in the codebook, it is difficult to associate the input data with each code vector in the codebook. Since distance calculations are performed T times, encoding takes time, making it difficult to perform recognition processing and data communication in real time. It is an object of the present invention to solve this problem and speed up the calculation time of data compression processing.

【０００４】0004

【課題を解決するための手段】本発明のデータ符号化方
法は、データ符号化方法において、コードブック中の、
Ｔ個の各コードベクトルを、第１段階の分類として、Ｍ
（１）種類の各カテゴリーに分割することと、第２段階
の分類として、前記第１段階の分類のＭ（１）種類の各
カテゴリーを、それぞれＭ（２）種類の各カテゴリーに
分割し、Ｍ（１）＊Ｍ（２）種類の各カテゴリーに分割
することと、前記の各段階の分類と同様にして、第Ｎ段
階の分類として、第Ｎ−１段階の分類のＭ（１）＊Ｍ（
２）＊・・・＊Ｍ（Ｎ−１）種類の各カテゴリーを、そ
れぞれＭ（Ｎ）種類の各カテゴリーに分割し、Ｍ（１）
＊Ｍ（２）＊・・・＊Ｍ（Ｎ）種類の各カテゴリーに分
割することと、前記段階数Ｎの値を３以上とすることと
、前記各カテゴリーの各特徴ベクトルを、前記各カテゴ
リーに含まれる前記各コードベクトルの重心ベクトルと
することと、入力データを前記コードブック中の前記コ
ードベクトルのいづれかに、コード化する際、第１探索
として、前記入力データと前記第１段階のＭ（１）種類
のカテゴリーの特徴ベクトルとの距離が最小となる、１
つの前記第１段階のカテゴリーを選択することと、第２
探索として、前記入力データと、選択された前記第１段
階のカテゴリーに属する前記Ｍ（２）種類のカテゴリー
の特徴ベクトルとの距離が最小となる、１つの前記第２
段階のカテゴリーを選択することと、前記の各探索と同
様にして、第Ｎ探索として、前記入力データと、選択さ
れた前記第Ｎ−１段階のカテゴリーに属する前記Ｍ（Ｎ
）種類のカテゴリーの特徴ベクトルとの距離が最小とな
る、１つの前記第Ｎ段階のカテゴリーを選択することと
、　　最終探索として、前記入力データと、選択された
前記第Ｎ段階のカテゴリーに属する前記コードベクトル
との距離が最小となる、１つの前記コードベクトルを選
択し、前記入力データを、選択された１つの前記コード
ベクトルのコードに対応づけることと、第ｎ探索（ｎ≦
Ｎ）において選択された第ｎ段階のカテゴリーに属する
前記コードベクトルが１個のみの時は、第ｎ探索を最終
探索として、その前記コードベクトルのコードに、前記
入力データを対応づけること、を特徴とする。[Means for Solving the Problems] The data encoding method of the present invention includes:
Each of the T code vectors is classified as M
(1) dividing into each category of types, and as a second stage classification, dividing each of the M(1) types of categories in the first stage classification into each of M(2) types of categories, respectively; By dividing into M(1)*M(2) types of categories, and in the same way as the classification at each stage described above, as the Nth stage classification, the N-1th stage classification M(1)* M(
2) *...*Divide each category of M(N-1) types into each category of M(N) types,
dividing into *M(2)*...*M(N) categories, setting the value of the number of stages N to 3 or more, and dividing each feature vector of each category into , and when input data is encoded into any of the code vectors in the codebook, as a first search, the input data and M of the first stage are (1) The distance from the feature vector of the type category is the minimum, 1
selecting one of said first stage categories; and
As a search, one second search is performed, in which the distance between the input data and the feature vectors of the M(2) types of categories belonging to the selected first stage category is minimized.
By selecting the category of the stage, and in the same way as each search above, as the Nth search, the input data and the M(N
) selecting one of the N-th stage categories that has a minimum distance from the feature vector of the category of type; Selecting one code vector having a minimum distance to the code vector, and associating the input data with the code of the selected one code vector, and n-th search (n≦
When there is only one code vector belonging to the category of the n-th stage selected in N), the n-th search is regarded as the final search, and the input data is associated with the code of the code vector. shall be.

【０００５】[0005]

【実施例】（実施例１）本発明のデータ符号化方法を、
単語認識の音声認識装置に応用した場合の１実施例を図
面に沿って説明する。[Example] (Example 1) The data encoding method of the present invention is
An embodiment in which the present invention is applied to a speech recognition device for word recognition will be described with reference to the drawings.

【０００６】図１は、本発明のデータ符号化方法を用い
た音声認識装置のシステム構成図である。話者によって
発話された音声を、マイク１より入力し、Ａ／Ｄ変換部
２において、１６［ＫＨｚ］、１２ビットのディジタル
信号に変換し、特徴抽出部３において、２０［ｍｓ］を
１フレームとして、１フレーム毎に、ハミングウィンド
ウ処理、線形予測分析を行い、１４次ＬＰＣケプストラ
ム係数を特徴パラメータとして求める。この時、フレー
ムのシフト量は１０［ｍｓ］とする。このようにして得
た１４次の特徴パラメータを入力ベクトルとして、デー
タ圧縮部４において、本発明のデータ符号化方法を用い
て、コードブック５中のコードベクトルの１つのコード
ベクトルにコード化する。データ圧縮部４においてコー
ド化されたコードの時系列と、あらかじめ学習させてあ
るＸ個の単語の標準パターンとを、単語認識部５におい
てＨＭＭ法を用いてパターンマッチングを行うことによ
り、単語認識する。このときＸ個の単語の標準パターン
は単語辞書７に登録されている。FIG. 1 is a system configuration diagram of a speech recognition device using the data encoding method of the present invention. The voice uttered by the speaker is input through the microphone 1, and the A/D converter 2 converts it into a 16 [KHz], 12-bit digital signal, and the feature extractor 3 converts 20 [ms] into one frame. As such, Hamming window processing and linear prediction analysis are performed for each frame, and 14th-order LPC cepstral coefficients are determined as feature parameters. At this time, the frame shift amount is 10 [ms]. Using the 14th-order feature parameter thus obtained as an input vector, the data compression unit 4 encodes it into one of the code vectors in the codebook 5 using the data encoding method of the present invention. The word recognition unit 5 performs pattern matching on the time series of codes encoded in the data compression unit 4 and standard patterns of X words that have been trained in advance, using the HMM method to recognize words. . At this time, standard patterns of X words are registered in the word dictionary 7.

【０００７】本実施例においては、コードブック５中の
コードベクトルの総数は２５６個であり、６段階に分類
されている。また、このコードブック５中の２５６個の
コードベクトルは、２０人の話者により発話された音声
を、１６［ＫＨｚ］、１２ビットでサンプリングし、２
０［ｍｓ］を１フレーム、シフト量を１０［ｍｓ］とし
て、１フレーム毎に、ハミングウィンドウ処理、線形予
測分析を行い、１４次ＬＰＣケプストラム係数を特徴パ
ラメータとする数１０万フレームの特徴パラメータ群か
ら、ＬＢＧアルゴリズムを用いて求めたものであり、各
々のコードベクトルの次数は１４次である。このＬＢＧ
アルゴリズムとは、「ＡｎＡｌｇｏｒｉｔｈｍ　ｆｏｒ
Ｖｅｃｔｏｒ　Ｑｕａｎｔｉｚｅｒ　Ｄｅｓｉｇｎ」（
ＩＥＥＥ　ＴＲＡＳＡＣＴＩＯＮＳ　ＯＮ　ＣＯＭＭＵ
ＮＩＣＡＴＩＯＮＳ，ＶＯＬ．ＣＯＭ−２８，ＮＯ．１
，　ＪＡＮＵＡＲＹ　１９８０．　ｂｙ　ＬＩＮＤＥ，
ＢＵＺＯ　ａｎｄ　ＧＲＡＹ）に記載されているアルゴ
リズムである。In this embodiment, the total number of code vectors in the codebook 5 is 256, which are classified into six levels. In addition, the 256 code vectors in this codebook 5 are obtained by sampling the voices uttered by 20 speakers at 16 [KHz] and 12 bits.
Hamming window processing and linear prediction analysis are performed for each frame, with 0 [ms] as one frame and the shift amount as 10 [ms], and a feature parameter group of hundreds of thousands of frames with 14th LPC cepstral coefficients as feature parameters. , using the LBG algorithm, and the order of each code vector is 14th. This LBG
An algorithm is “AnAlgorithm for
Vector Quantizer Design” (
IEEE TRASACTIONS ON COMMU
NICATIONS, VOL. COM-28, NO. 1
, JANUARY 1980. by LINDE,
This is an algorithm described in BUZO and GRAY).

【０００８】コードブック５中のコードベクトルの６段
階の分類について、図３を用いて簡単に説明する。図３
において、カテゴリー（０、１）３０１は２５６個のコ
ードベクトルを全部含むカテゴリー、すなわちコードブ
ックそのものとする。このカテゴリー（０、１）３０１
中の２５６個のコードベクトルを、第１段階の分類によ
って、２種類のカテゴリー、カテゴリー（１、１）３０
２、カテゴリー（１、２）３０２、に分割する。カテゴ
リー３０２、３０３の特徴ベクトルは、各々のカテゴリ
に属するコードベクトルの重心ベクトルとする。第２段
階の分類として、第１段階の分類で分割された２種類の
カテゴリー３０２、３０３に属するコードベクトルを、
それぞれ２種類のカテゴリーに分割する。カテゴリー３
０２に属するコードベクトルを分割したカテゴリーが、
カテゴリー３０４、３０５であり、カテゴリー３０３に
属するコードブックを分割したカテゴリーが、カテゴリ
ー３０６、３０７である。このように、第２段階の分類
では、コードベクトルは４種類のカテゴリーに分割され
る。各々のカテゴリーの特徴ベクトルは、第１段階の分
類と同様に、各々のカテゴリーに属するコードベクトル
の重心ベクトルとする。同様にして第３段階の分類では
、コードベクトルは、カテゴリー（３、１）３０８から
カテゴリー（３、８）３０９までの８種類のカテゴリー
に分割される。本実施例では、同様にして、６段階の分
類まで行う。第６段階の分類では、コードベクトルは、
カテゴリー（６、１）３１０からカテゴリー（６、６４
）３１１までの６４種類のカテゴリーに分割される。カ
テゴリー３１０に属するコードベクトルは、コードベク
トル３１２、３１３、３１４、３１５の４個であり、カ
テゴリー３１０の特徴ベクトルは、コードベクトル３１
２、３１３、３１４、３１５の重心ベクトルである。The six-stage classification of code vectors in codebook 5 will be briefly explained using FIG. Figure 3
In this example, category (0, 1) 301 is a category that includes all 256 code vectors, that is, the codebook itself. This category (0, 1) 301
The 256 code vectors were classified into two categories, category (1, 1), and 30
2, category (1, 2) 302. The feature vectors of categories 302 and 303 are the centroid vectors of code vectors belonging to each category. As the second stage classification, code vectors belonging to the two categories 302 and 303 divided in the first stage classification are
Each category is divided into two categories. Category 3
The categories obtained by dividing the code vector belonging to 02 are:
Categories 304 and 305 are categories, and categories 306 and 307 are obtained by dividing the codebook belonging to category 303. Thus, in the second stage classification, code vectors are divided into four categories. The feature vector of each category is the centroid vector of the code vector belonging to each category, as in the first stage classification. Similarly, in the third stage classification, the code vector is divided into eight categories from category (3, 1) 308 to category (3, 8) 309. In this embodiment, classification is performed in the same way up to six levels. In the sixth stage of classification, the code vector is
Category (6, 1) 310 to Category (6, 64
) is divided into 64 categories up to 311. The four code vectors belonging to category 310 are code vectors 312, 313, 314, and 315, and the feature vector of category 310 is code vector 31.
2, 313, 314, and 315.

【０００９】この分類のアルゴリズムを図２を用いて説
明する。The algorithm for this classification will be explained using FIG.

【００１０】まず、記号を定義する。Ｎは分類の全段階
数とし、本実施例では６とする。ｎは分類の段階名とす
る。Ｉは第ｎ−１段階の分類におけるカテゴリー数、ｉ
はカテゴリー名とする。Ｃｎ（ｉ）は第ｎ段階の分類に
おいて新しくできたカテゴリーｉの重心ベクトル、すな
わち、第ｎ段階のｉカテゴリーの特徴ベクトルである。First, symbols will be defined. N is the total number of classification stages, and is set to 6 in this embodiment. Let n be the classification stage name. I is the number of categories in the n-1th stage classification, i
is the category name. Cn(i) is the centroid vector of the newly created category i in the n-th stage classification, that is, the feature vector of the i-category in the n-th stage.

【００１１】演算２１において、ｎとＩをそれぞれ１に
初期化する。これは、最初の分類が、第１段階の分類で
あることと、最初のカテゴリー数（第０段階の分類のカ
テゴリー数）は１種類（コードブックそのもの）である
ことを示す。In operation 21, n and I are each initialized to 1. This indicates that the first classification is the first stage classification and that the first number of categories (the number of categories in the 0th stage classification) is one type (the codebook itself).

【００１２】ループ２２では、Ｉ種類のカテゴリーに属
するコードベクトルをそれぞれ２分割する計算を実行す
るために、演算２３から演算２７までをＩ回計算する。In loop 22, operations 23 to 27 are calculated I times in order to perform calculations for dividing each code vector belonging to I types of categories into two.

【００１３】演算２３では、第ｎ−１段階のｉカテゴリ
ーに属するコードベクトルを２分割するための初期化と
して、第ｎ−１段階のｉカテゴリーに属するコードベク
トルのうち最も距離の離れた２個のコードベクトルを、
２個の重心ベクトルの初期値として選択する。In operation 23, as an initialization for dividing the code vector belonging to the i category in the n-1th stage into two, the two most distant code vectors belonging to the i category in the n-1th stage are The code vector of
Select as the initial values of the two centroid vectors.

【００１４】演算２４では、第ｎ−１段階のｉカテゴリ
ーに属する各々のコードベクトルと２個の重心ベクトル
との距離計算をし、その距離が小さくなるように、各々
のコードベクトルを２種類のカテゴリーに分割する。In operation 24, the distance between each code vector belonging to the i category of the n-1th stage and the two centroid vectors is calculated, and each code vector is divided into two types so that the distance becomes small. Split into categories.

【００１５】演算２５では、演算２４で分類された２種
類のカテゴリー毎に、それぞれのカテゴリーに属するコ
ードベクトルの重心ベクトルを求める。In operation 25, for each of the two categories classified in operation 24, the centroid vector of the code vector belonging to each category is determined.

【００１６】分岐２６では、演算２４で分類された２種
類のカテゴリー毎に、演算２５で求めた重心ベクトルと
そのカテゴリーに属するコードベクトルとの距離の和の
値が、収束条件を満たすか、否かで、演算２７に進むか
、演算２４、２５を再計算するかを判断する。本実施例
での収束条件は、この距離の和がある一定値に収束した
時、収束したと判断し、収束した場合演算２７を実行す
る。In branch 26, for each of the two categories classified in operation 24, it is determined whether the sum of the distances between the centroid vector obtained in operation 25 and the code vector belonging to that category satisfies the convergence condition. It is determined whether to proceed to operation 27 or to recalculate operations 24 and 25. The convergence condition in this embodiment is that when the sum of these distances converges to a certain constant value, convergence is determined, and when convergence occurs, operation 27 is executed.

【００１７】演算２７では、演算２５で計算された２個
の重心ベクトルを、第ｎ段階の分類の（２＊ｉ−１）カ
テゴリーと（２＊ｉ）カテゴリーの特徴ベクトルとして
、記憶しておく。In operation 27, the two centroid vectors calculated in operation 25 are stored as feature vectors for the (2*i-1) and (2*i) categories of the n-th classification. .

【００１８】演算２８では、段階名ｎの値を１つ増やし
、第ｎ−１段階のカテゴリー数をＩに代入する。In operation 28, the value of the stage name n is increased by one, and the number of categories of the (n-1)th stage is substituted for I.

【００１９】分岐２９では、ｎがＮ以下の場合、ループ
２２から演算２８を実行し、ｎがＮを越えた場合、分類
計算を終了とする。本実施例では、Ｎの値は６であるの
で、第６段階の分類まで計算を行い、各々のカテゴリー
の特徴ベクトルを求める。At branch 29, if n is less than or equal to N, operation 28 is executed from loop 22, and if n exceeds N, the classification calculation is terminated. In this embodiment, the value of N is 6, so calculations are performed up to the sixth stage of classification, and feature vectors for each category are determined.

【００２０】以上の計算で求めた各カテゴリーの特徴ベ
クトルを用いて、入力データの１フレームの入力ベクト
ルをコード化する方法を、図３を用いて説明する。A method of encoding an input vector of one frame of input data using the feature vectors of each category obtained through the above calculation will be explained with reference to FIG.

【００２１】第１の探索として、入力ベクトルと第１段
階の２種類のカテゴリー３０２、３０３の特徴ベクトル
Ｃ１（１）、Ｃ１（２）との距離が小さい方の、１つの
カテゴリーを選択する。As a first search, one category with a smaller distance between the input vector and the feature vectors C1(1) and C1(2) of the two categories 302 and 303 of the first stage is selected.

【００２２】第２の探索として、入力ベクトルと、選択
された第１段階のカテゴリーに属する２種類のカテゴリ
ーの特徴ベクトルとの距離が小さい方の、１つのカテゴ
リーを選択する。仮に第１の探索で、カテゴリー３０３
が選択された場合、第２の探索では、カテゴリー３０６
、３０７の特徴ベクトルＣ２（３）、Ｃ２（４）との距
離が小さい方の、１つのカテゴリーを選択することにな
る。As a second search, one category is selected that has a smaller distance between the input vector and the feature vectors of two categories belonging to the selected first-stage category. Suppose that in the first search, category 303
is selected, in the second search, category 306
, 307, the one category with the smaller distance from the feature vectors C2(3) and C2(4) is selected.

【００２３】同様にして、第６探索まで探索を行い、第
６段階の分類のカテゴリーを１つ選択する。Similarly, the search is performed up to the sixth search, and one category of the classification at the sixth stage is selected.

【００２４】最終探索として、入力ベクトルと選択され
た第６段階のカテゴリーに属するコードベクトルとの距
離が最小となる、１つのコードベクトルを選択し、入力
ベクトルを、選択されたコードベクトルのコードに対応
づけることにより、入力ベクトルのコード化が完了する
。As a final search, one code vector with the minimum distance between the input vector and the code vector belonging to the selected sixth stage category is selected, and the input vector is converted into the code of the selected code vector. By making the correspondence, encoding of the input vector is completed.

【００２５】第６段階の各カテゴリーに属するコードブ
ックの数は、平均４個である。よって、入力ベクトルを
コード化する際の距離計算の回数は、本実施例の場合、
各段階のカテゴリーの特徴ベクトルとの距離計算が２＊
６＝１２回、第６段階のカテゴリーに属するコードベク
トルとの距離計算が平均４回、合計平均１８回となる。従来の方法でコード化を行った場合、２５６個すべての
コードベクトルと距離計算を行うため、距離計算の回数
は２５６回となる。よって、本発明のデータ符号化方法
を用いると、この実施例では、従来方法の約１４倍の速
さでコード化が可能となる。[0025] The average number of codebooks belonging to each category in the sixth stage is four. Therefore, in this example, the number of distance calculations when encoding an input vector is
The distance calculation between the feature vector of each stage category is 2*
6 = 12 times, the distance calculation with the code vector belonging to the category of the 6th stage is performed 4 times on average, and the total average is 18 times. When encoding is performed using the conventional method, distance calculations are performed for all 256 code vectors, so the number of distance calculations is 256. Therefore, using the data encoding method of the present invention, in this embodiment, it is possible to encode data approximately 14 times faster than the conventional method.

【００２６】また、第ｎ探索（ｎ≦Ｎ）において選択さ
れた第ｎ段階のカテゴリーに属するコードベクトルが１
個のみの時は、第ｎ探索を最終探索として、そのコード
ベクトルのコードに、入力ベクトルを対応づける。[0026] Also, if the code vector belonging to the n-th category selected in the n-th search (n≦N) is 1
If there are only 1, the n-th search is the final search, and the input vector is associated with the code of that code vector.

【００２７】[0027]

【発明の効果】以上説明したように、本発明のデータ符
号化方法を用いることにより、入力データのコード化が
高速になるという効果がある。仮に、コードサイズを２
５６とし、本発明のデータ符号化方法の分類を６段階と
し、第ｎ段階の分類でのカテゴリー数を２ｎ個とした場
合、コード化の際、従来方法では２５６回の距離計算が
必要なのに対し、本発明のデータ符号化方法では平均１
０回の距離計算をするだけでよいので、約２５倍高速に
なる。As explained above, by using the data encoding method of the present invention, there is an effect that input data can be encoded at high speed. Suppose the code size is 2.
56, the classification of the data encoding method of the present invention is 6 stages, and the number of categories in the nth stage classification is 2n, the conventional method requires 256 distance calculations during encoding, whereas the conventional method requires 256 distance calculations. , the data encoding method of the present invention has an average of 1
Since it is only necessary to calculate the distance 0 times, the speed is approximately 25 times faster.

【００２８】また、このようにコード化が高速になるた
め、コードサイズを大きくして、コード化に生ずる量子
化誤差を小さくすることが可能となる。[0028] Furthermore, since the coding speed is increased in this way, it is possible to increase the code size and reduce the quantization error that occurs in the coding.

[Brief explanation of the drawing]

【図１】本発明のデータ符号化方法を音声認識装置に応
用した場合のシステム構成図。FIG. 1 is a system configuration diagram when the data encoding method of the present invention is applied to a speech recognition device.

【図２】本発明のデータ符号化方法において、コードベ
クトルを分割するアルゴリズムを示す図。FIG. 2 is a diagram showing an algorithm for dividing code vectors in the data encoding method of the present invention.

【図３】本発明のデータ符号化方法における、コードベ
クトルの分類を示す図。FIG. 3 is a diagram showing classification of code vectors in the data encoding method of the present invention.

[Explanation of symbols]

１　　マイク２　　Ａ／Ｄ変換部３　　特徴抽出部４　　データ圧縮部５　　コードブック６　　単語認識部７　　単語辞書２１　　演算２２　　ループ２３　　演算２４　　演算２５　　演算２６　　分岐２７　　演算２８　　演算２９　　分岐３０１　　カテゴリー３０２　　カテゴリー３０３　　カテゴリー３０４　　カテゴリー３０５　　カテゴリー３０６　　カテゴリー３０７　　カテゴリー３０８　　カテゴリー３０９　　カテゴリー３１０　　カテゴリー３１１　　カテゴリー３１２　　コードベクトル３１３　　コードベクトル３１４　　コードベクトル３１５　　コードベクトル 1. Microphone 2 A/D conversion section 3 Feature extraction section 4 Data compression section 5 Codebook 6 Word recognition section 7. Word dictionary 21 Arithmetic 22 Loop 23 Arithmetic 24 Arithmetic 25 Arithmetic 26 Branch 27 Arithmetic 28 Arithmetic 29 Branch 301 Category 302 Category 303 Category 304 Category 305 Category 306 Category 307 Category 308 Category 309 Category 310 Category 311 Category 312 Code vector 313 Code vector 314 Code vector 315 code vector

Claims

[Claims]

Claim 1. A data encoding method comprising: dividing each of T code vectors in a codebook into M(1) types of categories as a first stage classification; and a second stage classification. , M(1
) type categories are respectively divided into M(2) types of categories, and divided into M(1) * M(2) types of categories, and in the same manner as the classification of each stage described above, As the Nth stage classification, M(
1) *M(2)*...*M(N-1) categories are divided into M(N) categories, and M(1)*M(2)*... *Divide into M(N) categories, set the value of the number of stages N to 3 or more, and set each feature vector of each category to the center of gravity of each code vector included in each category. When input data is encoded into one of the code vectors in the codebook, as a first search, the input data and the first stage M
(1) Selecting one of the first-stage categories that has a minimum distance from the feature vector of the category of the type, and as a second search, using the input data and the selected first-stage category. As the Nth search, in the same way as each of the searches described above, select one of the second-stage categories that has a minimum distance from the feature vectors of the M(2) types of categories belonging to the above. selecting one of the N-th stage categories in which the distance between the input data and the feature vectors of the M(N) types of categories belonging to the selected N-1-th stage category is minimum; As a final search, one code vector with the minimum distance between the input data and the code vector belonging to the selected N-th stage category is selected, and the input data is When there is only one code vector belonging to the n-th category selected in the n-th search (n≦N), the n-th search is the final search and the A data encoding method, comprising associating the input data with the code of the code vector.

2. The data encoding method according to claim 1, wherein the code vector in the codebook is a code vector obtained by vector quantizing learning data using an LBG algorithm. Method.