JP3361564B2

JP3361564B2 - High-speed matching method and apparatus

Info

Publication number: JP3361564B2
Application number: JP10033893A
Authority: JP
Inventors: 秀明田中
Original assignee: Sharp Corp
Current assignee: Sharp Corp
Priority date: 1993-04-02
Filing date: 1993-04-02
Publication date: 2003-01-07
Anticipated expiration: 2018-01-07
Also published as: JPH06290272A

Description

【発明の詳細な説明】Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、高速マッチング方式に
関し、より詳細には、文字や音声の認識装置などのよう
に、辞書ベクトルと入力ベクトルとのマッチング計算を
行なうようにした高速マッチング方式に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a high-speed matching method, and more particularly to a high-speed matching method for performing a matching calculation between a dictionary vector and an input vector such as a character or voice recognition device. .

【０００２】[0002]

【従来の技術】図１６は、従来の認識装置におけるマッ
チング部のフローチャートを示す図である。一般的に、
入力特徴ベクトルが与えられた場合、大分類処理（step
１）、小分類処理（step２）、詳細分類処理（step３）
を経て認識結果（候補）が求められる。各分類処理には
別々のマッチング辞書と特徴ベクトルが用いられ、入力
ベクトルとのマッチング計算（採用されている認識手法
に依存する）を行ない、計算結果をソーティングして認
識候補を作成する。このように処理を分割している理由
は高速化のためである。一般的に、詳細識別用の認識辞
書は、高認識率の実現のため、より多次元のベクトルを
採用している。そのため、全認識対象カテゴリに対し
て、この詳細分類辞書のみでマッチング計算を行なった
場合、その計算量が莫大となり、速度の低下を生じる。
これを解決するため、従来手法では、より少ない次元数
の特徴ベクトルによる認識辞書（ただし、その認識性能
は低下する）を用いた分類処理により、認識性能が保証
できるまで候補数を減少させ、次の（より多次元辞書を
用いた）分類処理へ送るといった多段階処理により高速
性を実現していた。2. Description of the Related Art FIG. 16 is a diagram showing a flowchart of a matching section in a conventional recognition apparatus. Typically,
Given an input feature vector, a large classification process (step
1), small classification process (step 2), detailed classification process (step 3)
After that, the recognition result (candidate) is obtained. Separate matching dictionaries and feature vectors are used for each classification process, and matching calculation with input vectors (depending on the adopted recognition method) is performed, and the calculation results are sorted to create recognition candidates. The reason for dividing the processing in this way is to increase the processing speed. In general, a recognition dictionary for detailed identification employs a multidimensional vector in order to achieve a high recognition rate. Therefore, when matching calculation is performed only for this detailed classification dictionary for all recognition target categories, the amount of calculation becomes enormous and the speed is reduced.
In order to solve this, in the conventional method, the number of candidates is reduced until the recognition performance can be guaranteed by the classification process using the recognition dictionary (however, the recognition performance is deteriorated) using the feature vector with a smaller number of dimensions, High speed was realized by multi-step processing such as sending to classification processing (using a more multi-dimensional dictionary).

【０００３】図１６の従来方式の例では、大分類から詳
細分類へ行くに従い、用いるベクトル次元数は増加す
る。この従来手法の例として、各処理で用いる辞書の１
カテゴリ当たりのベクトル次元数と、入力候補数、出力
候補数、計算量を以下の表１に示す。In the example of the conventional system shown in FIG. 16, the number of vector dimensions to be used increases from the large classification to the detailed classification. As an example of this conventional method, one of the dictionaries used in each process
The number of vector dimensions per category, the number of input candidates, the number of output candidates, and the amount of calculation are shown in Table 1 below.

【０００４】[0004]

【表１】 [Table 1]

【０００５】この例は、対象カテゴリ数が５０００の認
識装置（日本語ＯＣＲなど）についてのものであり、詳
細分類辞書のみで計算した場合の計算量１２８００００
（＝２５６×５０００）を３９６８００に減少させてい
る（約１／３）。また従来法では、これと演算方法の簡
略化を組み合わせる方法も行なわれている。例えば、詳
細分類などではマッチング処理にユークリッド距離や類
似度など乗算を必要とする演算を用い、大分類などでは
絶対差などの演算を用いる方法である。一般的には乗算
よりも減算のほうが高速に実行できるため、大分類で高
価な演算を用いるよりも高速化が実現できる。This example is for a recognition device with a target category number of 5000 (such as Japanese OCR), and the calculation amount 12800000 when calculated only with the detailed classification dictionary.
(= 256 × 5000) is reduced to 396800 (about 1/3). In the conventional method, a method combining this with simplification of the calculation method is also used. For example, a method that uses multiplication such as Euclidean distance or similarity in the matching process is used in the detailed classification and the like, and an operation such as absolute difference is used in the large classification and the like. In general, subtraction can be executed faster than multiplication, so that higher speed can be realized than using expensive operation in large classification.

【０００６】[0006]

【発明が解決しようとする課題】前述のように、従来の
マッチング方式では、各分類部において必ず入力候補数
分のマッチング計算を行なう必要があり、劇的な高速化
とはならないという問題点が生じる。例えば、前記表１
の例で、仮に大分類処理で５０００カテゴリ全てに対し
ての計算は行なわず、ある手法により１０００カテゴリ
に対してのみ行なったとすれば、トータルの計算量は１
４０８００となり、約１桁の高速化となる。As described above, in the conventional matching method, it is necessary for each classification unit to perform matching calculation for the number of input candidates, which is not a drastic speedup. Occurs. For example, Table 1 above
In this example, if the calculation is not performed for all 5000 categories in the large classification process, but only for 1000 categories by a certain method, the total calculation amount is 1
It is 40800, which is about one digit faster.

【０００７】本発明は、このような実情に鑑みてなされ
たもので、辞書の各カテゴリベクトル間の距離関係が既
知であることを利用し、その距離関係を保存したフラグ
テーブルを各カテゴリ毎に予め求めておき、注目カテゴ
リとのマッチング計算で得られた距離（または類似度）
をもとに、フラグテーブルとの論理演算により、以降の
カテゴリに対して候補となる可能性があるものを決定
し、可能性があると判断されたカテゴリを次の注目カテ
ゴリとし、以降同様の処理を操り返すことにより、可能
性の高いカテゴリのみをマッチング計算することによ
り、高速化を図るようにした高速マッチング方式を提供
することを目的としている。The present invention has been made in view of such circumstances, and utilizes the fact that the distance relationship between category vectors in the dictionary is known, and a flag table storing the distance relationship is stored for each category. Distance (or similarity) obtained in advance and obtained by matching calculation with the target category
Based on, the logical operation with the flag table is used to determine what may be candidates for the subsequent categories, and the category determined to be possible is set as the next target category, and the same thereafter. It is an object of the present invention to provide a high-speed matching method that aims at speeding up by performing matching calculation only on a category having a high possibility by repeating processing.

【０００８】[0008]

【課題を解決するための手段】本発明は、上記目的を達
成するために、（１）認識辞書内の辞書カテゴリ毎に設
定された辞書番号に対応し、各辞書カテゴリを１ビット
の情報として表した、すなわち辞書カテゴリ数分のビッ
ト情報を有するバイナリデータを１つのフラグテーブル
とし、辞書カテゴリ毎に、後記類似度合いをある数で分
割した個数だけのフラグテーブルを有するデータ構成に
おいて、予め、全辞書カテゴリのフラグテーブルの全ビ
ットを０に初期化しておき、１つのフラグテーブルと同
じ構成の認識結果フラグを有し、入力ベクトルに対して
認識辞書内の各辞書カテゴリベクトルとの類似度合いを
計算する後記計算手段前に、予め、認識結果フラグの全
ビットを０に初期化しておき、前記辞書カテゴリベクト
ルと照合すべき入力ベクトルに対して、認識辞書内の各
辞書カテゴリベクトルとの類似度合いを計算する計算手
段と、前記計算手段により求められた入力ベクトルと各
辞書カテゴリベクトルとの類似度合いを基にソーティン
グを行い、認識候補を作成する認識候補作成手段と、前
記認識候補に選ばれた各辞書カテゴリの辞書番号に対応
する認識結果フラグのビットを１とし、その他の、すな
わち前記認識候補に選ばれていない、各辞書カテゴリの
辞書番号に対応する認識結果フラグのビットを０とし
て、認識結果フラグを作成するフラグ作成手段と、認識
辞書内の全ての辞書カテゴリを順に走査し、各辞書カテ
ゴリベクトルと前記入力ベクトルに対して前記計算手段
で求めた類似度合いを基に、当該辞書カテゴリの有する
フラグテーブル内の前記類似度合いで選ばれる１つのフ
ラグテーブルを注目フラグテーブルとして決定する注目
フラグテーブル決定手段と、当該注目フラグテーブルと
前記認識結果フラグとの論理和を求めることでフラグテ
ーブルを作成するフラグテーブル作成手段とを有する事
を特徴とし、前記計算手段、認識候補作成手段、フラグ
作成手段、注目フラグテーブル決定手段、およびフラグ
テーブル作成手段を、入力ベクトルに対して実施するこ
とによりフラグテーブルを作成すること、（２）前記（１）において、前記認識辞書は辞書カテゴ
リ毎に設定された辞書番号順に配置されており、ある辞
書番号の辞書カテゴリは、それより以前の、すなわち当
該辞書カテゴリより辞書番号が小さい全ての辞書カテゴ
リに対して、最も類似度合いが低い辞書カテゴリを選択
し順次並び替えを行う認識辞書並び替え手段を有し、予
め、前記認識辞書並び替え手段により並び替えられた認
識辞書を使用すること、（３）認識辞書内の辞書カテゴリ毎に設定された辞書番
号に対応し、各辞書カテゴリを１ビットの情報として表
した、すなわち辞書カテゴリ数分のビット情報を有する
バイナリデータを１つのフラグテーブルとし、辞書カテ
ゴリ毎に、後記類似度合いをある数で分割した個数だけ
のフラグテーブルを有するデータ構成において、１つの
フラグテーブルと同じ構成の可能性フラグテーブルを有
し、入力データに対する各認識処理前に、予め、前記可
能性フラグテーブルの全ビットを１に初期化し、入力デ
ータに対して認識用の特徴ベクトルとなる入力ベクトル
を作成する特徴ベクトル作成手段と、入力ベクトルに対
して、認識辞書内の各辞書カテゴリベクトルとの類似度
合いを計算する計算手段と、前記可能性フラグテーブル
の各ビットを順に走査し、当該ビットが１であるかどう
かを判定する計算手段と、前記判定手段においてビット
が１である、すなわち、認識候補となる可能性があると
判断された、当該ビットに対応する辞書カテゴリに対し
て、前記入力ベクトルと辞書カテゴリベクトルとの類似
度合いを前記計算手段で計算し、得られた類似度合いを
基に当該辞書カテゴリの有するフラグテーブル内の前記
類似度合いで選ばれる１つのフラグテーブルを注目フラ
グテーブルとして決定する注目フラグテーブル決定手段
と、前記注目フラグテーブル決定手段により決定された
注目フラグテーブルと、前記可能性フラグテーブルとの
論理積を求め、結果を可能性フラグテーブルに格納する
ことにより可能性フラグテーブルの更新を行う更新手段
とを有することを特徴とし、前記判定手段、計算手段、
注目フラグテーブル決定手段、更新手段を、可能性フラ
グテーブルの全ビットに対して、すなわち、全ての辞書
カテゴリに対して実施することにより、高速にマッチン
グ処理を行うこと、（４）認識辞書内の辞書カテゴリ毎に設定された辞書番
号に対応し、各辞書カテゴリを１ビットの情報として表
した、すなわち辞書カテゴリ数分のビット情報を有する
バイナリデータを１つのフラグテーブルとし、辞書カテ
ゴリ毎に、後記類似度合いをある数で分割した個数だけ
のフラグテーブルを有するデータ構成において、予め、
全辞書カテゴリのフラグテーブルの全ビットを０に初期
化しておき、１つのフラグテーブルと同じ構成の認識結
果フラグを有し、入力ベクトルに対して認識辞書内の各
辞書カテゴリベクトルとの類似度合いを計算する後記計
算ステップ前に、予め、認識結果フラグの全ビットを０
に初期化しておき、前記辞書カテゴリベクトルと照合す
べき入力ベクトルに対して、認識辞書内の各辞書カテゴ
リベクトルとの類似度合いを計算する計算ステップと、
前記計算ステップにより求められた入力ベクトルと各辞
書カテゴリベクトルとの類似度合いを基にソーティング
を行い、認識候補を作成する認識候補作成ステップと、
前記認識候補に選ばれた各辞書カテゴリの辞書番号に対
応する認識結果フラグのビットを１とし、その他の、す
なわち前記認識候補に選ばれていない、各辞書カテゴリ
の辞書番号に対応する認識結果フラグのビットを０とし
て、認識結果フラグを作成するフラグ作成ステップと、
認識辞書内の全ての辞書カテゴリを順に走査し、各辞書
カテゴリベクトルと前記入力ベクトルに対して前記計算
ステップで求めた類似度合いを基に、当該辞書カテゴリ
の有するフラグテーブル内の前記類似度合いで選ばれる
１つのフラグテーブルを注目フラグテーブルとして決定
する注目フラグテーブル決定ステップと、当該注目フラ
グテーブルと前記認識結果フラグとの論理和を求めるこ
とでフラグテーブルを作成するフラグテーブル作成ステ
ップとを有する事を特徴とし、前記計算ステップ、認識
候補作成ステップ、フラグ作成ステップ、注目フラグテ
ーブル決定ステップ、およびフラグテーブル作成ステッ
プを、入力ベクトルに対して実施することによりフラグ
テーブルを作成すること、（５）前記（４）において、前記認識辞書は辞書カテゴ
リ毎に設定された辞書番号順に配置されており、ある辞
書番号の辞書カテゴリは、それより以前の、すなわち当
該辞書カテゴリより辞書番号が小さい全ての辞書カテゴ
リに対して、最も類似度合いが低い辞書カテゴリを選択
し順次並び替えを行う認識辞書並び替えステップを有
し、予め、前記認識辞書並び替えステップにより並び替
えられた認識辞書を使用すること、（６）認識辞書内の辞書カテゴリ毎に設定された辞書番
号に対応し、各辞書カテゴリを１ビットの情報として表
した、すなわち辞書カテゴリ数分のビット情報を有する
バイナリデータを１つのフラグテーブルとし、辞書カテ
ゴリ毎に、後記類似度合いをある数で分割した個数だけ
のフラグテーブルを有するデータ構成において、１つの
フラグテーブルと同じ構成の可能性フラグテーブルを有
し、入力データに対する各認識処理前に、予め、前記可
能性フラグテーブルの全ビットを１に初期化し、入力デ
ータに対して認識用の特徴ベクトルとなる入力ベクトル
を作成する特徴ベクトル作成ステップと、入力ベクトル
に対して、認識辞書内の各辞書カテゴリベクトルとの類
似度合いを計算する計算ステップと、前記可能性フラグ
テーブルの各ビットを順に走査し、当該ビットが１であ
るかどうかを判定する判定ステップと、前記判定ステッ
プにおいてビットが１である、すなわち、認識候補とな
る可能性があると判断された、当該ビットに対応する辞
書カテゴリに対して、前記入力ベクトルと辞書カテゴリ
ベクトルとの類似度合いを前記計算ステップで計算し、
得られた類似度合いを基に当該辞書カテゴリの有するフ
ラグテーブル内の前記類似度合いで選ばれる１つのフラ
グテーブルを注目フラグテーブルとして決定する注目フ
ラグテーブル決定ステップと、前記注目フラグテーブル
決定ステップにより決定された注目フラグテーブルと前
記可能性フラグテーブルとの論理積を求め、結果を可能
性フラグテーブルに格納することにより可能性フラグテ
ーブルの更新を行う更新ステップとを有することを特徴
とし、前記判定ステップ、計算ステップ、注目フラグテ
ーブル決定ステップ、更新ステップを、可能性フラグテ
ーブルの全ビットに対して、すなわち、全ての辞書カテ
ゴリに対して実施することにより、高速にマッチング処
理を行うこと、を特徴としたものである。In order to achieve the above object, the present invention provides (1) a setting for each dictionary category in a recognition dictionary.
1 bit for each dictionary category corresponding to the specified dictionary number
Information, that is, the number of bits
One binary flag table containing binary information
Then, for each dictionary category, the degree of similarity described below is divided by a certain number.
In the data structure that has the flag table of the number of divided
In advance, all the words in the flag table of all dictionary categories are
Is initialized to 0 and the same as one flag table.
It has a recognition result flag of the same structure,
The degree of similarity with each dictionary category vector in the recognition dictionary
Note that all recognition result flags must be
The bit is initialized to 0, and the dictionary category vector is set.
For each input vector in the recognition dictionary
Calculator to calculate the degree of similarity with dictionary category vector
Stage, the input vector obtained by the calculation means, and
Sorting based on similarity to dictionary category vector
And a recognition candidate creating means for performing recognition and creating recognition candidates.
Corresponds to the dictionary number of each dictionary category selected as a recognition candidate
Set the bit of the recognition result flag to 1 and
In each dictionary category that is not selected as a recognition candidate
The bit of the recognition result flag corresponding to the dictionary number is set to 0
The recognition result flag,
It scans all dictionary categories in the dictionary in order,
The calculation means for the Gori vector and the input vector
Based on the degree of similarity found in
One flag selected by the degree of similarity in the flag table.
Attention to determine the lag table as the attention flag table
Flag table determining means,
The flag table is obtained by calculating the logical sum with the recognition result flag.
A flag table creating means for creating a table
And a calculation means, a recognition candidate creation means, and a flag.
Creation means, attention flag table determination means, and flags
A table creation method can be applied to the input vector.
Creating a flag table by a, in (2) above (1), the recognition dictionary Dictionary category
Are arranged in the order of the dictionary numbers set for each
The dictionary category for the book number is earlier,
All dictionary categories with dictionary numbers smaller than the dictionary category
Select the dictionary category with the lowest similarity to
It has a recognition dictionary sorting means that sorts sequentially,
Therefore, the recognition dictionary rearranged by the recognition dictionary rearrangement means
The use of識辞statement, (3) set for each dictionary categories in the recognition dictionary dictionary number
Corresponding to the No., each dictionary category is displayed as 1-bit information.
That is, it has bit information for the number of dictionary categories
Binary data is used as one flag table and dictionary
The number of similarities divided by a certain number
In the data structure having the flag table of
Possibility of having the same configuration as the flag table
However, before each recognition process for input data,
All bits in the capability flag table are initialized to 1 and the input data is
Input vector which is the feature vector for recognition to the data
The feature vector creation means for creating
Then, the similarity with each dictionary category vector in the recognition dictionary
Calculating means for calculating a match and the possibility flag table
Scan each bit in sequence and check if the bit is 1.
And a bit in the determining means
Is 1, that is, it may be a recognition candidate.
For the determined dictionary category corresponding to the bit
And the similarity between the input vector and the dictionary category vector
The degree is calculated by the calculation means, and the obtained degree of similarity is calculated.
Based on the flag table of the dictionary category
One flag table selected according to the degree of similarity
Flag table determining means for determining a table
Is determined by the attention flag table determination means
Of the attention flag table and the possibility flag table
Calculate the logical product and store the result in the possibility flag table
Update means for updating the possibility flag table by
And a determination means, a calculation means,
Set the attention flag table determination means and update means to the possibility flag.
All bits in the table, ie all dictionaries
Matching at high speed by executing for categories
Carrying out the grayed processing, (4) is set for each dictionary categories in the recognition dictionary the dictionary number
Corresponding to the No., each dictionary category is displayed as 1-bit information.
That is, it has bit information for the number of dictionary categories
Binary data is used as one flag table and dictionary
The number of similarities divided by a certain number
In the data structure having the flag table of
Initialize all bits in flag table of all dictionary categories to 0
It is converted to a recognition result with the same structure as one flag table.
Has a result flag, and each in the recognition dictionary for the input vector
A postscript to calculate the similarity with the dictionary category vector
Before the calculation step, all bits of the recognition result flag are set to 0 in advance.
It is initialized to and matched with the dictionary category vector
For each power input vector, each dictionary category in the recognition dictionary
Calculation step for calculating the degree of similarity with the re-vector,
The input vector obtained by the calculation step and each word
Sorting based on the degree of similarity with the calligraphy category vector
And a recognition candidate creating step for creating a recognition candidate,
It corresponds to the dictionary number of each dictionary category selected as the recognition candidate.
The corresponding recognition result flag bit is set to 1, and other
That is, each dictionary category not selected as a recognition candidate
The bit of the recognition result flag corresponding to the dictionary number of
And a flag creation step of creating a recognition result flag,
Scan all dictionary categories in the recognition dictionary in turn,
The calculation for the category vector and the input vector
Based on the similarity calculated in step, the dictionary category
Is selected by the degree of similarity in the flag table of
Determine one flag table as the flag table of interest
Attention flag table determination step
Table and the recognition result flag.
Create a flag table with and
And the calculation step, recognition
Candidate creation step, flag creation step, attention flag table
Table determination step and flag table creation step.
Flag on the input vector by
Creating a table (5) in the (4), the recognition dictionary Dictionary category
Are arranged in the order of the dictionary numbers set for each
The dictionary category for the book number is earlier,
All dictionary categories with dictionary numbers smaller than the dictionary category
Select the dictionary category with the lowest similarity to
There is a recognition dictionary sorting step that sorts sequentially.
In advance, the recognition dictionary is rearranged by the rearrangement step.
Use the obtained recognition dictionary, (6) Dictionary number set for each dictionary category in the recognition dictionary
Corresponding to the No., each dictionary category is displayed as 1-bit information.
That is, it has bit information for the number of dictionary categories
Binary data is used as one flag table and dictionary
The number of similarities divided by a certain number
In the data structure having the flag table of
Possibility of having the same configuration as the flag table
However, before each recognition process for input data,
All bits in the capability flag table are initialized to 1 and the input data is
Input vector which is the feature vector for recognition to the data
Feature vector creation step and input vector
For each dictionary category vector in the recognition dictionary
A calculation step for calculating the degree of similarity, and the possibility flag
Each bit of the table is scanned in turn and the bit is 1
The determination step for determining whether or not
Bit is 1, that is, it is not a recognition candidate.
Which corresponds to the bit that was determined to be
For the calligraphy category, the input vector and dictionary category
Calculate the degree of similarity with the vector in the calculation step,
Based on the obtained degree of similarity, the dictionary
One flag selected by the degree of similarity in the lag table
Table to determine the table as the flag table of interest.
Lag table determination step and the attention flag table
The attention flag table determined by the determination step and the previous
It is possible to obtain the result by calculating the logical product with the recordability flag table.
The possibility flag table by storing it in the
And an update step for updating the cable.
, The determination step, the calculation step, the attention flag table
Table determination step and update step.
Table for all bits, that is, all dictionary categories
High-speed matching process by performing on the gori
It is characterized by doing the work .

【０００９】[0009]

【作用】本発明は、文字や音声の認識装置において、入
力されたパターン（文字や音声）と、予め備えてある認
識辞書とのマッチング処理を高速に行なうもので、高速
マッチング処理を実現するため、各辞書ベクトル間の距
離関係が既知であることを利用し、その距離関係を保存
したフラグテーブルを各辞書ベクトル毎に予め設定して
おき、入力ベクトルと注目カテゴリベクトルとのマッチ
ング結果（距離や類似度など）をもとにフラグテーブル
を用いた論理演算により、以降のカテゴリに対して、認
識候補としての可能性があるカテゴリ（可能性カテゴ
リ）のみを決定し、次の可能性カテゴリを注目カテゴリ
として、逐次、同様のマッチング処理およびフラグによ
る決定処理を操り返して高速に処理する。According to the present invention, in a character or voice recognition device, a matching process between an input pattern (characters or voice) and a recognition dictionary provided in advance is performed at high speed. , The distance relationship between each dictionary vector is known, a flag table storing the distance relationship is set in advance for each dictionary vector, and the matching result (distance or distance) between the input vector and the target category vector is set. Based on the similarity, etc.), a logical operation using a flag table is used to determine only the categories (possibility categories) that are possible candidates for recognition, and to focus on the next possibility category. As a category, the matching process and the determination process using a flag are sequentially repeated and processed at high speed.

【００１０】[0010]

【実施例】実施例について、図面を参照して以下に説明
する。なお、以下の説明においては、簡単なため、マッ
チング計算には通常のユークリッド距離を用いた場合に
ついて説明する。まず、図１２〜図１５に基づいて、視
覚的な方式について説明する。図１２は、辞書ベクトル
と入力ベクトルとの位置関係、図１３は従来方式の計算
例、図１４は本発明方式の第１カテゴリとの計算結果と
可能性カテゴリ、図１５は本発明方式の第２カテゴリと
の計算結果と可能性カテゴリを各々示している。以下、
順次説明する。Embodiments will be described below with reference to the drawings. In the following description, for simplicity, the case where a normal Euclidean distance is used for the matching calculation will be described. First, a visual method will be described with reference to FIGS. FIG. 12 is a positional relationship between a dictionary vector and an input vector, FIG. 13 is a calculation example of a conventional method, FIG. 14 is a calculation result with a first category of the present invention method and a possibility category, and FIG. 15 is a first example of the present invention method. The calculation results for two categories and the possibility categories are shown. Less than,
This will be explained sequentially.

【００１１】図１２は、辞書の各カテゴリベクトルと入
力ベクトルとの位置関係を２次元平面上にプロットした
ものである（カテゴリベクトルは〇で、入力ベクトルは
◎で表示）。従来手法では、図１３のように、入力ベク
トルと全ての各カテゴリベクトルとの距離計算を行な
い、それらをソーティング（距離の小さいものから並べ
換えて）して認識候補を求める。なぜ全てのカテゴリベ
クトルに対して距離計算するかというと、それは入力ベ
クトルが未知であるため、各辞書ベクトルとの距離関係
が不明であるためである。しかし、入力ベクトルは未知
であっても、各カテゴリベクトル間の位置関係は既知で
ある。本発明の方式ではこのことを利用し、まず、図１
４に示すように、辞書の第１カテゴリとの距離計算結果
をもとに、可能性のあるカテゴリ（入力ベクトルが第１
カテゴリに近ければ近いカテゴリ、遠ければ遠いカテゴ
リ）を選び、それらのうち１つを第２カテゴリとする。
次に、図１５に示すように、この第２カテゴリとの距離
計算結果をもとに、まだマッチング計算を行なっていな
く、かつ第１カテゴリでの計算結果から可能性があると
判定されたものの中から、再び可能性のあるカテゴリを
決定する。以降、この処理を操り返すことにより、可能
性のあるカテゴリのみをマッチング処理し、高速にマッ
チング処理を行なう。図１４及び図１５は、本発明の方
式での第１カテゴリにより処理結果と、第２カテゴリに
よる処理結果である。可能性のあるカテゴリは図の斜線
カテゴリとなる。FIG. 12 is a plot of the positional relationship between each category vector of the dictionary and the input vector on a two-dimensional plane (category vector is indicated by ◯, input vector is indicated by ⊚). In the conventional method, as shown in FIG. 13, distance calculation is performed between the input vector and all the category vectors, and the recognition candidates are obtained by sorting them (sorting from the smallest distance). The reason why distances are calculated for all category vectors is that the input vector is unknown and the distance relationship with each dictionary vector is unknown. However, even if the input vector is unknown, the positional relationship between each category vector is known. The method of the present invention takes advantage of this fact by first referring to FIG.
As shown in FIG. 4, there is a possibility that the category (the input vector is the first category) based on the calculation result of the distance from the first category of the dictionary.
A category closer to the category and a category farther from the category) are selected, and one of them is set as the second category.
Next, as shown in FIG. 15, the matching calculation has not yet been performed based on the distance calculation result with the second category, and it is determined that there is a possibility from the calculation result in the first category. From within, again determine the possible categories. After that, by repeating this processing, only the possible categories are subjected to the matching processing, and the matching processing is performed at high speed. 14 and 15 show the processing result by the first category and the processing result by the second category in the method of the present invention. Possible categories are the shaded categories in the figure.

【００１２】次に、本発明の高速マッチング方式に用い
るフラグテーブルの構成と作成方法について説明する。
フラグテーブルは辞書のカテゴリ番号に対応したバイナ
リーテーブルである。各ビットが辞書の各カテゴリに対
応し、ビット値が１ならば可能性があり、０ならば可能
性なしとして作成したテーブルである。Next, the structure and method of creating the flag table used in the high speed matching method of the present invention will be described.
The flag table is a binary table corresponding to the category number of the dictionary. Each bit corresponds to each category of the dictionary, and if the bit value is 1, there is a possibility, and if 0, there is no possibility.

【００１３】[0013]

【表２】 [Table 2]

【００１４】表２は、日本語ＯＣＲのフラグテーブルの
例である。このフラグテーブルからは、国、田、困、
口、団などのカテゴリが可能性のあるカテゴリとして選
択される。１つのフラグテーブルは認識対象カテゴリ−
ビット分の容量が必要となる。従って、そのメモリ容量
（バイト数）はカテゴリ数÷８で概算できる。例えば、
５０００カテゴリのものであれば、１つのフラグテーブ
ルは６２５バイトとなる。本発明ではまず、マッチング
計算により求められる距離を均等の範囲ごとにｍ個に分
割し、辞書の１カテゴリ当たり、その個数分（ｍ個）フ
ラグテーブルを用意する。最大距離を１０００、分割数
ｍを１０とした場合の、１カテゴリ当たりのフラグテー
ブル構成例を以下の表３に示す。Table 2 is an example of a Japanese OCR flag table. From this flag table, country, field, trouble,
Mouth, band, etc. categories are selected as possible categories. One flag table is the recognition target category-
Bit capacity is required. Therefore, the memory capacity (the number of bytes) can be roughly estimated by the number of categories ÷ 8. For example,
In the case of 5000 categories, one flag table has 625 bytes. In the present invention, first, the distance calculated by the matching calculation is divided into m pieces for each equal range, and the number (m pieces) of flag tables is prepared for each category of the dictionary. Table 3 below shows an example of the flag table configuration per category when the maximum distance is 1000 and the number of divisions m is 10.

【００１５】[0015]

【表３】 [Table 3]

【００１６】この最大距離は用いる特徴ベクトルの次元
数や正規化手法などにより異なってくる。また、分割数
ｍはより大きくとれば、各カテゴリ間の距離関係の最良
近似となるが、メモリ容量も増加するので、システムに
より随意に決定されるものである。あるシステムでは、
特徴ベクトルに対してある種の正規化処理を施こしてな
いものがある（かもしれない）。この場合論理的には、
最大距離の設定ができないが、実験により（頻度情報な
どをもとに）最大距離を設定し、それをｍ個に分割し、
ｍ＋１個の範囲として「最大距離以上」を設定すること
により、本発明の方式を適用することができる。This maximum distance varies depending on the number of dimensions of the feature vector used and the normalization method. Further, if the number of divisions m is larger, the distance relationship between the categories becomes the best approximation, but the memory capacity also increases, so it is arbitrarily determined by the system. In one system,
Some feature vectors may not have undergone some kind of normalization. In this case, logically,
I can not set the maximum distance, but I set the maximum distance by experiment (based on frequency information etc.), divide it into m,
The method of the present invention can be applied by setting "more than the maximum distance" as the m + 1 range.

【００１７】次に、辞書カテゴリの並び換えについて説
明する。フラグテーブルを作成する前に、辞書カテゴリ
の並び換えを行なう。この理由は、本発明ではある注目
カテゴリと入力ベクトルとの距離により、以降の可能性
カテゴリの決定を行なうため、辞書カテゴリの並びで、
カテゴリ間の距離が近いベクトル（例えば０とｏなどの
類似文字）が連続している場合、可能性カテゴリ判定が
効果的に機能しないためである。これを解決するため、
本発明では、以前のカテゴリに対して最も遠いカテゴリ
ベクトルを順に並べていく方法で、辞書の各カテゴリベ
クトルを並び換える。Next, rearrangement of dictionary categories will be described. Before creating the flag table, the dictionary categories are rearranged. The reason for this is that in the present invention, the possibility category is determined thereafter by the distance between a certain target category and the input vector.
This is because the possibility category determination does not function effectively when vectors with similar distances between categories (for example, similar characters such as 0 and o) are continuous. To solve this,
In the present invention, the category vectors in the dictionary are rearranged by a method of sequentially arranging the category vector farthest from the previous category.

【００１８】図１は、本発明の高速マッチング方式の辞
書並び換えのブロック図で、図中、１は辞書並び換え処
理部、２は平均ベクトル作成部、３は最近傍カテゴリ決
定部、４はカテゴリベクトルコピー部、５はコピーフラ
グ制御部、６は辞書並び換え制御部、７は平均ベクトル
バッファ、８は入力辞書バッファ、９は結果辞書バッフ
ァ、１０は結果辞書カウンタ、１１は選択番号バッフ
ァ、１２は入力辞書カウンタ、１３は最大累積距離バッ
ファ、１４は累積距離バッファ、１５はカウンタ、１６
は距離バッファ、１７はカテゴリ選択制御部、１８はカ
テゴリ選択処理部である。FIG. 1 is a block diagram of dictionary rearrangement according to the high speed matching system of the present invention. In the figure, 1 is a dictionary rearrangement processing unit, 2 is an average vector generation unit, 3 is a nearest neighbor category determination unit, and 4 is Category vector copy unit, 5 copy flag control unit, 6 dictionary rearrangement control unit, 7 average vector buffer, 8 input dictionary buffer, 9 result dictionary buffer, 10 result dictionary counter, 11 selection number buffer, 12 is an input dictionary counter, 13 is a maximum cumulative distance buffer, 14 is a cumulative distance buffer, 15 is a counter, 16
Is a distance buffer, 17 is a category selection control unit, and 18 is a category selection processing unit.

【００１９】図２（ａ),(ｂ）は、入力辞書及び結果辞
書の各々の構成図である。図３は辞書並び換えのフロー
チャートである。以下、各ステップに従って順に説明す
る。なおフローチャート中の記号の意味は、以下の表４
のとおりである。FIGS. 2A and 2B are block diagrams of the input dictionary and the result dictionary, respectively. FIG. 3 is a flowchart of dictionary rearrangement. Hereinafter, each step will be described in order. The meanings of the symbols in the flowchart are shown in Table 4 below.
It is as follows.

【００２０】[0020]

【表４】 [Table 4]

【００２１】step１：まず、入力辞書の全カテゴリック
ベクトルのコピーフラグ（既に結果辞書バッファへコピ
ーしたかを示すフラグ）を０クリアする。step２：入力辞書の全カテゴリベクトルで平均ベクトル
を求める。step３，４：平均ベクトルに最も近い入力辞書のカテゴ
リベクトルを求め（最近傍カテゴリ）、それを第１カテ
ゴリとして、結果バッファの第１番目にコピーする。こ
の時、第１カテゴリに選択された入力辞書カテゴリのコ
ピーフラグを１にする。step５：結果辞書カウンタを１に初期化する。step６：結果辞書カウンタの値が認識対象カテゴリ数で
あるかチェックする。もし、認識対象カテゴリ数であれ
ば辞書並び換え処理を終了する。Step 1 : First, the copy flags of all categorical vectors of the input dictionary (flags indicating whether or not they have already been copied to the result dictionary buffer) are cleared to 0. step2 : Obtain an average vector from all category vectors in the input dictionary. step3,4 : Obtain the category vector of the input dictionary closest to the average vector (nearest neighbor category), and copy it to the first in the result buffer as the first category. At this time, the copy flag of the input dictionary category selected as the first category is set to 1. step5 : Initialize the result dictionary counter to 1. step6 : It is checked whether the value of the result dictionary counter is the number of recognition target categories. If it is the number of categories to be recognized, the dictionary rearrangement process is terminated.

【００２２】step７：全てのコピーフラグが０（まだ、
結果辞書にコピーされていない）カテゴリの中で、既に
コピー済の結果辞書の各カテゴリに対して、最も遠いカ
テゴリを次カテゴリとして選択する（選択フローヘ）。step８：次カテゴリを結果辞書にコピーし、対応する入
力辞書のコピーフラグを１にする。step９：結果辞書カウンタをインクリメントする。以降、同様の処理を全ての入力辞書カテゴリが、結果辞
書にコピーされるまで（結果辞書カウンタが認識対象カ
テゴリとなるまで）操り返す。[0022] step7: all of the copy flag is 0 (still,
Among the categories that have not been copied to the result dictionary), the farthest category is selected as the next category for each category of the result dictionary that has already been copied (to the selection flow). step8 : Copy the next category to the result dictionary and set the copy flag of the corresponding input dictionary to 1. step9 : The result dictionary counter is incremented. After that, the same processing is repeated until all the input dictionary categories are copied to the result dictionary (until the result dictionary counter becomes the recognition target category).

【００２３】図４は、カテゴリ選択のフローチャートで
ある。以下、各ステップに従って順に説明する。なお、
フローチャート中の記号の意味は、以下の表５のとおり
である。FIG. 4 is a flow chart of category selection. Hereinafter, each step will be described in order. In addition,
The meanings of the symbols in the flowchart are as shown in Table 5 below.

【００２４】[0024]

【表５】 [Table 5]

【００２５】step１，２：制御に必要な入力辞書カウン
タと最大累積距離バッファを０クリアする。step３，４：入力辞書カテゴリの中で、コピーフラグが
０のものを探す。step５〜１０：評価に用いる累積距離バッファを０クリ
アし、この入力カテゴリベクトルと、（既にコピー済み
の）結果辞書の各カテゴリ間で距離を計算し、求められ
た距離を累積距離バッファに加算していく。step１１〜１４：最大累積距離バッファとこの累積距離
バッファの内容を比較し、累積距離バッファの方が小さ
ければ、次の入力辞書カテゴリの走査へ移る。もし、累
積距離バッファの方が大きければ、その内容を最大累積
距離バッファへコピーし、選択番号バッファに現在の入
力辞書カテゴリ番号（入力辞書カウンタの内容）をコピ
ーする。以降、同様の処理を全ての入力辞書カテゴリに対して行
ない、最終的に選択番号バッファに格納されている番号
が、次の結果辞書へコピーされるカテゴリ番号となる。 Steps 1, 2 : Clear the input dictionary counter required for control and the maximum cumulative distance buffer to zero. step3,4 : Search the input dictionary category for which the copy flag is 0. step5-10 : The cumulative distance buffer used for evaluation is cleared to 0, the distance between this input category vector and each category of the (already copied) result dictionary is calculated, and the obtained distance is added to the cumulative distance buffer. To go. Steps 11 to 14 : The maximum cumulative distance buffer is compared with the contents of this cumulative distance buffer, and if the cumulative distance buffer is smaller, the process moves to the scanning of the next input dictionary category. If the cumulative distance buffer is larger, its contents are copied to the maximum cumulative distance buffer, and the current input dictionary category number (contents of the input dictionary counter) is copied to the selection number buffer. After that, the same processing is performed for all the input dictionary categories, and the number finally stored in the selection number buffer becomes the category number to be copied to the next result dictionary.

【００２６】次に、フラグテーブルの作成方法について
説明する。フラグテーブルは実験的に作成し、作成に用
いられる入力データは、辞書作成に用いられる大量で多
類の特徴ベクトルである。図５は、フラグテーブル作成
部のブロック図で、図中、２１はフラグテーブル作成処
理部、２２は距離計算部、２３はソーティング部、２４
は認識結果フラグ設定部、２５は注目フラグ決定部、２
６は注目フラグ更新部、２７はフラグテーブル作成制御
部、２８は入力特徴データバッファ部、２９は距離バッ
ファ、３０は認識結果（修補）バッファ、３１は認識結
果フラグバッファ、３２は認識結果バッファ、３３はカ
テゴリカウンタである。Next, a method of creating the flag table will be described. The flag table is created experimentally, and the input data used for the creation is a large number and many kinds of feature vectors used for creating the dictionary. FIG. 5 is a block diagram of the flag table creation unit. In the figure, 21 is a flag table creation processing unit, 22 is a distance calculation unit, 23 is a sorting unit, and 24.
Is a recognition result flag setting unit, 25 is an attention flag determination unit, 2
6 is an attention flag update unit, 27 is a flag table creation control unit, 28 is an input feature data buffer unit, 29 is a distance buffer, 30 is a recognition result (correction) buffer, 31 is a recognition result flag buffer, 32 is a recognition result buffer, 33 is a category counter.

【００２７】図６（ａ）〜（ｃ）は、認識結果フラグ、
認識辞書、フラグテーブル部の各々の構成図である。図
７は、フラグテーブル作成のフローチャートである。以
下、各ステップに従って順に説明する。なお、フロチャ
ート中の記号の意味は、以下の表６のとおりである。FIGS. 6A to 6C show recognition result flags,
It is a block diagram of each of a recognition dictionary and a flag table unit. FIG. 7 is a flowchart for creating the flag table. Hereinafter, each step will be described in order. The meanings of the symbols in the flow chart are as shown in Table 6 below.

【００２８】[0028]

【表６】 [Table 6]

【００２９】step１：まず、辞書の全てのフラグテーブ
ルを０クリアする。step２〜４：入力データに対して、各辞書カテゴリベク
トルとの距離計算（この演算には採用される認識手法を
用いる）を行なった後、ソーティングにより認識候補を
作成する。この処理は通常のマッチング処理をそのまま
用いる。従って、ソーティングにより選ばれる候補数も
従来から採用している（あるいはその辞書により認識性
能が保証される）数を用いる。この時、各辞書べクトル
との距離は距離バッファに格納される。step５，６：ソーティング結果をもとに、認識候補とし
て選ばれた候補番号に対応するビットを１、認識候補外
を０として、認識結果フラグを作成する。この認識結果
フラグは、表２のフラグテーブルと同様の構成でカテゴ
リ数分のバイナリーデータである。Step 1 : First, all flag tables in the dictionary are cleared to 0. step2-4 : After calculating the distance between each input data and each dictionary category vector (using the recognition method adopted for this calculation), a recognition candidate is created by sorting. This processing uses the normal matching processing as it is. Therefore, as the number of candidates selected by sorting, the number conventionally adopted (or the recognition performance is guaranteed by the dictionary) is used. At this time, the distance to each dictionary vector is stored in the distance buffer. Steps 5 and 6 : Based on the sorting result, a bit corresponding to the candidate number selected as a recognition candidate is set to 1, and a bit other than the recognition candidate is set to 0 to create a recognition result flag. This recognition result flag is binary data for the number of categories with the same configuration as the flag table of Table 2.

【００３０】step７，８：これらのデータ設定終了後、
各辞書カテゴリ毎にフラグテーブルの作成を行なう。ま
ず、順次辞書カテゴリを注目カテゴリとし、結果バッフ
ァに格納された距離をもとに１カテゴリ当たりの複数の
フラグテーブル（表３）の中から範囲に該当するフラグ
テーブルを決定し、それを注目フラグテーブルとする。 step９，１０：このフラグテーブルと認識結果フラグと
の論理和（ＯＲ）を求め、論理和結果をこの注目フラグ
テーブルの値として格納する（フラグテーブルの更新処
理：表７）。以降、同様の処理を入力ベクトルがなくなるまで操り返
す。[0030]step7,8: After setting these data,
A flag table is created for each dictionary category. Well
Instead, the dictionary category is sequentially set as the attention category, and the result buffer is
Based on the distance stored in
Flags corresponding to the range from the flag table (Table 3)
The table is determined and it is set as the attention flag table. step9,10 : This flag table and recognition result flag
The logical sum (OR) of the
Store as table value (flag table update process
Reason: Table 7). After that, the same processing is repeated until there are no input vectors.
You

【００３１】[0031]

【表７】 [Table 7]

【００３２】この処理の結果、求められたフラグテーブ
ルは、あるカテゴリベクトルに対して、入力ベクトルと
の距離が求められた場合、最終的に候補として採用され
るべきカテゴリ位置のビットが１となったバイナリーデ
ータで、距離が小さい範囲のフラグテーブルは、そのカ
テゴリベクトルに近いカテゴリが１でその他は０とな
り、ある程度距離が大きい範囲のフラグテーブルは、そ
の距離に対応したカテゴリの位置が１となり、非常に近
いカテゴリや遠いカテゴリに対しては０となっている。
この情報は既知の辞書の各カテゴリベクトル間の距離関
係を保存したものであり、本発明の重要な要素は、この
情報に保存しつつ、学習ベクトル（フラグテーブル作成
に用いた特徴ベクトル）に対しては、認識性能の低下が
論理的に発生しないことである。従って、距離関係の保
存の他、学習ベクトルに対する認識性能も完全に保存さ
れている。As a result of this processing, in the flag table thus obtained, when the distance from a certain category vector to the input vector is obtained, the bit of the category position to be finally adopted as a candidate becomes 1. In the binary data, the flag table in the range with a small distance is 1 in the category close to the category vector, and 0 in the others. In the flag table in the range with a relatively large distance, the position of the category corresponding to the distance is 1. It is 0 for very close categories and distant categories.
This information saves the distance relationship between each category vector of the known dictionary, and the important element of the present invention is to save the information and to the learning vector (feature vector used to create the flag table). That is, the reduction in recognition performance does not logically occur. Therefore, the recognition performance for the learning vector is completely preserved in addition to the preservation of the distance relation.

【００３３】次に、高速マッチング方法について説明す
る。図８は、高速マッチング処理のブロック図で、図
中、４１は距離計算部、４２はフラグテーブル判定部、
４３は高速マッチング制御部、４４は距離バッファ、４
５は可能性フラグテーブル、４６はカテゴリカウンタ、
４７は入力特徴ベクトル、４８は認識辞書、４９は注目
フラグテーブル番号バッファ、５０は先頭アドレスレジ
スタ、５１は終了アドレスレジスタ、５２はカウンタ、
５３は論理演算部、５４は更新制御部である。Next, the high-speed matching method will be described. FIG. 8 is a block diagram of the high-speed matching process, in which 41 is a distance calculation unit, 42 is a flag table determination unit,
43 is a high-speed matching control unit, 44 is a distance buffer, 4
5 is a possibility flag table, 46 is a category counter,
47 is an input feature vector, 48 is a recognition dictionary, 49 is an attention flag table number buffer, 50 is a start address register, 51 is an end address register, 52 is a counter,
Reference numeral 53 is a logical operation unit, and 54 is an update control unit.

【００３４】図９（ａ）〜（ｃ）は、可能性フラグテー
ブル、認識辞書、フラグテーブル部の各々の構成図であ
る。図１０は、高速マッチング処理のフローチャートで
ある。以下、各ステップに従って順に説明する。なおフ
ローチャート中の記号の意味は、次の表８のとおりであ
る。FIGS. 9A to 9C are block diagrams of the possibility flag table, the recognition dictionary, and the flag table section. FIG. 10 is a flowchart of the high speed matching process. Hereinafter, each step will be described in order. The meanings of the symbols in the flowchart are as shown in Table 8 below.

【００３５】[0035]

【表８】 [Table 8]

【００３６】step１，２：最初に可能性フラグテーブル
の全ビットを１にセットする。この司能性フラグテーブ
ルは、表２のフラグテーブルと同様の構成であり、各ビ
ット位置が辞書の各カテゴリ番号に対応している。カテ
ゴリカウンタを０に初期化する。step３，４：カテゴリカウンタで示される可能性フラグ
テーブルのビットが１か（そのカテゴリをマッチング計
算してよいか）どうかを判定する。もし０ならば、マッ
チング処理を行なわず、カデゴリカウンタのインクメン
トを行ない、次の辞書のカテゴリへ処理を進める。step５：可能性フラグテーブルのビットが１の場合、そ
の辞書カテゴリベクトルと入力ベクトルとの距離計算を
行なう。step６：求められた距離をもとに、その辞書カテゴリの
注目のフラグテーブルを決定する。 Steps 1, 2 : First, all bits of the possibility flag table are set to 1. This versatility flag table has the same structure as the flag table of Table 2, and each bit position corresponds to each category number in the dictionary. Initialize the category counter to 0. step 3,4 : It is judged whether the bit of the possibility flag table indicated by the category counter is 1 (whether the category may be subjected to matching calculation). If it is 0, the matching process is not performed, the categorical counter is incremented, and the process is advanced to the category of the next dictionary. step5 : When the bit of the possibility flag table is 1, the distance between the dictionary category vector and the input vector is calculated. step 6 : Determine the attention flag table of the dictionary category based on the obtained distance.

【００３７】step７：この注目フラグテーブルと可能性
フラグテーブルとの論理積（ＡＮＤ）を求め、その結果
を可能性フラグテーブルへコピーする（可能性フラグテ
ーブル更新処理：後述）。step８：カテゴリカウンタをインクリメントし、次の辞
書カテゴリへ処理を進める。以降、同様の処理をカテゴリカウンタが認識対象カテゴ
リ数となるまで操り返す。以上の処理により、注目のカ
テゴリにより計算された距離をもとに、以降の可能性の
あるカテゴリの決定を遂次行ない、無駄なカテゴリとの
マッチング計算をパスすることにより、高速マッチング
処理を実現できる。[0037] step7: logically ANDed with the attention flag table and likelihood flag table (the AND), the result is copied to the likelihood flag table (likelihood flag table update process: see below). step8 : increment the category counter and proceed to the next dictionary category. After that, similar processing is repeated until the category counter reaches the number of recognition target categories. With the above processing, based on the distance calculated by the category of interest, the possibility of subsequent categories is determined successively, and the matching calculation with unnecessary categories is passed, realizing high-speed matching processing. it can.

【００３８】最後に、可能性フラグの更新処理を説明す
る。可能性フラグは、注目フラグとの論理積（ＡＮＤ）
により遂時更新される。前述した、高速マッチング処理
動作においては、この論理積範囲はフラグの全て（認識
対象カテゴリビット数分）であってもかまわない。しか
し、更新された可能性フラグの必要範囲は、現カテゴリ
番号以降の範囲である。従って、更新処理の論理積（Ａ
ＮＤ）処理もその範囲のみで行なえばよいことになる。
この範囲を限定した更新処理の説明を行なう。フラグテ
ーブルは、バイナリーデータであるので、通常はカテゴ
リ数÷８で概算できるバイト数のデータである。論理演
算はバイト単位で行なえるので、更新処理の論理積（Ａ
ＮＤ）もバイト単位で行なう。このバイト単位の処理は
メモリアクセス長の問題であるので、システムにより１
６ビットまたは３２ビットなどのアクセスが可能の場
合、本発明と同様に処理できる。またアクセス幅が大き
くなるほど処理は高速となる。Finally, the update processing of the possibility flag will be described. The possibility flag is a logical product (AND) with the flag of interest.
Will be updated by In the above-described high-speed matching processing operation, this logical product range may be all of the flags (corresponding to the number of recognition target category bits). However, the necessary range of the updated possibility flag is the range after the current category number. Therefore, the logical product (A
The ND) process only needs to be performed within that range.
The update process in which this range is limited will be described. Since the flag table is binary data, it is usually data of the number of bytes which can be roughly estimated by the number of categories / 8. Since logical operations can be performed in byte units, the logical product (A
ND) is also performed in byte units. Since this byte unit processing is a problem of memory access length,
If 6-bit or 32-bit access is possible, it can be processed in the same manner as the present invention. Further, the larger the access width, the faster the processing.

【００３９】図１１は、可能性フラグテーブル更新のフ
ローチャートである。以下、各ステップに従って順に説
明する。なお、フローチャート中の記号の意味は、以下
の表９のとおりである。FIG. 11 is a flowchart for updating the possibility flag table. Hereinafter, each step will be described in order. The meanings of the symbols in the flowchart are as shown in Table 9 below.

【００４０】[0040]

【表９】 [Table 9]

【００４１】step１：まず、現カテゴリ番号により、フ
ラグの先頭バイトアドレスを計算し（これはカテゴリ番
号÷８で求められる）、アドレスカウンタに代入する。step２〜５：アドレスカウンタで示される可能性フラグ
のバイトデータと注目フラグのバイトデータの論理積
（ＡＮＤ）演算を行ない、結果を可能性フラグに代入す
る。同様にアドレスカウンタが終了となるまで操り返
す。この処理により、高速マッチング処理に必要な範囲のみ
の可能性フラグの更新が行なえる。Step 1 : First, the head byte address of the flag is calculated based on the current category number (this is calculated by category number / 8), and the result is substituted into the address counter. Steps 2 to 5 : A logical product (AND) operation is performed on the byte data of the possibility flag indicated by the address counter and the byte data of the attention flag, and the result is assigned to the possibility flag. Similarly, the operation is repeated until the address counter reaches the end. By this processing, the possibility flag can be updated only in the range necessary for the high-speed matching processing.

【００４２】次に、距離尺度以外のマッチング手法の適
用について説明する。本発明は、通常の絶対差距離やユ
ーリッド距離など、全ての距離尺度を用いたマッチング
手法を採用しているものに対して有効である。この他、
認識装置には類似度なる尺度も多く採用されている。類
似度（Ｓ）は以下の通りの数１の計算式で求められる。Next, application of a matching method other than the distance measure will be described. INDUSTRIAL APPLICABILITY The present invention is effective for those adopting a matching method using all distance measures such as normal absolute difference distance and Eulide distance. Besides this,
Many measures of similarity are also used in the recognition device. The degree of similarity (S) is calculated by the following formula (1).

【００４３】[0043]

【数１】 [Equation 1]

【００４４】これは単に２つのベクトルの内積値をそれ
ぞれのべクトルのノルムで正規化していることで、各ベ
クトルのノルムを１に正規化した次の数２の（２）式と
等価になる。This is because the inner product value of two vectors is normalized by the norm of each vector, and is equivalent to the following equation (2) of the following equation 2 in which the norm of each vector is normalized to 1. .

【００４５】[0045]

【数２】 [Equation 2]

【００４６】このベクトルを用いたユークリッド距離
（Ｄ）を考えると、以下の数３の（３）式となる。Considering the Euclidean distance (D) using this vector, the following equation (3) is obtained.

【００４７】[0047]

【数３】 [Equation 3]

【００４８】結局、類似度とユークリッド距離とは等価
（裏表）な関係にある。従って、類似度法に対しても本
発明の適用は可能となる。After all, the degree of similarity and the Euclidean distance are in an equivalent (front and back) relationship. Therefore, the present invention can be applied to the similarity method.

【００４９】このように、本発明は、高速マッチング処
理に使用するための、辞書の各カテゴリベクトルの距離
関係を保存したフラグテーブルの構成とその作成方法で
あり、以下の点を特徴としている。（１）１つのフラグテーブルは辞書の各カテゴリ番号に
対応したバイナリーデータであり、フラグテーブルのビ
ット値が１の場合は、対応するカテゴリが候補となる可
能性あり（マッチング処理する必要あり）、ビット値が
０の場合は、対応するカテゴリ候補となる可能性なし
（マッチング処理の必要なし）と意味付けられたもので
ある。（２）１カテゴリ当たりのフラグテーブルの構成は、認
識手法に用いられる尺度（距離など）の最大値をある分
割数で分割し、分割された各距離範囲に対応した意味付
けのフラグテーブルを、分割した個数個備えたもので構
成されており、入力ベクトルと注目カテゴリとのマッチ
ング計算結果（距離など）により、注目カテゴリの注目
フラグテーブルが１つ決定できる構成である。As described above, the present invention is a flag table configuration for storing the distance relation of each category vector of the dictionary for use in high-speed matching processing and a method of creating the flag table, and is characterized by the following points. (1) One flag table is binary data corresponding to each category number in the dictionary, and if the bit value of the flag table is 1, the corresponding category may be a candidate (needs matching processing), When the bit value is 0, it means that there is no possibility of being a corresponding category candidate (no need for matching processing). (2) The configuration of the flag table per category is such that the maximum value of the scale (distance etc.) used in the recognition method is divided by a certain number of divisions, and a flag table with meaning corresponding to each divided distance range is It is configured by including the divided number of pieces, and one attention flag table of the attention category can be determined based on the matching calculation result (distance or the like) between the input vector and the attention category.

【００５０】（３）本発明を効果的に動作させるため、
フラグテーブル作成前に、以前に決定された辞書の各カ
テゴリベクトルに対して、最も遠い関係にあるものから
順番に、辞書のカテゴリベクトルを並び換える操作を行
なう。（４）辞書作成に用いた多種大量の特徴ベクトルを入力
データとし、採用されているマッチング手法と候補数を
もとに、それぞれマッチング処理とソーティング処理の
結果得られた認識結果候補と、辞書の各カテゴリベクト
ルとのマッチング結果（距離など）をもとに、各カテゴ
リの対応する注目フラグテーブルの更新処理を、入力特
徴ベクトルがなくなるまで操り返すことにより、各カテ
ゴリベクトルの距離関係を保存したフラグテーブルを作
成する方法である。(3) In order to operate the present invention effectively,
Before creating the flag table, an operation of rearranging the category vectors of the dictionary is performed for each category vector of the dictionary determined previously in order from the one having the farthest relationship. (4) The recognition result candidates obtained as a result of the matching process and the sorting process and the dictionary of the dictionary are obtained based on the matching method and the number of candidates that have been adopted, using a large amount of various feature vectors used to create the dictionary as input data. Based on the matching result with each category vector (distance, etc.), the flag that saves the distance relationship of each category vector by repeating the update process of the corresponding flag table of each category until there is no input feature vector. How to create a table.

【００５１】また、辞書の各カテゴリベクトルの距離関
係を保存したフラグテーブルを用いて、高速にマッチン
グ処理する方法であり、以下の点を特徴としている。（１）１つのフラグテーブルと同様の構成をした可能性
フラクテーブルを備え、マッチング処理前にそれの全ビ
ット値を１に初期化する。（２）カテゴリカウンタで示されるカテゴリに対応した
可能性フラグビット値が１つであるか判定することによ
り、マッチング処理すべき注目カテゴリを決定する。（３）入力ベクトルとその注目カテゴリに対して、採用
されているマッチング計算を行ない、計算結果（距離な
ど）をもとに、注目カテゴリ注目フラグテーブルを決定
する。Further, this is a method for performing a high-speed matching process using a flag table that stores the distance relationship of each category vector of the dictionary, and is characterized by the following points. (1) A possibility fractal table having the same structure as one flag table is provided, and all bit values of the possibility fractal table are initialized to 1 before the matching process. (2) The attention category to be subjected to the matching process is determined by determining whether the possibility flag bit value corresponding to the category indicated by the category counter is one. (3) The matching calculation adopted is performed for the input vector and its attention category, and the attention category attention flag table is determined based on the calculation result (distance etc.).

【００５２】（４）その注目フラグテーブルと可能性フ
ラグテーブルに対して、現カテゴリカウンタ以降のビッ
トデータに対してのみ、論理演算を用いて可能性フラグ
テーブルの更新を行なう。（５）以降、カテゴリカウンタが認識対象カテゴリ数と
なるまで、同様の処理を操り返すことにより、可能性の
あるカテゴリベクトルとのみマッチング計算を行なうこ
とで、高速にマッチング処理を行なう。(4) With respect to the attention flag table and the possibility flag table, the possibility flag table is updated using a logical operation only for bit data after the current category counter. (5) After that, the same process is repeated until the category counter reaches the number of categories to be recognized, so that the matching calculation is performed only with a possible category vector, thereby performing the matching process at high speed.

【００５３】[0053]

【発明の効果】以上の説明から明らかなように、本発明
によると、以下のような効果である。すなわち、本発明
は文字や音声の認識装置において、入力されたパターン
（文字や音声）と、予め備えてある認識辞書とのマッチ
ング処理を高速に行なう方法である。本発明では高速マ
ッチング処理を実現するため、各辞書ベクトル間の距離
関係が既知であることを利用し、その距離関係を保存し
たフラグテーブルを各辞書ベクトル毎に予め設定してお
き、入力ベクトルと注目カテゴリベクトルとのマッチン
グ結果（距離や類似度など）をもとにフラグテーブルを
用いた論理演算により、以降カテゴリに対して、認識候
補としての可能性があるカテゴリ（可能性カテゴリ）の
みを決定し、次の可能性カテゴリを注目カテゴリとし
て、遂次、同様のマッチング処理およびフラグによる決
定処理を操り返し高速に処理することができる。As is apparent from the above description, the present invention has the following effects. That is, the present invention is a method for performing high-speed matching processing between an input pattern (characters or voice) and a recognition dictionary provided in advance in a character or voice recognition device. In order to realize the high-speed matching process in the present invention, the fact that the distance relationship between the dictionary vectors is known is used, and a flag table storing the distance relationship is set in advance for each dictionary vector and Based on the matching result with the category vector of interest (distance, similarity, etc.), a logical operation using a flag table is used to determine only the category (possible category) that may be a recognition candidate for subsequent categories. However, with the next possibility category as the attention category, the matching process and the determination process by the flag can be repeated and processed at high speed.

【図面の簡単な説明】[Brief description of drawings]

【図１】本発明による高速マッチング方式の辞書並び換
え処理部の一実施例を説明するための構成図である。FIG. 1 is a block diagram for explaining an embodiment of a high-speed matching type dictionary rearrangement processing unit according to the present invention.

【図２】図１における入力辞書及び結果辞書の構成図で
ある。FIG. 2 is a configuration diagram of an input dictionary and a result dictionary in FIG.

【図３】本発明による高速マッチング方式の辞書並び換
え処理のフローチャートである。FIG. 3 is a flowchart of a dictionary rearrangement process of a high speed matching system according to the present invention.

【図４】本発明による高速マッチング方式のカテゴリ選
択処理のフローチャートである。FIG. 4 is a flowchart of category selection processing of a high speed matching method according to the present invention.

【図５】本発明による高速マッチング方式のフラグテー
ブル作成部の構成図である。FIG. 5 is a configuration diagram of a flag table creation unit of a high-speed matching method according to the present invention.

【図６】図５における認識結果フラグ，認識辞書，フラ
グテーブル部の構成図である。FIG. 6 is a configuration diagram of a recognition result flag, a recognition dictionary, and a flag table unit in FIG.

【図７】本発明による高速マッチング方式のフラグテー
ブル作成処理のフローチャートである。FIG. 7 is a flowchart of a flag table creation process of a high speed matching method according to the present invention.

【図８】本発明による高速マッチング方式の高速マッチ
ング処理部の構成図である。FIG. 8 is a configuration diagram of a high-speed matching processing unit of a high-speed matching system according to the present invention.

【図９】図８における可能性フラグテーブル，認識辞
書，フラグテーブル部の構成図である。9 is a configuration diagram of a possibility flag table, a recognition dictionary, and a flag table unit in FIG.

【図１０】本発明による高速マッチング方式の高速マッ
チング処理部のフローチャートである。FIG. 10 is a flowchart of a high-speed matching processing unit of a high-speed matching method according to the present invention.

【図１１】本発明による高速マッチング方式の可能性フ
ラグテーブル更新処理のフローチャートである。FIG. 11 is a flowchart of a possibility flag table updating process of the high speed matching method according to the present invention.

【図１２】本発明の辞書ベクトルと入力ベクトルとの位
置関係を示す図である。FIG. 12 is a diagram showing a positional relationship between a dictionary vector and an input vector according to the present invention.

【図１３】本発明を説明するための従来方式の計算例を
示す図である。FIG. 13 is a diagram showing a calculation example of a conventional method for explaining the present invention.

【図１４】本発明の第１カテゴリとの計算結果と可能性
カテゴリを示す図である。FIG. 14 is a diagram showing calculation results and a possibility category with the first category of the present invention.

【図１５】本発明の第２カテゴリとの計算結果と可能性
カテゴリを示す図である。FIG. 15 is a diagram showing a calculation result and a possibility category with the second category of the present invention.

【図１６】従来のマッチング処理のフローチャートであ
る。FIG. 16 is a flowchart of a conventional matching process.

[Explanation of symbols]

１…辞書並び換え処理部、２…平均ベクトル作成部、３
…最近傍カテゴリ決定部、４…カテゴリベクトルコピー
部、５…コピーフラグ制御部、６…辞書並び換え制御
部、７…平均ベクトルバッファ、８…入力辞書バッフ
ァ、９…結果辞書バッファ、１０…結果辞書カウンタ、
１１…選択番号バッファ、１２…入力辞書カウンタ、１
３…最大累積距離バッファ、１４…累積距離バッファ、
１５…カウンタ、１６…距離バッファ、１７…カテゴリ
選択制御部、１８…カテゴリ選択処理部。1 ... Dictionary rearrangement processing unit, 2 ... Average vector creation unit, 3
... nearest category determination unit, 4 ... category vector copy unit, 5 ... copy flag control unit, 6 ... dictionary rearrangement control unit, 7 ... average vector buffer, 8 ... input dictionary buffer, 9 ... result dictionary buffer, 10 ... result Dictionary counter,
11 ... Selection number buffer, 12 ... Input dictionary counter, 1
3 ... maximum cumulative distance buffer, 14 ... cumulative distance buffer,
15 ... Counter, 16 ... Distance buffer, 17 ... Category selection control unit, 18 ... Category selection processing unit.

───────────────────────────────────────────────────── フロントページの続き (58)調査した分野(Int.Cl.⁷，ＤＢ名) G06T 7/00 - 7/60 G06K 9/62 - 9/82 G10L 15/00 - 17/00 ＪＩＣＳＴファイル（ＪＯＩＳ)─────────────────────────────────────────────────── ─── Continuation of the front page (58) Fields surveyed (Int.Cl. ⁷ , DB name) G06T ⁷ /00-7/60 G06K 9/62-9/82 G10L 15/00-17/00 JISST file ( JOIS)

Claims

(57) [Claims]

1. It is set for each dictionary category in the recognition dictionary.
1-bit information for each dictionary category corresponding to the dictionary number
Represented as, that is, bit information for the number of dictionary categories
Binary data with is used as one flag table, and the similarity degree described below is divided by a certain number for each dictionary category.
Data structure with a flag table of the number of items
Beforehand, all bits of the flag table of all dictionary categories
Is initialized to 0, and a recognition result flag with the same structure as one flag table
And each dictionary category in the recognition dictionary for the input vector
Before the calculation means described below to calculate the degree of similarity with the vector,
Initialize all bits of the recognition result flag to 0 in advance.
The input vector to be matched with the dictionary category vector
In contrast, it is similar to each dictionary category vector in the recognition dictionary
Calculating means for calculating the degree, the input vector obtained by the calculating means, and each dictionary
Sorting is performed based on the degree of similarity with the Tegori vector.
The recognition candidate creating means for creating a recognition candidate and the dictionary number of each dictionary category selected as the recognition candidate.
The corresponding recognition result flag bit is set to 1, and other
That is, each dictionary category not selected as a recognition candidate
The bit of the recognition result flag corresponding to the dictionary number of
, A flag creating means for creating a recognition result flag and all dictionary categories in the recognition dictionary are sequentially scanned to
The calculation for the category vector and the input vector
Based on the degree of similarity calculated by
One selected according to the degree of similarity in the flag table
Flag table of interest as flag table of interest
Attention flag table determination means , logic of the attention flag table and the recognition result flag
A flag table that creates a flag table by finding the sum
And a calculation means, a recognition candidate creation means, a flag creation means, and a note creation means.
Eye flag table determination means and flag table creation
Means by implementing the means on the input vector.
A flag table creating device for creating a table.

2. The recognition dictionary is set for each dictionary category.
Are arranged in the order of the dictionary numbers
The categories are earlier, that is, the dictionary category
For all dictionary categories with dictionary numbers smaller than
Select the dictionary category with the lowest degree of similarity and sort them sequentially
It has a recognition dictionary rearranging means for
Use the recognition dictionary sorted by the dictionary sorting means
The flag table production according to claim 1, characterized in that
Equipment.

3. It is set for each dictionary category in the recognition dictionary.
1-bit information for each dictionary category corresponding to the dictionary number
Represented as, that is, bit information for the number of dictionary categories
Binary data with is used as one flag table, and the similarity degree described below is divided by a certain number for each dictionary category.
Data structure with a flag table of the number of items
Te, the possibility of the same configuration as one flag table Furagutebu
Before each recognition process for input data,
All bits of the possibility flag table are initialized to 1, and an input vector that becomes a feature vector for recognition is input data.
A feature vector creating means for creating a vector and an input vector for each dictionary category in the recognition dictionary.
A calculation means for calculating the degree of similarity with the cutout, and each bit of the possibility flag table is scanned in order,
Calculating means for determining whether or not the bit is 1;
If the bit is judged to be a candidate for knowledge,
For the corresponding dictionary category,
The degree of similarity with the call category vector is calculated by the calculation means.
Based on the calculated similarity, the dictionary category
One selected according to the degree of similarity in the flag table
Flag table of interest as flag table of interest
Attention flag table determination means and the attention flag table determination means
Logic of flag table and possibility flag table
Find the product and store the result in the possibility flag table
Update means to update the possibility flag table by
Characterized by having the determination means, the calculation means, the attention flag table determiner
Stage, the updating means, pairs all bits possibility flag table
, Ie, for all dictionary categories
By doing so, high-speed matching that performs high-speed matching processing
Processing equipment.

4. It is set for each dictionary category in the recognition dictionary.
1-bit information for each dictionary category corresponding to the dictionary number
Represented as, that is, bit information for the number of dictionary categories
Binary data with is used as one flag table, and the similarity degree described below is divided by a certain number for each dictionary category.
Data structure with a flag table of the number of items
Beforehand, all bits of the flag table of all dictionary categories
Is initialized to 0, and a recognition result flag with the same structure as one flag table
And each dictionary category in the recognition dictionary for the input vector
Before the calculation step described below to calculate the degree of similarity with the vector
In advance, all bits of the recognition result flag are initialized to 0.
Every input vector to be matched with the dictionary category vector
In contrast, it is similar to each dictionary category vector in the recognition dictionary
The calculation step for calculating the degree, the input vector obtained by the calculation step, and the respective words
Sorting based on the degree of similarity with the calligraphy category vector
Recognition candidate creation step of creating a recognition candidate, and the dictionary number of each dictionary category selected as the recognition candidate.
The corresponding recognition result flag bit is set to 1, and other
That is, each dictionary category not selected as a recognition candidate
The bit of the recognition result flag corresponding to the dictionary number of
, A flag creation step of creating a recognition result flag, and sequentially scans all dictionary categories in the recognition dictionary,
The calculation for the category vector and the input vector
Based on the similarity calculated in step, the dictionary category
Is selected by the degree of similarity in the flag table of
Determine one flag table as the flag table of interest
Attention flag table determination step
Table and the recognition result flag.
Create a flag table with and
And the calculation step, the recognition candidate creation step, and the flag creation.
Step, attention flag table determination step, and
Perform the lag table creation step for the input vector.
A flag table that creates a flag table by applying
Bull making method.

5. The recognition dictionary is set for each dictionary category.
Are arranged in the order of the dictionary numbers
The categories are earlier, that is, the dictionary category
For all dictionary categories with dictionary numbers smaller than
Select the dictionary category with the lowest degree of similarity and sort them sequentially
And a recognition dictionary rearrangement step for performing
Recognition dictionaries sorted by the recognition dictionary sorting step
5. The flag table as set forth in claim 4, characterized in that
Cable creation method.

6. A method is set for each dictionary category in the recognition dictionary.
1-bit information for each dictionary category corresponding to the dictionary number
Represented as, that is, bit information for the number of dictionary categories
Binary data with is used as one flag table, and the similarity degree described below is divided by a certain number for each dictionary category.
Data structure with a flag table of the number of items
Te, the possibility of the same configuration as one flag table Furagutebu
Before each recognition process for input data,
All bits of the possibility flag table are initialized to 1, and an input vector that becomes a feature vector for recognition is input data.
The feature vector creation step for creating a vector and the dictionary vector for each dictionary in the recognition dictionary for the input vector.
Calculation step for calculating the degree of similarity to the cutout, and scanning each bit of the possibility flag table in order,
A determination step for determining whether the bit is 1
And the bit is 1 in the determination step, that is,
If there is a possibility that the
Input dictionary for the dictionary category corresponding to
The degree of similarity between the
Calculated with the step, and based on the degree of similarity obtained, the dictionary
Select by the degree of similarity in the flag table of the category
One flag table exposed is the flag table of interest.
Determined by the attention flag table determination step and the attention flag table determination step
A discussion of the attention flag table and the possibility flag table
It is possible to obtain the logical product and store the result in the possibility flag table.
Update step to update possibility flag table by
Characterized by having a flop, said determination step, calculation step, attention flag table
Decision step, update step, possibility flag table
For all bits of, that is, for all dictionary categories
By performing this, it is possible to perform matching processing at high speed.
High-speed matching processing method.