JPH1139344A

JPH1139344A - Character string retrieval method using two-dimensional array code

Info

Publication number: JPH1139344A
Application number: JP9209932A
Authority: JP
Inventors: Yoichi Tomiyama; 洋一冨山
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 1997-07-18
Filing date: 1997-07-18
Publication date: 1999-02-12
Anticipated expiration: 2017-07-18
Also published as: JP3129248B2

Abstract

PROBLEM TO BE SOLVED: To reduce the number of comparisons even when there are plural specifications and to accelerate retrieval performance by comparing a code which encodes a conditional character string by using a two-dimensional array with a comparison object character including plural specification conditional characters. SOLUTION: A character string that meets a conditional character 1 which is given from character string data 4 is retrieved and stored in a coincidence data file 6. Comparison processing 5 compares the character 1 by using a conditional character code 3 that is acquired through code processing. Coding divides the conditional character string 1 into each conditional character and gives an one-dimensional storage number (primary subscript) of an array. Further, it divides the character into conditional classes, assigns a class number to the front of a two-dimensional storage number (secondary subscript), in the case of plural specifications, stores a corresponding data group as a flag in each class, corresponding to an element of a corresponding secondary subscript and makes it a conditional character code 3.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、文字列検索方法に
関し、特に、コンピュータを用いた文字列検索方法に関
する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a character string search method, and more particularly to a character string search method using a computer.

【０００２】[0002]

【従来の技術】ファイルに格納されている複数の文字列
データと条件文字列を比較する文字検索方法として、比
較対象文字列の先頭１文字と条件文字列の先頭１文字と
を比較し、合致した場合には、以降それぞれの文字列の
次の文字同士の比較を順次行い、条件文字列全体と一致
するか否かを判断する。条件文字列中の文字としては、
通常の文字のほかに、特殊な文字として、任意の１文字
を表すもの、任意の文字（または文字列）を表すもの、
さらに該当する文字の羅列や範囲を指定して複数の文字
を表すもの（この指定の仕方を「複数指定」という）な
どがある。2. Description of the Related Art As a character search method for comparing a plurality of character string data stored in a file with a condition character string, a first character of a comparison target character string is compared with a first character of the condition character string, and a match is made. In this case, the following characters in the respective character strings are compared sequentially, and it is determined whether or not the character string matches the entire condition character string. The characters in the condition string are
In addition to ordinary characters, special characters that represent any one character, characters that represent any character (or character string),
Further, there is a method of designating a plurality of characters by designating a list or range of corresponding characters (this designation method is referred to as “plural designation”).

【０００３】この中で、複数指定による比較を行うため
には、その指定によって条件に該当する全ての文字
（「該当文字群」という）の情報を保持し、比較対象の
文字と個々に比較する必要がある。[0003] Among them, in order to perform a comparison by a plurality of designations, information of all characters (referred to as "corresponding character group") corresponding to a condition is retained by the designation and individually compared with characters to be compared. There is a need.

【０００４】従来の文字検索方式について図１０を参照
して以下に説明する。従来の方式では、条件文字列１０
１に対してコード化処理１０２を行い、条件文字コード
１０３のように、該当文字群の情報を１文字ごとに配列
の１要素に格納しているため、比較対象文字列１０５と
の比較処理１０４では比較回数が増えてしまう。A conventional character search method will be described below with reference to FIG. In the conventional method, the condition string 10
1 is subjected to the encoding process 102, and as in the condition character code 103, the information of the corresponding character group is stored in one element of the array for each character. Then, the number of comparisons increases.

【０００５】[0005]

【発明が解決しようとする課題】以上説明したように、
上記従来の文字検索方法は下記記載の問題点を有してい
る。As described above,
The above-described conventional character search method has the following problems.

【０００６】第１の問題点は、複数指定をした際の性能
が著しく低下する、ということである。The first problem is that the performance when a plurality of designations are made is significantly reduced.

【０００７】その理由は、複数指定部分での比較回数
は、指定該当文字数をｎとすると、比較対象となる文字
列データごとに、比較対象文字が該当文字列にない場合
には、ｎ回、該当文字列にある場合でも、平均して、
（（１＋ｎ）／２）回になり、比較回数は、該当文字数
や、特に文字列データが増えるにしたがって増大するか
らである。[0007] The reason is that the number of comparisons in a plurality of designated portions is n for each character string data to be compared if the character to be compared is not in the corresponding character string, assuming that the number of designated characters is n. Even if it is in the string,
This is because the number of comparisons becomes ((1 + n) / 2), and the number of comparisons increases as the number of applicable characters, and in particular, the character string data increases.

【０００８】第２の問題点は、複数指定時の該当文字群
の情報を格納するために必要な個々の文字を保持する配
列の大きさ（要素の個数）を決めるための適当な方法が
ない、ということである。The second problem is that there is no appropriate method for determining the size (number of elements) of an array holding individual characters necessary to store information on a corresponding character group when a plurality of characters are specified. ,That's what it means.

【０００９】その理由は、上述の考え方では、複数指定
部分での該当文字群の情報を格納する配列の大ささは、
（条件文字数×許される文字数）個であるが、通常、複
数指定は、条件文字列中、せいぜい１〜２文字程度であ
り、また該当文字数も、平均すれば、使用可能な文字数
全体に比べ、かなりの割合であることが考えられるた
め、ほとんどが無駄な配列となる、可能性がある。The reason is that, in the above-described concept, the size of an array for storing information of a corresponding character group in a plurality of designated portions is:
(Condition number of characters x number of allowed characters), but usually multiple designations are at most about 1 to 2 characters in the condition character string, and the number of applicable characters is, on average, smaller than the total number of usable characters. It can be a significant percentage, so most could be wasted arrays.

【００１０】しかし、この大きさよりも小さい配列を用
意すると、指定によっては、用意した配列の大きさ以上
の大きさが必要となり、いわゆる配列外参照を起こして
しまう可能性がある。However, if an array smaller than this size is prepared, a size larger than the size of the prepared array is required depending on the specification, and so-called out-of-array reference may be caused.

【００１１】したがって、本発明は、複数指定がある場
合でも比較回数を減らし、検索性能を高速化する文字検
索方法を提供することにある。Accordingly, an object of the present invention is to provide a character search method that reduces the number of comparisons and speeds up search performance even when a plurality of designations are made.

【００１２】また、本発明は、複数指定による該当文字
の情報を格納するために必要な配列の大きさについて、
さほど無駄にならず、かつ、一意に決まるようにする文
字検索方法を提供することにある。Further, according to the present invention, the size of an array necessary for storing information of a corresponding character by a plurality of designations is defined as follows.
An object of the present invention is to provide a character search method that does not waste much and is uniquely determined.

【００１３】[0013]

【課題を解決するための手段】前記目的を達成するた
め、本発明は、その概要を述べれば、条件文字列を２次
元配列を用いてコード化するものである。本発明の文字
検索方法は、条件指定による文字列の検索において、条
件文字を文字種別と、該文字種別内該当文字フラグを表
す２次元の配列を用いてコード化し、前記コードと複数
指定条件文字を含む比較対象文字とを比較することを特
徴とする。In order to achieve the above-mentioned object, the present invention, in brief, is to code a conditional character string using a two-dimensional array. In a character search method according to the present invention, in a search for a character string by condition designation, a condition character is coded using a character type and a two-dimensional array representing a corresponding character flag in the character type, and the code and the plural designation condition character Is compared with the comparison target character including.

【００１４】また、本発明は、条件種別を行方向と、文
字種別を列方向の添え字とする２次元配列を備え、列方
向添字の先頭には、対応する条件文字種別コードが格納
され、複数指定の場合、列方向の添え字の次の要素に、
順次、文字種別の該当フラグによりコード化し、比較対
象文字について該比較対象文字種別の該当フラグにより
コード値を前記２次元配列の前記文字種別に対応する添
字に格納されたコードと比較することにより、文字の検
索を行うことを特徴とする。Further, the present invention includes a two-dimensional array having a condition type as a subscript in the row direction and a character type as a subscript in the column direction. A corresponding condition character type code is stored at the head of the column direction subscript. In the case of multiple specifications, the next element of the subscript in the column direction is
By sequentially coding with a corresponding flag of the character type, and comparing the code value of the character to be compared with the code stored in the subscript corresponding to the character type in the two-dimensional array by the corresponding flag of the character type to be compared, Character search is performed.

【００１５】本発明においては、条件コードは、条件文
字ごとに文字種別ごとの該当文字フラグの情報を格納す
る配列と、これらの配列を条件文字すべてについて管理
する配列からなっている。In the present invention, the condition code includes an array for storing information on the corresponding character flag for each character type for each condition character, and an array for managing these arrays for all the condition characters.

【００１６】[0016]

【発明の実施の形態】次に、本発明の実施の形態につい
て図面を参照して説明する。Next, embodiments of the present invention will be described with reference to the drawings.

【００１７】図１は、本発明の実施の形態を説明するた
めの図であり、文字列データ４の中から与えられた条件
文字１を満たす文字列を検索し、一致データファイル６
に格納するシステムである。このシステムでの比較処理
５にあたっては、条件文字１をコード処理（図１の２）
により得た条件文字コード３を使って比較を行ってい
る。FIG. 1 is a diagram for explaining an embodiment of the present invention. A character string satisfying a given condition character 1 is searched from character string data 4 and a matching data file 6 is searched.
System. In comparison processing 5 in this system, conditional character 1 is code-processed (2 in FIG. 1).
The comparison is performed using the conditional character code 3 obtained by the above.

【００１８】次に、本発明の実施の形態における条件文
字コード３の詳細について、図２を参照して説明する。Next, details of the conditional character code 3 in the embodiment of the present invention will be described with reference to FIG.

【００１９】コード化にあたっては、与えられた条件文
字列１を条件文字ごとに分け、配列の１次元目の格納番
号（「１次の添字」という）を与える。さらに、その文
字を条件種別にわけ、２次元目の格納番号（「２次の添
字」という）の先頭に種別番号を代入し、特に、複数指
定であった場合には、文字種別ごとに、該当データ群を
フラグとして、対応する２次の添字の要素に対応させて
格納し（図２の７）、条件文字コードコード３とする。In coding, the given condition character string 1 is divided for each condition character, and the storage number of the first dimension of the array (referred to as "primary subscript") is given. Further, the character is divided into condition types, and a type number is assigned to the head of a storage number in the second dimension (referred to as a “secondary subscript”). The corresponding data group is stored as a flag in association with the element of the corresponding secondary subscript (7 in FIG. 2), and the condition character code code 3 is set.

【００２０】図３は、本発明の実施の形態の動作を説明
するためのフローチャートである。本発明の実施の形態
の動作について、図３を参照して説明する。FIG. 3 is a flowchart for explaining the operation of the embodiment of the present invention. The operation of the embodiment of the present invention will be described with reference to FIG.

【００２１】ある与えられた条件文字列について、その
文字列を分類処理し（ステップＳｌｌ）、文字列全体が
一意指定か前方一致であるかを判定する（ステップＳ１
２）。一意指定または前方一致の場合、処理が増えるた
め、標準関数等を用いて、比較する（ステップＳ１
４）。With respect to a given condition character string, the character string is classified (step S11), and it is determined whether the entire character string is uniquely specified or matches the beginning (step S1).
2). In the case of unique designation or prefix matching, the number of processes increases, so comparison is made using a standard function or the like (step S1).
4).

【００２２】上記以外の場合には（ステップＳ１２のＮ
ｏ分岐）、構成する条件文字ごとに条件種別にわけ、複
数指定であった場合には文字種別ごとに該当データ群を
フラグ化し、各配列の要素に格納する（ステップＳ１
３）。In cases other than the above (N in step S12)
o), the condition type is divided for each constituent character, and if a plurality is specified, the data group is flagged for each character type and stored in an element of each array (step S1).
3).

【００２３】次に、文字列データファイル４より比較対
象文字を取得し（ステップＳ１５）、文字列データが終
了しているか否かの判断を行う（ステップＳ１６）。Next, a character to be compared is obtained from the character string data file 4 (step S15), and it is determined whether or not the character string data is completed (step S16).

【００２４】文字列データが終了していなければ、文字
ごとの比較処理を行い（ステップＳ１７）、最終一致判
定処理を行い（ステップＳ１８）、文字列が一致するか
どうかを判定し（ステップＳ１９）、一致していれば、
一致文字列データを書き込んで（ステップＳ２０）、次
の文字列データを読み込む（ステップＳ１５）。ステッ
プＳ１５〜Ｓ２０の処理は、文字データファイル内のす
べての文字列データについて行われる。If the character string data is not completed, a comparison process is performed for each character (step S17), a final match determination process is performed (step S18), and it is determined whether or not the character strings match (step S19). , If they match,
The matching character string data is written (step S20), and the next character string data is read (step S15). The processing of steps S15 to S20 is performed for all character string data in the character data file.

【００２５】図４は、図３のステップＳ１７の文字ごと
の比較処理の処理フローを示すフローチャートである。
図４を参照して、比較対象文字列との比較処理について
説明する。FIG. 4 is a flowchart showing the processing flow of the comparison process for each character in step S17 in FIG.
With reference to FIG. 4, a description will be given of a comparison process with a comparison target character string.

【００２６】比較対象文字列との比較おいては、対応す
る条件文字コードの２次の添え字について先頭の要素を
基に、複数指定文字かどうかを判断し（ステップＳ４
２）、複数指定文字以外の場合には、処理が増えるた
め、従来の条件種別に応じた比較処理を行う（ステップ
Ｓ４４）。In comparison with the character string to be compared, it is determined whether or not the secondary subscript of the corresponding condition character code is a plurality of designated characters based on the first element (step S4).
2) In the case of a character other than a plurality of designated characters, a comparison process is performed according to the conventional condition type because the process is increased (step S44).

【００２７】一方、複数指定の場合には、比較文字デー
タの（ｉ＋１）文字目について文字種別を判断し、該当
する文字種別についてのみコード化する（ステップＳ４
３）。On the other hand, when a plurality of characters are specified, the character type is determined for the (i + 1) th character of the comparison character data, and only the corresponding character type is encoded (step S4).
3).

【００２８】次に、上記比較対象文字コードと対応する
条件文字コードの文字種別の要素（フラグ）で論理積を
とることにより比較を行う（ステップＳ４５）。Next, a comparison is made by taking a logical product of the elements (flags) of the character type of the condition character code corresponding to the character code to be compared (step S45).

【００２９】上記処理の比較結果からの文字が一致する
かどうかを判断する（ステップＳ４６）。具体的には、
ステップＳ４５での結果が０以外の場合には一致、０の
場合には不一致となる。It is determined whether or not the characters from the comparison result of the above processing match (step S46). In particular,
If the result in step S45 is other than 0, they match, and if they are 0, they do not match.

【００３０】比較の結果一致しない場合には、次の文字
列データを読み込む（図１のステップＳ１５）。If they do not match, the next character string data is read (step S15 in FIG. 1).

【００３１】一致する場合には、比較対象を次の条件文
字、比較対象文字に移すためにカウンタ（ｉ）を増やす
（ステップＳ４７）。ここで、比較文字列が終了してい
た場合には最終一致判定処理へ移る（図１のステップＳ
１８）。If they match, the counter (i) is increased to move the comparison target to the next condition character or comparison target character (step S47). Here, if the comparison character string has been completed, the process proceeds to the final match determination process (step S in FIG. 1).
18).

【００３２】終了していない場合には、ステップＳ４２
からＳ４８までの処理を繰り返す。なお、上記各ステッ
プの処理は、コンピュータ上で実行されるプログラム制
御により実現することができる。If not completed, step S42
To S48 are repeated. The processing in each of the above steps can be realized by program control executed on a computer.

【００３３】[0033]

【実施例】次に、本発明の一実施例について説明する。
使用する文字は以下のものとし、それぞれ以下のような
文字種別とする。Next, an embodiment of the present invention will be described.
The following characters are used, and the following character types are used.

【００３４】．（ピリオド） …文字種別１。. (Period) ... Character type 1.

【００３５】０〜９ …文字種別２。0-9: Character type 2

【００３６】Ａ〜Ｚ …文字種別３。AZ: Character type 3

【００３７】また、条件文字の指定方法として以下の４
種類があるものとし、それぞれ以下のような条件種別番
号に対応するものとする。In addition, the following 4
It is assumed that there are types, and they correspond to the following condition type numbers, respectively.

【００３８】・一意指定（条件種別０） …使用可能な文字の中
から１文字を指定する。-Unique designation (condition type 0) ... designates one character from available characters.

【００３９】・任意１文字指定（条件種別１）…使用可能な文字の中
の任意の１文字を表す。Specifying an arbitrary one character (condition type 1): represents an arbitrary one of available characters.

【００４０】・任意文字指定（条件種別２） …使用可能な文字また
は文字列を表す。-Arbitrary character designation (condition type 2): Indicates an available character or character string.

【００４１】・複数指定（条件種別３） …複数個の使用可能な
文字を指定するもの（文字の羅列や範囲指定などがあ
る）。A plurality of designations (condition type 3)... Designation of a plurality of usable characters (such as a character string or range designation).

【００４２】複数指定の例について簡単に説明する。An example of a plurality of designations will be briefly described.

【００４３】［ＡＢＣ］→Ａ，Ｂ，Ｃのいずれかを表す。[ABC] → A, B, or C is represented.

【００４４】［Ｄ−Ｆ］→Ｄ，Ｅ，Ｆのいずれかを表す。[DF] represents one of D, E, and F.

【００４５】［ＡＣ−Ｅ］→Ａ，Ｃ，Ｄ，Ｅのいずれかを表す。[AC-E] represents one of A, C, D, and E.

【００４６】また、文字種別間の範囲指定についても文
字種別の１から３の順に連続しているものとして指定が
可能であるものとする。It is also assumed that the range designation between the character types can be designated as being continuous in the order of 1 to 3 of the character types.

【００４７】実際の条件文字の例として、条件文字列Ａ
？＊［５−ＣＸ］について考える。As an example of an actual condition character, a condition character string A
? * Consider [5-CX].

【００４８】これは、先頭の文字がＡで、最後の文字
が、５，６，７，８，９，Ａ，Ｂ，Ｃ，Ｘである３文字
以上の文字列であることを表す。This means that the first character is A, and the last character is a character string of three or more characters of 5, 6, 7, 8, 9, A, B, C, and X.

【００４９】この場合の条件種別は、４種類、条件文字
数は４文字であるので、文字コードは５×５の行列とな
る。In this case, since there are four types of condition and four condition characters, the character code is a 5 × 5 matrix.

【００５０】次に、本発明の一実施例の動作について、
図５を参照して詳細に説明する。Next, the operation of one embodiment of the present invention will be described.
This will be described in detail with reference to FIG.

【００５１】条件文字列５１をコード化した際の条件コ
ードの配列の値は、条件コード５２のように、２次の添
え字の先頭には、対応する条件文字種別コードが入り、
複数指定（Ｄ［ｎ］［０］＝３）の場合、その１次の添
え字の次の要素（Ｄ［ｎ］［１］，Ｄ［ｎ］［２］，
…）に、順次、文字種別の該当フラグを格納する。な
お、条件コード５２の他の要素のデータは、条件文字列
５１に対しては使用しないが、Ｄ［０］［０］＝０は条
件種別０、Ｄ［１］［０］＝１は条件種別１、Ｄ［２］
［０］＝２は条件種別２を示している。The value of the array of condition codes when the condition character string 51 is coded is, as in the case of the condition code 52, the head of the secondary subscript is filled with the corresponding condition character type code.
In the case of multiple designations (D [n] [0] = 3), the next element (D [n] [1], D [n] [2],
..) Are sequentially stored with corresponding flags of the character types. The data of the other elements of the condition code 52 are not used for the condition character string 51, but D [0] [0] = 0 is the condition type 0, and D [1] [0] = 1 is the condition type. Type 1, D [2]
[0] = 2 indicates the condition type 2.

【００５２】次に、条件文字列５２中の複数条件文字
［５−ＣＸ］のコード化について、図６を参照して説明
する。Next, encoding of a plurality of conditional characters [5-CX] in the conditional character string 52 will be described with reference to FIG.

【００５３】複数条件文字のコード化では、対応する該
当文字について条件文字種別ごとの番号を付け、その番
号に対応する要素に文字があるかないかを０と１で表
す。以下各文字種別ごとに説明する。In the coding of a plurality of condition characters, a number is assigned to each corresponding character by the condition character type, and whether or not there is a character in an element corresponding to the number is represented by 0 or 1. The following describes each character type.

【００５４】条件文字［５−ＣＸ］中には、文字種別１
であるピリオドは含まれていない。そこで、Ｄ［３］
［１］に０を格納する（図６の６１）。In condition character [5-CX], character type 1
Is not included. Therefore, D [3]
0 is stored in [1] (61 in FIG. 6).

【００５５】また、文字種別２の中では５から９が該当
するため、それらに該当する部分のフラグを１にし、こ
れを０から９まで順に並べたもの（図６の６２）を２進
数と見なし、Ｄ［３］［２］に格納する。なお、以降こ
の２進数を１０進化して説明する。In addition, since the characters 5 to 9 correspond to the character type 2, the flag of the portion corresponding to them is set to 1, and those arranged in order from 0 to 9 (62 in FIG. 6) are expressed as a binary number. Considered and stored in D [3] [2]. In the following, this binary number will be described after being deciphered into ten.

【００５６】同様に、文字種別３についても同様の処理
を行い（図６の６３）、データをＤ［３］［３］に格納
する。Similarly, the same processing is performed for the character type 3 (63 in FIG. 6), and the data is stored in D [3] [3].

【００５７】次に実際の比較について、図７を参照して
詳細に説明する。なお、複数条件文字以外の比較処理に
ついては従来の処理で行うため、説明は複数条件文字と
の比較部分に絞って説明する（図４のステップＳ４３と
ステップＳ４５）。Next, the actual comparison will be described in detail with reference to FIG. Since comparison processing other than the multiple condition characters is performed by the conventional processing, the description will be limited to the comparison with the multiple condition characters (steps S43 and S45 in FIG. 4).

【００５８】具体例の１として、上記条件文字［５−Ｃ
Ｘ］、比較の対象となる文字の例として”６”を考え
る。まず”６”の文字種別は２であり、”６”について
条件文字と同様の方法でコード化を行い、６４を得る
（図７の７１）。As one specific example, the condition character [5-C
X], and consider "6" as an example of a character to be compared. First, the character type of "6" is 2, and "6" is coded in the same manner as the conditional character to obtain 64 (71 in FIG. 7).

【００５９】次に、このコードと条件文字コード７２内
の対応するコード（Ｄ［３］［２］）との論理積をと
る。その結果として６４となり、論理積が０でないこと
から一致するという結果を得る。Next, the logical product of this code and the corresponding code (D [3] [2]) in the conditional character code 72 is calculated. As a result, the result is 64, and the result that the logical product matches because the logical product is not 0 is obtained.

【００６０】また、同じ複数条件文字に対して”Ｅ”と
の比較を行うと”Ｅ”の文字種別と順番から（図７の７
３）のようになり、条件文字コード７２中の対応コード
（Ｄ［３］［３］）との論理積の結果である０、すなわ
ち不一致という結果を得る。When the same plural condition characters are compared with "E", the character type and the order of "E" are determined (see 7 in FIG. 7).
As shown in 3), 0 which is the result of the logical product with the corresponding code (D [3] [3]) in the conditional character code 72, that is, the result of mismatch is obtained.

【００６１】次に、従来のコードとの違いについて図１
１を参照して詳細に説明する。Next, the difference from the conventional code is shown in FIG.
This will be described in detail with reference to FIG.

【００６２】図１１（Ａ）の条件文字列５１を、従来の
コードを使って表現すると、図１１（Ｂ）の１１１のよ
うになる。ここで、１次の添え字３と１３の括
弧（［，］）は複数条件の該当文字データの開始と終了
を表すものとする。When the conditional character string 51 of FIG. 11A is expressed using a conventional code, it becomes as shown by 111 in FIG. 11B. Here, the parentheses ([,]) of the primary suffixes 3 and 13 indicate the start and end of the character data corresponding to a plurality of conditions.

【００６３】配列の大きさでは、従来のコードは、１
４、本実施例によるコードでは２５と従来のコードの方
が少ないが、文字列データあたりの複数条件部分での比
較回数については、従来の９回（＝該当文字数）に比
べ、本実施例によるコードでは１回と少ない。For the size of the array, the conventional code is 1
4. In the code according to the present embodiment, 25 is smaller in the conventional code than in the conventional code. However, the number of comparisons in a plurality of condition parts per character string data is smaller than that in the conventional 9 (= the number of applicable characters) according to the present embodiment. In code, it is only once.

【００６４】さらに、図１２に示すよう、に条件文字列
としてＡ？−＊［５−Ｘ］を考えると、従来の条件コー
ドでは３６となり、本実施例によるコードの２５よりも
多くなる。また、文字列データあたりの複数指定部分の
比較回数は、従来のコードでの３１回に比べて１回とか
なり少ない。このように、本実施例によるコードでは条
件文字数が同じであればどのような指定をしても配列の
要素数は変わらず、また、複数指定部分の比較は常に１
回になる。Further, as shown in FIG. 12, A? Considering − * [5-X], the condition code is 36 in the conventional condition code, which is more than 25 in the code according to the present embodiment. Also, the number of comparisons of a plurality of designated portions per character string data is one, which is considerably smaller than that of 31 in the conventional code. As described above, in the code according to the present embodiment, the number of elements of the array does not change regardless of the designation as long as the number of conditional characters is the same, and the comparison of a plurality of designated portions is always one.
Turns.

【００６５】なお、従来のコードにおいて使用する要素
数は条件文字列によって変わるものの、配列外参照が起
こらないようにするために必要な配列の要素数は、条件
文字を４文字とした場合には、４×｛（文字種別１の文
字数）＋（文字種別２の文字数）＋（文字種別３の文字
数）｝＝４×（１＋１０＋２６）＝１４８となる。Although the number of elements used in the conventional code varies depending on the condition character string, the number of elements of the array required to prevent out-of-array reference is reduced when the condition character is four characters. , 4 × {(number of characters of character type 1) + (number of characters of character type 2) + (number of characters of character type 3)} = 4 × (1 + 10 + 26) = 148.

【００６６】本発明の別の実施の形態について図８を参
照して説明する。入力ファイル８１を記憶装置８３に格
納する際に、予めデータ内のキーワードを基に、上位索
引ファイル８７からデータ種別を分類し、そのキーワー
ドとデータファイル８７へのポインタに関するデータを
データ種別ごとに格納する索引ファイル８９で構成され
るシステムでの検索方式がある。Another embodiment of the present invention will be described with reference to FIG. When storing the input file 81 in the storage device 83, the data type is classified from the upper index file 87 based on the keywords in the data in advance, and the data relating to the keyword and the pointer to the data file 87 are stored for each data type. There is a search method in a system constituted by an index file 89 to be executed.

【００６７】従来は、指定したキーワードの数だけリス
トデータの検索を行っていたが、本発明の実施の形態で
は、上位索引ファイル８３内に、データ種別ごとに属す
るキーワードに番号を付けそれらを管理することで（図
９の９１）、検索の際に、検索データ８４を上位索引フ
ァイル８３の情報を基に、データ種別ごとにフラグ化す
る。なお、図９において、上位検索ファイル９１とデー
タ種別１との対応を９２で示す。データ種別ごとの索引
ファイルとの比較においては、ファイル内の各データを
同様にフラグ化して入力条件コードの該当データ種別と
の論理積をとりその値で一致するかどうかで判断でき
る。Conventionally, list data has been searched for the number of specified keywords. In the embodiment of the present invention, keywords belonging to each data type are numbered in the upper index file 83 and managed. By doing so (91 in FIG. 9), at the time of search, the search data 84 is flagged for each data type based on the information of the upper index file 83. In FIG. 9, the correspondence between the upper search file 91 and the data type 1 is indicated by 92. In comparison with the index file for each data type, each data in the file is similarly flagged, and the logical product of the input condition code and the corresponding data type is obtained, and it can be determined whether or not the values match.

【００６８】[0068]

【発明の効果】以上説明したように、本発明によれば下
記記載の効果を奏する。As described above, according to the present invention, the following effects can be obtained.

【００６９】本発明の第１の効果は、従来方式で問題点
とされていた複数指定を使った際の検索性能の低下を解
消し、検索性能を高速化する、ということである。A first effect of the present invention is to solve the problem of the conventional method, that is, to reduce the search performance when using a plurality of designations, and to speed up the search performance.

【００７０】その理由は、本発明においては、コード化
は複数条件文字部分のデータ化を検索処理の先頭で行う
ため、比較回数が文字列データごとに１回となる、ため
である。The reason is that, in the present invention, since the coding is performed at the beginning of the search processing by converting the data of a plurality of conditional character portions into data, the number of comparisons becomes one for each character string data.

【００７１】本発明の第２の効果は、該当文字群の情報
を格納する配列についての無駄や配列外参照の発生の可
能性を回避することができる、ということである。A second effect of the present invention is that it is possible to avoid waste of the array storing the information of the corresponding character group and the possibility of occurrence of out-of-array reference.

【００７２】その理由は、本発明においては、該当文字
群の情報を格納するために必要な配列の大きさを使用可
能文字数から文字種別数にすることで、各条件文字コー
ドごとに全情報を持たせても、さほど大きくならないか
らである。The reason is that, in the present invention, by changing the size of the array necessary for storing the information of the corresponding character group from the number of usable characters to the number of character types, all the information for each conditional character code is obtained. Even if you do, it will not be so large.

[Brief description of the drawings]

【図１】本発明の実施の形態を説明するための図であ
る。FIG. 1 is a diagram for describing an embodiment of the present invention.

【図２】本発明の実施の形態における２次元コードを説
明するための図である。FIG. 2 is a diagram for explaining a two-dimensional code according to the embodiment of the present invention.

【図３】本発明の実施の形態の処理フローを示すフロー
チャートである。FIG. 3 is a flowchart showing a processing flow according to the embodiment of the present invention.

【図４】本発明の実施の形態における文字ごとの比較の
処理フローを示すフローチャートである。FIG. 4 is a flowchart illustrating a comparison processing flow for each character according to the embodiment of the present invention.

【図５】本発明の一実施例を説明するための図であり、
条件文字全体のコードの例を示す図である。FIG. 5 is a diagram for explaining one embodiment of the present invention;
It is a figure showing the example of the code of the whole condition character.

【図６】本発明の一実施例を説明するための図であり、
複数指定文字部分についてのコードの例を示す図であ
る。FIG. 6 is a diagram for explaining one embodiment of the present invention;
It is a figure showing an example of a code about a plurality of designated character parts.

【図７】本発明の一実施例を説明するための図であり、
複数指定文字と比較対象文字との比較方法についての例
を示す図であるFIG. 7 is a diagram for explaining one embodiment of the present invention;
FIG. 6 is a diagram illustrating an example of a method of comparing a plurality of designated characters with a comparison target character;

【図８】本発明の別の実施の形態の構成を示す図であ
る。FIG. 8 is a diagram showing a configuration of another embodiment of the present invention.

【図９】本発明の別の実施の形態のコード化の適用例を
示す図である。FIG. 9 is a diagram showing an application example of coding according to another embodiment of the present invention.

【図１０】従来の文字検索方式を説明するための図であ
る。FIG. 10 is a diagram for explaining a conventional character search method.

【図１１】従来のコードによる条件文字列５１のコード
である。FIG. 11 is a code of a condition character string 51 according to a conventional code.

【図１２】条件文字列の他の例での従来と本実施例のコ
ードである。FIG. 12 shows codes of another example of a condition character string according to the related art and the present embodiment.

[Explanation of symbols]

１条件文字列２コード化処理３条件文字コード４文字列データ５条件文字列との比較処理６一致文字列データ７条件文字ごとの情報５１条件文字列の例５２条件文字列５１に対応する条件コード６１条件文字列５１の文字種別１に関するコード表６２条件文字列５１の文字種別２に関するコード表６３条件文字列５１の文字種別３に関するコード表７１比較対象文字”６”の文字種別２に関するコード
表７２条件文字列５１の条件コード中の複数指定データ
部７３比較対象文字”Ｅ”の文字種別３に関するコード
表８１入力ファイル８２入力装置８３記憶装置８４検索データ８５検索装置８６一致データ格納ファイル８７データファイル８８上位検索ファイル８９データ種別ごとの検索ファイル９１上位検索ファイル８８の一例９２データ種別１に関するコード表の例１０１従来のコードによる条件文字列５１のコード１１１条件文字列の例１１２従来のコードによる条件文字列１１１のコード１１３本発明のコードによる条件文字列１１１のコー
ドDESCRIPTION OF SYMBOLS 1 Condition character string 2 Encoding processing 3 Condition character code 4 Character string data 5 Comparison processing with a condition character string 6 Matching character string data 7 Information for each condition character 51 Example of a condition character string 52 Conditions corresponding to the condition character string 51 Code 61 Code table for character type 1 of condition character string 51 62 Code table for character type 2 of condition character string 51 63 Code table for character type 3 of condition character string 51 71 Code for character type 2 of comparison target character “6” Table 72 Multiple designation data part in condition code of condition character string 51 Code table for character type 3 of comparison target character "E" 81 Input file 82 Input device 83 Storage device 84 Search data 85 Search device 86 Matching data storage file 87 Data file 88 Top search file 89 Search file for each data type 91 Top search Example of search file 88 92 Example of code table relating to data type 101 101 Code of condition character string 51 by conventional code 111 Example of condition character string 112 Code of condition character string 111 by conventional code 113 Condition character by code of the present invention Code for column 111

Claims

[Claims]

In a character string search by condition specification, a condition character is coded using a character type and a two-dimensional array representing a corresponding character flag in the character type, and a comparison including the code and a plurality of specified condition characters is performed. A character string search method characterized by comparing with a target character.

2. A two-dimensional array having a condition type subscript in the row direction and a character type subscript in the column direction. A corresponding condition character type code is stored at the head of the column direction subscript. , The next element in the column index,
By encoding with a corresponding flag of the character type, and comparing the code value of the character to be compared with the code stored in the subscript corresponding to the character type in the two-dimensional array by the corresponding flag of the character type to be compared, A character string search method characterized by performing a search.

3. A corresponding condition character type code is stored at the head of a column direction subscript of a two-dimensional array in which a condition type is a subscript in the row direction and a character type is a subscript in the column direction. With reference to the two-dimensional array, which is sequentially coded by the corresponding flag of the character type,
A character search is performed by comparing a code value of a character to be compared with a code stored in a subscript corresponding to the character type of the two-dimensional array by a corresponding flag of the character type to be compared. A recording medium on which a program to be executed is recorded.