JP2000276487A

JP2000276487A - Method and device for instance storage and retrieval, computer readable recording medium for recording instance storage program, and computer readable recording medium for recording instance retrieval program

Info

Publication number: JP2000276487A
Application number: JP11083027A
Authority: JP
Inventors: Yasuhiro Takayama; 泰博高山; Katsushi Suzuki; 克志鈴木; Takeyuki Aikawa; 勇之相川; Yamahiko Ito; 山彦伊藤
Original assignee: Mitsubishi Electric Corp
Current assignee: Mitsubishi Electric Corp
Priority date: 1999-03-26
Filing date: 1999-03-26
Publication date: 2000-10-06

Abstract

PROBLEM TO BE SOLVED: To obtain instance storage and retrieval techniques which facilitate the solution of a problem by similar instance retrieval even in the case of a complicated instance consisting of plural instance sentences where an instance storage and retrieval object is described in a natural language. SOLUTION: A processing object sentence extraction means 45, a similar sentence clustering means 49, and a similar instance retrieval means 52 are provided. The means 45 segments individual instance sentences in each classification from an electronic document, and the means 49 classifies similar instance sentences on the basis of syntax structural and semantic similarities between instance sentences, which a similar sentence collation means 48 uses an area ontology (instance storage and retrieval object area-dependent knowledge) 44 to obtain, and generates similar instance data to store it in an instance data base, and the means 52 retrieves similar instance sentences resembling the retrieval sentence from the instance data base on the basis of similarities, which are obtained by the similar sentence collation means 48, between the input retrieval sentence and individual instance sentences.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、過去に発生した問
題の記述とその解決策との対が記録された文書、例え
ば、コールセンタでの問合せ記録などを基に作成した
「事例」を新たな問題の解決に用いるヘルプデスクシス
テムなどで利用するために、事例を蓄積すると共に類似
事例を検索する事例蓄積・検索技術に関するものであ
る。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a document storing a pair of a description of a problem that occurred in the past and a solution to the problem, for example, a "case" created based on an inquiry record at a call center. The present invention relates to a case accumulation / search technique for accumulating cases and searching for similar cases for use in a help desk system used for solving a problem.

【０００２】[0002]

【従来の技術】相談窓口における問合せ応対業務などで
のやりとりを電子的な手段で蓄積したデータを再利用し
たいという要求が強い。こうした要求に対して、従来、
自然言語で表現された事例を事例データベースに予め登
録しておき、新規に入力された検索事例に類似した事例
を前記事例データベースの中から検索する技術が、例え
ば、特開平９−７３４６４号公報「類似事例検索装置」
に開示されている。以降、本技術を従来技術１と呼ぶこ
とにする。2. Description of the Related Art There is a strong demand for reusing data stored by electronic means in exchanges for inquiries at a consultation desk. In response to these demands,
A technique of registering a case expressed in a natural language in a case database in advance and searching for a case similar to a newly input search case from the case database is disclosed in, for example, Japanese Patent Application Laid-Open No. 9-73364. Similar Case Search Device "
Is disclosed. Hereinafter, the present technology will be referred to as Conventional Technology 1.

【０００３】図２６は、従来技術１に係る類似事例検索
装置の構成を示す構成図である。図２６において、１は
自然言語で記述されて問題部分と解決方法部分とから構
成される事例、２は事例１を蓄積した事例データベー
ス、３は事例１から生成されて、事例番号と、カテゴリ
番号と、キーワード番号と、重みとから構成される属性
情報、４は属性情報３を蓄積した属性データベースであ
る。FIG. 26 is a configuration diagram showing a configuration of a similar case search apparatus according to the prior art 1. As shown in FIG. In FIG. 26, 1 is a case described in a natural language and composed of a problem part and a solution method part, 2 is a case database storing case 1 and 3 is generated from case 1, and a case number and a category number are generated. Attribute information 4 including attribute information, keyword numbers, and weights is an attribute database in which attribute information 3 is stored.

【０００４】また、５は事例１から抽出した各々のキー
ワードに一意の番号を付して格納するキーワード番号テ
ーブル、６は事例番号と当該事例番号を持つ事例の中か
ら抽出したキーワードに対応するキーワード番号とを対
応付けて格納したキーワードテーブル、７は事例番号と
当該事例番号を持つ事例の分類を表すカテゴリ（カテゴ
リの種類は予め設定しておく）に対応するカテゴリ番号
とを対応付けて格納したカテゴリテーブルである。A keyword number table 5 stores each keyword extracted from the case 1 with a unique number attached thereto, and a keyword 6 corresponding to the case number and a keyword extracted from the cases having the case number. A keyword table 7 stores the case numbers in association with each other. The case table 7 stores the case numbers and the category numbers corresponding to the categories representing the classification of the cases having the case numbers (category types are set in advance). It is a category table.

【０００５】また、８は行方向にカテゴリ番号を列方向
にキーワード番号をとったテーブルの各々の欄に当該カ
テゴリ番号に対応するカテゴリに含まれて当該キーワー
ド番号に対応するキーワードを含む事例の事例数を格納
する事例数テーブル、９はキーワード番号テーブル５
と、キーワードテーブル６と、カテゴリテーブル７と、
事例数テーブル８とを記憶したメモリ、１０は事例１の
属性情報３を生成する属性情報生成手段、１１は事例デ
ータベース２に格納された事例間の類似度を生成する類
似度生成手段である。[0005] Reference numeral 8 denotes an example of a case in which each column of a table having a category number in the row direction and a keyword number in the column direction includes a keyword corresponding to the keyword number included in the category corresponding to the category number. Number of cases table storing numbers, 9 is keyword number table 5
, A keyword table 6, a category table 7,
A memory storing the case number table 8 and 10 is attribute information generating means for generating attribute information 3 of case 1, and 11 is a similarity generating means for generating similarity between cases stored in the case database 2.

【０００６】次に、従来技術１に係る類似事例検索装置
の動作について図２６を用いて説明する。まず、事例登
録時の動作について説明する。Next, the operation of the similar case search apparatus according to the prior art 1 will be described with reference to FIG. First, the operation at the time of case registration will be described.

【０００７】事例数テーブル８には、図２６に示すよう
に、テーブルの各欄に当該カテゴリ番号に対応するカテ
ゴリおよび当該キーワード番号に対応するキーワードを
含む事例の事例数を格納する他に、事例テーブル８の各
行における各欄に格納されたキーワード毎の事例数の和
（当該キーワード番号に対応するキーワードを含む事例
の事例数の合計）、各列における各欄に格納されたカテ
ゴリ毎の事例数の和（当該カテゴリ番号に対応するカテ
ゴリに含まれる事例の事例数の合計）、事例データベー
ス２に蓄積された全事例の事例総数を格納する。As shown in FIG. 26, the number-of-cases table 8 stores the number of cases including the category corresponding to the category number and the keyword corresponding to the keyword number in each column of the table. The sum of the number of cases for each keyword stored in each column in each row of the table 8 (total number of cases including the keyword corresponding to the keyword number), the number of cases for each category stored in each column in each column (The total number of cases included in the category corresponding to the category number) and the total number of cases stored in the case database 2.

【０００８】なお、事例数テーブル８は、初期状態では
予め設定しておいたカテゴリの種類に対応するカテゴリ
番号だけが存在するが、事例１が蓄積（格納）される度
に、事例１から抽出されたキーワードに対応するキーワ
ード番号が行方向に追加され、当該カテゴリ番号および
キーワード番号に対応する欄の事例数、当該カテゴリに
対応するカテゴリ毎の事例数の和、当該キーワードに対
応するキーワード毎の事例数の和、前記全事例の事例総
数が更新される。In the case number table 8, in the initial state, only the category number corresponding to the category type set in advance exists, but each time the case 1 is accumulated (stored), it is extracted from the case 1. The keyword number corresponding to the keyword is added in the row direction, the number of cases in the column corresponding to the category number and the keyword number, the sum of the number of cases in each category corresponding to the category, The sum of the number of cases and the total number of cases of all the cases are updated.

【０００９】事例登録時には、属性情報生成手段１０が
次の処理を実行する。即ち、まず、与えられた事例の問
題部分からキーワードを抽出し、キーワード番号を付与
してキーワード番号テーブル５に格納する。次に、抽出
した前記キーワードついてキーワード番号テーブル５に
格納されているキーワード番号を取出し、前記事例に付
与された事例番号に対応付けてキーワードテーブル６に
格納する。At the time of case registration, the attribute information generating means 10 executes the following processing. That is, first, a keyword is extracted from the problem part of the given case, a keyword number is assigned, and the keyword is stored in the keyword number table 5. Next, the keyword numbers stored in the keyword number table 5 are extracted for the extracted keywords, and stored in the keyword table 6 in association with the case numbers assigned to the cases.

【００１０】次に、ユーザの指示によって、前記事例に
付与されたカテゴリのカテゴリ番号を前記事例の事例番
号に対応付けてカテゴリテーブル７に格納する。次に、
事例数テーブル８にキーワード番号を必要に応じて追加
すると共に、事例テーブル８における該当欄の事例数、
前記キーワード毎の事例数の和、前記カテゴリ毎の事例
数の和、前記全事例の事例総数のデータを更新する。Next, according to a user's instruction, the category number of the category assigned to the case is stored in the category table 7 in association with the case number of the case. next,
A keyword number is added to the case number table 8 as needed, and the number of cases in the corresponding column in the case table 8
The data of the sum of the number of cases for each keyword, the sum of the number of cases of each category, and the total number of cases of all the cases are updated.

【００１１】次に、事例数テーブル８を基にして重みを
生成する。なお、事例数テーブル８に格納されたカテゴ
リ数ｎ、キーワード数ｍ、全事例の事例総数ｓ、カテゴ
リｃ _i（１≦ｉ≦ｎ）に属する事例の和ｓ_i、キーワード
番号ｊ（１≦ｊ≦ｍ）のキーワードが出現する事例の和
ｔ_j、カテゴリｃ_iに属し、かつ、キーワード番号ｍのキ
ーワードが出現する事例数ｔ_ijを生成し、ｔ_ij≠０とな
るキーワード番号ｊのキーワードについて、カテゴリｃ
_iにおけるキーワード番号ｊのキーワードに対する重み
ω_ijは、 ω_ij＝（ｔ_ij／ｔ_j−ｓｉ／ｓ）＋（ｔ_ij／ｓ_i−ｔ_j／
ｓ）として生成する。次に、前記事例についての事例番号、
カテゴリ番号、キーワード毎の重みから構成される属性
情報を作成して属性データベース４に格納する。Next, based on the case number table 8, weights are
Generate. The category stored in the case number table 8
Number n, number m of keywords, total number s of all cases, category
Ri c _iSum s of cases belonging to (1 ≦ i ≦ n)_i,keyword
Sum of cases where the keyword of number j (1 ≦ j ≦ m) appears
t_j, Category c_iBelongs to and has the keyword number m.
-Number of cases where the word appears t_ijAnd generate t_ij≠ 0
Category c for the keyword with keyword number j
_iWeight for keyword with keyword number j in
ω_ijIs ω_ij= (T_ij/ T_j−si / s) + (t_ij/ S_i-T_j/
s) is generated. Next, a case number for the case,
Attribute consisting of category number and weight for each keyword
Information is created and stored in the attribute database 4.

【００１２】次に、事例検索時の動作について説明す
る。事例検索時には、類似度生成手段１１が次の処理を
実行する。即ち、まず、新規に入力された検索事例の
「問題」部分からキーワードを抽出する。次に、属性デ
ータベース４を検索し、抽出されたキーワードと同一の
キーワードが存在する事例のカテゴリ番号およびキーワ
ード毎の重みから構成される属性情報を取出す。Next, the operation at the time of case search will be described. At the time of case search, the similarity generation means 11 executes the following processing. That is, first, a keyword is extracted from the “problem” portion of the newly input search case. Next, the attribute database 4 is searched to extract attribute information including the category number of the case where the same keyword as the extracted keyword exists and the weight of each keyword.

【００１３】次に、取出した属性情報を基に類似度を生
成する。与えられた前記事例と取出した前記属性情報に
対応する事例との間の類似度は、取出した事例のキーワ
ード毎の重みの組から、類似度ω＝（一致したキーワー
ドの重みの総和）×（一致したキーワード数／属性デー
タベース４から取出したキーワードリストの長さ）を算
出して生成する。次に、生成した類似度ωが高い順にソ
ートし、類似度ωが高い事例から事例のカテゴリ、類似
度、データの内容などを出力し、表示または印字する。Next, a similarity is generated based on the extracted attribute information. The similarity between the given case and the case corresponding to the extracted attribute information is obtained by calculating the similarity ω = (sum of the weights of the matched keywords) × ( The number of matching keywords / the length of the keyword list extracted from the attribute database 4) is calculated and generated. Next, sorting is performed in descending order of the generated similarity ω, and the case category, similarity, data content, and the like are output from the cases with the highest similarity ω and displayed or printed.

【００１４】また、機器などのクレーム処理の作業など
において、過去の故障内容、故障対応内容、故障属性デ
ータ等を保持する事例ベースを提供する技術が、例え
ば、特開平６−３０９１７２号公報「知識ベース装置お
よび知識ベース構築方法」に開示されている。以降、本
技術を従来技術２と呼ぶことにする。A technology for providing a case base for retaining past failure contents, failure response contents, failure attribute data, and the like in a work of claim processing of a device or the like is disclosed in, for example, Japanese Patent Application Laid-Open No. 6-309172. Base Apparatus and Knowledge Base Construction Method ". Hereinafter, the present technology will be referred to as Conventional Technology 2.

【００１５】図２７は、従来技術２に係る知識ベース装
置の構成を示す構成図である。FIG. 27 is a configuration diagram showing a configuration of a knowledge base device according to the conventional technology 2. As shown in FIG.

【００１６】図２７において、１２は推論、類似検索、
事例編集、知識編集などをユーザが指示すると共に、そ
れに伴う推論条件、検索条件などのデータをユーザが入
力する指示・入力手段と、検索結果やメニュなどのデー
タをユーザに提示する表示手段とを備えたユーザインタ
フェース、１３は条件部と、結論部と、事例番号とから
構成される知識を格納した知識ベース、１４は一般的な
シソーラスを格納した類義語辞書、１５は管理項目（顧
客名など）と、故障内容と、キーワード群と、故障原因
と、故障対応内容とから構成される事例データ、１６は
事例データ１５を格納した事例ベース、１７はキーワー
ドと、事例データでのキーワードの出現回数と、１事例
データ内でのキーワード間の同時出現関係と、１事例デ
ータ内でのキーワードの同時出現回数とから構成されネ
ットワーク構造になっているキーワード関係データベー
スである。In FIG. 27, reference numeral 12 denotes inference, similarity search,
Instruction and input means for the user to input data such as inference conditions and search conditions while the user instructs case editing and knowledge editing, and display means for presenting data such as search results and menus to the user. Provided user interface, 13 is a knowledge base storing knowledge composed of a condition part, a conclusion part, and a case number, 14 is a synonym dictionary storing a general thesaurus, and 15 is a management item (such as a customer name). , A failure content, a keyword group, a failure cause, and failure handling content, case data 16, a case base storing case data 15, 17 a keyword, and the number of occurrences of the keyword in the case data. The network structure is composed of the simultaneous appearance relationship between keywords in one case data and the number of simultaneous appearances of keywords in one case data. Is a keyword relationship database that Tsu.

【００１７】また、１８はユーザインタフェース１２を
介したユーザとのやり取りによって事例データベース１
６に格納された事例データ１５の編集を行う事例編集手
段、１９はユーザインタフェース１２から推論条件を受
取り、知識ベース１３に格納された知識を利用して推論
を行う推論手段、２０はユーザインタフェース１２から
検索条件を受取り、類義語辞書１４に格納されている一
般的なシソーラスを用いて前記検索条件を一般化し、事
例ベース１６に格納された事例データ１５の類似検索を
行う類似検索手段、２１は事例ベース１６に格納された
事例データ１５からキーワードデータを獲得し、獲得し
たキーワードデータを用いてキーワード間の関係データ
を作成するキーワード関係データ作成手段である。Reference numeral 18 denotes a case database 1 which is exchanged with a user via the user interface 12.
A case editing means for editing the case data 15 stored in the user interface 12; an inference means 19 for receiving an inference condition from the user interface 12 to make an inference using the knowledge stored in the knowledge base 13; A similarity search means for receiving a search condition from a general thesaurus using a general thesaurus stored in a synonym dictionary 14 and performing a similarity search of the case data 15 stored in the case base 16; A keyword relation data creating unit that acquires keyword data from the case data 15 stored in the base 16 and creates relation data between keywords using the acquired keyword data.

【００１８】また、２２はキーワード関係データベース
１７に格納されたキーワード関係データからメニューを
作成するメニュー作成手段、２３はキーワード関係デー
タベース１７に格納されたキーワード関係データと事例
データベース１６に格納された事例データ１５から知識
２４を生成する事例・知識コネクト手段、２５はユーザ
インタフェース１２からの知識編集の指示によって事例
・知識コネクト手段２３が生成した知識２４を編集し、
知識ベース１３に格納する知識編集手段である。Reference numeral 22 denotes menu creation means for creating a menu from the keyword-related data stored in the keyword-related database 17, and 23 denotes keyword-related data stored in the keyword-related database 17 and case data stored in the case database 16. A case / knowledge connect means for generating knowledge 24 from 15; 25 edits the knowledge 24 generated by the case / knowledge connect means 23 according to a knowledge editing instruction from the user interface 12;
This is a knowledge editing means stored in the knowledge base 13.

【００１９】次に、従来技術２に係る知識ベース装置の
動作について図２７を用いて説明する。まず、事例登録
時の動作について説明する。Next, the operation of the knowledge base device according to the prior art 2 will be described with reference to FIG. First, the operation at the time of case registration will be described.

【００２０】事例登録時には、まず、事例編集手段１８
により、顧客名（管理項目の１つ）、故障内容、キーワ
ード群を入力し、事例データ１５を作成して事例ベース
１６に格納しておく。キーワード群は、直接入力する
か、メニュー作成手段２２により作成されたメニューを
指定することによって入力する。また、キーワード関係
データ生成手段２１により、キーワード、キーワードの
出現回数、キーワード間の同時出現回数をキーワード関
係データベース２１に格納しておく。At the time of case registration, first, the case editing means 18
Thus, the customer name (one of the management items), the failure content, and the keyword group are input, and the case data 15 is created and stored in the case base 16. The keyword group is input either directly or by specifying a menu created by the menu creating means 22. Also, the keyword, the number of appearances of the keyword, and the number of simultaneous appearances between keywords are stored in the keyword relation database 21 by the keyword relation data generation means 21.

【００２１】次に、事例・知識コネクト手段２３によ
り、キーワード関係データベース２１に格納されている
同時出現頻度の高いキーワードの組合せを、候補となる
知識２４の条件部として生成し、生成したキーワードの
組合せを持つ事例データを事例ベース１６から検索し、
検索して得られた事例データの故障原因、故障対応内
容、事例番号を結論部とした知識２４を生成し、生成し
た知識２４を知識編集手段２５によりユーザが編集して
知識ベース１３に格納しておく。Next, the case / knowledge connecting means 23 generates a combination of keywords having a high simultaneous appearance frequency stored in the keyword relation database 21 as a condition part of the knowledge 24 as a candidate. From the case base 16 with case data having
A knowledge 24 having a failure cause, a failure response content, and a case number of the retrieved case data as a conclusion part is generated, and the generated knowledge 24 is edited by a knowledge editing unit 25 by a user and stored in the knowledge base 13. Keep it.

【００２２】次に、事例検索時の動作について説明す
る。事例検索時には、まず、推論手段１９はユーザイン
タフェース１２を介してユーザから検索条件データを受
取り、受取った検索条件データに基づいて知識ベース１
３に格納された知識を検索する。受取った検索条件デー
タが知識２４の条件部にある場合には、ユーザインタフ
ェース１２を介して当該知識の結論部をユーザに提示す
る。受取った検索条件データが知識２４中の「条件部」
にある知識が知識ベース１３の中にない場合には、類似
検索手段２０により、一般的なシソーラスを格納した類
義語辞書１４を用いて検索条件データの条件を広げ、事
例ベース３０に格納された事例データ１５に対して類似
検索を実行する。Next, the operation at the time of case retrieval will be described. At the time of case search, first, the inference means 19 receives search condition data from the user via the user interface 12 and based on the received search condition data, the knowledge base 1.
3 is searched for knowledge. When the received search condition data is in the condition part of the knowledge 24, a conclusion part of the knowledge is presented to the user via the user interface 12. The received search condition data is "condition part" in knowledge 24
Is not found in the knowledge base 13, the conditions of the search condition data are expanded by the similarity search means 20 using the synonym dictionary 14 storing the general thesaurus, and the case stored in the case base 30 is obtained. A similarity search is performed on the data 15.

【００２３】また、文書検索において検索結果の妥当性
の判定を容易にすることを目的とした技術が、例えば、
特開平７−１９２０２０号公報「文書情報検索装置」に
開示されている。以降、本技術を従来技術３と呼ぶこと
にする。Further, a technique aimed at facilitating the determination of the validity of the search result in the document search is described in, for example,
It is disclosed in Japanese Patent Application Laid-Open No. 7-192020 "Document Information Searching Apparatus". Hereinafter, the present technology will be referred to as Conventional Technology 3.

【００２４】本従来技術３に係る文書情報検索装置にお
いては、入力文章を形態素解析、構文解析して、キーワ
ードと「視点」のリストからなる検索命令を生成し、本
検索命令によって文書データを検索することにより、複
数視点を軸にして検索命令と文書データとの類似の度合
いを表示することにより検索結果の妥当性の判定を容易
にしている。本従来技術３は、自然言語の構文解析は行
うが、その解析結果を視点の抽出だけに用いており、実
際の検索時にはキーワードと視点とから構成した情報だ
けを用いている。In the document information search apparatus according to the prior art 3, the input sentence is morphologically analyzed and syntax-analyzed to generate a search command including a list of keywords and "viewpoints", and the document data is searched by the search command. By doing so, the degree of similarity between the search command and the document data is displayed with a plurality of viewpoints as axes, thereby making it easy to determine the validity of the search result. The prior art 3 performs syntax analysis of a natural language, but uses the analysis result only for viewpoint extraction, and uses only information composed of keywords and viewpoints during an actual search.

【００２５】図２８は、従来技術３に係る文書情報検索
装置の構成を示す構成図である。図２８において、２６
はユーザが検索を指示するキーワードまたは自然言語か
らなる文章を入力する入力部、２８はユーザが入力した
入力文章２７を解析して検索命令に変換して得られた検
索命令２９を入力部２６へ出力する入力解析部である。FIG. 28 is a configuration diagram showing a configuration of a document information retrieval apparatus according to the conventional technique 3. In FIG. 28, 26
Is an input unit for inputting a keyword or a sentence composed of a natural language instructing the user to search, and 28 is for inputting a search command 29 obtained by analyzing the input sentence 27 input by the user and converting it into a search command. This is an input analysis unit that outputs.

【００２６】また、３０は文章データを格納する文章デ
ータ記録部、３１は入力部２６から検索命令２９を入力
として文書データ記録部３０に格納された文章データを
参照することによって関連する文書を検索して得られた
検索結果３２を出力する検索部、３３は入力部２６から
検索命令２９を受取ると共に検索部３１が出力した検索
結果３２を受取って検索履歴として格納する検索履歴記
憶部、３５は検索履歴記憶部３３に格納された検索履歴
を入力して木構造で表示する検索履歴表示部である。Reference numeral 30 denotes a sentence data recording unit for storing sentence data, and 31 denotes a search for a related document by inputting a search command 29 from the input unit 26 and referring to the sentence data stored in the document data recording unit 30. A search unit 33 that outputs the search result 32 obtained by the search; a search history storage unit 33 that receives the search command 29 from the input unit 26 and receives the search result 32 output by the search unit 31 and stores it as a search history; This is a search history display unit that inputs the search history stored in the search history storage unit 33 and displays it in a tree structure.

【００２７】また、３６は検索部３１で検索して得られ
た検索結果３２を入力して検索結果の文書の集合を多次
元で表示する（例えば、各々の文書の表題や書誌事項な
どを木構造や表やリストなどで表示する）と共に表示さ
れた文書の集合の中から文書を選択する検索結果表示
部、３７は検索結果表示部３６で選択された文書の内容
（全文）を表示するブラウズ部、３４は検索履歴記憶部
３３、検索履歴表示部３５、検索結果表示部３６を制御
して検索履歴の格納・表示を制御・管理する履歴管理部
である。A search unit 36 inputs a search result 32 obtained by a search performed by the search unit 31 and displays a set of documents of the search result in a multidimensional manner (for example, a title or a bibliographic item of each document is displayed in a tree). A search result display unit for selecting a document from a set of documents displayed together with a structure, a table, a list, etc.), and a browse 37 displays the contents (full text) of the document selected on the search result display unit 36 Reference numeral 34 denotes a history management unit that controls the search history storage unit 33, the search history display unit 35, and the search result display unit 36 to control and manage the storage and display of the search history.

【００２８】図２９は、図２８における入力解析部２８
の構成を示す構成図である。図２９において、３８は個
々の文を入力して形態素解析を行ってその解析結果を出
力する形態素解析部、３９は形態素解析の結果を入力し
て構文解析を行ってその結果を出力する構文解析部、４
０は入力文章２７を入力して形態素解析３８および構文
解析３９を起動して構文解析結果を得る入力解析制御部
である。FIG. 29 shows the input analysis unit 28 in FIG.
FIG. 2 is a configuration diagram showing the configuration of FIG. In FIG. 29, reference numeral 38 denotes a morphological analysis unit which inputs individual sentences, performs morphological analysis, and outputs the analysis result, and 39 denotes a syntax analysis which inputs the result of the morphological analysis, performs syntax analysis, and outputs the result. Part 4,
Reference numeral 0 denotes an input analysis control unit that inputs the input sentence 27, activates the morphological analysis 38 and the syntax analysis 39, and obtains a syntax analysis result.

【００２９】また、４１は視点を抽出するために用いる
規則を記述した視点抽出用規則、４２は前記構文解析結
果を入力して視点抽出用規則を参照して視点情報を作成
する視点抽出部、４３は視点抽出部４２において抽出し
た視点情報を入力すると共に個々の文から自立語を抽出
して検索命令を構成する検索命令生成部である。なお、
図２９中、図２８と同一または相当部分には同一符号を
付して説明を省略する。Reference numeral 41 denotes a viewpoint extraction rule describing a rule used for extracting a viewpoint. Reference numeral 42 denotes a viewpoint extraction unit for inputting the result of the syntax analysis and referring to the viewpoint extraction rule to create viewpoint information. Reference numeral 43 denotes a search command generation unit which inputs the viewpoint information extracted by the viewpoint extraction unit 42 and extracts a self-sustaining word from each sentence to form a search command. In addition,
29, those parts which are the same as or correspond to those in FIG. 28 are given the same reference numerals, and descriptions thereof will be omitted.

【００３０】次に、従来技術３に係る文書情報検索装置
の動作について図２８および図２９を用いて説明する。
入力解析部２８においては、図２９に示した入力解析制
御部４０によって、形態素解析部３８、構文解析部３９
を起動し、入力文章２７に対する構文解析結果を得る。
次に、視点抽出部４２によって（述部、様相、格構造）
というパターンを視点に割当てる視点抽出用規則４１を
用いて視点のリストを生成し、検索命令生成部４３にお
いて前記構文解析結果から自立語を抽出することによ
り、＜キーワード＞と＜視点のリスト＞からなる検索命
令２９を生成する。Next, the operation of the document information retrieval apparatus according to the prior art 3 will be described with reference to FIGS.
In the input analysis unit 28, the morphological analysis unit 38 and the syntax analysis unit 39 are controlled by the input analysis control unit 40 shown in FIG.
Is activated, and the result of parsing the input sentence 27 is obtained.
Next, the viewpoint extracting unit 42 (predicate, modality, case structure)
A viewpoint list is generated using a viewpoint extraction rule 41 that assigns a pattern to a viewpoint, and a search command generation unit 43 extracts an independent word from the result of the syntax analysis, thereby obtaining a list from the <keyword> and the <view list>. Is generated.

【００３１】図２８に示した検索部３１においては、検
索命令２９を用いて文書データ記憶部３０に格納されて
いる文書データを検索して、視点毎に類似した文書を検
索結果３２として得ることにより、検索結果表示部３６
に視点毎の検索結果（文書の集合）を表示することがで
きる。また、履歴管理部３４、検索履歴記憶部３３、検
索履歴表示部３５においては、過去の検索命令と検索結
果の履歴を管理・記憶しておき、記憶しておいた検索命
令と検索結果の履歴を表示することができる。更に、検
索結果表示部３６に表示された検索結果である文書の集
合の中から文書を選択することにより、ブラウズ部３７
に選択された文書の全文を表示することができる。The retrieval unit 31 shown in FIG. 28 retrieves the document data stored in the document data storage unit 30 using the retrieval command 29, and obtains a similar document for each viewpoint as the retrieval result 32. The search result display section 36
, A search result (a set of documents) for each viewpoint can be displayed. The history management unit 34, the search history storage unit 33, and the search history display unit 35 manage and store the history of past search commands and search results, and store the history of the stored search commands and search results. Can be displayed. Further, by selecting a document from a set of documents that are search results displayed on the search result display unit 36, the browse unit 37 is selected.
The full text of the selected document can be displayed.

【００３２】[0032]

【発明が解決しようとする課題】従来技術１および従来
技術２においては、事例中の自然言語での記述が単文で
あるような単純な類似検索を想定しているため、事例デ
ータの蓄積検索対象部分が複数の文からなるような複雑
な場合、予め事例から文を抽出して分類しておき、類似
検索による問題解決を容易にすることができないという
課題があった。In the prior arts 1 and 2, a simple similarity search in which a description in a natural language in a case is a simple sentence is assumed. In a case where a part is composed of a plurality of sentences, there is a problem that it is not possible to extract and classify sentences from a case in advance and to easily solve a problem by similarity search.

【００３３】また、従来技術１においては、キーワード
抽出だけで索引を生成しており、キーワードのカテゴリ
分類および頻度情報だけで類似度計算を行なうので、事
例を構成する文の様相表現（否定、推量など）などを含
めた詳細な自然言語の類似性を判断することができない
という課題があった。In the prior art 1, an index is generated only by keyword extraction, and similarity calculation is performed only by keyword category classification and frequency information. Therefore, modal expressions (negation, guesswork) of a sentence constituting a case are obtained. Etc.) cannot be determined in detail.

【００３４】また、従来技術２においては、一般的なシ
ソーラスを格納した類義語辞書を利用しているが、出現
頻度の高いキーワードの組合せだけで索引を生成し、類
似検索の入力も入力内容から抽出したキーワードだけを
用いているので、文の様相表現などを含めた詳細な自然
言語の類似性に関する処理が考慮されておらず、入力文
と構文的・意味的に類似した文を含む事例を検索するこ
とができないという課題があった。Further, in the prior art 2, a synonym dictionary storing a general thesaurus is used. However, an index is generated only by a combination of keywords having a high frequency of appearance, and an input for similarity search is also extracted from the input contents. Because only keywords that have been used are used, detailed processing related to the similarity of natural languages, including the modal expressions of sentences, is not considered, and cases that include sentences that are syntactically and semantically similar to the input sentence are searched. There was a problem that it was not possible.

【００３５】また、従来技術３においては、検索入力文
章の構文解析を行うが、構文解析結果情報を視点の抽出
だけに用いており、実際の検索時にはキーワードと視点
とから構成した情報だけを用いている。従って、従来技
術１および従来技術２と同様に、入力文と構文的・意味
的に類似した文を含む文書を検索することができないと
いう課題があった。Further, in the prior art 3, the syntax analysis of the search input sentence is performed, but the syntax analysis result information is used only for extracting the viewpoint, and only the information composed of the keyword and the viewpoint is used at the time of the actual search. ing. Therefore, similarly to the related art 1 and the related art 2, there is a problem that a document including a sentence that is syntactically and semantically similar to the input sentence cannot be searched.

【００３６】また、従来技術３においては、本発明のよ
うに文書データ間の関係を含めて事例として予め蓄積し
ておく手段を備えていないので、データ間の関係を意識
した検索はできないという課題があった。Further, in the prior art 3, since there is no means for pre-accumulating the case including the relationship between the document data as in the present invention, it is not possible to perform a search conscious of the relationship between the data. was there.

【００３７】本発明は、前記のような課題を解決するた
めになされたもので、蓄積・検索対象が複数の事例文か
らなる複雑な事例に対しても、類似検索による問題の解
決を容易にすることができる事例蓄積・検索装置、およ
び事例蓄積方法並びに事例検索方法、および事例蓄積プ
ログラムを記録したコンピュータで読取可能な記録媒体
並びに事例検索プログラムを記録したコンピュータで読
取可能な記録媒体を提供することを目的とする。The present invention has been made in order to solve the above-described problems, and it is easy to solve a problem by similarity search even for a complex case where a storage / search target is composed of a plurality of case sentences. Case storage / retrieval apparatus, case storage method, case search method, computer readable recording medium recording case storage program, and computer readable recording medium recording case search program The purpose is to:

【００３８】また、本発明は、事例を構成する文の様相
表現（否定、推量など）を含めた多様な自然言語表現に
対応して詳細な類似性を考慮し、構文的・意味的に類似
した文を含む事例の検索を可能にすることを目的とす
る。Further, the present invention considers detailed similarity corresponding to various natural language expressions including modal expressions (negation, guesswork, etc.) of sentences constituting a case, and considers syntactic and semantic similarity. The purpose of the present invention is to make it possible to search for cases that include a sentence.

【００３９】また、本発明は、文書データに対して類似
や背反といったデータ間の関係を意識した検索を可能に
することを目的とする。Another object of the present invention is to make it possible to perform a search in consideration of the relationship between document data such as similarity or contradiction.

【００４０】[0040]

【課題を解決するための手段】本発明に係る第１の事例
蓄積・検索装置は、事例蓄積・検索の対象とする領域に
依存した用語と前記用語間の関係とに関する知識を予め
格納した領域オントロジと、自然言語で記述された電子
化文書の中から前記事例蓄積・検索の対象とする各々の
事例文を切出す処理対象文抽出手段と、ユーザが所望す
る事例文を検索するための検索文を入力する検索入力手
段と、前記処理対象抽出手段によって切出された各々の
事例文または前記検索入力手段によって入力された検索
文を入力として前記領域オントロジに格納された知識を
参照して形態素解析および構文解析を行って前記各々の
事例文または前記検索文の構造を作成する文解析手段
と、前記文解析手段によって作成された前記各々の事例
文または前記検索文の構造を入力として前記領域オント
ロジに格納された知識を参照して前記各々の事例文同士
の類似度または前記各々の事例文と前記検索文との間の
類似度を求める類似文照合手段と、前記類似文照合手段
によって求められた前記各々の事例文同士の類似度に基
づいて前記各々の事例文を分類して事例クラスタを構成
し、前記事例クラスタの情報と前記事例クラスタ間の関
係情報とから構成される事例データを作成する類似文ク
ラスタリング手段と、前記類似文クラスタリング手段に
よって作成された事例データを格納する事例データベー
スと、前記類似文照合手段によって求められた前記各々
の事例文と前記検索文との間の類似度に基づいて前記検
索文入力手段によって入力された検索文に類似した類似
事例文を前記事例データベースに格納された事例データ
の中から検索する類似事例検索手段と、前記類似事例検
索手段によって検索された類似事例文を表示する検索結
果表示手段とを備えたことを特徴とする。A first case accumulation / retrieval apparatus according to the present invention is an area accumulation / retrieval apparatus which prestores knowledge relating to terms dependent on an area to be accumulated / retrieved and relations between the terms in advance. An ontology, a processing target sentence extracting means for extracting each case sentence to be stored and searched for from the digitized document described in a natural language, and a search for searching for a case sentence desired by the user A morpheme by referring to knowledge stored in the area ontology with a search input unit for inputting a sentence, and each case sentence extracted by the processing target extraction unit or the search sentence input by the search input unit as an input. Sentence analysis means for performing analysis and syntax analysis to create the structure of each of the case sentences or the search sentence, and each of the case sentences or the search sentence created by the sentence analysis means Similar sentence matching means for obtaining a similarity between each of the case sentences or a similarity between each of the case sentences and the search sentence by referring to knowledge stored in the area ontology with a structure as an input; Based on the similarity between the case sentences obtained by the similar sentence matching means, the respective case sentences are classified to form a case cluster, and information on the case clusters and relation information between the case clusters are used. Similar sentence clustering means for creating constituted case data; a case database for storing case data created by the similar sentence clustering means; each of the case sentences and the search sentence obtained by the similar sentence matching means A similar case sentence similar to the search sentence input by the search sentence input means is stored in the case database based on the similarity between Wherein the similar case retrieving means for retrieving from the case data, that a search result display means for displaying the retrieved similar cases statements by the analogous case retrieval means.

【００４１】本発明に係る第１の事例蓄積方法は、自然
言語で記述された電子化文書の中から事例蓄積の対象と
する各々の事例文を切出す処理対象文抽出ステップと、
前記処理対象抽出ステップによって切出された各々の事
例文を入力として前記事例蓄積の対象とする領域に依存
した用語と前記用語間の関係とに関する知識を予め格納
した領域オントロジを参照して形態素解析および構文解
析を行って前記各々の事例文の構造を作成する文解析ス
テップと、前記文解析ステップによって作成された前記
各々の事例文の構造を入力として前記領域オントロジに
格納された知識を参照して前記各々の事例文同士の類似
度を求める類似文照合ステップと、前記類似文照合ステ
ップによって求められた前記各々の事例文同士の類似度
に基づいて前記各々の事例文を分類して事例クラスタを
構成し、前記事例クラスタの情報と前記事例クラスタ間
の関係情報とから構成される事例データを作成して事例
データベースに格納する類似文クラスタリングステップ
とから構成されたことを特徴とする。A first case storing method according to the present invention includes a processing target sentence extracting step of cutting out each case sentence to be stored in a digitized document described in a natural language.
A morphological analysis is performed by referring to an area ontology in which each case sentence extracted by the processing target extraction step is input and knowledge about terms dependent on the area to be stored in the case and the relation between the terms is stored in advance. And a sentence analysis step of performing a syntax analysis to create the structure of each of the case sentences, and referring to the knowledge stored in the area ontology with the structure of each of the case sentences created by the sentence analysis step as an input. A similarity sentence matching step of calculating the similarity between the case sentences, and classifying each of the case sentences based on the similarity degree between the case sentence obtained by the similar sentence matching step. And creates case data composed of the case cluster information and the relationship information between the case clusters, and stores the case data in the case database. Characterized in that it consists of a similar sentence clustering step of.

【００４２】本発明に係る第２の事例蓄積方法は、前記
処理対象文抽出ステップでは、抽出した前記事例文に対
して事例文の種別を付与することを特徴とする。A second case accumulation method according to the present invention is characterized in that, in the processing object sentence extracting step, a case sentence type is assigned to the extracted case sentence.

【００４３】本発明に係る第３の事例蓄積方法は、前記
類似文クラスタリングステップでは、前記各々の事例文
を分類して構成するクラスタの階層数または・および前
記各々の事例文同士が類似していると判断する際に用い
るクラスタ間の類似度の閾値を指定することを特徴とす
る。In the third case accumulating method according to the present invention, in the similar sentence clustering step, the number of hierarchical levels of clusters configured by classifying the respective case sentences and / or the case sentences are similar to each other. It is characterized in that a threshold value of the similarity between clusters to be used when it is determined that there is a presence is designated.

【００４４】本発明に係る第４の事例蓄積方法は、前記
領域オントロジでは、意味的な上位−下位関係を記述し
たＩＳ−Ａ関係知識、意味的な部分−全体の関係を記述
したＨＡＳ−Ａ関係知識、概念間の関係を記述した格関
係知識、１つの意味・概念・関係を複数の表現で記述し
た言換え知識、同時には起り得ない背反関係を記述した
知識の内、少なくとも１種類の知識を記述することを特
徴とする。In the fourth case accumulation method according to the present invention, in the area ontology, IS-A relation knowledge describing a semantic upper-lower relation and HAS-A describing a semantic part-whole relation are described. At least one of relational knowledge, case relation knowledge that describes the relationship between concepts, paraphrase knowledge that describes one meaning, concept, and relation in multiple expressions, and knowledge that describes a conflict that cannot occur at the same time It is characterized by describing knowledge.

【００４５】本発明に係る第５の事例蓄積方法は、前記
似文照合ステップでは、構前記文解析ステップにおいて
作成された文の構造における構文的要素の属性に基づい
て前記各々の事例文同士の意味構造を照合することによ
って類似度を求めることを特徴とする。In a fifth case accumulating method according to the present invention, in the similar sentence matching step, each of the case sentences is connected to each other based on an attribute of a syntactic element in a sentence structure created in the sentence analyzing step. It is characterized in that a similarity is obtained by collating a semantic structure.

【００４６】本発明に係る第６の事例蓄積方法は、前記
類似文照合ステップでは、前記各々の事例文同士の類似
度を求める際に用いる照合の詳細度を指定することを特
徴とする。A sixth case accumulation method according to the present invention is characterized in that, in the similar sentence matching step, a degree of detail of matching used when obtaining the similarity between the case sentences is designated.

【００４７】本発明に係る第７の事例蓄積方法は、前記
文解析ステップは、前記事例文の構造を木構造として作
成し、前記類似文照合ステップでは、前記各々の事例文
同士の類似度を求める際に用いる照合の詳細度を前記事
例文の木構造の深さによって指定することを特徴とす
る。In a seventh case accumulating method according to the present invention, in the sentence analyzing step, the structure of the case sentence is created as a tree structure, and in the similar sentence matching step, the similarity between the case sentences is determined. It is characterized in that the degree of detail of the collation used for obtaining is specified by the depth of the tree structure of the case sentence.

【００４８】本発明に係る第８の事例蓄積方法は、前記
事例データベースでは、事例クラスタ間の関係として類
似関係または・および背反関係を記述することを特徴と
する。An eighth case accumulation method according to the present invention is characterized in that the case database describes a similarity relationship or a contradiction relationship as a relationship between case clusters.

【００４９】本発明に係る第１の事例検索方法は、事例
検索の対象とする事例データを格納した事例データベー
スの中からユーザが所望する事例文を検索するための検
索文を入力する検索入力ステップと、前記検索入力ステ
ップによって入力された検索文を入力として前記事例検
索の対象とする領域に依存した用語と前記用語間の関係
とに関する知識を予め格納した領域オントロジに格納さ
れた知識を参照して形態素解析および構文解析を行って
前記検索文の構造を作成する文解析ステップと、前記文
解析ステップによって作成された前記検索文の構造を入
力として前記領域オントロジに格納された知識を参照し
て前記各々の事例文と前記検索文との間の類似度を求め
る類似文照合ステップと、前記類似文照合ステップによ
って求められた前記各々の事例文と前記検索文との間の
類似度に基づいて前記検索文入力ステップによって入力
された検索文に類似した類似事例文を前記事例データベ
ースに格納された事例データの中から検索する類似事例
検索ステップと、前記類似事例検索ステップによって検
索された類似事例文を表示する検索結果表示ステップと
から構成されたことを特徴とする。In the first case search method according to the present invention, a search input step of inputting a search sentence for searching a case sentence desired by a user from a case database storing case data to be searched. With reference to the knowledge stored in the area ontology in which knowledge relating to the term depending on the area to be searched for the case and the relationship between the terms is previously stored with the search sentence input in the search input step as an input. A sentence analysis step of performing the morphological analysis and syntax analysis to create the structure of the search sentence, and referring to the knowledge stored in the area ontology with the structure of the search sentence created by the sentence analysis step as input. A similar sentence matching step for obtaining a similarity between each of the case sentences and the search sentence; A similarity search for searching a similar case sentence similar to the search sentence input in the search sentence input step from the case data stored in the case database based on the similarity between each case sentence and the search sentence. The method is characterized by comprising a case search step and a search result display step of displaying a similar case sentence searched by the similar case search step.

【００５０】本発明に係る第１の事例蓄積プログラムを
記録したコンピュータで読取り可能な記憶媒体は、自然
言語で記述された電子化文書の中から事例蓄積の対象と
する各々の事例文を切出す処理対象文抽出手順と、前記
処理対象抽出手順によって切出された各々の事例文を入
力として前記事例蓄積の対象とする領域に依存した用語
と前記用語間の関係とに関する知識を予め格納した領域
オントロジを参照して形態素解析および構文解析を行っ
て前記各々の事例文の構造を作成する文解析手順と、前
記文解析手順によって作成された前記各々の事例文の構
造を入力として前記領域オントロジに格納された知識を
参照して前記各々の事例文同士の類似度を求める類似文
照合手順と、前記類似文照合手順によって求められた前
記各々の事例文同士の類似度に基づいて前記各々の事例
文を分類して事例クラスタを構成し、前記事例クラスタ
の情報と前記事例クラスタ間の関係情報とから構成され
る事例データを作成して事例データベースに格納する類
似文クラスタリング手順とから構成される事例蓄積プロ
グラムを記録したことを特徴とする。The computer-readable storage medium storing the first case storage program according to the present invention cuts out each case sentence to be stored in a digitized document described in a natural language. An area in which knowledge relating to a processing target sentence extraction procedure, a term depending on an area to be subjected to the case accumulation as an input of each case sentence extracted by the processing target extraction procedure, and a relationship between the terms are stored in advance. A sentence analysis procedure for creating a structure of each of the case sentences by performing morphological analysis and syntax analysis with reference to an ontology; and inputting the structure of each of the case sentences created by the sentence analysis procedure to the area ontology as input. A similar sentence matching procedure for obtaining a similarity between the case sentences with reference to the stored knowledge; and a similar sentence matching procedure obtained by the similar sentence matching procedure. Classifying each of the case sentences based on the similarity of the case clusters to form a case cluster, creating case data including information on the case clusters and information on the relationship between the case clusters, and storing the created case data in a case database A case accumulation program comprising a similar sentence clustering procedure is recorded.

【００５１】本発明に係る第１の事例検索プログラムを
記録したコンピュータで読取り可能な記憶媒体は、事例
検索の対象とする事例データを格納した事例データベー
スの中からユーザが所望する事例文を検索するための検
索文を入力する検索入力手順と、前記検索入力手順によ
って入力された検索文を入力として前記事例検索の対象
とする領域に依存した用語と前記用語間の関係とに関す
る知識を予め格納した領域オントロジを参照して形態素
解析および構文解析を行って前記検索文の構造を作成す
る文解析手順と、前記文解析手順によって作成された前
記検索文の構造を入力として前記領域オントロジに格納
された知識を参照して前記各々の事例文と前記検索文と
の間の類似度を求める類似文照合手順と、前記類似文照
合手順によって求められた前記各々の事例文と前記検索
文との間の類似度に基づいて前記検索文入力手順によっ
て入力された検索文に類似した類似事例文を前記事例デ
ータベースに格納された事例データの中から検索する類
似事例検索手順と、前記類似事例検索手順によって検索
された類似事例文を表示する検索結果表示手順とから構
成された事例検索プログラムを記録したことを特徴とす
る。The computer-readable storage medium storing the first case search program according to the present invention searches for a case sentence desired by a user from a case database storing case data to be searched. A search input procedure for inputting a search sentence for inputting, and a search sentence input by the search input procedure as input, and knowledge about a term depending on an area to be searched for the case and a relationship between the terms are stored in advance. A sentence analysis procedure for creating a structure of the search sentence by performing morphological analysis and syntax analysis with reference to an area ontology, and the structure of the search sentence created by the sentence analysis procedure is stored in the area ontology as an input. A similar sentence matching procedure for obtaining a similarity between each of the case sentences and the search sentence with reference to knowledge; A similar case sentence similar to the search sentence input by the search sentence input procedure based on the similarity between each of the obtained case sentences and the search sentence is selected from among the case data stored in the case database. A case search program comprising a similar case search procedure for searching and a search result display procedure for displaying a similar case sentence searched by the similar case search procedure is recorded.

【００５２】[0052]

【発明の実施の形態】実施の形態１．以下、本発明に係
る事例蓄積・検索技術について図を用いて説明する。DESCRIPTION OF THE PREFERRED EMBODIMENTS Embodiment 1 Hereinafter, the case accumulation / retrieval technology according to the present invention will be described with reference to the drawings.

【００５３】図１は、本発明に係る実施の形態１による
事例蓄積・検索装置の構成を示す構成図である。FIG. 1 is a configuration diagram showing the configuration of the case storage / retrieval apparatus according to Embodiment 1 of the present invention.

【００５４】図１において、４４は事例蓄積・検索装置
が蓄積・検索処理の対象とする事例の対象領域に依存し
た用語や当該用語間の関係に関する知識を予め記述して
格納しておく領域オントロジ、４５は事例をフィールド
毎に区切って自然言語で記述した文書が格納された電子
化文書ファイル（図示せず）の中からフィールド毎に事
例蓄積処理の対象となる各々の文（以降「事例文」と呼
ぶ）を切出す処理対象文抽出部である。In FIG. 1, reference numeral 44 denotes an area ontology in which terms dependent on a target area of a case to be stored and searched by the case storage / retrieval apparatus and knowledge about a relationship between the terms are previously described and stored. , 45, each sentence to be subjected to the case accumulation processing for each field (hereinafter referred to as “case sentence”) from an electronic document file (not shown) in which a document in which a case is delimited for each field and described in a natural language is stored. ) Is extracted.

【００５５】また、４６は事例検索処理のために検索用
の文（以降「検索文」と呼ぶ）を入力する検索文入力
部、４７は処理対象文抽出部４５において切出された各
事例文、または検索文入力部４６において入力された検
索文に対して、単語辞書、文法規則（以上図示せず）、
領域オントロジ４４などを参照して形態素解析および構
文解析を行う文解析部、４８は２つの文の構造を入力と
し、領域オントロジ４４、類似度の計算規則（図示せ
ず）などを参照して文同士の類似度を求める類似文照合
部である。Reference numeral 46 denotes a search sentence input unit for inputting a search sentence (hereinafter referred to as “search sentence”) for case search processing, and reference numeral 47 denotes each case sentence extracted by the processing target sentence extraction unit 45. Or a search sentence input in the search sentence input unit 46, a word dictionary, grammar rules (not shown),
The sentence analyzer 48 performs morphological analysis and syntax analysis with reference to the area ontology 44 and the like. The sentence analyzer 48 receives two sentence structures as input, and refers to the area ontology 44, rules for calculating similarity (not shown), and the like. This is a similar sentence matching unit that obtains a similarity between the two.

【００５６】また、４９は文解析部４７において各事例
文を形態素解析および構文解析して得られた事例文の構
造の集合を入力とし、入力した事例文の構造の集合の中
の任意の２つの事例文の構造に対して類似文照合部４８
が求めた事例文同士の類似度に基づいて相互に類似した
文を分類（クラスタリング）してフィールド毎に事例の
クラスタを構成し、構成した事例クラスタの情報および
事例クラスタ間の関係情報を作成する類似文クラスタリ
ング部、５０は類似文クラスタリング部４９でフィール
ド毎に作成した事例クラスタの情報および事例クラスタ
間の関係情報を索引付きで格納する事例データベース、
５１は事例データベース５０に格納されたクラスタの情
報およびクラスタ間の関係情報に基づいて事例のクラス
タの階層構造を問題解決木として編集する事例クラスタ
編集部である。Reference numeral 49 denotes a sentence analyzing unit 47 which receives a set of case sentence structures obtained by morphological analysis and syntax analysis of each case sentence and inputs any two of the input set of case sentence structures. Similar sentence matching unit 48 for the structure of two case sentences
Based on the degree of similarity between the case sentences obtained by the above, sentences similar to each other are classified (clustered) to form a case cluster for each field, and information on the formed case clusters and relation information between the case clusters are created. A similar sentence clustering unit 50 for storing case cluster information and related information between case clusters created for each field by the similar sentence clustering unit 49 with an index;
Reference numeral 51 denotes a case cluster editing unit that edits a hierarchical structure of clusters of cases as a problem solving tree based on information on clusters stored in the case database 50 and relation information between clusters.

【００５７】また、５２は文解析部４７において検索文
を形態素解析および構文解析して得られた検索文の構造
を入力し、事例データベース５０に格納された各事例文
の構造と入力された検索文の構造に対して類似文照合部
４８が求めた事例文と検索文との間の類似度に基づいて
検索文と類似した事例文を検索する類似事例検索部、５
３は類似事例検索部５２における検索結果や確認のため
に事例データベース５０に格納された事例クラスタの内
容を表示する検索結果表示部である。Reference numeral 52 denotes a sentence analysis unit 47 for inputting the structure of a search sentence obtained by morphological analysis and syntax analysis of the search sentence, and the structure of each case sentence stored in the case database 50 and the input search sentence. A similar case search unit that searches for a case sentence similar to the search sentence based on the similarity between the case sentence and the search sentence obtained by the similar sentence matching unit 48 for the sentence structure;
Reference numeral 3 denotes a search result display unit that displays the search results in the similar case search unit 52 and the contents of the case cluster stored in the case database 50 for confirmation.

【００５８】次に、本発明に係る実施の形態１による事
例蓄積・検索装置の動作について説明する。Next, the operation of the case accumulating and searching device according to the first embodiment of the present invention will be described.

【００５９】本発明に係る実施の形態１による事例蓄積
・検索装置の機能には、大別すると、電子化文書の中か
ら事例を抽出して事例データベースに蓄積する事例蓄積
機能と事例データベースに蓄積された事例の中から類似
事例を検索する事例検索機能の２つの機能がある。The functions of the case storage / retrieval apparatus according to the first embodiment of the present invention are roughly classified into a case storage function of extracting a case from an electronic document and storing it in a case database and a case storage function of storing the case in a case database. There are two functions of a case search function for searching for similar cases from the performed cases.

【００６０】まず、本発明に係る実施の形態１による事
例蓄積・検索装置における事例蓄積機能の動作の概要に
ついて図２を用いて説明する。First, the outline of the operation of the case accumulation function in the case accumulation / retrieval apparatus according to Embodiment 1 of the present invention will be described with reference to FIG.

【００６１】図２は、本発明に係る実施の形態１による
事例蓄積・検索装置における事例蓄積機能の処理の流れ
を示すフローチャートである。なお、図２中、図１と同
一または相当部分には同一符号を付して説明を省略す
る。FIG. 2 is a flowchart showing the flow of processing of the case accumulation function in the case accumulation / retrieval apparatus according to the first embodiment of the present invention. In FIG. 2, the same or corresponding parts as those in FIG.

【００６２】図２において、処理対象文抽出処理（ステ
ップＳ１）は図１に示した事例蓄積・検索装置の構成中
の処理対象文抽出部４５において実行される。また、文
解析処理（ステップＳ２）は図１に示した事例蓄積・検
索装置の構成中の文解析部４７において実行される。ま
た、類似文クラスタリング処理（ステップＳ３）は図１
に示した事例蓄積・検索装置の構成中の類似文クラスタ
リング部４９において実行される。また、事例クラスタ
編集処理（ステップＳ４）は図１に示した事例蓄積・検
索装置の構成中の事例クラスタ編集部５１において実行
される。In FIG. 2, the processing target sentence extraction process (step S1) is executed by the processing target sentence extraction unit 45 in the configuration of the case accumulation / retrieval apparatus shown in FIG. The sentence analysis process (step S2) is executed by the sentence analysis unit 47 in the configuration of the case accumulation / search device shown in FIG. Further, the similar sentence clustering process (step S3) is performed as shown in FIG.
Is executed in the similar sentence clustering unit 49 in the configuration of the case accumulation / search device shown in FIG. The case cluster editing process (step S4) is executed by the case cluster editing unit 51 in the configuration of the case accumulation / search device shown in FIG.

【００６３】まず、処理対象文抽出処理（ステップＳ
１）において、電子化文書ファイルを入力し、事例蓄積
・検索処理の対象となる各々の事例文を切出す。次に、
文解析処理（ステップＳ２）において、処理対象文抽出
処理（ステップＳ１）によって切出した各々の事例文を
入力し、形態素解析および構文解析を行って各々の事例
文の構造（具体的には後述する係り受け構造）を生成す
る。次に、類似文クラスタリング処理（ステップＳ３）
において、文解析処理（ステップＳ２）によって生成さ
れた各々の事例文の構造に対して類似文照合部４７を呼
出して任意の事例文間の類似度を求め、処理対象文抽出
処理（ステップＳ１）によって切出された各々の事例文
を求めた類似度に基づいて分類（クラスタリング）して
フィールド毎にクラスタの情報およびクラスタ間の関係
情報を作成し、作成した情報を事例データベース５０に
格納する。次に、事例クラスタ編集処理（ステップＳ
４）において、事例データベース５０に格納されたクラ
スタの情報およびクラスタ間の関係情報を入力し、事例
クラスタの階層構造を問題解決木として編集する。First, the target sentence extraction process (step S
In 1), an electronic document file is input, and each case sentence to be subjected to case accumulation / search processing is cut out. next,
In the sentence analysis process (step S2), each case sentence extracted by the process target sentence extraction process (step S1) is input, and morphological analysis and syntax analysis are performed, and the structure of each case sentence (specifically described later) (Dependency structure). Next, similar sentence clustering processing (step S3)
In step S2, the similar sentence matching unit 47 is called for the structure of each case sentence generated by the sentence analysis process (step S2) to obtain the degree of similarity between arbitrary case sentences, and the process target sentence extraction process (step S1) Based on the obtained similarity, each case sentence is classified (clustered) to create cluster information and inter-cluster relation information for each field, and the created information is stored in the case database 50. Next, the case cluster editing process (step S
In 4), information of clusters stored in the case database 50 and relation information between clusters are input, and the hierarchical structure of the case clusters is edited as a problem solving tree.

【００６４】なお、図２において、類似クラスタリング
処理（ステップＳ３）は図１に示した事例蓄積・検索装
置における類似文照合部４７を呼出すように構成してい
るが、この類似文照合部４７は類似クラスタリング処理
（ステップＳ３）から呼出されるモジュール（ステップ
の集合）として構成し、プログラムで実現しても良い。In FIG. 2, the similar clustering process (step S3) is configured to call the similar sentence matching unit 47 in the case accumulation / search apparatus shown in FIG. It may be configured as a module (a set of steps) called from the similar clustering process (step S3) and realized by a program.

【００６５】次に、本発明に係る実施の形態１による事
例蓄積・検索装置のもう一方の機能である事例蓄積機能
の動作の概要について図２４を用いて説明する。Next, the outline of the operation of the case accumulation function, which is another function of the case accumulation / retrieval apparatus according to the first embodiment of the present invention, will be described with reference to FIG.

【００６６】図２４は、本発明に係る実施の形態１によ
る事例蓄積・検索装置における事例検索機能の処理の流
れを示すフローチャートである。なお、図２４中、図２
と同一または相当部分には同一符号を付して説明を省略
する。FIG. 24 is a flowchart showing a flow of processing of the case search function in the case storage / search apparatus according to the first embodiment of the present invention. In FIG. 24, FIG.
The same or corresponding parts are denoted by the same reference numerals and description thereof is omitted.

【００６７】図２４において、検索文入力処理（ステッ
プＳ４１）は図１に示した事例蓄積・検索装置の構成中
の検索文入力部４６において実行される。また、類似事
例検索処理（ステップＳ４２）は図１に示した事例蓄積
・検索装置の構成中の類似事例検索部５２において実行
される。また、検索結果表示処理（ステップＳ４３）は
図１に示した事例蓄積・検索装置の構成中の検索結果表
示部５３において実行される。In FIG. 24, the search sentence input processing (step S41) is executed in the search sentence input unit 46 in the configuration of the case accumulation / search apparatus shown in FIG. The similar case search process (step S42) is executed in the similar case search unit 52 in the configuration of the case accumulation / search device shown in FIG. The search result display processing (step S43) is executed in the search result display unit 53 in the configuration of the case accumulation / search apparatus shown in FIG.

【００６８】まず、検索文入力処理（ステップＳ４１）
において、事例データベース５０に格納された事例を検
索するための検索文を入力する。次に、文解析処理（ス
テップＳ２）において、検索文入力処理（ステップＳ４
１）によって入力された検索文に対して形態素解析およ
び構文解析を行って検索文の構造具体的には後述する係
り受け構造）を生成する。次に、類似事例検索処理（ス
テップＳ４２）において、文解析処理（ステップＳ２）
によって生成された検索文の構造と事例データベース５
０に格納された各々の事例文の構造に対して類似文照合
部４７を呼出して検索文と各々の事例文との間の類似度
を求め、求めた類似度に基づいて検索文入力処理（ステ
ップＳ４１）によって入力された検索文に類似する類似
事例文を事例データベース５０の中から検索する。次
に、検索結果表示処理（ステップＳ４３）において、類
似事例検索処理（ステップＳ４２）によって検索された
結果得られた類似事例文や事例データベース５０に格納
された事例クラスタの内容を表示する。First, search sentence input processing (step S41)
In, a search sentence for searching for a case stored in the case database 50 is input. Next, in the sentence analysis process (step S2), a search sentence input process (step S4)
A morphological analysis and a syntax analysis are performed on the search sentence input in 1) to generate a structure of the search sentence (specifically, a dependency structure described later). Next, in a similar case search process (step S42), a sentence analysis process (step S2)
Sentence structure and case database 5 generated by
The similar sentence matching unit 47 is called for the structure of each case sentence stored in 0 and the similarity between the search sentence and each case sentence is obtained. Based on the obtained similarity, the search sentence input processing ( A similar case sentence similar to the search sentence input in step S41) is searched from the case database 50. Next, in a search result display process (step S43), the similar case sentence obtained by the similar case search process (step S42) and the contents of the case cluster stored in the case database 50 are displayed.

【００６９】なお、図２４において、類似事例検索処理
（ステップＳ４２）は図１に示した事例蓄積・検索装置
における類似文照合部４７を呼出すように構成している
が、この類似文照合部４７は類似事例検索処理（ステッ
プＳ４２）から呼出されるモジュール（ステップの集
合）として構成し、プログラムで実現しても良い。In FIG. 24, the similar case search process (step S42) is configured to call the similar sentence matching unit 47 in the case storage / search device shown in FIG. May be configured as a module (set of steps) called from the similar case search process (step S42), and may be implemented by a program.

【００７０】以下、適宜図を参照しながら、本発明に係
る実施の形態１による事例蓄積・検索装置における事例
蓄積機能の動作の詳細を、図２に示したフローチャート
に従って具体例を用いて説明する。Hereinafter, the operation of the case accumulation function in the case accumulation / retrieval device according to the first embodiment of the present invention will be described in detail with reference to the flowchart shown in FIG. .

【００７１】まず、処理対象文抽出処理（ステップＳ
１）について説明する。ステップＳ１においては、事例
蓄積処理の対象とする問合せ記録などの電子化文書ファ
イルを入力し、入力した電子化文書ファイルの中から処
理対象の事例文を抽出する。First, the sentence to be processed is extracted (step S
1) will be described. In step S1, a digitized document file such as an inquiry record to be subjected to case accumulation processing is input, and a case sentence to be processed is extracted from the input digitized document file.

【００７２】図３は、電子化文書ファイルの構成の例を
示す図である。図３において、５４は事例蓄積処理の対
象とする１つの電子化文書ファイル、５５、５６は各々
電子化文書ファイル５４中に存在し、複数のフィールド
を持つ文書、５７は文書５５の中に存在するフィールド
の内、内容が数字や簡単な文字列で表現される（属性と
しての）定型フィールド、５８は文書５５中に存在する
フィールドの内、自然言語で表現された複数の文で記述
される（属性としての）非定型フィールド、５９は文書
５５の先頭の区切りを示す文書開始タグ、６０は文書５
５の末尾の区切りを示す文書終了タグである。FIG. 3 is a diagram showing an example of the structure of a digitized document file. In FIG. 3, reference numeral 54 denotes one digitized document file to be subjected to the case accumulation processing; 55 and 56 each exist in the digitized document file 54; a document having a plurality of fields; A fixed field (as an attribute) whose contents are represented by a number or a simple character string among the fields to be described. 58 is described by a plurality of sentences expressed in a natural language among the fields existing in the document 55. An atypical field (as an attribute), 59 is a document start tag indicating the beginning delimiter of the document 55, and 60 is a document 5
5 is an end-of-document tag indicating the end of the end of the document.

【００７３】また、６１は文書５５において定型フィー
ルド、非定型フィールドに拘らず、文書中に存在する各
々のフィールド（以降「文書フィールド」と呼ぶ）の先
頭の区切りを示す文書フィールド開始タグ、６２は文書
５５において文書フィールドの末尾の区切りを示す文書
フィールド終了タグである。Reference numeral 61 denotes a document field start tag indicating a head delimiter of each field (hereinafter, referred to as “document field”) in the document 55 irrespective of the fixed field and the non-standard field. This is a document field end tag indicating the end delimiter of the document field in the document 55.

【００７４】なお、図３に示した電子化文書ファイル５
４において符号６１、６２が具体的に示しているもの
は、６１が文書フィールド「顧客名」に対する文書フィ
ールド開始タグであり、６２が文書フィールド「顧客
名」に対する文書フィールド終了タグである。The digitized document file 5 shown in FIG.
In FIG. 4, reference numerals 61 and 62 specifically indicate a document field start tag for the document field “customer name” and a document field end tag for the document field “customer name”.

【００７５】図３に示すように、１つの電子化文書ファ
イル５４の中に複数の文書が格納されている場合があ
る。図３に示した電子化文書ファイルの例では、文書開
始タグ５９「＜ＤＯＣ＞」と文書終了タグ６０＜／ＤＯ
Ｃ＞とによって文書の区切りが表されている。As shown in FIG. 3, a plurality of documents may be stored in one digitized document file 54. In the example of the digitized document file shown in FIG. 3, the document start tag 59 “<DOC>” and the document end tag 60 </ DO
C> represents a document break.

【００７６】また、１つの文書は複数の文書フィールド
から構成されている。例えば、図３に示したように、文
書５５の中には、「顧客名」、「顧客電話」、「機
種」、「件名」といった文書フィールドのように、記述
内容が定型的であり、数字や簡単な文字列などで表現で
きる定型フィールド５７と、「質問」、「回答」といっ
た文書フィールドのように、記述内容が非定型的であ
り、自然言語で表現される複数の文で記述される非定型
フィールド５８とから構成される。One document is composed of a plurality of document fields. For example, as shown in FIG. 3, in the document 55, the description content is a fixed form such as a document field such as "customer name", "customer telephone", "model", and "subject". Field 57, which can be represented by a simple character string, or a document field such as "question" or "answer", the description content is atypical and is described by a plurality of sentences expressed in a natural language. And an atypical field 58.

【００７７】また、１つの文書フィールドは、図３に示
した電子化文書ファイルの例では、文書フィールド開始
タグ６１は記号「＜」と記号「＞」とで囲まれ、記号
「＜」と記号「＞」とを含む文字列であり、文書フィー
ルド開始タグ６１から「＜」と記号「＞」とを除いた文
字列がフィールド名である。一方、文章フィールド終了
タグ６２は記号「＜／」と記号「＞」とで囲まれ、記号
「＜／」と記号「＞」とを含むた文字列であり、文書フ
ィールド終了タグ６２から記号「＜／」と記号「＞」と
を除いた文字列がフィールド名である。In the example of the digitized document file shown in FIG. 3, one document field has a document field start tag 61 surrounded by symbols “<” and “>”, and a symbol “<” and a symbol “<”. A character string including “>”, and a character string obtained by removing “<” and the symbol “>” from the document field start tag 61 is a field name. On the other hand, the sentence field end tag 62 is a character string that is surrounded by the symbol “<//” and the symbol “>” and includes the symbol “<//” and the symbol “>”. The character string excluding </ and> is the field name.

【００７８】図４は、図２に示したフローチャートにお
ける処理対象文抽出処理（ステップＳ１）の流れを示す
フローチャートである。図３に示したような電子化文書
ファイル５４を入力した場合の処理対象文抽出処理（ス
テップＳ１）を図４に示したフローチャートを用いて説
明する。FIG. 4 is a flowchart showing the flow of the processing target sentence extraction process (step S1) in the flowchart shown in FIG. The processing target sentence extraction process (step S1) when the digitized document file 54 as shown in FIG. 3 is input will be described with reference to the flowchart shown in FIG.

【００７９】まず、文書開始タグ５９「＜ＤＯＣ＞」が
見つかるまで（ステップＳ６）電子化文書ファイル５４
を先頭から文字列を順次読み進める（ステップＳ５）。
電子化文書ファイル５４の終りを検出する（図４に示し
たフローチャートにおいては「ＥＯＦ」と表記してい
る）までに文書開始タグ５９が見つからなかった場合
は、処理を終了する。文書開始タグ５９が見つかった場
合は、ステップＳ７へ進み、文書終了タグ６０「＜／Ｄ
ＯＣ＞」が見つかるまで（ステップＳ８）電子化文書フ
ァイル５４の文字列を順次読込む（ステップＳ７）。電
子化文書ファイル５４の終りを検出するまでに文書終了
タグ６０が見つからなかった場合は、処理を終了する。
文書終了タグ６０が見つかった場合は、ステップＳ９へ
進む。First, until the document start tag 59 "<DOC>" is found (step S6), the digitized document file 54
Are sequentially read from the beginning (step S5).
If the document start tag 59 is not found by the time the end of the digitized document file 54 is detected (denoted by “EOF” in the flowchart shown in FIG. 4), the process ends. If the document start tag 59 is found, the process proceeds to step S7, where the document end tag 60 “</ D
Until "OC>" is found (step S8), the character strings of the digitized document file 54 are sequentially read (step S7). If the document end tag 60 is not found before the end of the digitized document file 54 is detected, the process ends.
If the document end tag 60 is found, the process proceeds to step S9.

【００８０】ステップＳ９においては、ステップＳ７に
おいて読込んだ文字列（ここでは、文書開始タグ５９お
よび文書終了タグ６０は読み進め、読込んだ文字列には
含まれないものとする）が空でなければ、読込んだ文字
列を１つの文書５５であると判断して切出し、切出した
文書５５を事例として処理するために、文書５５に対し
て一意に定まる事例番号を付与する（ステップＳ９）。
次に、ステップＳ９において切出した文書５５に対し
て、処理の流れは後述する１文書に対する処理対象文抽
出処理（ステップＳ１０）を実行した後、ステップＳ５
に戻り、以上の処理を繰返す。In step S9, the character string read in step S7 (here, the document start tag 59 and the document end tag 60 are read and are not included in the read character string) are empty. If not, the read character string is determined to be one document 55, cut out, and a unique case number is assigned to the document 55 in order to process the cut out document 55 as a case (step S9). .
Next, with respect to the document 55 cut out in step S9, the processing flow is as follows.
And the above processing is repeated.

【００８１】なお、以下の（１）乃至（３）の３つの場
合には、（ａ）ユーザに警告を出して処理を続行する、
（ｂ）ユーザに警告を出して処理を一旦中断してユーザ
の指示を待ち、ユーザの指示に従った処理をした後、処
理を続行する、（ｃ）ユーザに警告を出して処理を終了
する、の何れかの処置をとる。（１）ステップＳ６において、文書開始タグ５９と文書
終了タグ６０とを除いて読み進むデータがあった場合。（２）ステップＳ８において、電子化文書ファイル５４
の終りを検出した場合で、かつ、ステップＳ９およびス
テップＳ１０において未処理の読込んだ文字列が存在す
る場合。（３）ステップＳ９において、読込んだ文字列が空であ
った場合。In the following three cases (1) to (3), (a) a warning is issued to the user and the processing is continued.
(B) A warning is issued to the user, the process is temporarily interrupted, the process waits for the user's instruction, the process is performed according to the user's instruction, and then the process is continued. (C) A warning is issued to the user and the process ends. Take one of the following actions. (1) When there is data to be read in step S6 except for the document start tag 59 and the document end tag 60. (2) In step S8, the digitized document file 54
Is detected, and there is an unprocessed read character string in steps S9 and S10. (3) In step S9, the read character string is empty.

【００８２】図５は、図４に示したフローチャートにお
ける１文書に対する処理対象文抽出処理（ステップＳ１
０）の流れを示すフローチャートである。次に、図５に
示したフローチャートを用いて１文書に対する処理対象
文抽出処理（ステップＳ１０）を説明する。FIG. 5 is a flow chart showing a process-target sentence extracting process for one document in the flowchart shown in FIG.
It is a flowchart which shows the flow of 0). Next, the processing target sentence extraction processing (step S10) for one document will be described using the flowchart shown in FIG.

【００８３】１文書に対する処理対象文抽出処理（ステ
ップＳ１０）においては電子化文書ファイル５４におけ
る１文書分の文字列を入力として受取る。まず、記号
「＜」と記号「＞」とで囲まれ、記号「＜」の次の文字
が記号「／」でない文字列、即ち、文書フィールド開始
タグ６１が見つかるまで（ステップＳ１２）入力された
１文書分の文字列を先頭から順次読み進める（ステップ
Ｓ１１）。入力された１文書分の文字列の終りを検出す
る（図５に示したフローチャートにおいては「ＥＯＤ」
と表記している）までに文書フィールド開始タグ６１が
見つからなかった場合は、処理を終了する。In the processing target sentence extraction processing for one document (step S10), a character string for one document in the digitized document file 54 is received as an input. First, a character string surrounded by a symbol "<" and a symbol ">" is input until a character string next to the symbol "<" is not the symbol "/", that is, a document field start tag 61 is found (step S12). The character strings for one document are sequentially read from the beginning (step S11). The end of the character string for one input document is detected (“EOD” in the flowchart shown in FIG. 5).
If the document field start tag 61 is not found before the process is completed, the process is terminated.

【００８４】文書フィールド開始タグ６１が見つかった
場合は、ステップＳ１３へ進み、記号「＜／」と記号
「＞」とで囲まれた文字列、即ち、文書フィールド終了
タグ６２が見つかるまで（ステップＳ１４）入力された
１文書分の文字列を順次読込む（ステップＳ１３）。入
力された１文書分の文字列の終りを検出するまでに文書
フィールド終了タグ６２が見つからなかった場合は、処
理を終了する。文書フィールド終了タグ６２が見つかっ
た場合は、ステップＳ１５へ進む。If the document field start tag 61 is found, the process proceeds to step S13, and the character string surrounded by the symbol "<//" and the symbol ">", that is, until the document field end tag 62 is found (step S14). ) The input character strings for one document are sequentially read (step S13). If the document field end tag 62 is not found before the end of the input character string of one document is detected, the process is terminated. If the document field end tag 62 is found, the process proceeds to step S15.

【００８５】ステップＳ１５においては、ステップＳ７
において読込んだ文字列（ここでは、文書フィールド開
始タグ６１および文書フィールド終了タグ６２は別に保
存し、読込んだ文字列には含まれないものとする）が空
でない場合は、読込んだ文字列を１つのフィールドの内
容であると判断して切出すと共に、文書フィールド開始
タグ６１および／または文書フィールド終了タグ６２か
らフィールド名を取出し、フィールドを特定する。In step S15, step S7
If the character string read in (i.e., the document field start tag 61 and the document field end tag 62 are stored separately and not included in the read character string) is not empty, the read character string The column is determined to be the contents of one field and cut out, and the field name is extracted from the document field start tag 61 and / or the document field end tag 62 to specify the field.

【００８６】図６は、定型フィールドか非定型フィール
ドかといった文書フィールドに関する情報を予め記述し
た文書フィールド情報の例を示す図である。図６におい
て、６３は文書フィールド情報、６４は本事例蓄積・検
索方法において取扱う全てのフィールドのフィールド名
を登録した「フィールド名」の欄、６５は「フィールド
名」の欄６４に登録された各々のフィールドが定型フィ
ールドか非定型フィールドかという属性を予め定義して
登録した「定型／非定型」の欄、６６は「フィールド
名」の欄６４に登録された各々のフィールドに対して設
定可能なオプションの情報を登録した「オプション」の
欄であり、６７は事例データの集合を大きく分類するた
めに設定した「カテゴリ」属性、６８は事例データの一
覧を表示する場合などに各々の事例を識別するために用
いる「タイトル」属性である。FIG. 6 is a diagram showing an example of document field information in which information about a document field such as a fixed field or an irregular field is described in advance. In FIG. 6, reference numeral 63 denotes document field information, 64 denotes a "field name" column in which the field names of all fields handled in this case accumulation / search method are registered, and 65 denotes a field name registered in the "field name" column 64 In the “Typical / Atypical” column 66 in which the attribute of a field is defined as a standard field or an atypical field is registered in advance, and 66 can be set for each field registered in the “field name” column 64. An "option" column in which option information is registered, 67 is a "category" attribute set to classify a set of case data largely, and 68 is an identification of each case when displaying a list of case data. This is a “title” attribute used to perform the operation.

【００８７】フィールド名の取出し処理は、文書フィー
ルド開始タグ６１から記号「＜」および記号「＞」を取
除いた文字列と文書フィールド終了タグ６２から記号
「＜／」および記号「＞」を取除いた文字列とが同一で
（一致し）かつ空でない場合は、前記文字列をフィール
ド名であると判断して取出す。The field name extraction process is performed by extracting the character string obtained by removing the symbols “<” and “>” from the document field start tag 61 and the symbols “<//” and “>” from the document field end tag 62. If the removed character string is identical (coincident) and not empty, the character string is determined to be a field name and taken out.

【００８８】次に、フィールドの特定処理は、図６に示
した文書フィールド情報６３を参照し、フィールド名の
取出し処理において取出したフィールド名が「フィール
ド名」の欄６４に登録されている場合は、その登録され
ているフィールド名に対応する「定型／非定型」の欄６
５の属性値を取出し、定型フィールドか非定型フィール
ドかを特定する。Next, the field specification processing refers to the document field information 63 shown in FIG. 6, and if the field name extracted in the field name extraction processing is registered in the “field name” column 64 , “Formal / Atypical” column 6 corresponding to the registered field name
The attribute value of No. 5 is extracted, and it is specified whether the field is a fixed field or a non-fixed field.

【００８９】次に、取出した属性値が「定型」の場合は
ステップＳ１６に進み、定型フィールドに対する処理を
実行する。また、取出した属性値が「非定型」の場合は
ステップＳ１７へ進み、非定型フィールドに対する処理
を実行する。Next, when the extracted attribute value is "fixed pattern", the process proceeds to step S16, and the process for the fixed pattern field is executed. If the extracted attribute value is "irregular", the process proceeds to step S17 to execute processing on the irregular field.

【００９０】定型フィールドの処理を行うステップＳ１
６においては、ステップＳ１５で取出したフィールドの
内容をそのまま事例データベース５０の対応する欄に登
録する。図３に示した電子化文書ファイル５４の例で
は、フィールドが「＜顧客名＞」の場合、事例データベ
ース５０のフィールド名の欄が「顧客名」である欄に対
応するフィールド内容の欄に文字列「山田太郎」をその
まま登録する。Step S1 for processing a fixed field
In step 6, the contents of the fields extracted in step S15 are registered in the corresponding columns of the case database 50 as they are. In the example of the digitized document file 54 shown in FIG. 3, when the field is “<customer name>”, characters are added to the field contents field corresponding to the field where the field name field of the case database 50 is “customer name”. The column "Taro Yamada" is registered as it is.

【００９１】また、非定型フィールドの処理を行うステ
ップＳ１７においては、処理の流れは後述する「非定型
フィールド処理」を呼出す。図３に示した電子化文書フ
ァイル５４の例では、「＜質問＞」フィールドと「＜回
答＞」フィールドに対して、本「非定型フィールド処
理」が呼ばれる。In step S17 for processing an irregular field, the processing flow calls "irregular field processing" described later. In the example of the digitized document file 54 shown in FIG. 3, the "irregular field processing" is called for the "<question>" field and the "<answer>" field.

【００９２】ステップＳ１６またはステップＳ１７の処
理が終った後、ステップＳ１１に戻って以上の各処理を
繰返す。After the processing in step S16 or S17 is completed, the flow returns to step S11 to repeat the above processing.

【００９３】なお、以下の（１）乃至（５）の５つの場
合には、（ａ）ユーザに警告を出して処理を続行する、
（ｂ）ユーザに警告を出して処理を一旦中断してユーザ
の指示を待ち、ユーザの指示に従った処理をした後、処
理を続行する、（ｃ）ユーザに警告を出して処理を終了
する、の何れかの処置をとる。In the following five cases (1) to (5), (a) a warning is issued to the user and the processing is continued.
(B) A warning is issued to the user, the process is temporarily interrupted, the process waits for the user's instruction, the process is performed according to the user's instruction, and then the process is continued. (C) A warning is issued to the user and the process ends. Take one of the following actions.

【００９４】（１）ステップＳ１２において、文書フィ
ールド開始タグ６１と文書フィールド終了タグ６２とを
除いて読み進むデータがあった場合。（２）ステップＳ８において、電子化文書ファイル５４
の終りを検出した場合で、かつ、ステップＳ１６または
ステップＳ１７において未処理の読込んだ文字列が存在
する場合。（３）ステップＳ１５において、読込んだ文字列が空で
あった場合。（４）ステップＳ１５において、切出した文字列（１つ
のフィールドに相当する）に対する文書フィールド開始
タグ６１から取出したフィールド名と、同文書フィール
ド終了タグ６２から取出したフィールド名とが一致しな
い場合。（５）ステップＳ１５において取出したフィールド名が
図６に示した文書フィールド情報６３に存在しない場
合。なお、取出したフィールド名が空であった場合も含
まれる。(1) When there is data to be read except for the document field start tag 61 and the document field end tag 62 in step S12. (2) In step S8, the digitized document file 54
Is detected, and there is an unprocessed read character string in step S16 or step S17. (3) In step S15, the read character string is empty. (4) If the field name extracted from the document field start tag 61 for the extracted character string (corresponding to one field) does not match the field name extracted from the same document field end tag 62 in step S15. (5) When the field name extracted in step S15 does not exist in the document field information 63 shown in FIG. The case where the extracted field name is empty is also included.

【００９５】図７は、図３に示した電子化文書ファイル
５４における非定型フィールドに対する処理（図５に示
したフローチャートにおけるステップＳ１７）の流れを
示すフローチャートである。FIG. 7 is a flowchart showing the flow of the processing (step S17 in the flowchart shown in FIG. 5) for the irregular fields in the digitized document file 54 shown in FIG.

【００９６】図７において、６９は各々の文書に対して
設定された事例番号、７０は処理対象である非定型フィ
ールドのフィールド名、７１は後述する文切出し処理
（ステップＳ１８）において１文毎に切出された事例
文、７２は事例文７１の各々に付与された一連の文番
号、７３は後述する文タグ付与処理（ステップＳ２０）
において各々の事例文７１の各々に付与された文タグ、
７４は文タグ７３毎に（同一の文タグ７３を持つ）事例
文７１の各々に付与された一連の文タグ毎文番号であ
る。In FIG. 7, 69 is the case number set for each document, 70 is the field name of the atypical field to be processed, and 71 is the sentence extraction process (step S18) described below for each sentence. The extracted case sentence, 72 is a series of sentence numbers assigned to each of the case sentences 71, and 73 is a sentence tag assigning process described later (step S20).
, A sentence tag assigned to each of the case sentences 71,
74 is a series of sentence tag-by-sentence numbers assigned to each of the example sentences 71 (having the same sentence tag 73) for each sentence tag 73.

【００９７】次に、図７に示したフローチャートを用い
て非定型フィールドの処理の流れを説明する。Next, the processing flow of the irregular field will be described with reference to the flowchart shown in FIG.

【００９８】まず、処理対象文書（例えば図３における
文書５５）の事例番号６９と処理対象の非定型フィール
ド（例えば図３における非定型フィールド５８の「＜質
問＞」フィールド）のフィールド名７０が入力として受
渡されており、電子化文書ファイル５４中の処理対象の
非定型フィールドに記述されている複数の文の中から、
句点「。」、中点「・」、章番号などの箇条書きを表す
文字列などから判定して、１つの文を切出す（ステップ
Ｓ１８）。First, the case number 69 of the document to be processed (eg, the document 55 in FIG. 3) and the field name 70 of the atypical field to be processed (eg, the “<question>” field of the atypical field 58 in FIG. 3) are input. From among a plurality of sentences described in the atypical fields to be processed in the digitized document file 54,
One sentence is cut out based on a character string representing a bullet point such as a period ".", A middle point ".", And a chapter number (step S18).

【００９９】図３に示した電子化文書ファイル５４の例
では、「＜質問＞」フィールドからは、まず「エアコン
から音がする。」という文を切出す。ここでは、文切出
し処理（ステップ１８）において切出した文のことを事
例データを構成する「事例文」と呼ぶ。ステップＳ１８
において事例文７１を切出すことができた場合には、抽
出した事例文７１に、事例文７１が属する事例の事例番
号６９、事例文７１が属するフィールドのフィールド名
７０、処理対象のフィールド（例えば図３に示した「＜
質問＞」フィールド）内で何番目の文であるかを示す文
番号７２を付与する。In the example of the digitized document file 54 shown in FIG. 3, from the "<question>" field, a sentence "sound from the air conditioner." Here, the sentence extracted in the sentence extraction processing (step 18) is referred to as "case sentence" constituting the case data. Step S18
When the case sentence 71 can be extracted in the extracted case sentence 71, the case number 69 of the case to which the case sentence 71 belongs, the field name 70 of the field to which the case sentence 71 belongs, the field to be processed (for example, "<" Shown in FIG.
A sentence number 72 indicating the number of the sentence in the “question>” field) is assigned.

【０１００】次に、事例文を切出すことができたか否か
を調べ（ステップＳ１９）、事例文を切出すことができ
なかった場合には、処理対象フィールド内の全文の抽出
が終了したものと判断し、文書フィールド終了タグ（例
えば図３における「＜／質問＞」）が存在することを確
認し（ステップＳ２１）、非定型フィールド処理を終了
して、図５に示したフローチャートにおける非定型フィ
ールド処理の呼出し元であるステップ１７に戻り、次に
ステップＳ１１へ進む。なお、ステップＳ２１におい
て、文書フィールド終了タグが存在することが確認でき
ない場合は警告を出す。Next, it is checked whether or not a case sentence can be cut out (step S19). If the case sentence cannot be cut out, it is determined that all sentences in the processing target field have been extracted. It is confirmed that there is a document field end tag (for example, "</ question>" in FIG. 3) (step S21), the atypical field processing ends, and the atypical field processing in the flowchart shown in FIG. The process returns to step 17, which is the calling source of the field processing, and then proceeds to step S11. In step S21, if it cannot be confirmed that the document field end tag exists, a warning is issued.

【０１０１】一方、ステップＳ１９において事例文を切
出すことができた場合には、切出された事例文７１に対
して事例文７１のタイプを表すラベルである「文タグ」
７３を付与する（ステップＳ２０）。On the other hand, if a case sentence can be cut out in step S19, the sentence sentence 71, which is a label indicating the type of the case sentence 71, is added to the cut-out case sentence 71.
73 is given (step S20).

【０１０２】図８は、文タグ一覧表の例を示す図であ
る。図８において、７５は文タグ一覧表である。文タグ
付与処理（ステップＳ２０）においては、文タグ７３を
付与するための情報として、例えば、図８に示した文タ
グ一覧表７５を参照する。FIG. 8 is a diagram showing an example of the sentence tag list. In FIG. 8, reference numeral 75 denotes a sentence tag list. In the sentence tag assigning process (step S20), for example, a sentence tag list table 75 shown in FIG. 8 is referred to as information for assigning the sentence tag 73.

【０１０３】図８に示すように、文タグ一覧表７５に
は、「フィールド」毎に文中のキーとなる表現である
「条件」部と本「条件」部に対応する「文タグ」を予め
定義しておく。As shown in FIG. 8, in the sentence tag list 75, a “condition” part which is a key expression in a sentence for each “field” and a “sentence tag” corresponding to the “condition” part are previously stored. Define it.

【０１０４】例えば、図３に示した電子化文書ファイル
５４の例では、＜質問＞フィールドの最初の事例文「エ
アコンから音がする。」に対しては、図８に示した文タ
グ一覧表を参照すると、「フィールド」が「＜質問＞」
である「条件」部には「で使用」「がする」……という
キーワードが記述されているが、「がする」というキー
ワードが前記事例文中に存在するので、前記事例文の文
タグは、文タグ一覧表７５における「がする」という
「条件」部に対応する「文タグ」を取出して「症状」と
なる。For example, in the example of the digitized document file 54 shown in FIG. 3, the first example sentence “sound from the air conditioner” in the <question> field corresponds to the sentence tag list shown in FIG. "Field" is "<question>"
In the “condition” part of which is described, the keywords “use in” and “gas” are described. However, since the keyword “gas” is present in the case sentence, the sentence tag of the case sentence is The “sentence tag” corresponding to the “condition” part of “ga suru” in the sentence tag list 75 is extracted and becomes the “symptom”.

【０１０５】また、図３に示した電子化文書ファイル５
４の例で、＜回答＞フィールドの最後の事例文「異常で
はないので対策不要。」に対しては、図８に示した文タ
グ一覧表を参照すると、「フィールド」が「＜回答＞」
である「条件」部には「ため」「ようだ」「思われる」
「要」「不要」……というキーワードが記述されている
が、「不要」というキーワードが前記事例文中に存在す
るので、前記事例文の文タグは、文タグ一覧表７５にお
ける「不要」という「条件」部に対応する「文タグ」を
取出して「対策」となる。Further, the digitized document file 5 shown in FIG.
In the example of FIG. 4, for the last case sentence in the <answer> field, “No countermeasure is required because it is not abnormal,” referring to the sentence tag list shown in FIG. 8, the “field” is changed to “<answer>”.
The "condition" part is "for", "it seems", "it seems"
The keywords "necessary", "unnecessary" are described. However, since the keyword "unnecessary" exists in the case sentence, the sentence tag of the case sentence is referred to as "unnecessary" in the sentence tag list table 75. The “statement tag” corresponding to the “condition” part is extracted and used as “measures”.

【０１０６】なお、図８においては、説明を簡単にする
ため、表層（文中に含まれる字面）の文字列を用いて文
タグを特定する場合を示しているが、「条件」部に文の
構文パターンや否定・推量などの様相表現などを指定で
きるようにしておき、切出した事例文に対して後述する
文解析処理を行ってから文タグ一覧表７５との対応を取
って文タグを決定しても良い。FIG. 8 shows a case where a sentence tag is specified by using a character string of a surface layer (characters included in a sentence) for the sake of simplicity. A syntax pattern and a modal expression such as negation and guesswork can be specified, and a sentence analysis process described later is performed on the extracted example sentence, and then a sentence tag is determined by associating with the sentence tag list 75. You may.

【０１０７】図９は、処理対象文抽出処理（ステップＳ
１）によって、図３に示した電子化文書ファイル５４の
文書５５から生成された「事例」の例を示す図である。
図９において、７６は事例である。なお、図９中、図７
と同一または相当部分には同一符号を付して説明を省略
する。FIG. 9 shows a process of extracting a target sentence (step S).
FIG. 4 is a diagram showing an example of “case” generated from the document 55 of the digitized document file 54 shown in FIG. 3 by 1).
In FIG. 9, reference numeral 76 denotes a case. In FIG. 9, FIG.
The same or corresponding parts are denoted by the same reference numerals and description thereof is omitted.

【０１０８】図９に示した例では、事例７６として、事
例番号６９（図９に示した例では「０００１」）が割当
てられている。In the example shown in FIG. 9, case number 69 (“0001” in the example shown in FIG. 9) is assigned as case 76.

【０１０９】また、図３に示した定型フィールド５７の
各文書フィールド「＜顧客名＞」「＜顧客電話＞」「＜
件名＞」「＜機種＞」に対しては、事例７６として、フ
ィールド名「顧客名」「顧客番号」「件名」「機種」と
当該フィールド名に対応させた定型フィールドの内容
「山田太郎」「０□□□−△▽−○○○○」「エアコン
の冷媒音」「エアコン」が生成されている。Further, each of the document fields “<customer name>”, “<customer telephone>”, “<
For the “subject>” and “<model>”, as the case 76, the field names “customer name”, “customer number”, “subject”, “model” and the contents of the standard fields corresponding to the field names “Taro Yamada”, “Taro Yamada” 0 □□□-△ ▽ -OOOO ”,“ refrigerant sound of air conditioner ”, and“ air conditioner ”are generated.

【０１１０】また、図３に示した非定型フィールド５８
の各フィールド「＜質問＞」「＜回答＞」に対しては、
事例７６として、「質問」「回答」といったフィールド
名７０、各フィールド名７０に対応させて、「エアコン
から音がする。」……「異常ではないので対策不要。」
といった当該フィールドに属する事例文７１、各フィー
ルド毎に事例文７１の各々に対して「１」「２」「３」
「４」（以上＜質問＞フィールド）「１」「２」（以上
＜回答＞フィールド）といった一連の文番号７２、「症
状」「環境」「症状」「症状」（以上＜質問＞フィール
ド）「原因」「対策」（以上＜回答＞フィールド）とい
った文タグ７３が生成されている。The irregular field 58 shown in FIG.
For each of the fields <Question> and <Answer>
As a case 76, "Field sound 70 such as" Question "and" Answer ", and corresponding to each field name 70," Sound from the air conditioner. "
"1", "2", "3" for each case sentence 71 belonging to the field.
A series of sentence numbers 72 such as "4" (the above <question> field), "1" and "2" (the above <answer> field), "symptom", "environment", "symptom", "symptom" (the above <question> field) " Sentence tags 73 such as "cause" and "measures" (the above <answer> field) are generated.

【０１１１】また、文タグ７３「症状」が与えられてい
るのは、＜質問＞フィールドに属する文番号１，３、４
の事例文７１であり、その他の事例文７１にはそれぞれ
異なる文タグ７３が与えられているので、文番号１、
２、３、４（以上＜質問フィールド＞）、１、２（以上
＜回答＞フィールド）の事例文７１に対して、それぞれ
「１」「１」「２」「３」「１」「１」といった文タグ
毎文番号７４が割当てられている。The sentence tag 73 "symptom" is given to the sentence numbers 1, 3, 4 belonging to the <question> field.
Is different from the example sentence 71, and the other example sentences 71 are given different sentence tags 73.
“1,” “1”, “2”, “3”, “1”, and “1” for example sentences 71 of 2, 3, 4 (or more <question field>) and 1 and 2 (or more <answer> field) Sentence number 74 for each sentence tag.

【０１１２】図９においては、文切り出し処理（図７に
示したフローチャートにおけるステップＳ１８）、文タ
グ付与処理（同ステップＳ２０）が全て成功した場合を
示しているが、これら２つのの処理は文の区切りや文タ
グ付与条件の曖昧さのために事例文の切出しを誤る場合
がある。その場合には、図１に示した事例蓄積・検索装
置における処理対象文抽出部４５に表示機能を設け、図
９に示したような形式で事例をユーザに提示（表示）し
てユーザからの編集・修正入力を受付ける構成とするこ
とにより、人手による編集・修正が可能となる。FIG. 9 shows a case where the sentence extraction process (step S18 in the flowchart shown in FIG. 7) and the sentence tag attaching process (step S20) are all successful. There is a case where segmentation of a case sentence is erroneous due to delimitation of a sentence or ambiguity of a sentence tag attaching condition. In this case, a display function is provided in the processing target sentence extraction unit 45 in the case accumulation / search device shown in FIG. 1, and the case is presented (displayed) to the user in a format as shown in FIG. By adopting a configuration that accepts edit / correction inputs, manual edit / correction becomes possible.

【０１１３】以上で図２に示したフローチャートにおけ
る処理対象文抽出処理（ステップＳ１）が終了し、次に
文解析処理（ステップＳ２）を実行する。Thus, the sentence extraction process (step S1) in the flowchart shown in FIG. 2 is completed, and then the sentence analysis process (step S2) is executed.

【０１１４】文解析処理（ステップＳ２）においては、
処理対象抽出処理（ステップＳ１）において得られた処
理対象文（事例文）の集合に対して形態素解析および構
文解析を行い、文の構造を生成する。In the sentence analysis processing (step S2),
Morphological analysis and syntax analysis are performed on a set of processing target sentences (case sentences) obtained in the processing target extraction process (step S1) to generate a sentence structure.

【０１１５】図１０は、図２に示したフローチャートに
おける文解析処理（ステップＳ２）の流れを示すフロー
チャートである。図１０において、７７は後述する形態
素解析（ステップＳ２２）において用いる解析用単語辞
書である。なお、図１０中、図２と同一または相当部分
には同一符号を付して説明を省略する。次に、図１０に
示したフローチャートを用いて文解析処理（ステップＳ
２）の流れを説明する。FIG. 10 is a flowchart showing the flow of the sentence analysis process (step S2) in the flowchart shown in FIG. In FIG. 10, reference numeral 77 denotes an analysis word dictionary used in morphological analysis (step S22) described later. In FIG. 10, the same or corresponding parts as those in FIG. Next, a sentence analysis process (step S
The flow of 2) will be described.

【０１１６】文解析処理（ステップＳ２）は、図１０に
示すように、形態素解析処理（ステップＳ２２）と構文
解析処理（ステップＳ２３）とからなる。形態素解析処
理（ステップＳ２２）では解析用単語辞書７７を、構文
解析ステップＳ２３では領域オントロジ４４を参照す
る。また、解析用単語辞書７７を領域オントロジ４４の
中に含めておき、形態素解析処理（ステップＳ２２）が
領域オントロジ４４を参照するように構成しても良い。As shown in FIG. 10, the sentence analysis process (step S2) includes a morphological analysis process (step S22) and a syntax analysis process (step S23). The morphological analysis process (step S22) refers to the analysis word dictionary 77, and the syntax analysis step S23 refers to the area ontology 44. Alternatively, the analysis word dictionary 77 may be included in the area ontology 44, and the morphological analysis processing (step S22) may refer to the area ontology 44.

【０１１７】図１１は、図１０に示したフローチャート
における形態素解析処理（ステップＳ２２）の流れを示
すフローチャートであると共に、図１に示した事例蓄積
・検索装置における文解析部４７中の形態素解析部（図
示せず）の構成を示す構成図である。FIG. 11 is a flow chart showing the flow of the morphological analysis processing (step S22) in the flow chart shown in FIG. 10, and the morphological analysis section in the sentence analysis section 47 in the case storage / search apparatus shown in FIG. FIG. 2 is a configuration diagram showing a configuration (not shown).

【０１１８】図１１において、７８は図１に示した事例
蓄積・検索装置における文解析部４７の中に存在して形
態素解析を行う形態素解析部、７９は形態素解析を行う
際に用いる付属語を予め格納した付属語辞書、８０は付
属語辞書７９に格納した付属語の接続関係を予め記述し
て付属語辞書７９に格納した付属語接続表、８１は形態
素解析処理（ステップ２２）の処理対象となる入力文、
８２は入力文８１を形態素解析した際の解析途中結果、
８３は形態素解析処理（ステップ２２）を実行して得ら
れた形態素解析結果、８４、８５，８６は解析用単語辞
書７７の構造を示し、それぞれ、「見出し」、「品詞」
情報、「意味シンボル」である。なお、図１１中、図１
０と同一または相当部分には同一符号を付して説明を省
略する。In FIG. 11, reference numeral 78 denotes a morphological analysis unit which exists in the sentence analysis unit 47 of the case storage / search apparatus shown in FIG. 1 and performs morphological analysis, and 79 denotes an auxiliary word used when performing morphological analysis. An attached word dictionary stored in advance, 80 is an attached word connection table stored in the attached word dictionary 79 in which connection relations of attached words stored in the attached word dictionary 79 are described in advance, and 81 is a processing target of the morphological analysis process (step 22). Input sentence,
Reference numeral 82 denotes a result of the morphological analysis of the input sentence 81 during the analysis,
Reference numeral 83 denotes a morphological analysis result obtained by executing the morphological analysis processing (step 22), and reference numerals 84, 85, and 86 denote structures of the analysis word dictionary 77, which are "heading" and "part of speech", respectively.
Information, a "semantic symbol". In FIG. 11, FIG.
The same or corresponding parts as 0 are denoted by the same reference numerals and description thereof will be omitted.

【０１１９】次に、図１１に示した形態素解析部７８の
構成図とフローチャートとを用いて形態素解析部７８の
動作と形態素解析処理（ステップＳ３２）の流れを説明
する。Next, the operation of the morphological analysis unit 78 and the flow of the morphological analysis process (step S32) will be described using the configuration diagram and flowchart of the morphological analysis unit 78 shown in FIG.

【０１２０】形態素解析部７８では、形態素解析処理
（ステップＳ２２）を実行する。形態素解析処理（ステ
ップＳ２２）では、解析用単語辞書７７、付属語辞書７
９、付属語接続表８０を参照し、形態素解析を行ない、
入力文８１を形態素の列に分割する。形態素解析の方
法、付属語辞書７９、付属語接続表８０については、多
くの文献に詳述されているのでここでは説明を省略す
る。なお以下の説明では、図１１の解析途中結果８２に
示すように、形態素の区切りを「／」によって略記表示
する。The morphological analysis section 78 executes a morphological analysis process (step S22). In the morphological analysis process (step S22), the analysis word dictionary 77, the attached word dictionary 7
9. Perform morphological analysis with reference to the attached word connection table 80,
The input sentence 81 is divided into morpheme columns. The morphological analysis method, the auxiliary word dictionary 79, and the auxiliary word connection table 80 have been described in detail in many documents, and thus description thereof is omitted here. In the following description, as shown in the analysis result 82 in FIG. 11, the delimiters of the morphemes are abbreviated as "/".

【０１２１】図１１に示した解析途中結果８２は、自立
語部分から、解析用単語辞書７７に記述された情報（見
出し８４、品詞情報８５、単語の種別を表す概念情報で
ある意味シンボル８６）を参照できる構成とする。この
とき、解析途中結果８２に解析用単語辞書７７を参照す
るためのポインタ情報を保持しても良いし、解析用単語
辞書７７が２次記憶装置に存在するなどの原因により参
照に時間を要する場合は、１次記憶装置上に前記解析用
単語辞書７７の情報をコピーしても良い。なお、以下の
説明では、図１１に示したように意味シンボル８６は単
語を記号「＜」と記号「＞」とで囲んで表すものとす
る。The analysis result 82 shown in FIG. 11 is based on the information described in the analysis word dictionary 77 (headline 84, part of speech information 85, meaning symbol 86 which is conceptual information indicating the type of word) from the independent word part. Can be referred to. At this time, pointer information for referring to the analysis word dictionary 77 may be held in the analysis in-progress result 82, or it takes time to refer to the analysis word dictionary 77 due to the presence of the analysis word dictionary 77 in the secondary storage device. In this case, the information of the analysis word dictionary 77 may be copied on the primary storage device. In the following description, as shown in FIG. 11, the meaning symbol 86 is represented by enclosing a word between a symbol “<” and a symbol “>”.

【０１２２】図１２は、図１０に示したフローチャート
における構文解析処理（ステップＳ２３）の処理の流れ
を示すフローチャートであると共に、図１に示した事例
蓄積・解析装置における文解析部４５の中の構文解析部
（図示せず）の構成を示す構成図である。FIG. 12 is a flow chart showing the flow of the syntax analysis processing (step S23) in the flow chart shown in FIG. 10, and also included in the sentence analysis section 45 in the case accumulation / analysis apparatus shown in FIG. FIG. 3 is a configuration diagram illustrating a configuration of a syntax analysis unit (not shown).

【０１２３】図１２において、８７は文解析部４５の中
に存在し、構文解析を行う構文解析部、８８は係り受け
処理（ステップＳ２５）で用いるため予め格納した文法
規則、８９は形態素解析結果８３を入力とし、文節構造
生成処理（ステップＳ２４）で生成した文節構造、９０
は文節構造８９を入力とし、係り受け解析処理（ステッ
プＳ２５）で生成した係り受け構造である。In FIG. 12, reference numeral 87 denotes a syntactic analysis unit which exists in the sentence analysis unit 45 and performs syntactic analysis; 88, a grammatical rule stored in advance for use in the dependency processing (step S25); 83, the phrase structure generated by the phrase structure generation process (step S24);
Is a dependency structure generated by the dependency analysis process (step S25) using the phrase structure 89 as an input.

【０１２４】次に、図１２に示した構文解析部８７の構
成図とフローチャートとを用いて構文解析部の動作と形
態素解析処理（ステップＳ２３）の流れを説明する。Next, the operation of the syntax analysis unit and the flow of the morphological analysis process (step S23) will be described using the configuration diagram and flowchart of the syntax analysis unit 87 shown in FIG.

【０１２５】構文解析部８７は、係り受け解析を行なう
基本単位である文節構造８９を生成する文節構造生成部
（図示せず）と、文法規則８８を参照して係り受け解析
を行う係り受け解析部（図示せず）とから構成される。The syntactic analysis unit 87 includes a phrase structure generation unit (not shown) that generates a phrase structure 89, which is a basic unit for performing dependency analysis, and a dependency analysis unit that performs dependency analysis with reference to the grammar rules 88. (Not shown).

【０１２６】図１０に示した構文解析処理（ステップＳ
２３）では、まず、文節構造生成処理（ステップＳ２
４）において、図１１に示した形態素解析出力結果８３
を入力とし、係り受け解析を行なう基本単位である文節
構造８９を生成する。文節構造８９は、１個以上の自立
語形態素だけから構成されるか、または１個以上の自立
語形態素と当該自立語形態素に連なる１個以上の付属語
形態素とから構成される。The syntax analysis processing shown in FIG. 10 (step S
23), first, a phrase structure generation process (step S2)
In 4), the morphological analysis output result 83 shown in FIG.
To generate a phrase structure 89, which is a basic unit for performing dependency analysis. The phrase structure 89 is composed of only one or more independent word morphemes or one or more independent word morphemes and one or more adjunct word morphemes connected to the independent word morpheme.

【０１２７】図１３は、文節構造８９の例を示す図であ
る。図１３において、９１は係り受けの係り部分の属性
である係り属性、９２は係り受けの受け部分の属性であ
る受け属性、９３は当該文節を構成する少なくとも１個
の自立語形態素情報へのポインタを格納した自立語情
報、９４は０個以上の付属語形態素情報へのポインタを
格納した付属語情報である。FIG. 13 is a diagram showing an example of the phrase structure 89. In FIG. 13, reference numeral 91 denotes a dependency attribute which is an attribute of a dependency portion, 92 denotes a reception attribute which is an attribute of a dependency portion, and 93 denotes a pointer to at least one independent word morpheme information constituting the phrase. Is the independent word information, and 94 is auxiliary word information storing pointers to zero or more auxiliary word morpheme information.

【０１２８】係り受け解析処理（ステップＳ２５）で
は、文法規則８８に従って文節構造８９の係り受け解析
を行い、係り受け構造９０を構文解析結果として生成す
る。係り受け解析の方法については、多くの文献等に解
説されているので、ここでは詳細な説明を省略する。な
お、一般に係り受け解析の途中で多数の曖昧性を生ずる
が、当該曖昧性を解消するために領域オントロジ４４を
適宜参照する。なお、文法規則８８は、係り受け解析処
理（ステップＳ２５）のプログラム中に埋込んだ構成と
しても良い。In the dependency analysis process (step S25), the dependency analysis of the phrase structure 89 is performed in accordance with the grammar rule 88, and the dependency structure 90 is generated as a result of the syntax analysis. Since the dependency analysis method is described in many documents and the like, a detailed description is omitted here. In general, a number of ambiguities are generated during the dependency analysis, but the area ontology 44 is appropriately referred to in order to resolve the ambiguity. The grammar rule 88 may be embedded in a program for the dependency analysis process (step S25).

【０１２９】以上で、図２に示したフローチャートにお
ける文解析処理（ステップＳ２）の処理が終了する。次
に、領域オントロジ４４を参照して類似文の照合を行う
類似文照合部４８を呼出しながら処理対象文（事例文）
の構文解析結果９０の集合を類似文ごとに分類（クラス
タリング）する類似文クラスタリング処理（ステップＳ
３）を実行する。Thus, the sentence analysis processing (step S2) in the flowchart shown in FIG. 2 is completed. Next, a sentence to be processed (case sentence) is called while calling a similar sentence matching unit 48 for matching similar sentences with reference to the area ontology 44.
Sentence clustering processing for classifying (clustering) a set of the syntax analysis results 90 for each similar sentence (step S
Execute 3).

【０１３０】図１４は、図３に示した電子化文書ファイ
ル５４から生成された類似した事例文の集合を含むクラ
スタからなる問題解決木９５の例を示す図である。FIG. 14 is a diagram showing an example of a problem solving tree 95 composed of clusters including a set of similar case sentences generated from the digitized document file 54 shown in FIG.

【０１３１】図１４において、９５は問題解決木、９６
は「気になる音」というラベルが設定されたクラスタ、
９７は「本体ランプ未点灯」というラベルが設定された
クラスタ、９８は「本体表示ランプ点滅」というラベル
が設定されたクラスタ、９９は「入力タイマ作動」とい
うラベルが設定されたクラスタ、１００は当該クラスタ
に関連付けられている事例（文）の数、１０１は検索時
に利用する確認チェック欄、１０２はクラスタ表示のた
めに設定したクラスタの表示ラベル、１０３は当該クラ
スタが問題解決木９５の終端に位置する場合に、当該ク
ラスタが終端であることが判るように付与した終端表示
である。In FIG. 14, 95 is a problem solving tree, 96
Is the cluster labeled "Sound of interest",
Reference numeral 97 denotes a cluster labeled “main body lamp not lit”, 98 denotes a cluster labeled “main body display lamp blinking”, 99 denotes a cluster labeled “input timer operation”, and 100 denotes the corresponding cluster. The number of cases (sentences) associated with the cluster, 101 is a confirmation check box used at the time of search, 102 is a display label of the cluster set for cluster display, 103 is the cluster located at the end of the problem solving tree 95 In this case, the end is displayed so that the cluster is the end.

【０１３２】図１４に示すような類似文の集合を含むク
ラスタからなる問題解決木９５を構成し、作成された問
題解決木９５の各クラスタが類似した事例文の集合を持
ち、図９に示すように、各事例文７１が事例７６へのポ
インタを持つ形式で図１に示す事例データベース５０に
蓄積される。図２に示したフローチャートにおける類似
文クラスタリング処理（ステップＳ３）の流れを説明す
る前に、まず、類似文クラスタリング処理（ステップＳ
３）から呼出して文同士の類似度を求める類似文照合部
４８の動作について説明する。A problem solving tree 95 composed of clusters including a set of similar sentences as shown in FIG. 14 is constructed, and each cluster of the created problem solving tree 95 has a set of similar case sentences and is shown in FIG. As described above, each case sentence 71 is stored in the case database 50 shown in FIG. 1 in a format having a pointer to the case 76. Before describing the flow of the similar sentence clustering process (step S3) in the flowchart shown in FIG. 2, first, the similar sentence clustering process (step S3)
The operation of the similar sentence matching unit 48 called from 3) to obtain the similarity between sentences will be described.

【０１３３】領域オントロジ４４を参照して類似文の照
合を行う類似文照合部４８における文同士の類似度の計
算は、文の係り受け構造におけるノードの階層毎の重み
付けやノード内の属性による重み付け（否定、推量など
の様相表現を計算対象にする処理）によって行う。Calculation of similarity between sentences in the similar sentence matching unit 48 for matching similar sentences with reference to the area ontology 44 is performed by weighting for each layer of nodes in the sentence dependency structure and weighting by attributes in the nodes. (Process of making modal expressions such as negation and guesswork as calculation targets).

【０１３４】図１５は、２つの文の係り受け構造の例を
示す図である。図１５において、１０４は文Ｓｏ「セン
サが目詰まりを認識する」の係り受け構造、１０５は文
Ｓｉ「センサが結露を検出する」の係り受け構造であ
る。FIG. 15 is a diagram showing an example of a dependency structure of two sentences. In FIG. 15, reference numeral 104 denotes a dependency structure of a sentence So "sensor recognizes clogging", and reference numeral 105 denotes a dependency structure of sentence Si "sensor detects dew condensation".

【０１３５】次に、図１５を用いて類似文照合部４８に
おける文同士の類似度の計算方法を説明する。類似文照
合部４８における自然言語の文同士の類似度計算関数を
Ｓｉｍ（Ａ，Ｂ，Ｄ）とする。引数ＡおよびＢは、図１
０に示した構文解析処理（ステップＳ２３）を実行した
結果得られた構文解析結果である係り受け構造９０（図
１２）であり、図１５に示すような木構造をしている。
また、引数Ｄは、文同士の類似度計算を行う際の照合の
詳細度であり、図１５に示した２つの木構造をした係り
受け構造１０４、１０５の類似度計算の際に、ルートノ
ードから何階層目までを処理対象とするかを示す値であ
る。ここでは簡単のため、Ｄ＝２として説明する。Next, a method of calculating the similarity between sentences in the similar sentence matching unit 48 will be described with reference to FIG. The function of calculating the similarity between natural language sentences in the similar sentence matching unit 48 is Sim (A, B, D). Arguments A and B are shown in FIG.
0 is a dependency structure 90 (FIG. 12) which is a syntax analysis result obtained as a result of executing the syntax analysis process (step S23), and has a tree structure as shown in FIG.
The argument D is the level of detail of the matching when calculating the similarity between sentences. When calculating the similarity between the two tree-structured dependency structures 104 and 105 shown in FIG. This is a value indicating from which layer up to which layer is to be processed. Here, for the sake of simplicity, description will be made assuming that D = 2.

【０１３６】最初に初期類似度１．０を与える。類似度
１．０は、入力された２つの文が全く同じ意味を表すと
いうことを意味する。以下の処理では、木構造を辿りな
がら各ノードの情報を比較し、異なる部分に「ペナルテ
ィ」を与え、１．０から減じていく。類似度が０または
所定の値になった時点で、比較対象は類似していないと
見なして類似度計算を打切る。First, an initial similarity of 1.0 is given. A similarity of 1.0 means that the two input sentences have exactly the same meaning. In the following processing, the information of each node is compared while tracing the tree structure, and a “penalty” is given to a different part, and the difference is subtracted from 1.0. When the similarity becomes 0 or a predetermined value, the comparison target is regarded as dissimilar and the similarity calculation is terminated.

【０１３７】図１６は、類似度計算におけるペナルティ
の計算規則の例を示す図である。FIG. 16 is a diagram showing an example of a penalty calculation rule in the similarity calculation.

【０１３８】まず、第１レベルのノード間の比較を行
う。ここでは、意味シンボルが「＜検出動作＞」で等し
く、実際の単語が異なるので図１６に示した規則Ｒ１に
従ってペナルティ値−０．０１を与える。次に、第２レ
ベルのノード間の比較を行う。このとき、左側のノード
「センサ＜モニタ装置＞」は情報が完全に一致するので
ペナルティを与えない。右側のノードは意味シンボルが
異なるので図１６の規則Ｒ４に従ってペナルティ−０．
３を与える。このようにして、類似度は、１．０−０．
０１−０．３を計算して０．６９となる。First, comparison between first level nodes is performed. Here, since the semantic symbols are equal in "<detection operation>" and the actual words are different, a penalty value of -0.01 is given according to rule R1 shown in FIG. Next, a comparison between the second level nodes is performed. At this time, no penalty is given to the left node “sensor <monitoring device>” because the information completely matches. The nodes on the right have different semantic symbols, so penalty-0.
Give 3. Thus, the similarity is 1.0-0.
01-0.3 is calculated to be 0.69.

【０１３９】例えば、仮に前記引数の値Ｄとして「１」
が与えられていれば、類似度計算は第１レベルのノード
だけが対象となり、類似度は１．０−０．０１を計算し
て０．９９となる。このように、Ｄの値によって類似度
計算の精度を制御できるので、検索状況に応じて柔軟な
処理（類似度計算）が可能となる。For example, if the value D of the argument is "1"
Is given, only the first-level nodes are targeted for the similarity calculation, and the similarity is calculated as 1.0-0.01 to be 0.99. As described above, since the precision of the similarity calculation can be controlled by the value of D, flexible processing (similarity calculation) can be performed according to the search situation.

【０１４０】図１７は、領域オントロジ４４に格納され
た領域に関する知識の例を示す図である。図１７におい
て、１０６は上位概念−下位概念の関係を示すＩＳ−Ａ
関係知識、１０７は全体−部分の関係を示すＨＡＳ−Ａ
関係知識、１０８は概念間の関係を示す格関係知識、１
０９は表現の差異が大きい場合にその違いを吸収する言
換え知識、１１０は「＜タイマ表示＞」に関するＨＡＳ
−Ａ関係知識であり、同時には起り得ない背反関係を示
す知識である。なお、図１７中、図２と同一または相当
部分には同一符号を付して説明を省略する。FIG. 17 is a diagram showing an example of knowledge about an area stored in the area ontology 44. In FIG. 17, reference numeral 106 denotes an IS-A indicating a relationship between a superordinate concept and a subordinate concept.
Relational knowledge 107 is HAS-A indicating a whole-part relation
Relation knowledge, 108 is case relation knowledge showing the relation between concepts, 1
09 is paraphrase knowledge that absorbs the difference when the difference in expression is large, and 110 is HAS related to “<timer display>”.
-A relationship knowledge, and at the same time, knowledge indicating a conflict that cannot occur. In FIG. 17, the same or corresponding parts as those in FIG. 2 are denoted by the same reference numerals, and description thereof will be omitted.

【０１４１】即ち、領域オントロジ４４は、事例蓄積・
検索処理の対象とする領域に依存したＩＳ−Ａ関係知識
１０６、ＨＡＳ−Ａ関係知識１０７、格関係知識１０
８、言換え知識１０９からなどから構成され、背反関係
を示す知識の記述も可能である。なお、ＩＳ−Ａ関係知
識１０６やＨＡＳ−Ａ関係知識１０７は、通常「シソー
ラス」と呼ばれるものである。That is, the area ontology 44 stores the case
IS-A related knowledge 106, HAS-A related knowledge 107, and case related knowledge 10 depending on the area to be searched.
8. It is composed of the paraphrase knowledge 109 and the like, and it is also possible to describe knowledge that indicates a conflict. The IS-A related knowledge 106 and the HAS-A related knowledge 107 are usually called "thesaurus".

【０１４２】また、図１５に示した係り受け構造１０４
に対して図１６に示した規則Ｒ２を適用すると、図１７
に示したＩＳ−Ａ関係知識１０６によって、単語「モニ
タ装置」の意味シンボルが「＜本体装置＞」の場合、文
「モニタ装置が目詰まりを認識する」の類似度は、１．
０−０．１を計算して０．９となる。In addition, the dependency structure 104 shown in FIG.
When the rule R2 shown in FIG. 16 is applied to
When the meaning symbol of the word “monitor device” is “<main device>” according to the IS-A relation knowledge 106 shown in FIG. 2, the similarity of the sentence “the monitor device recognizes clogging” is 1.
0-0.1 is calculated to be 0.9.

【０１４３】一方、図１６に示した規則Ｒ５を図１２に
示した文「ＲＣでタイマが入らない」の係り受け構造９
０に適用すると、構文解析結果のノードの属性に否定を
含むので、図１６に示した規則Ｒ５によって文「ＲＣで
タイマが入る」の係り受け構造との類似度は、１．０−
０．９を計算して０．１となる。類似度計算を打切るた
めの所定の値（閾値）が、例えば、０．２に設定されて
いたとすると、前記２つの文は類似していないことにな
り、類似度計算は本時点で打切られる。On the other hand, the rule R5 shown in FIG. 16 is changed from the dependency structure 9 shown in FIG.
When applied to 0, the attribute of the node of the syntax analysis result includes a negation, so that the similarity with the dependency structure of the sentence “Timer enters at RC” according to rule R5 shown in FIG.
0.9 is calculated to be 0.1. If a predetermined value (threshold) for terminating the similarity calculation is set to, for example, 0.2, the two sentences are not similar, and the similarity calculation is discontinued at this time. .

【０１４４】同様に、文中に「〜だろう」のような推量
表現があり、構文解析結果のノードの属性に推量を含む
場合には、図１６に示した規則Ｒ６が適用されてペナル
ティとして−０．１５を減じた類似度が与えられる。Similarly, when a sentence includes a guess expression such as "-Would be" and the attribute of the node as a result of the syntax analysis includes guesswork, the rule R6 shown in FIG. 16 is applied and the penalty is- Given a similarity less 0.15.

【０１４５】また、同様にして、文「タイマボタンの入
力を受付けない」に対して図１６に示した規則Ｒ３を適
用すると、図１７に示したＨＡＳ−Ａ関係知識１０７を
用いて、文「リモコンの入力を受付けない」との類似度
は、１．０−０．２を計算して０．８と比較的高くな
る。Similarly, when the rule R3 shown in FIG. 16 is applied to the sentence “Does not accept the input of the timer button”, the sentence “HAS-A related knowledge 107 shown in FIG. 17 is used. The degree of similarity with "Remote controller input is not accepted" is calculated to be 1.0-0.2 and is relatively high at 0.8.

【０１４６】領域オントロジ４４のＨＡＳ−Ａ関係知識
１０７には、背反な情報を記述することもできる。図１
７に示した「＜タイマ表示＞」に関するＨＡＳ−Ａ関係
知識１１０の例を用いて説明する。In the HAS-A relation knowledge 107 of the area ontology 44, information that is contrary to the above can be described. FIG.
This will be described using the example of the HAS-A related knowledge 110 regarding “<timer display>” shown in FIG.

【０１４７】「＜タイマ表示＞」には、「なし」、「切
タイマ」、「入タイマ」という状態があるが、図１７に
示した「＜タイマ表示＞」に関するＨＡＳ−Ａ関係知識
１１０は、「なし」と「切タイマ」または「入タイマ」
とは同時には表示されないので背反であることを示して
いる。本「＜タイマ表示＞」に関するＨＡＳ−Ａ関係知
識１１０によって、事例文中に文「タイマ表示はなしで
ある」と文「タイマ表示に入タイマがついている。」が
あるとき、クラスタ間の情報として事例データベース５
０の中にそれぞれの文を含むクラスタが背反であるとい
う情報を格納しておくために用いることができる。The “<timer display>” has states of “none”, “off timer”, and “on timer”, but the HAS-A related knowledge 110 regarding “<timer display>” shown in FIG. , "None" and "Off timer" or "On timer"
Is not displayed at the same time, indicating that it is contrary. According to the HAS-A-related knowledge 110 regarding this “<timer display>”, when the sentence “no timer display” and the sentence “timer display has an input timer” in the case sentence, the case Database 5
0 can be used to store information that a cluster including each sentence is a conflict.

【０１４８】図１７に示すように、領域オントロジ４４
の言い換え知識１０９には、同じ意味になる言葉の表現
を記述することができる。図１７に示す例では、知識１
０９は、「＜ブレーカ＞が飛ぶ」と「＜ブレーカ＞が落
ちる」とが同じ意味になるという言い換え知識１０９が
記述されており、類似度の計算時に本言い換え知識１０
９を参照することによって、文「ブレーカが飛ぶ」と文
「ブレーカが落ちる」との類似度は１．０である（意味
的に一致する）として取扱うことができる。As shown in FIG. 17, the area ontology 44
In the paraphrase knowledge 109, expressions of words having the same meaning can be described. In the example shown in FIG.
No. 09 describes paraphrase knowledge 109 that “<breaker> flies” and “<breaker> falls” have the same meaning, and this paraphrase knowledge 10 is used when similarity is calculated.
By referring to No. 9, the similarity between the sentence “breaker flies” and the sentence “breaker falls” can be handled as 1.0 (semantically coincident).

【０１４９】図１８は、図２に示したフローチャートに
おける類似文クラスタリング処理（ステップＳ３）の処
理の流れを示すフローチャートである。図１８におい
て、１１１は事例番号６９、フィールド名７０、事例文
７１、文番号７２、文タグ７３、文タグ毎文番号７４、
係り受け構造９０から構成される事例文データ、１１２
は複数の事例文データ１１１から構成される事例文デー
タの集合である。なお、図１８中、図７および図１２と
同一または相当部分には同一符号を付して説明を省略す
る。FIG. 18 is a flowchart showing the flow of the similar sentence clustering process (step S3) in the flowchart shown in FIG. In FIG. 18, reference numeral 111 denotes a case number 69, a field name 70, a case sentence 71, a sentence number 72, a sentence tag 73, a sentence number 74 for each sentence tag,
Example sentence data composed of the dependency structure 90, 112
Is a set of case sentence data composed of a plurality of case sentence data 111. Note that, in FIG. 18, the same or corresponding parts as those in FIGS. 7 and 12 are denoted by the same reference numerals, and description thereof will be omitted.

【０１５０】次に、図１８を用いて、図２に示したフロ
ーチャートにおける類似文クラスタリング処理（ステッ
プＳ３）の処理の流れを説明する。類似文クラスタリン
グ処理（ステップＳ３）は、事例文データの集合１１２
を入力パラメータとして受取る。Next, the flow of the similar sentence clustering process (step S3) in the flowchart shown in FIG. 2 will be described with reference to FIG. The similar sentence clustering process (step S3) includes a set 112 of case sentence data.
As input parameters.

【０１５１】まず、分類（クラスタリング）の対象とす
るフィールド名７０、文タグ７３の種類、クラスタの最
大階層の数ｌａｙｅｒを指定する（ステップＳ２６）。
次に、現在の階層を「１」として、指定された条件に合
う事例文の集合、最大階層数（ｌａｙｅｒ）、現在の階
層「１」を入力パラメータとしてクラスタリング処理
（ステップＳ２７）を呼出す。クラスタリング処理（ス
テップＳ２７）の処理の流れについては後述する。次
に、クラスタリング処理（ステップＳ２７）の処理結果
である図１４に示した問題解決木９５を事例データベー
ス５０に格納する。First, the field name 70 to be classified (clustered), the type of the sentence tag 73, and the maximum layer number of the cluster layer are specified (step S26).
Next, assuming that the current layer is “1”, the clustering process (step S27) is called with the set of case sentences meeting the specified condition, the maximum number of layers (layer), and the current layer “1” as input parameters. The processing flow of the clustering processing (step S27) will be described later. Next, the problem solving tree 95 shown in FIG. 14, which is the processing result of the clustering processing (step S27), is stored in the case database 50.

【０１５２】図１９は、図１８に示したクラスタリング
処理（ステップＳ２７）の処理の流れを示すフローチャ
ートである。図１９において、１１３は後述する実際の
クラスタリングを行うクラスタリング副処理（ステップ
Ｓ３０）の実行結果である。FIG. 19 is a flowchart showing the flow of the clustering process (step S27) shown in FIG. In FIG. 19, reference numeral 113 denotes an execution result of a clustering sub-process (step S30) for performing actual clustering described later.

【０１５３】クラスタリング処理（ステップＳ２７）
は、まず、事例文の集合、最大階層数ｌａｙｅｒ、現在
の階層ｉをパラメータとして受取ると、現在の階層ｉと
最大階層数ｌａｙｅｒとを比較し（ステップＳ２９）、
現在の階層ｉが最大階層数ｌａｙｅｒよりも大きい場合
は処理を終了する。一方、現在の階層ｉが最大階層数ｌ
ａｙｅｒ以下の場合は、事例文の集合に対してクラスタ
リング副処理（ステップＳ３０）を呼出し、実際のクラ
スタリング処理を行う。クラスタリング副処理（ステッ
プＳ３０）の処理の流れは後述する。クラスタリング副
処理（ステップＳ３０）の実行結果１１３として、クラ
スタ数ｋと事例文の集合であるｋ個のクラスタ｛Ｃ１，
Ｃ２，……，Ｃｋ｝とが得られる。Clustering process (step S27)
First, when the set of case sentences, the maximum number of layers, and the current layer i are received as parameters, the current layer i is compared with the maximum layer number layer (step S29),
If the current layer i is larger than the maximum layer number layer, the process ends. On the other hand, the current hierarchy i is the maximum hierarchy number l
In the case of ayer or less, the clustering sub-process (step S30) is called for the set of case sentences, and the actual clustering process is performed. The processing flow of the clustering sub-process (step S30) will be described later. As the execution result 113 of the clustering sub-process (step S30), the number k of clusters and k clusters {C1,
.., Ck} are obtained.

【０１５４】次に、ｊをクラスタリング処理（ステップ
Ｓ２７）の内部に設けたカウンタとし、カウンタｊの値
を初期値１に設定する（ステップＳ３１）。カウンタｊ
の値とクラスタ数ｋの値を比較し（ステップＳ３２）、
カウンタｊの値がクラスタ数ｋよりも大きい場合は処理
を終了する。一方、カウンタｊの値がクラスタ数ｋ以下
の場合は、事例文の集合Ｃｊ、最大階層ｌａｙｅｒ、現
在の階層（ｉ＋１）を入力パラメータとしてクラスタリ
ング処理（ステップＳ３７）を再帰的に呼出す。次に、
カウンタｊの値を１だけ増分してステップＳ３２へ戻
り、ステップＳ３２からの処理を繰返す。Next, j is a counter provided inside the clustering process (step S27), and the value of the counter j is set to an initial value 1 (step S31). Counter j
Is compared with the value of the number of clusters k (step S32),
If the value of the counter j is larger than the number of clusters k, the process ends. On the other hand, when the value of the counter j is equal to or smaller than the number k of clusters, the clustering process (step S37) is recursively called using the set Cj of the case sentences, the maximum hierarchy layer, and the current hierarchy (i + 1) as input parameters. next,
The value of the counter j is incremented by 1 and the process returns to step S32, and the processing from step S32 is repeated.

【０１５５】図２０は、図１９に示したフローチャート
におけるクラスタリング副処理（ステップＳ３０）の処
理の流れを示すフローチャートである。次に、図２０を
用いて、類似文照合部４８による類似度計算を用いてク
ラスタリングを行うクラスタリング副処理（ステップＳ
３０）の処理の流れを説明する。FIG. 20 is a flowchart showing the flow of the clustering sub-process (step S30) in the flowchart shown in FIG. Next, referring to FIG. 20, a clustering sub-process for performing clustering using similarity calculation by the similar sentence matching unit 48 (step S
The flow of the process 30) will be described.

【０１５６】まず、２つの文が類似していると判定する
閾値ｔｈを設定する（ステップＳ３４）。ここでは、閾
値ｔｈの値を０．７５として説明する。閾値ｔｈはクラ
スタの作成状況に応じて変更することができるものとす
る。次に、一つのクラスタＣｉ（ｉ＝１，２，……，
ｎ；ｎは処理対象の事例文の数）がそれぞれ１つの事例
文Ｓｉからなる初期クラスタを構成する（ステップＳ３
５）。続いて、類似文照合部４８を呼出してＳｉｍ（Ｓ
ｉ，Ｓｊ，Ｄ）を求め、クラスタ間の類似度Ｓｉｍ（Ｃ
ｉ，Ｃｊ）をＳｉｍ（Ｓｉ，Ｓｊ，Ｄ）として各クラス
タ間の初期の類似度表を作成する。ここで、ｉ＝１，
２，……，ｎ、ｊ＝ｉ＋１，ｉ＋２，……，ｎである。
なお、Ｄの値は必要に応じて設定する。First, a threshold value th for determining that two sentences are similar is set (step S34). Here, the description will be made assuming that the value of the threshold th is 0.75. It is assumed that the threshold th can be changed according to the cluster creation status. Next, one cluster Ci (i = 1, 2,...,
n; n is the number of case sentences to be processed (n is the number of case sentences to be processed), each constituting an initial cluster consisting of one case sentence Si (step S3).
5). Subsequently, the similar sentence matching unit 48 is called to call Sim (S
i, Sj, D) is obtained, and the similarity Sim (C
(i, Cj) is Sim (Si, Sj, D), and an initial similarity table between the clusters is created. Where i = 1
2,..., N, j = i + 1, i + 2,.
The value of D is set as needed.

【０１５７】図２１は、事例文がＳ１，Ｓ２，……，Ｓ
５の５個の文からなり、Ｃ１＝｛Ｓ１｝，Ｃ２＝｛Ｓ
２｝，……，Ｃ５＝｛Ｓ５｝の５個のクラスタからクラ
スタリング処理を始める場合のクラスタ間の類似度を格
納した類似度表の例を示す図である。図２１において、
１１４はクラスタ間の初期の類似度表、１１５は第１回
目のクラスタリング処理を行った場合の類似度表、１１
６は第２回目のクラスタリング処理を行った場合の類似
度表である。なお、これらの類似度表１１４、１１５、
１１６はそれぞれ対象行列になるので、下半分のみに値
を格納している。FIG. 21 shows that the case sentences are S1, S2,.
5, five sentences, C1 = {S1}, C2 = {S
It is a figure which shows the example of the similarity table which stored the similarity between clusters at the time of starting a clustering process from five clusters of 2}, ..., C5 = {S5}. In FIG.
114 is an initial similarity table between clusters, 115 is a similarity table when the first clustering process is performed, 11
Reference numeral 6 denotes a similarity table when the second clustering process is performed. Note that these similarity tables 114, 115,
Since 116 is a target matrix, values are stored only in the lower half.

【０１５８】次に、前記初期の類似度表１１４から最大
の類似度を持つクラスタの対ＣｕとＣｖを求める（ステ
ップＳ３７）。図２１に示した類似度表１１４において
は、Ｃｕ＝Ｃ４、Ｃｖ＝Ｃ５、Ｓｉｍ（Ｃｕ，Ｃｖ）＝
Ｓｉｍ（Ｃ４，Ｃ５）＝０．９９であり、クラスタ間の
類似度が最大となっている。Next, a pair Cu and Cv of the cluster having the maximum similarity is obtained from the initial similarity table 114 (step S37). In the similarity table 114 shown in FIG. 21, Cu = C4, Cv = C5, and Sim (Cu, Cv) =
Sim (C4, C5) = 0.99, and the similarity between clusters is the maximum.

【０１５９】次に、最大の類似度Ｓｉｍ（Ｃｕ，Ｃｖ）
と閾値ｔｈを比較し（ステップＳ３８）、最大の類似度
Ｓｉｍ（Ｃｕ，Ｃｖ）が閾値ｔｈよりも大きい場合はク
ラスタリング処理を続行し、最大の類似度Ｓｉｍ（Ｃ
ｕ，Ｃｖ）が閾値ｔｈ以下の場合はクラスタリング処理
を終了する。図２１に示した初期の類似度表１１４の例
では、最大の類似度Ｓｉｍ（Ｃ４，Ｃ５）が０．９９、
閾値ｔｈが０．７５であるから、クラスタリング処理を
続行する。Next, the maximum similarity Sim (Cu, Cv)
Is compared with the threshold th (step S38), and when the maximum similarity Sim (Cu, Cv) is larger than the threshold th, the clustering process is continued, and the maximum similarity Sim (C
If u, Cv) is equal to or smaller than the threshold th, the clustering process ends. In the example of the initial similarity table 114 shown in FIG. 21, the maximum similarity Sim (C4, C5) is 0.99,
Since the threshold th is 0.75, the clustering process is continued.

【０１６０】次に、ステップＳ２４０６では、クラスタ
の対ＣｕとＣｖを１つのクラスタＣｕ，ｖ＝｛Ｓｕ，Ｓ
ｖ｝に纏める。図２０に示した初期の類似度表の例に対
して第１回目のクラスタリング処理を行った場合、クラ
スタの対Ｃ４とＣ５が１つのクラスタＣ４，５＝｛Ｓ
４，Ｓ５｝に纏められる。Next, in step S2406, the pair of clusters Cu and Cv is converted into one cluster Cu, v = ｛Su, S
v｝. When the first clustering process is performed on the example of the initial similarity table shown in FIG. 20, the cluster pair C4 and C5 is one cluster C4,5 = ｛S
4, S5}.

【０１６１】次に、クラスタの対を纏めることによって
新しく生成されたクラスタＣｕ，ｖとその他のクラスタ
Ｃｉとの類似度Ｓｉｍ（Ｃｕ,ｖ，Ｃｉ）を（Ｓｉｍ
（Ｃｉ，Ｃｕ）＋Ｓｉｍ（Ｃｉ，Ｃｖ））／２なる式で
計算し、類似度表を再編成する（ステップＳ４０）。図
２１に示した例では、初期の類似度表１１４に対して、
第１回目のクラスタリング処理を行った場合の類似度表
１１５が得られる。Next, the similarity Sim (Cu, v, Ci) between the newly generated cluster Cu, v and other clusters Ci by unifying the cluster pairs is represented by (Sim
The similarity is calculated using the formula (Ci, Cu) + Sim (Ci, Cv)) / 2 to reorganize the similarity table (step S40). In the example shown in FIG. 21, for the initial similarity table 114,
A similarity table 115 obtained when the first clustering process is performed is obtained.

【０１６２】次に、ステップＳ３７に戻り、ステップＳ
３８においてクラスタ間の類似度Ｓｉｍ（Ｃｕ，Ｃｖ）
が閾値ｔｈ以下となるまでステップＳ３７からの処理を
繰返す。Next, returning to step S37, step S37 is executed.
38. Similarity between clusters Sim (Cu, Cv) at 38
Is repeated until the value becomes equal to or smaller than the threshold th.

【０１６３】図２０に示したステップＳ４０における式
（前記式）では、２つのクラスタＣｕ、Ｃｖの類似度の
単純平均値を計算しているが、クラスタ内の事例文の数
を考慮した重み付き平均値などを用いても良い。In the equation (the above equation) in step S40 shown in FIG. 20, the simple average value of the similarity between the two clusters Cu and Cv is calculated, but the weighted weight is calculated in consideration of the number of case sentences in the cluster. An average value or the like may be used.

【０１６４】図２１に示した例では、２回目の繰返しの
後、クラスタＣ１とクラスタＣ２とを纏めて作成された
第２回目のクラスタリング処理を行った場合の類似度表
１１６において、最大の類似度Ｓｉｍ（Ｃ１,２，Ｃ４,
５）が０．７３となり、閾値ｔｈの０．７５以下となる
ので、ステップＳ３８においてクラスタリング副処理
（ステップＳ３０）を終了する。このとき、図２１に示
した例では、Ｃ１，２＝｛Ｓ１，Ｓ２｝、Ｃ３＝｛Ｓ
３｝、Ｃ４，５＝｛Ｓ４，Ｓ５｝の３つのクラスタが生
成されている。In the example shown in FIG. 21, after the second repetition, the largest similarity table 116 in the similarity table 116 in the case where the second clustering process created by combining the clusters C1 and C2 is performed. Degree Sim (C1,2, C4,
5) becomes 0.73, which is 0.75 or less of the threshold th, so the clustering sub-process (step S30) is ended in step S38. At this time, in the example shown in FIG. 21, C1,2 = {S1, S2}, C3 = {S
Three clusters of {3}, C4,5 = {S4, S5} are generated.

【０１６５】以上で、図２に示したフローチャートにお
いて事例文の分類処理を行う類似文クラスタリング処理
（ステップＳ３）が終了する。次に、図２に示したフロ
ーチャートにおける事例クラスタ編集処理（ステップＳ
４）を実行し、事例データベース５０に格納された事例
データのクラスタの階層を図１４に示すような構造の問
題解決木９５として編集して表示する。Thus, the similar sentence clustering process (step S3) for performing the case sentence classification process in the flowchart shown in FIG. 2 is completed. Next, the case cluster editing process in the flowchart shown in FIG.
By executing 4), the cluster hierarchy of the case data stored in the case database 50 is edited and displayed as a problem solving tree 95 having a structure as shown in FIG.

【０１６６】図１４に示した問題解決木９５の例は、図
２に示したフローチャートにおける処理対象文抽出処理
（ステップＳ１）で抽出された図９に示す事例文７１の
内、図６に示したフィールド６４のフィールド名が「＜
質問＞」であるフィールドから文タグ７３として「症
状」を持つ１、２、３番目に出現した事例文７１、即
ち、文タグ毎分番号が１、２、３の事例文７１に対し
て、クラスタの階層数３を図２に示したフローチャート
における類似文クラスタリング処理（ステップＳ３）で
指定して類似文のクラスタリングを行って作成したもの
である。The example of the problem solving tree 95 shown in FIG. 14 is shown in FIG. 6 among the case sentences 71 shown in FIG. 9 extracted in the processing target sentence extracting process (step S1) in the flowchart shown in FIG. Field 64 has a field name of "<
The case sentence 71 having the first, second, and third appearances having the “symptom” as the sentence tag 73 from the field “question>”, that is, the case sentence 71 having the sentence tag per minute number of 1, 2, and 3, It is created by performing clustering of similar sentences by specifying the number of hierarchical levels of the cluster 3 in the similar sentence clustering process (step S3) in the flowchart shown in FIG.

【０１６７】図１４に示した問題解決木９５は、図６に
示したフィールド名６４の欄のフィールド名に対応する
オプション６６の欄における「カテゴリ」属性６７の値
毎に作成する。また、図１に示した事例クラスタ編集部
５１における各クラスタの表示は、当該クラスタが含む
事例文の数が多い順に上から下に表示する。各クラスタ
は類似した事例文の集合からなり、マウスなどの操作に
よって各クラスタの表示ラベル１０２を指定して当該ク
ラスタ内の情報が参照できるものとする。また、事例ク
ラスタ編集部５１では、ユーザの操作によりクラスタの
表示ラベル１０２の文字列を設定および編集できるもの
とする。The problem solving tree 95 shown in FIG. 14 is created for each value of the “category” attribute 67 in the option 66 column corresponding to the field name in the field name 64 column shown in FIG. The display of each cluster in the case cluster editing unit 51 shown in FIG. 1 is displayed from top to bottom in the descending order of the number of case sentences included in the cluster. Each cluster is composed of a set of similar case sentences, and it is assumed that the display label 102 of each cluster is designated by an operation such as a mouse and information in the cluster can be referred to. Further, the case cluster editing unit 51 can set and edit the character string of the display label 102 of the cluster by the operation of the user.

【０１６８】図２２は、事例データベース５０に格納さ
れた事例文データ１１１の例を示す図である。なお、図
２２中、図１８と同一または相当部分には同一符号を付
して説明を省略する。FIG. 22 is a diagram showing an example of the case sentence data 111 stored in the case database 50. In FIG. 22, the same or corresponding parts as those in FIG.

【０１６９】図２に示したフローチャートにおける処理
対象文抽出処理（ステップＳ１）の出力、細かくは、図
７に示したフローチャートにおける文タグ付与処理（ス
テップＳ２０）の出力である事例番号６９、フィールド
名７０、事例文７１、文番号７２、文タグ７３、文タグ
毎文番号７４の上に、図２に示したフローチャートにお
ける文解析処理（ステップＳ２）、細かくは、図１２に
示したフローチャートにおける係り受け解析処理（ステ
ップＳ２５）を実行することによって得られた係り受け
構造９０が付け加えられた後、図２に示したフローチャ
ートにおける類似文クラスタリング処理（ステップＳ
３）において事例文データが属するクラスタが決定さ
れ、該クラスタのクラスタ番号が付与されて事例データ
ベース５０に格納される。このとき、事例データベース
５０には図１４に示した問題解決木９５のようなクラス
タ間の階層を表す情報も同時に格納される。The output of the processing target sentence extraction process (step S1) in the flowchart shown in FIG. 2, specifically, the case number 69, the field name, which is the output of the sentence tag attaching process (step S20) in the flowchart shown in FIG. A sentence analysis process (step S2) in the flowchart shown in FIG. 2 is described above the reference numeral 70, the case sentence 71, the sentence number 72, the sentence tag 73, and the sentence number 74 for each sentence tag. After the dependency structure 90 obtained by executing the reception analysis process (step S25) is added, the similar sentence clustering process (step S25) in the flowchart shown in FIG.
In 3), the cluster to which the case sentence data belongs is determined, the cluster number of the cluster is assigned, and stored in the case database 50. At this time, the case database 50 also stores information indicating the hierarchy between clusters, such as the problem solving tree 95 shown in FIG.

【０１７０】また、図２３は、図２に示したフローチャ
ートにおける類似文クラスタリング処理（ステップＳ
３）において問題解決木９５（図１４参照）を事例デー
タベース５０に格納する処理において、事例データベー
ス５０に格納した階層関係以外のクラスタ間情報の例を
示す図である。FIG. 23 shows a similar sentence clustering process (step S) in the flowchart shown in FIG.
FIG. 15 is a diagram showing an example of inter-cluster information other than the hierarchical relationship stored in the case database 50 in the process of storing the problem solving tree 95 (see FIG. 14) in the case database 50 in 3).

【０１７１】クラスタ間情報は、図２３に示すように、
クラスタ番号で表される２つのクラスタ間の関係を定義
する。関係のタイプの例としては、−１（背反）と１
（類似）とがある。背反関係は、図１７における「＜タ
イマ表示＞」に関するＨＡＳ−Ａ関係知識１１０に示し
た背反関係を参照して作成する。The inter-cluster information is, as shown in FIG.
A relationship between two clusters represented by cluster numbers is defined. Examples of relationship types are -1 (contradiction) and 1
(Similar). The conflicting relationship is created with reference to the conflicting relationship shown in the HAS-A relationship knowledge 110 regarding “<timer display>” in FIG.

【０１７２】類似関係は、各クラスタに当該クラスタを
代表する代表文を設定できるようにしておき、その代表
文の間の類似度をクラスタ間情報として事例データベー
ス５０に格納しておく。The similarity relation is such that a representative sentence representing the cluster can be set in each cluster, and the similarity between the representative sentences is stored in the case database 50 as inter-cluster information.

【０１７３】以上で、本発明に係る実施の形態１による
事例蓄積・検索装置における事例蓄積機能の動作の詳細
な説明を終了する。The detailed description of the operation of the case accumulation function in the case accumulation / retrieval apparatus according to Embodiment 1 of the present invention has been completed.

【０１７４】以下、適宜図を参照しながら、本発明に係
る実施の形態１による事例蓄積・検索装置における事例
検索機能の動作の詳細を、図２４に示したフローチャー
トに従って具体例を用いて説明する。Hereinafter, the operation of the case retrieval function in the case accumulation / retrieval apparatus according to the first embodiment of the present invention will be described in detail with reference to the flowchart as shown in FIG. .

【０１７５】まず、検索文入力処理（ステップＳ４１）
において、ユーザが所望の文書を検索するための検索文
（新たな問題の記述）を入力する。このとき、検索文の
入力は、キーボード、文字認識装置、音声認識装置（以
上図示せず）などの入力装置を用いて行う。First, a search sentence input process (step S41)
In, the user inputs a search sentence (a description of a new problem) for searching for a desired document. At this time, the input of the search sentence is performed using an input device such as a keyboard, a character recognition device, and a voice recognition device (not shown).

【０１７６】次に、文解析処理（ステップＳ２）におい
て、検索文入力処理（ステップＳ４１）によって入力さ
れた検索文の解析を行う。Next, in the sentence analysis process (step S2), the search sentence input by the search sentence input process (step S41) is analyzed.

【０１７７】文解析処理（ステップＳ２）は、前述した
事例蓄積処理における説明と同様に、図１０に示した形
態素解析処理（ステップＳ２２）と構文解析処理（ステ
ップＳ２３）の順で前記検索文を解析し、前記検索文に
対する係り受け構造を生成する。前記各処理は前述した
事例構築処理の場合と同様であるので説明を省略する。In the sentence analysis process (step S2), the search sentence is processed in the order of the morphological analysis process (step S22) and the syntax analysis process (step S23) shown in FIG. Analyze and generate a dependency structure for the search sentence. The above processes are the same as those in the case construction process described above, and thus description thereof is omitted.

【０１７８】次に、類似事例検索処理（ステップＳ４
２）の処理の流れを説明する。まず、事例データベース
５０の１次検索を行う。事例データベース５０の容量が
小さい場合は入力された検索文と全ての事例文との間で
類似文照合部４８の処理を行うという方法もあるが、一
般的には事例データベース５０には大容量の事例文が格
納されているため、入力された事例文と全ての事例文と
の照合処理を行えば、処理時間に問題が生じる。そこ
で、事例文の係り受け構造に対して、当該構造の構文要
素または当該構文要素が持つ意味シンボルによる索引を
設けておき、当該索引を用いて１次検索処理を行った
後、類似文照合部４８の処理を行い、処理対象の事例文
を予め絞り込んでおく構成とする。Next, a similar case search process (step S4)
The flow of the process 2) will be described. First, a primary search of the case database 50 is performed. When the capacity of the case database 50 is small, there is a method of performing the processing of the similar sentence matching unit 48 between the input search sentence and all the case sentences, but in general, the case database 50 has a large capacity. Since the case sentences are stored, if the input case sentence is compared with all the case sentences, a problem occurs in the processing time. Therefore, an index based on the syntactic element of the structure or the semantic symbol of the syntactic element is provided for the dependency structure of the case sentence, and after performing the primary search processing using the index, the similar sentence matching unit Forty-eight processes are performed, and the case sentences to be processed are narrowed down in advance.

【０１７９】次に、図２４に示したフローチャートにお
ける類似事例検索処理（ステップＳ４２）では、類似文
照合部４８を起動して前記１次検索結果の各事例文の係
り受け解析結果と入力された検索文の係り受け解析結果
との類似度を求め、求めた類似度に基づいて類似文の照
合を行う。類似文照合処理では、図１７に示した領域オ
ントロジ４４を参照しながら類似文の照合を行う。Next, in the similar case search process (step S42) in the flowchart shown in FIG. 24, the similar sentence matching section 48 is activated to input the dependency analysis result of each case sentence in the primary search result. The similarity between the search sentence and the dependency analysis result is obtained, and the similar sentence is collated based on the obtained similarity. In the similar sentence matching process, similar sentence matching is performed with reference to the area ontology 44 shown in FIG.

【０１８０】次に、図２４に示したフローチャートにお
ける類似事例検索処理（ステップＳ４２）の処理の流れ
を図２５に示したフローチャートに従って説明する。Next, the flow of the similar case search process (step S42) in the flowchart shown in FIG. 24 will be described with reference to the flowchart shown in FIG.

【０１８１】まず、ループ処理のための初期化設定を行
う（ステップ４４）。次に、ステップＳ４５からステッ
プＳ４７までのループ処理を前記１次検索結果の各事例
文に対して実行する。First, initialization setting for loop processing is performed (step 44). Next, the loop processing from step S45 to step S47 is executed for each case sentence of the primary search result.

【０１８２】まず、ステップＳ４６において、１次検索
結果のｉ番目の事例文Ｓｉ（図１５に示した事例文Ｓｉ
「センサが結露を検出する」に対する係り受け構造１０
５）と入力検索文Ｓｏ（図１５に示した事例文Ｓｏ「セ
ンサが目詰まりを認識する」に対する係り受け構造１０
４）との間で前述の類似度計算Ｓｉｍ（Ｓｉ，Ｓｏ，
Ｄ）を行う。Ｄの値は、前述したように必要に応じて設
定しておく。First, in step S46, the ith case sentence Si of the primary search result (case sentence Si shown in FIG. 15)
Dependency structure 10 for "sensor detects condensation"
5) and the input search sentence So (dependency structure 10 for the case sentence So “sensor recognizes clogging” shown in FIG. 15)
4) and the similarity calculation Sim (Si, So,
Perform D). The value of D is set as needed as described above.

【０１８３】最後に、検索結果を類似度順にソートし
（ステップＳ４８）、ソートした結果を図１に示した検
索結果表示部５３に出力する（ステップＳ４３）。Finally, the search results are sorted in order of similarity (step S48), and the sorted results are output to the search result display section 53 shown in FIG. 1 (step S43).

【０１８４】本発明に係る事例蓄積・検索方法によれ
ば、検索文と事例文とが事例文中に含まれるキーワード
のレベルでは完全に一致していても、検索文と事例文と
の文の意味内容が異なれば類似度が小さくなる。また、
検索文と事例文とに含まれるキーワードが異なっていて
も、領域オントロジ４４を利用して類似文照合を行うこ
とによって意味的に同じ内容の検索文と事例文との文の
類似度が高くなる。According to the case storage / search method of the present invention, even if the search sentence and the case sentence completely match at the level of the keyword included in the case sentence, the meaning of the sentence between the search sentence and the case sentence If the contents are different, the similarity decreases. Also,
Even if the keywords included in the search sentence and the case sentence are different, the similarity between the search sentence and the case sentence having the same semantic content is increased by performing similar sentence matching using the area ontology 44. .

【０１８５】従来のキーワードベースの事例蓄積・検索
においては、以上のような木目細かな処理はしていない
ので、否定文などが事例検索時に検索ゴミとして余計な
事例文が上位に出力されてしまうが、本発明に係る事例
蓄積・検索方法によれば、例えば、類似度が０．５以下
の事例文は表示しないといった設定をすることによって
余計な事例文を類似文として出力しないようにすること
ができる。In the conventional keyword-based case accumulation / search, since the detailed processing described above is not performed, an unnecessary case sentence as a search garbage at the time of case search such as a negative sentence is output at a higher position. However, according to the case accumulation / search method of the present invention, for example, by setting such that case sentences having a similarity of 0.5 or less are not displayed, unnecessary case sentences are not output as similar sentences. Can be.

【０１８６】以上のように、事例検索時には、図２４に
示した文解析処理（ステップＳ２）における図１０に示
した形態素解析処理（ステップＳ２２）において入力検
索文を単語単位に分割し、図１０に示した構文解析処理
（ステップＳ２３）において前記入力検索文に対する係
り受け構造を生成し、図２３に示した類似事例検索処理
（ステップＳ４２）において係り受け構造による検索を
行ない、事例蓄積時に用いたものと同じ類似文照合部４
８において領域オントロジ４４を参照しながら前記入力
検索文と前記検索結果との類似度を計算し、計算して得
られた類似度に従って図２３に示した検索結果表示処理
（ステップＳ４３）において前記類似度に従って検索結
果を図１に示した検索結果表示部５３に出力することに
より、多様な自然言語の表現に対して検索ゴミの少ない
事例検索を実現することができる。As described above, at the time of case search, the input search sentence is divided into words in the morphological analysis process (step S22) shown in FIG. 10 in the sentence analysis process (step S2) shown in FIG. A dependency structure for the input search sentence is generated in the syntax analysis process (step S23) shown in FIG. 23, and a search using the dependency structure is performed in the similar case search process (step S42) shown in FIG. Similar sentence matching unit 4
8, the similarity between the input search sentence and the search result is calculated with reference to the area ontology 44, and the similarity is calculated in the search result display process (step S43) shown in FIG. 23 according to the calculated similarity. By outputting the search result to the search result display unit 53 shown in FIG. 1 according to the degree, a case search with less search dust can be realized for various natural language expressions.

【０１８７】図２４に示した検索結果表示処理（ステッ
プＳ４３）における検索結果の表示方法は、図１４に示
したような問題解決木９５を検索結果表示部５３に表示
し、入力検索文と類似した事例文を含むクラスタを強調
表示しても良い。図１４は、例えば、「いやな音が聞こ
える」という入力検索文に対して、「気になる音」とい
うラベルが設定されたクラスタ９６が最も類似した事例
文「変な音が聞こえる」を含むクラスタとして強調表示
された例を示している。The search result display method in the search result display processing (step S43) shown in FIG. 24 is such that a problem solving tree 95 as shown in FIG. The cluster including the case sentence may be highlighted. FIG. 14 includes, for example, an input sentence “unpleasant sound is heard” includes a case sentence “unusual sound is heard” that is most similar to the cluster 96 labeled “unwanted sound”. The example highlighted as a cluster is shown.

【０１８８】また、図１に示した検索結果表示部５３に
おいては、事例クラスタ編集部５１と同様に問題解決木
９５をクラスタが含む事例文の数が多い順に上から下に
表示することによって過去に蓄積された件数が多い事例
をユーザが容易に参照することができる。ユーザは各ク
ラスタを指定し、指定されたクラスタが含む事例文に対
応する事例を参照し、図９に示すような形式で事例を表
示することができる。１つのクラスタには複数の事例文
が含まれ、それぞれ別の事例に対応する。事例を参照す
る際に事例の一覧表を表示する場合には、図６に示した
フィールド名６４欄の各フィールド名に対応するオプシ
ョン６６欄に記述された「タイトル」属性６８を事例の
内容として一覧表示すれば良い。In the search result display unit 53 shown in FIG. 1, similarly to the case cluster editing unit 51, the problem solving tree 95 is displayed from the top to the bottom in the descending order of the number of case sentences included in the cluster. The user can easily refer to a case where the number of cases stored in is large. The user can designate each cluster, refer to the case corresponding to the case sentence included in the designated cluster, and display the case in a format as shown in FIG. One cluster includes a plurality of case sentences, each of which corresponds to another case. When a case list is displayed when referring to a case, a “title” attribute 68 described in an option 66 column corresponding to each field name in a field name 64 column shown in FIG. Just list them.

【０１８９】また、事例データベース５０には図２２に
示した事例文データ１１１の他に図２３に示したクラス
タ間情報も格納されている。次に、クラスタ間情報の使
い方について説明する。The case database 50 also stores the inter-cluster information shown in FIG. 23 in addition to the case sentence data 111 shown in FIG. Next, how to use the inter-cluster information will be described.

【０１９０】クラスタ間情報は、クラスタ番号で表され
る２つのクラスタ間の背反関係や類似関係が定義されて
いる。本クラスタ間情報を用いることにより、例えば、
「タイマ表示はなし」という入力文に対して、図１４に
示す問題解決木９５において「本体ランプ未点灯」とい
うクラスタ９７が検索された場合に、ユーザが当該クラ
スタ９７の状態が正しいとして、確認チェック１０１欄
を選択した場合に、類似クラスタである「本体エリア表
示未点灯」というクラスタ９８を強調表示したり、背反
クラスタである「入タイマ作動」というクラスタ９９の
表示を非強調表示したりすることができる。本実施の形
態１のように、事例蓄積時において、データ間の関係を
含めて事例データとして予め事例データベース５０に格
納しておくことにより、データ間の関係を意識した検索
の効率化を図ることができる。In the inter-cluster information, a reciprocal relation and a similar relation between two clusters represented by cluster numbers are defined. By using this inter-cluster information, for example,
In response to the input sentence "No timer display", if the cluster 97 "main unit lamp is not lit" is searched in the problem solving tree 95 shown in FIG. 14, the user confirms that the state of the cluster 97 is correct and checks When the column 101 is selected, the cluster 98, which is a similar cluster, “main body area display is not lit” is highlighted, or the display of the cluster 99, which is a contradictory cluster, “on timer operation” is not highlighted. Can be. As in the first embodiment, by storing in the case database 50 in advance the case data including the relationship between data at the time of case accumulation, the search efficiency can be improved in consideration of the relationship between data. Can be.

【０１９１】また、以上のような事例蓄積・検索処理を
プログラムとして実現して得られた事例蓄積・検索プロ
グラムをコンピュータで読取可能な記録媒体に記録し、
記録媒体に記録した事例蓄積・検索プログラムをパーソ
ナルコンピュータ、マイクロプロセッサなどのコンピュ
ータにロードして実行しても良い。Further, the case accumulation / retrieval program obtained by realizing the above case accumulation / retrieval processing as a program is recorded on a computer-readable recording medium.
The case accumulation / search program recorded on the recording medium may be loaded into a computer such as a personal computer or a microprocessor and executed.

【０１９２】また、前記事例蓄積・検索プログラムは、
ネットワークに接続されたコンピュータまたはディスク
装置に格納されており、前記プログラムのロードはネッ
トワークを経由して行っても良い。Further, the case accumulation / retrieval program comprises:
The program may be stored in a computer or a disk device connected to a network, and the program may be loaded via the network.

【０１９３】また、前記事例蓄積・検索プログラムはネ
ットワークを介して接続されたコンピュータ中に存在
し、前記コンピュータにアクセスして前記事例蓄積・検
索プログラムを実行しても良い。The case accumulation / retrieval program may exist in a computer connected via a network, and may access the computer to execute the case accumulation / retrieval program.

【０１９４】[0194]

【発明の効果】本発明に係る第１の事例蓄積・検索装置
によれば、事例蓄積・検索の対象が複数の事例文からな
る事例に対しても、自然言語で記述された電子化文書の
中から各々の事例文を切出して分類した事例データを事
例データベースに蓄積しておき、入力された検索文と前
記事例データベースに分類して格納された事例文とを照
合することにより、複雑な事例に対しても類似検索によ
る問題の解決を容易にすることができるという効果を奏
する。According to the first case accumulation / retrieval apparatus of the present invention, even if the case accumulation / retrieval target is composed of a plurality of case sentences, an electronic document described in a natural language can be used. By storing case data obtained by cutting out and classifying each case sentence from the case database and comparing the input search sentence with the case sentence classified and stored in the case database, a complicated case is obtained. This also has an effect that the problem can be easily solved by the similarity search.

【０１９５】本発明に係る第１の事例蓄積方法によれ
ば、事例蓄積・検索の対象が複数の事例文からなる事例
に対しても、自然言語で記述された電子化文書の中から
各々の事例文を切出して分類した事例データを事例デー
タベースに蓄積しておき、入力された検索文と前記事例
データベースに分類して格納された事例文とを照合する
ことにより、複雑な事例に対しても類似検索による問題
の解決を容易にすることができるという効果を奏する。According to the first case accumulation method according to the present invention, even if the case accumulation / retrieval target includes a plurality of case sentences, each case is selected from the digitized document described in the natural language. By storing case data obtained by extracting and classifying case sentences in a case database, and collating the input search sentence with the case sentences classified and stored in the case database, even for complex cases, There is an effect that the problem can be easily solved by the similarity search.

【０１９６】本発明に係る第２の事例蓄積方法によれ
ば、更に、処理対象文抽出ステップにおいて、切出した
事例文に事例文の種別を付与することによって、事例蓄
積処理の対象とする事例文を特定することができるの
で、前記分類処理において、文の種類と記述順を反映さ
せた分類を行うことができるという効果を奏する。According to the second case accumulating method of the present invention, in the processing target sentence extracting step, the case sentence to be subjected to the case accumulating process is added by assigning the type of the case sentence to the extracted case sentence. Therefore, in the classification processing, it is possible to perform the classification in which the type and description order of the sentence are reflected.

【０１９７】本発明に係る第３の事例蓄積方法によれ
ば、更に、類似文クラスタリングステップにおいて、各
々の事例文を分類して構成するクラスタの階層数または
・および前記各々の事例文同士が類似していると判断す
る際に用いるクラスタ間の類似度の閾値を指定するの
で、事例蓄積処理が対象とする事例文の「文タグ」やそ
の順番などに応じて、前記階層数またはクラスタ間の類
似度の閾値を適切に設定することによって、木目細かく
且つ精度良くクラスタリング処理をすることができると
共に、クラスタリング処理の効率化を図ることができる
という効果を奏する。According to the third case accumulating method of the present invention, further, in the similar sentence clustering step, the number of hierarchical levels of the clusters that classify each case sentence and / or the case sentences are similar to each other. Since the threshold value of the similarity between the clusters to be used when judging that the case has been specified is specified, the number of layers or the number of By appropriately setting the threshold value of the similarity, it is possible to perform the clustering process finely and accurately, and to achieve an effect of increasing the efficiency of the clustering process.

【０１９８】本発明に係る第４の事例蓄積方法によれ
ば、更に、領域オントロジにおいて、意味的な上位−下
位の関係を示すＩＳ−Ａ関係知識、意味的な部分−全体
の関係を示すＨＡＳ−Ａ関係知識、概念間の関係を示す
格関係知識、表現の差異が大きい場合の違いを吸収する
言換え知識、同時には起り得ない背反関係にある知識を
それぞれ記述するので、事例蓄積処理の対象とする領域
に依存した知識の記述が柔軟になり、木目細かな知識の
記述ができるので、文解析処理や類似文照合処理の精度
が向上し、更には、事例検索時における類似文クラスタ
リング処理や類似事例検索処理の精度を向上することが
できるという効果を奏する。According to the fourth case accumulation method of the present invention, further, in the area ontology, IS-A relation knowledge indicating a semantic upper-lower relation, and HAS indicating a semantic part-whole relation, -A Relationship knowledge, case relationship knowledge indicating the relationship between concepts, paraphrase knowledge that absorbs differences when the difference in expressions is large, and knowledge that is at the same time incompatible with each other are described. The description of knowledge depending on the target area becomes flexible, and detailed knowledge can be described, so that the accuracy of sentence analysis processing and similar sentence matching processing is improved, and furthermore, similar sentence clustering processing during case search And the accuracy of similar case search processing can be improved.

【０１９９】本発明に係る第５の事例蓄積方法によれ
ば、更に、類似文照合ステップにおいて、前記領域オン
トロジ用いて各々の事例文を構文解析して得られた各々
の事例文の（係り受け）構造における構文的な要素の属
性によって意味構造を照合することによって類似度を求
めるので、事例を構成する文の様相表現（否定、推量な
ど）を含めた多様な自然言語表現に対応して、構文解析
結果の構文木のノードに設定されるような詳細な（木目
細かな）類似性を考慮して判定することによって、構文
的および・または意味的に類似した文を含む事例の検索
をすることができるという効果を奏する。According to the fifth case accumulating method of the present invention, in the similar sentence matching step, each case sentence obtained by parsing each case sentence using the area ontology is referred to as (dependency). ) Since the similarity is obtained by comparing the semantic structure with the attribute of the syntactic element in the structure, it can handle various natural language expressions including the modal expressions (negation, guesswork, etc.) of the sentences constituting the case, Searching for cases that include sentences that are syntactically and / or semantically similar by making judgments in consideration of the detailed (fine-grained) similarities set in the nodes of the syntax tree of the syntax analysis result It has the effect of being able to do so.

【０２００】本発明に係る第６の事例蓄積方法によれ
ば、更に、類似文照合ステップにおいて、各々の事例文
同士の類似度を求める際に用いる照合の詳細度を指定す
るので、ユーザが主導権を持って類似性の基準を与える
ことができると共に、事例蓄積処理の対象とする事例文
の「文タグ」やその順番などによって木目細かく類似性
の基準を与えることができるという効果を奏する。According to the sixth case accumulating method of the present invention, further, in the similar sentence matching step, the degree of detail of matching used when obtaining the similarity between each case sentence is specified, so that the user can take the initiative. It is possible to provide a similarity criterion with the right, and it is also possible to finely provide a similarity criterion based on the “sentence tag” of the case sentence to be subjected to the case accumulation processing and its order.

【０２０１】本発明に係る第７の事例蓄積方法によれ
ば、更に、前記文解析ステップにおいて、前記事例文の
構造を木構造として作成し、類似文照合ステップにおい
て、各々の事例文同士の類似度を求める際に用いる照合
の詳細度を前記事例文の処理対象とする木構造の深さに
よって指定するので、類似文照合処理の効率化を図るこ
とができるという効果を奏する。According to the seventh case accumulation method of the present invention, further, in the sentence analyzing step, the structure of the case sentence is created as a tree structure, and in the similar sentence matching step, the similarity of each case sentence is compared. Since the degree of detail of the collation used to determine the degree is specified by the depth of the tree structure to be processed for the case sentence, there is an effect that the efficiency of the similar sentence collation processing can be improved.

【０２０２】本発明に係る第８の事例蓄積方法によれ
ば、更に、事例データベースにおいて、クラスタ間の関
係として類似関係または背反関係を記述するので、ある
クラスタを指定した場合に類似度計算など特別な計算を
することなく前記クラスタと類似したクラスタを検索す
ることができるので、類似事例検索処理や検索結果表示
処理など事例検索処理の効率化を図ることができるとい
う効果を奏する。According to the eighth case accumulation method according to the present invention, further, since a similarity relation or a contradiction relation is described as a relation between clusters in the case database, when a certain cluster is designated, special case calculation such as similarity calculation is performed. Since a cluster similar to the cluster can be searched without performing a complicated calculation, it is possible to improve the efficiency of case search processing such as similar case search processing and search result display processing.

【０２０３】本発明に係る第１の事例検索方法によれ
ば、事例蓄積・検索の対象が複数の事例文からなる事例
に対しても、自然言語で記述された電子化文書の中から
各々の事例文を切出して分類した事例データを事例デー
タベースに蓄積しておき、入力された検索文と前記事例
データベースに分類して格納された事例文とを照合する
ことにより、複雑な事例に対しても類似検索による問題
の解決を容易にすることができるという効果を奏する。According to the first case retrieval method according to the present invention, even if a case where the case accumulation / retrieval target includes a plurality of case sentences, each case is selected from the digitized document described in the natural language. By storing case data obtained by extracting and classifying case sentences in a case database, and collating the input search sentence with the case sentences classified and stored in the case database, even for complex cases, There is an effect that the problem can be easily solved by the similarity search.

【０２０４】本発明に係る第１の事例蓄積プログラムを
記録したコンピュータで読取り可能な記録媒体によれ
ば、前記記録媒体に記録された事例蓄積プログラムを実
行することにより、事例蓄積・検索の対象が複数の事例
文からなる事例に対しても、自然言語で記述された電子
化文書の中から各々の事例文を切出して分類した事例デ
ータを事例データベースに蓄積しておき、入力された検
索文と前記事例データベースに分類して格納された事例
文とを照合することにより、複雑な事例に対しても類似
検索による問題の解決を容易にすることができるという
効果を奏する。According to the computer-readable recording medium on which the first case accumulation program according to the present invention is recorded, the case accumulation / retrieval target can be obtained by executing the case accumulation program recorded on the recording medium. For cases consisting of multiple case sentences, case data is extracted from each digitized document written in natural language and stored in the case database, and the case data is stored in the case database. By collating with a case sentence classified and stored in the case database, it is possible to easily solve a problem by similarity search even for a complicated case.

【０２０５】本発明に係る第１の事例検索プログラムを
記録したコンピュータで読取り可能な記録媒体によれ
ば、前記記録媒体に記録された事例検索プログラムを実
行することにより、事例蓄積・検索の対象が複数の事例
文からなる事例に対しても、自然言語で記述された電子
化文書の中から各々の事例文を切出して分類した事例デ
ータを事例データベースに蓄積しておき、入力された検
索文と前記事例データベースに分類して格納された事例
文とを照合することにより、複雑な事例に対しても類似
検索による問題の解決を容易にすることができるという
効果を奏する。According to the computer-readable recording medium on which the first case search program according to the present invention is recorded, by executing the case search program recorded on the recording medium, the case accumulation / retrieval target can be obtained. For cases consisting of multiple case sentences, case data is extracted from each digitized document written in natural language and stored in the case database, and the case data is stored in the case database. By collating with a case sentence classified and stored in the case database, it is possible to easily solve a problem by similarity search even for a complicated case.

[Brief description of the drawings]

【図１】本発明に係る実施の形態１による事例蓄積・検
索装置の構成を示す構成図である。FIG. 1 is a configuration diagram showing a configuration of a case accumulation / search device according to a first embodiment of the present invention.

【図２】本発明に係る実施の形態１による事例蓄積・検
索装置における事例蓄積機能の処理の流れを示すフロー
チャートである。FIG. 2 is a flowchart showing a flow of processing of a case accumulation function in the case accumulation / retrieval apparatus according to the first embodiment of the present invention.

【図３】電子化文書ファイルの構成の例を示す図であ
る。FIG. 3 is a diagram illustrating an example of the configuration of a digitized document file.

【図４】処理対象文抽出処理（ステップＳ１）の流れを
示すフローチャートである。FIG. 4 is a flowchart showing a flow of a process target sentence extraction process (step S1).

【図５】１文書に対する処理対象文抽出処理（ステップ
Ｓ１０）の流れを示すフローチャートである。FIG. 5 is a flowchart showing the flow of processing target sentence extraction processing (step S10) for one document.

【図６】定型フィールドか非定型フィールドかといった
文書フィールドに関する情報を予め記述した文書フィー
ルド情報の例を示す図である。FIG. 6 is a diagram illustrating an example of document field information in which information about a document field such as a fixed field or an irregular field is described in advance.

【図７】電子化文書ファイル５４における非定型フィー
ルドに対する処理（図５に示したフローチャートにおけ
るステップＳ１７）の流れを示すフローチャートであ
る。FIG. 7 is a flowchart showing a flow of processing (step S17 in the flowchart shown in FIG. 5) for an irregular field in the digitized document file 54;

【図８】文タグ一覧表の例を示す図である。FIG. 8 is a diagram showing an example of a sentence tag list.

【図９】処理対象文抽出処理（ステップＳ１）によっ
て、図３に示した電子化文書ファイル５４の文書５５か
ら生成された「事例」の例を示す図である。9 is a diagram illustrating an example of a “case” generated from the document 55 of the digitized document file 54 illustrated in FIG. 3 by the processing target sentence extraction process (step S1).

【図１０】文解析処理（ステップＳ２）の流れを示すフ
ローチャートである。FIG. 10 is a flowchart showing a flow of a sentence analysis process (step S2).

【図１１】形態素解析処理（ステップＳ２２）の流れを
示すフローチャートであると共に、事例蓄積・検索装置
における文解析部４７中の形態素解析部（図示せず）の
構成を示す構成図である。FIG. 11 is a flowchart showing a flow of a morphological analysis process (step S22), and is a configuration diagram showing a configuration of a morphological analysis unit (not shown) in a sentence analysis unit 47 in the case accumulation / search device.

【図１２】図１０に示したフローチャートにおける構文
解析処理（ステップＳ２３）の処理の流れを示すフロー
チャートであると共に、図１に示した事例蓄積・解析装
置における文解析部４５の中の構文解析部（図示せず）
の構成を示す構成図である。12 is a flowchart showing a flow of a syntax analysis process (step S23) in the flowchart shown in FIG. 10, and a syntax analysis unit in a sentence analysis unit 45 in the case accumulation / analysis device shown in FIG. (Not shown)
FIG. 2 is a configuration diagram showing the configuration of FIG.

【図１３】文節構造８９の例を示す図である。FIG. 13 is a diagram showing an example of a phrase structure 89.

【図１４】電子化文書ファイル５４から生成された類似
した事例文の集合を含むクラスタからなる問題解決木９
５の例を示す図である。14 is a problem solving tree 9 composed of a cluster including a set of similar case sentences generated from the digitized document file 54. FIG.
FIG. 9 is a diagram showing an example of No. 5;

【図１５】２つの文の係り受け構造の例を示す図であ
る。FIG. 15 is a diagram illustrating an example of a dependency structure of two sentences.

【図１６】類似度計算におけるペナルティの計算規則の
例を示す図である。FIG. 16 is a diagram showing an example of a penalty calculation rule in similarity calculation.

【図１７】領域オントロジ４４に格納された領域に関す
る知識の例を示す図である。FIG. 17 is a diagram showing an example of knowledge about an area stored in an area ontology 44.

【図１８】類似文クラスタリング処理（ステップＳ３）
の処理の流れを示すフローチャートである。FIG. 18 is a similar sentence clustering process (step S3).
3 is a flowchart showing the flow of the processing of FIG.

【図１９】クラスタリング処理（ステップＳ２７）の処
理の流れを示すフローチャートである。FIG. 19 is a flowchart showing the flow of a clustering process (step S27).

【図２０】クラスタリング副処理（ステップＳ３０）の
処理の流れを示すフローチャートである。FIG. 20 is a flowchart showing the flow of a clustering sub-process (step S30).

【図２１】事例文をクラスタリング処理する場合のクラ
スタ間の類似度を格納した類似度表の例を示す図であ
る。FIG. 21 is a diagram illustrating an example of a similarity table storing similarities between clusters when a case sentence is subjected to clustering processing.

【図２２】図１８に示したフローチャートにおける問題
解決木９５（図１４参照）を事例データベース５０（図
２参照）に格納する処理（ステップＳ２８）において、
事例データベース５０に格納した階層関係以外のクラス
タ間情報の例を示す図である。FIG. 22 shows a process (step S28) for storing the problem solving tree 95 (see FIG. 14) in the flowchart shown in FIG. 18 in the case database 50 (see FIG. 2).
FIG. 7 is a diagram illustrating an example of inter-cluster information other than a hierarchical relationship stored in a case database 50;

【図２３】図１８に示したフローチャートにおける問題
解決木９５（図１４参照）を事例データベース５０（図
２参照）に格納する処理（ステップＳ２８）において、
事例データベース５０に格納した階層関係以外のクラス
タ間情報の例を示す図である。FIG. 23 shows a process (step S28) for storing the problem solving tree 95 (see FIG. 14) in the case database 50 (see FIG. 2) in the flowchart shown in FIG.
FIG. 7 is a diagram illustrating an example of inter-cluster information other than a hierarchical relationship stored in a case database 50;

【図２４】本発明に係る実施の形態１による事例蓄積・
検索装置における事例検索機能の処理の流れを示すフロ
ーチャートである。FIG. 24 is an illustration of case accumulation and storage according to the first embodiment of the present invention.
It is a flowchart which shows the flow of a process of the case search function in a search device.

【図２５】図２４に示したフローチャートにおける類似
事例検索処理（ステップＳ４２）の処理の流れを示すフ
ローチャートである。FIG. 25 is a flowchart showing a flow of a similar case search process (step S42) in the flowchart shown in FIG. 24;

【図２６】従来技術１に係る類似事例検索装置の構成を
示す構成図である。FIG. 26 is a configuration diagram illustrating a configuration of a similar case search device according to the related art 1.

【図２７】従来技術２に係る知識ベース装置の構成を示
す構成図である。FIG. 27 is a configuration diagram showing a configuration of a knowledge base device according to the conventional technology 2.

【図２８】従来技術３に係る文書情報検索装置の構成を
示す構成図である。FIG. 28 is a configuration diagram showing a configuration of a document information search device according to a conventional technology 3.

【図２９】図２８における入力解析部２８の構成を示す
構成図である。FIG. 29 is a configuration diagram showing a configuration of an input analysis unit 28 in FIG. 28;

[Explanation of symbols]

４４領域オントロジ４５処理対象文抽出部４６検索文入力部４７文解析部４８類似文照合文４９類似文クラスタリング部５０事例データベース５１事例クラスタ編集部５２類似事例検索部５３検索結果表示部ステップＳ１処理対象文抽出処理ステップステップＳ２文解析処理ステップステップＳ３類似文クラスタリング処理ステップステップＳ４事例クラスタ編集処理ステップステップＳ４１検索文入力処理ステップステップＳ４２類似事例検索処理ステップステップＳ４３検索結果表示処理ステップなお、図中、同一符号は同一または相当部分を示す。 44 area ontology 45 processing target sentence extraction unit 46 search sentence input unit 47 sentence analysis unit 48 similar sentence matching sentence 49 similar sentence clustering unit 50 case database 51 case cluster editing unit 52 similar case search unit 53 search result display unit Step S1 processing target Sentence extraction processing step S2 Sentence analysis processing step S3 Similar sentence clustering processing step S4 Case cluster editing processing step S41 Search sentence input processing step S42 Similar case search processing step S43 Search result display processing step The same reference numerals indicate the same or corresponding parts.

───────────────────────────────────────────────────── フロントページの続き (72)発明者相川勇之東京都千代田区丸の内二丁目２番３号三菱電機株式会社内 (72)発明者伊藤山彦東京都千代田区丸の内二丁目２番３号三菱電機株式会社内Ｆターム(参考） 5B075 ND03 ND20 ND35 NK02 NK10 NK32 NK35 NR03 NR12 PP02 PP24 PQ02 PR06 QM08 UU06 UU40 (54)【発明の名称】事例蓄積・検索装置、並びに事例蓄積方法および事例検索方法、並びに事例蓄積プログラムを記録したコンピュータで読取可能な記録媒体および事例検索プログラムを記録したコンピュータで読取可能な記録媒体 ──────────────────────────────────────────────────の Continuing on the front page (72) Inventor, Yoshiyuki Aikawa 2-3-2 Marunouchi, Chiyoda-ku, Tokyo Inside Mitsubishi Electric Corporation (72) Inventor Yamahiko Ito 2-3-2, Marunouchi, Chiyoda-ku, Tokyo Mitsui Electric Co., Ltd. F-term (reference) And a computer-readable recording medium recording a case accumulation program and a computer-readable recording medium recording a case retrieval program

Claims

[Claims]

1. An area ontology in which knowledge relating to a term depending on an area to be stored and searched for and a relation between the terms is stored in advance, and the case accumulation is performed from an electronic document described in a natural language. A processing target sentence extracting unit that cuts out each case sentence to be searched; a search input unit that inputs a search sentence for searching for a case sentence desired by the user; The morphological analysis and the syntactic analysis are performed by referring to the knowledge stored in the area ontology with the input of each case sentence or the search sentence input by the search input means as input, and Sentence analysis means for creating a structure, and knowledge stored in the area ontology with the structure of each case sentence or the search sentence created by the sentence analysis means as input A similarity sentence matching unit that obtains a similarity between the case sentences or the similarity between each of the case sentences and the search sentence with reference to the reference sentence; and each of the cases obtained by the similar sentence matching unit. Similar sentence clustering means for classifying each of the case sentences based on the similarity between sentences to form a case cluster, and creating case data composed of the information of the case clusters and the relationship information between the case clusters A case database storing case data created by the similar sentence clustering unit; and the search sentence based on a similarity between each of the case sentences and the search sentence obtained by the similar sentence matching unit. Similar case search means for searching a similar case sentence similar to the search sentence input by the input means from the case data stored in the case database; The searched by similar case retrieval means a similar case statement to display the search result display unit and case storage and retrieval apparatus comprising the.

2. A processing target sentence extracting step of extracting each case sentence to be stored as an object from an electronic document described in a natural language, and each case extracted by the processing target extracting step A morphological analysis and a syntactic analysis are performed by referring to an area ontology in which a sentence is input and knowledge on a term depending on an area to be stored in the case and a relationship between the terms is stored in advance, and a structure of each of the case sentences is performed. A sentence analysis step of creating, and inputting the structure of each case sentence created by the sentence analysis step as input and referring to knowledge stored in the area ontology to obtain a similarity between the case sentences A sentence matching step, and classifying each of the case sentences based on the degree of similarity between the case sentences obtained in the similar sentence matching step. And a similar sentence clustering step of generating case data composed of the information of the case clusters and the relationship information between the case clusters and storing the same in a case database. .

3. The case storage method according to claim 2, wherein the processing target sentence extracting step assigns a case sentence type to the extracted case sentence.

4. The similar sentence clustering step includes the step of classifying each of the case sentences and the number of clusters constituting the case sentences and / or the number of clusters used when judging that the case sentences are similar to each other. The case accumulation method according to claim 2, wherein a threshold value of the similarity is designated.

5. The domain ontology describes IS-A relation knowledge describing a semantic upper-lower relation, HAS-A relation knowledge describing a semantic part-whole relation, and describes a relation between concepts. Claims characterized by describing at least one kind of knowledge out of case relation knowledge, paraphrase knowledge describing one meaning / concept / relation in a plurality of expressions, and knowledge describing conflicting relations that cannot occur at the same time. Item 2. The case accumulation method according to Item 2.

6. The similar sentence collating step includes collating a semantic structure between the case sentences based on an attribute of a syntactic element in a structure of each case sentence created in the sentence analyzing step. The case accumulation method according to claim 2, wherein the degree is obtained.

7. The case storing method according to claim 2, wherein the similar sentence matching step designates a level of detail of matching used when obtaining a similarity between the case sentences.

8. The sentence analyzing step creates the structure of the case sentence as a tree structure, and the similar sentence matching step sets a detail level of matching used when calculating the similarity between the case sentences. 9. The article example sentence is designated by a tree structure depth.
Case accumulation method described in.

9. The case storage method according to claim 2, wherein the case database describes a similarity relationship or a contradiction relationship as a relationship between the case clusters.

10. A search inputting step of inputting a search sentence for searching a case sentence desired by a user from a case database storing case data to be searched for a case, and inputting by the search inputting step. The structure of the search sentence is obtained by performing a morphological analysis and a syntax analysis by referring to an area ontology in which a search sentence is input and a knowledge on a term depending on an area to be searched for the case and a relationship between the terms is stored in advance. A sentence analysis step to be created, and a similar sentence matching for obtaining a similarity between each of the case sentences and the search sentence by referring to the area ontology with the structure of the search sentence created by the sentence analysis step as an input. And a search sentence input based on the similarity between each of the case sentences and the search sentence obtained in the similar sentence matching step. A similar case search step of searching a similar case sentence similar to the search sentence input in the step from the case data stored in the case database; and a search displaying the similar case sentence searched by the similar case search step. And a result display step.

11. A processing target sentence extraction procedure for extracting each case sentence to be stored as an object from an electronic document described in a natural language, and each case extracted by the processing target extraction procedure A morphological analysis and a syntactic analysis are performed by referring to an area ontology in which a sentence is input and knowledge on a term depending on an area to be stored in the case and a relationship between the terms is stored in advance, and a structure of each of the case sentences is performed. A sentence analysis procedure for creating the case sentence, and a similarity for obtaining a similarity between the case sentences by referring to the knowledge stored in the area ontology with the structure of each case sentence created by the sentence analysis procedure as an input. A sentence matching procedure, and classifying each of the case sentences based on the similarity between the case sentences obtained by the similar sentence matching procedure to form a case cluster; And a similar sentence clustering procedure for creating case data composed of data of the case data and the relationship information between the case clusters and storing the same in a case database. Possible storage medium.

12. A search input procedure for inputting a search sentence for searching for a case sentence desired by a user from a case database storing case data to be searched for a case, and a search input procedure input by the search input procedure. The structure of the search sentence is obtained by performing a morphological analysis and a syntax analysis by referring to an area ontology in which a search sentence is input and a knowledge on a term depending on an area to be searched for the case and a relationship between the terms is stored in advance. A sentence analysis procedure to be created, and a similarity between each of the case sentences and the search sentence with reference to the knowledge stored in the area ontology with the structure of the search sentence created by the sentence analysis procedure as input. A similar sentence matching procedure for calculating the similar sentence, and the search sentence input procedure based on the similarity between each of the case sentences and the search sentence obtained by the similar sentence matching procedure. Similar case search procedure for searching a similar case sentence similar to the input search sentence from the case data stored in the case database, and a search for displaying the similar case sentence searched by the similar case search procedure And a result display procedure. A computer-readable storage medium storing a case search program characterized by comprising a result display procedure.