JP2001229177A

JP2001229177A - Method and device for structuring instance base, and recording medium with recorded instance base structuring program

Info

Publication number: JP2001229177A
Application number: JP2000038736A
Authority: JP
Inventors: Yamahiko Ito; 山彦伊藤; 泰博 ▲たか▼山; Yasuhiro Takayama; Katsushi Suzuki; 克志鈴木
Original assignee: Mitsubishi Electric Corp
Current assignee: Mitsubishi Electric Corp
Priority date: 2000-02-16
Filing date: 2000-02-16
Publication date: 2001-08-24

Abstract

PROBLEM TO BE SOLVED: To solve the problem that a word having the same meaning property is not always selected and then the reference of classification of documents is inconsistent since a key word as the reference of classification is selected by using as a clue the distribution of words between documents without considering the meaning property of words. SOLUTION: A term extracting means 17 which extracts a term from a text 11 and a classification pattern generating means 18 which generates the classification pattern of the text 11 from the term are included and on the basis of the classification pattern, the text 11 is clustered.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】この発明は、ヘルプデスク業
務などにおいて過去の問い合わせを蓄積する一方、新た
な問い合わせを受けるとオペレータが蓄積した過去の事
例を検索して、問題解決に利用するシステムに適用する
事例ベース構築方法、事例ベース構築装置及び事例ベー
ス構築プログラムを記録した記録媒体に関するものであ
る。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention is applied to a system for accumulating past inquiries in a help desk business or the like, and when a new inquiry is received, an operator searches for past cases accumulated and uses them for problem solving. The present invention relates to a case-based construction method, a case-based construction apparatus, and a recording medium on which a case-based construction program is recorded.

【０００２】[0002]

【従来の技術】図３４は例えば特開平８−１５３１２１
号公報に示された従来の事例ベース構築装置を示す構成
図であり、図において、１は文書分類装置、２は文書Ｄ
Ｂ８とキーワードＤＢ９を管理して、文書やキーワード
群の入出力を管理するデータ管理部、３はデータ管理部
２から文書を受け取ると、単語辞書６やシソーラス辞書
７を参照して、単語キーワードを生成する単語検出部、
４はデータ管理部２から各文書のキーワード群を受け取
ると、階層的な分類体系を生成する文書分類部、５は文
書分類部４から階層的な分類体系を受け取ると、インタ
フェース画面を端末１０に表示する分類結果出力部であ
る。2. Description of the Related Art FIG.
FIG. 1 is a configuration diagram showing a conventional case-based construction apparatus shown in Japanese Patent Application Publication No. H10-207, in which 1 is a document classification apparatus, and 2 is a document D.
When the data management unit 3 manages B8 and the keyword DB 9 and manages the input and output of documents and keyword groups, and receives the document from the data management unit 2, the data management unit 3 refers to the word dictionary 6 or thesaurus dictionary 7 and searches for the word keyword. A word detector to generate,
4 receives a keyword group of each document from the data management unit 2 and generates a hierarchical classification system. 5 receives the hierarchical classification system from the document classification unit 4 and displays an interface screen on the terminal 10. This is a classification result output unit to be displayed.

【０００３】６は一般用語が収録された単語辞書、７は
用語間の上位下位関係や同義語情報などが収録されたシ
ソーラス辞書、８は分類対象の文書を格納する文書Ｄ
Ｂ、９は予め人手により付与されたキーワードを格納す
るキーワードＤＢ、１０はＣＲＴやキーボードなどを備
えた端末である。なお、図３５は文書分類部４の処理を
示すフローチャートである。[0003] Reference numeral 6 denotes a word dictionary containing general terms, 7 denotes a thesaurus dictionary containing upper / lower relations between terms and synonym information, and 8 denotes a document D storing documents to be classified.
B and 9 are keyword DBs for storing keywords manually assigned in advance, and 10 is a terminal equipped with a CRT, a keyboard, and the like. FIG. 35 is a flowchart showing the processing of the document classification unit 4.

【０００４】次に動作について説明する。まず、データ
管理部２は、文書ＤＢ８から文書を入力すると、その文
書を単語検出部３に出力する。Next, the operation will be described. First, when a data is input from the document DB 8, the data management unit 2 outputs the document to the word detection unit 3.

【０００５】単語検出部３は、データ管理部２から文書
を受け取ると、単語辞書６を参照して形態素解析処理を
実行し、その文書から１以上の単語を抽出する。そし
て、単語検出部３は、文書から１以上の単語を抽出する
と、シソーラス辞書７を参照して、その単語群から同義
語群を生成する。単語検出部３は、その単語群と同義語
群を合わせて単語キーワードを生成し、その単語キーワ
ードをデータ管理部２に出力する。When receiving a document from the data management unit 2, the word detection unit 3 executes a morphological analysis process with reference to the word dictionary 6, and extracts one or more words from the document. Then, when one or more words are extracted from the document, the word detection unit 3 refers to the thesaurus dictionary 7 and generates a synonym group from the word group. The word detection unit 3 generates a word keyword by combining the word group and the synonym group, and outputs the word keyword to the data management unit 2.

【０００６】これにより、データ管理部２は、キーワー
ドＤＢ９中の各文書のキーワード群に、その単語キーワ
ードを追加登録する。また、単語検出部３は、全単語の
出現頻度総数における各単語の出現頻度や、タイトル、
見出し、段落などの文書中での出現位置に基づいて各単
語のキーワードの重要度を計算する。Thus, the data management unit 2 additionally registers the word keyword in the keyword group of each document in the keyword DB 9. In addition, the word detection unit 3 determines the appearance frequency of each word in the total appearance frequency of all words, the title,
The keyword importance of each word is calculated based on the appearance position of the heading, paragraph, etc. in the document.

【０００７】文書分類部４は、データ管理部２から各文
書のキーワード群を受け取ると、階層的な分類体系を生
成する。即ち、文書分類部４は、データ管理部２から人
手付与キーワードや単語検出部３により生成された単語
キーワードが対応付けられている文書を受け取ると、そ
の文書に対応付けられた各キーワード毎に、キーワード
を含む文書をまとめる処理（以下、「単一キーワード分
類処理」という）を実行する（ステップＳＴ１）。単一
キーワード分類処理によってまとめられた文書の集合
を、単一キーワードフォルダと呼ぶことにする。[0007] When the document classifying unit 4 receives the keyword group of each document from the data management unit 2, it generates a hierarchical classification system. That is, when the document classifying unit 4 receives, from the data management unit 2, a document associated with the manual keyword or the word keyword generated by the word detection unit 3, for each keyword associated with the document, A process of combining documents including keywords (hereinafter, referred to as "single keyword classification process") is executed (step ST1). A set of documents compiled by the single keyword classification process is called a single keyword folder.

【０００８】次に、文書分類部４は、関連キーワード分
類処理を実行して、単一キーワードフォルダ内の文書同
士を比較して、類似する文書群を含む単一キーワードフ
ォルダ同士を結合し、関連キーワードフォルダを作成す
る（ステップＳＴ２）。結合の可否は、単一キーワード
フォルダ内の文書の単語ベクトルより距離計算を実行
し、単一キーワードフォルダ内の距離分布を比較するこ
とで判断する。Next, the document classifying unit 4 executes a related keyword classifying process, compares documents in the single keyword folder, combines the single keyword folders including similar document groups, and associates A keyword folder is created (step ST2). Whether or not the combination is possible is determined by executing a distance calculation from the word vectors of the documents in the single keyword folder and comparing the distance distribution in the single keyword folder.

【０００９】次に、文書分類部４は、関連キーワード分
類処理によって作成された関連キーワードフォルダにつ
いて、類似する文書群を含む関連キーワードフォルダの
組を統合できるか否かの判定を行い、統合が可能な間は
関連キーワードフォルダの統合を繰り返す処理を実行す
る（ステップＳＴ３）。Next, the document classifying unit 4 determines whether a set of related keyword folders including a group of similar documents can be integrated with respect to the related keyword folder created by the related keyword classification process, and the integration is possible. During the interval, a process of repeating integration of the related keyword folder is executed (step ST3).

【００１０】文書分類部４は、さらに、単一キーワード
フォルダや関連キーワードフォルダ内について細分類で
きるか否かを調査し（ステップＳＴ４）、細分類可能な
間は階層的に分類を繰り返す処理を実行する（ステップ
ＳＴ５）。ステップＳＴ５では、全てのフォルダ内につ
いて階層的な分類を実行し、未処理のフォルダがなくな
ると処理を終了する（ステップＳＴ６）。The document classifying unit 4 further checks whether or not the single keyword folder or the related keyword folder can be subdivided (step ST4), and executes a process of repeating the classification hierarchically while the subclassification is possible. (Step ST5). In step ST5, hierarchical classification is performed for all folders, and when there are no unprocessed folders, the process ends (step ST6).

【００１１】分類結果出力部５は、文書分類部４から階
層的な分類体系を受け取ると、インタフェース画面を端
末１０に表示する。When receiving the hierarchical classification system from the document classification unit 4, the classification result output unit 5 displays an interface screen on the terminal 10.

【００１２】[0012]

【発明が解決しようとする課題】従来の事例ベース構築
装置は以上のように構成されているので、下記に示す２
つの課題があった。The conventional case-based construction apparatus is configured as described above.
There were two issues.

【００１３】第１に、分類の基準となるキーワードを単
語の意味的性質を考慮せずに、文書間の単語の分布を手
掛かりに選択するので、同じ意味的性質の単語が選択さ
れるとは限らず、その結果、文書の分類の基準が一貫性
のないものになる課題があった。例えば、障害対応記録
を分類する場合、あるフォルダには、同じ「会社名」を
含む文書が集まり、あるフォルダには、同じ「障害名」
を持つ文書が集まるということが起こり得る。その結
果、図３６に示すように、分類の基準となる単語の概念
が異なるクラスタが混在することになり、目的の文書を
効率的に検索することが困難となる。図３６の例では、
「Ａ社」を含むクラスタと、「Ｆ障害」又は「Ｇ障害」
を含むクラスタが混在する。理想的には、図３７に示す
ように、利用者の目的に応じて共通する概念を持った単
語を基準に分類できることが望ましい。First, since a keyword serving as a reference for classification is selected based on the distribution of words between documents without considering the semantic properties of the words, words having the same semantic properties are selected. However, as a result, there has been a problem that the standards for document classification become inconsistent. For example, when classifying failure response records, documents containing the same "company name" are collected in a certain folder, and the same "failure name" is stored in a certain folder.
It can happen that documents with are gathered. As a result, as shown in FIG. 36, clusters having different concepts of words serving as classification criteria are mixed, and it becomes difficult to efficiently search for a target document. In the example of FIG.
Cluster including "Company A" and "F failure" or "G failure"
Clusters containing Ideally, as shown in FIG. 37, it is desirable that classification be possible based on words having a common concept according to the purpose of the user.

【００１４】第２に、文書からキーワードを自動的に抽
出する処理を省略して、文書に対応付けるキーワードを
人手付与キーワードのみとすることにより、分類の基準
となるキーワードを設定することも可能であるが、その
ためには、利用者が分類の基準となるキーワードを１つ
１つ正確に指定する必要があり、利用者の操作負担が増
大する課題があった。例えば、「会社名」を分類の基準
とする場合は、文書中に現れる個々の「会社名」（例え
ば、「Ａ社」「Ｂ社」「Ｃ社」）を指定する必要があ
る。同様に「障害名」を分類の基準とする場合は、文書
中に現れる個々の「障害名」（「Ｄ障害」「Ｅ障害」
「Ｆ障害」「Ｇ障害」）を指定する必要がある。その指
定を可能にするためには、文書中に出現する会社名や障
害名を利用者が予め全て知っている必要がある。Second, by omitting the process of automatically extracting a keyword from a document and setting only keywords assigned to a document to manually assigned keywords, it is possible to set a keyword as a reference for classification. However, for that purpose, it is necessary for the user to specify each keyword as a criterion of classification accurately, and there is a problem that the operation burden on the user increases. For example, when "company name" is used as a criterion for classification, it is necessary to specify individual "company names" (for example, "company A", "company B", and "company C") appearing in a document. Similarly, when “fault name” is used as a criterion for classification, each “fault name” (“D fault”, “E fault”
"F failure" and "G failure"). In order to enable the designation, the user needs to know in advance all the company names and failure names that appear in the document.

【００１５】この発明は上記のような課題を解決するた
めになされたもので、利用者の目的に応じて共通する概
念を持った単語を基準に分類を行うことを可能とし、目
的の事例を効率的に検索することができる事例ベース構
築方法、事例ベース構築装置及び事例ベース構築プログ
ラムを記録した記録媒体を得ることを目的とする。ま
た、この発明は、分類の基準となる個々のキーワードを
指定することなく、分類の基準となるキーワードを文書
から自動的に抽出して文書を分類することができる事例
ベース構築方法、事例ベース構築装置及び事例ベース構
築プログラムを記録した記録媒体を得ることを目的とす
る。[0015] The present invention has been made to solve the above-described problems, and it is possible to perform classification based on words having a common concept according to the purpose of a user, and to provide an example of the purpose. It is an object of the present invention to obtain a case-based construction method, a case-based construction device, and a recording medium on which a case-based construction program can be efficiently searched. In addition, the present invention provides a case-based construction method and a case-based construction that can automatically extract a keyword serving as a reference for classification from a document and classify the document without specifying individual keywords serving as a reference for classification. An object is to obtain a recording medium in which an apparatus and a case base construction program are recorded.

【００１６】[0016]

【課題を解決するための手段】この発明に係る事例ベー
ス構築方法は、テキストに含まれる用語を抽出する用語
抽出ステップと、その用語からテキストの分類パターン
を生成する分類パターン生成ステップとを設け、その分
類パターンを基準としてテキストのクラスタリングを実
行するようにしたものである。A case base construction method according to the present invention includes a term extraction step of extracting terms included in text, and a classification pattern generation step of generating a text classification pattern from the terms. The clustering of the text is executed based on the classification pattern.

【００１７】この発明に係る事例ベース構築方法は、用
語抽出ステップにより抽出された用語の表記の類似性か
ら用語の同義性を判定し、同義語に関する表記の揺れを
解消するようにしたものである。In the case-based construction method according to the present invention, the synonym of a term is determined from the similarity of the notation of the term extracted in the term extracting step, and the fluctuation of the notation related to the synonym is eliminated. .

【００１８】この発明に係る事例ベース構築方法は、同
義語辞書を参照して、用語抽出ステップにより抽出され
た用語の同義性を判定し、同義語に関する表記の揺れを
解消するようにしたものである。The case-based construction method according to the present invention refers to a synonym dictionary, determines the synonymity of the term extracted in the term extraction step, and eliminates the fluctuation of the notation related to the synonym. is there.

【００１９】この発明に係る事例ベース構築方法は、語
彙の概念関係が記述されたオントロジ辞書を参照してク
ラスタリングを実行するようにしたものである。In the case-based construction method according to the present invention, clustering is executed with reference to an ontology dictionary in which vocabulary conceptual relationships are described.

【００２０】この発明に係る事例ベース構築方法は、単
語の上位概念と下位概念の関係が記述されたＩＳ−Ａ辞
書をオントロジ辞書として用いるようにしたものであ
る。In the case-base construction method according to the present invention, an IS-A dictionary in which the relation between a superordinate concept and a subordinate concept of a word is described is used as an ontology dictionary.

【００２１】この発明に係る事例ベース構築方法は、機
器の上位構成と下位構成の関係が記述されたＨＡＳ−Ａ
辞書をオントロジ辞書として用いるようにしたものであ
る。In the case-based construction method according to the present invention, a HAS-A describing a relationship between a higher-level configuration and a lower-level configuration of a device is described.
The dictionary is used as an ontology dictionary.

【００２２】この発明に係る事例ベース構築方法は、入
力ステップにより入力されたテキストから、用語抽出ス
テップ、分類パターン生成ステップ及びクラスタ生成ス
テップの処理対象となる分類対象文を抽出する分類対象
文抽出ステップを設けたものである。In the case-based construction method according to the present invention, a classification target sentence extracting step of extracting a classification target sentence to be processed in a term extraction step, a classification pattern generation step and a cluster generation step from a text input in the input step. Is provided.

【００２３】この発明に係る事例ベース構築方法は、分
類対象文抽出ステップにより１つのテキストから複数の
分類対象文が抽出された場合、そのテキストの先頭の分
類対象文から順番にクラスタを生成して階層構造を作成
するようにしたものである。In the case-based construction method according to the present invention, when a plurality of classification target sentences are extracted from one text in the classification target sentence extracting step, clusters are generated in order from the first classification target sentence of the text. This is to create a hierarchical structure.

【００２４】この発明に係る事例ベース構築方法は、テ
キストの分類パターンを生成する際、分類の基準となる
語彙のパターンが記述された分類パターン記述テーブル
を参照する一方、その分類パターン記述テーブルには同
一レベルの階層毎に語彙のパターンが記述されているよ
うにしたものである。In the case-based construction method according to the present invention, when a text classification pattern is generated, a classification pattern description table in which a vocabulary pattern serving as a reference for classification is described, while the classification pattern description table includes The vocabulary pattern is described for each level of the same level.

【００２５】この発明に係る事例ベース構築方法は、ク
ラスタ生成ステップにより生成されたクラスタのうち、
属するテキストの個数が多いクラスタから順番に表示す
る表示ステップを設けたものである。In the case-based construction method according to the present invention, of the clusters generated in the cluster generation step,
A display step is provided for displaying a cluster in descending order of the number of belonging texts.

【００２６】この発明に係る事例ベース構築方法は、表
示対象のクラスタが指定されると、そのクラスタに割り
当てられたテキストの全文を表示するようにしたもので
ある。In the case-based construction method according to the present invention, when a cluster to be displayed is designated, the entire text of the text assigned to the cluster is displayed.

【００２７】この発明に係る事例ベース構築方法は、表
示対象のクラスタが指定されると、そのクラスタに割り
当てられた参照対象文を表示するようにしたものであ
る。In the case-based construction method according to the present invention, when a cluster to be displayed is designated, a reference target sentence assigned to the cluster is displayed.

【００２８】この発明に係る事例ベース構築方法は、分
類対象文抽出ステップにより１つのテキストから複数の
分類対象文が抽出された場合、各分類対象文の種類を識
別する識別情報を表示するようにしたものである。In the case-based construction method according to the present invention, when a plurality of classification target sentences are extracted from one text in the classification target sentence extracting step, identification information for identifying the type of each classification target sentence is displayed. It was done.

【００２９】この発明に係る事例ベース構築方法は、分
類対象文抽出ステップにより１つのテキストから複数の
分類対象文が抽出された場合、階層構造の初期段階では
任意の分類対象文を表示し、階層構造の途中段階から他
の分類対象文を表示するようにしたものである。In the case-based construction method according to the present invention, when a plurality of classification target sentences are extracted from one text in the classification target sentence extracting step, an arbitrary classification target sentence is displayed in an initial stage of the hierarchical structure, Another classification target sentence is displayed from the middle of the structure.

【００３０】この発明に係る事例ベース構築方法は、分
類対象文抽出ステップにより抽出された分類対象文を事
例ベースに格納されている既存のクラスタと対応付けな
がらクラスタリングを実行するようにしたものである。In the case base construction method according to the present invention, the clustering is executed while associating the classification target sentence extracted in the classification target sentence extraction step with the existing cluster stored in the case base. .

【００３１】この発明に係る事例ベース構築方法は、事
例ベースに格納するクラスタを編集する編集ステップを
設けたものである。The case base construction method according to the present invention is provided with an editing step of editing a cluster stored in the case base.

【００３２】この発明に係る事例ベース構築方法は、ク
ラスタ生成ステップにより生成されたクラスタを事例ベ
ースに格納する際、そのクラスタと事例ベースに格納さ
れている既存のクラスタの類似度が所定の閾値より高い
場合、表示ステップの表示処理及び編集ステップの編集
処理を省略させるようにしたものである。In the case base construction method according to the present invention, when the cluster generated in the cluster generation step is stored in the case base, the similarity between the cluster and the existing cluster stored in the case base is smaller than a predetermined threshold. When it is high, the display processing of the display step and the edit processing of the edit step are omitted.

【００３３】この発明に係る事例ベース構築方法は、ク
ラスタ生成ステップにより生成されたクラスタを事例ベ
ースに格納する際、そのクラスタと事例ベースに格納さ
れている既存のクラスタの類似度が所定の閾値より低い
場合、表示ステップの表示処理及び編集ステップの編集
処理を省略させるとともに、そのクラスタを事例ベース
に格納せずに、一時蓄積ファイルに保存するようにした
ものである。In the case-based construction method according to the present invention, when the cluster generated in the cluster generation step is stored in the case base, the similarity between the cluster and the existing cluster stored in the case base is smaller than a predetermined threshold. When the number is low, the display processing of the display step and the edit processing of the edit step are omitted, and the cluster is not stored in the case base but is stored in a temporary storage file.

【００３４】この発明に係る事例ベース構築装置は、テ
キストに含まれる用語を抽出する用語抽出手段と、その
用語からテキストの分類パターンを生成する分類パター
ン生成手段とを設け、その分類パターンを基準としてテ
キストのクラスタリングを実行するようにしたものであ
る。The case base construction apparatus according to the present invention is provided with term extraction means for extracting terms included in text, and classification pattern generation means for generating a classification pattern of the text from the terms, with the classification pattern as a reference. This is to perform text clustering.

【００３５】この発明に係る事例ベース構築プログラム
を記録した記録媒体は、テキストに含まれる用語を抽出
する用語抽出処理手順と、その用語からテキストの分類
パターンを生成する分類パターン生成処理手順とを設
け、その分類パターンを基準としてテキストのクラスタ
リングを実行するようにしたものである。The recording medium storing the case base construction program according to the present invention is provided with a term extraction processing procedure for extracting terms included in text and a classification pattern generation processing procedure for generating a text classification pattern from the terms. The text clustering is executed based on the classification pattern.

【００３６】[0036]

【発明の実施の形態】以下、この発明の実施の一形態を
説明する。実施の形態１．図１はこの発明の実施の形態１による事
例ベース構築装置を示す構成図であり、図において、１
１は事例構築の処理対象となるテキスト、１２はテキス
ト１１の形態素解析に使用する基本語辞書であり、基本
語辞書１２は表記と品詞の２つのフィールドから構成さ
れる（図２を参照）。１３はテキスト１１から用語と当
該用語の概念を抽出する際に使用するパターン（図３を
参照）が記述された用語パターン記述テーブル、１４は
テキスト１１を分類する基準となる語彙のパターン（図
４を参照）が記述された分類パターン記述テーブルであ
る。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS One embodiment of the present invention will be described below. Embodiment 1 FIG. FIG. 1 is a block diagram showing a case-based construction apparatus according to Embodiment 1 of the present invention.
Reference numeral 1 denotes a text to be processed in case construction, reference numeral 12 denotes a basic word dictionary used for morphological analysis of the text 11, and the basic word dictionary 12 includes two fields of notation and part of speech (see FIG. 2). Reference numeral 13 denotes a term pattern description table in which a term and a pattern used to extract a concept of the term from the text 11 (see FIG. 3) are described. Reference numeral 14 denotes a vocabulary pattern (see FIG. ) Is described.

【００３７】１５はテキスト１１を入力する入力手段、
１６は基本語辞書１２を参照してテキスト１１の形態素
解析を実行する文解析手段、１７は用語パターン記述テ
ーブル１３を参照して、テキスト１１に含まれる用語と
当該用語の概念を抽出する用語抽出手段、１８は分類パ
ターン記述テーブル１４を参照して、用語抽出手段１７
により抽出された用語と概念からテキスト１１の分類パ
ターンを生成する分類パターン生成手段である。15 is an input means for inputting the text 11;
Reference numeral 16 denotes a sentence analyzing unit for executing morphological analysis of the text 11 with reference to the basic word dictionary 12, and reference numeral 17 refers to a term pattern description table 13 for extracting a term included in the text 11 and a concept of the term. Means 18 refers to the classification pattern description table 14 and refers to the term extraction means 17
Is a classification pattern generating means for generating a classification pattern of the text 11 from the terms and concepts extracted by the above.

【００３８】１９は分類パターン生成手段１８により生
成された分類パターンを基準としてテキスト１１のクラ
スタリングを実行するクラスタ生成手段、２０はクラス
タ生成手段１９により生成されたクラスタを事例ベース
２３に格納する出力手段、２１は事例ベース２３に格納
するクラスタを編集する編集手段、２２は入力手段１
５、文解析手段１６、用語抽出手段１７、分類パターン
生成手段１８、クラスタ生成手段１９、出力手段２０及
び編集手段２１の動作を制御するとともに、各手段の入
出力となるデータの受け渡しを実行する制御手段、２３
はクラスタを格納する事例ベースである。なお、図５は
この発明の実施の形態１による事例ベース構築方法を示
すフローチャートである。Reference numeral 19 denotes a cluster generating means for performing clustering of the text 11 on the basis of the classification pattern generated by the classification pattern generating means 18, and 20 denotes an output means for storing the cluster generated by the cluster generating means 19 in the case base 23. , 21 are editing means for editing the cluster stored in the case base 23, and 22 is the input means 1
5. Control the operations of the sentence analysis unit 16, the term extraction unit 17, the classification pattern generation unit 18, the cluster generation unit 19, the output unit 20, and the editing unit 21 and execute the transfer of data that is input / output of each unit. Control means, 23
Is a case base for storing clusters. FIG. 5 is a flowchart showing the case base construction method according to the first embodiment of the present invention.

【００３９】次に動作について説明する。まず、入力手
段１５がテキスト１１の読み込みを実行する（ステップ
ＳＴ１１）。ただし、この実施の形態１では、ネットワ
ークに関する障害対応記録を示すテキスト１１を読み込
むものとする。図６は障害対応記録を示すテキスト１１
の一例である。Next, the operation will be described. First, the input unit 15 reads the text 11 (step ST11). However, in the first embodiment, it is assumed that the text 11 indicating the failure handling record regarding the network is read. FIG. 6 shows a text 11 showing a failure response record.
This is an example.

【００４０】文解析手段１６は、入力手段１５がテキス
ト１１の読み込みを実行すると、そのテキスト１１から
分類対象文を抽出する（ステップＳＴ１２）。本例で
は、テキスト１１に記録されている文のうち、障害の症
状に関する記述を分類対象文とする。When the input means 15 reads the text 11, the sentence analyzing means 16 extracts a classification target sentence from the text 11 (step ST12). In this example, among the sentences recorded in the text 11, a description related to the symptom of the failure is set as a classification target sentence.

【００４１】分類対象文の抽出処理においては文タグ判
定テーブルを参照する（図７を参照）。文タグ判定テー
ブルは文タグ定義部とキーワード定義部を備えており、
キーワード定義部に定義された文字列を含む文は、文タ
グ定義部に定義される内容の記述で表すことを意味す
る。例えば、「検知」という文字列を含む文は「症状」
に関する記述である。したがって、図６の障害対応記録
の各文において、「検知」、「発生」、「ＯＮ」、「Ｏ
ＦＦ」及び「復旧」のいずれかの文字列を含む文に「症
状」の文タグが付与され、分類対象文と判定される。In the process of extracting a classification target sentence, a sentence tag determination table is referred to (see FIG. 7). The sentence tag determination table has a sentence tag definition section and a keyword definition section,
A sentence including a character string defined in the keyword definition part is represented by a description of the content defined in the sentence tag definition part. For example, a sentence containing the character string "detected" becomes "symptom"
It is a description about. Therefore, in each sentence of the failure response record in FIG. 6, “detection”, “occurrence”, “ON”, “O”
A sentence tag of “symptom” is added to a sentence including a character string of either “FF” or “restoration”, and the sentence is determined as a classification target sentence.

【００４２】次に、文解析手段１６は、テキスト１１か
ら参照対象文を抽出する（ステップＳＴ１３）。本例で
は、テキスト１１に記録されている文のうち、障害に対
して対処した行動に関する記述を参照対象文とする。ス
テップＳＴ１２の処理と同様に文タグ判定テーブルを参
照する。図６の障害対応記録の各文において、「実
施」、「連絡」及び「依頼」のいずれかの文字列を含む
文に「行動」の文タグが付与され、参照対象文と判定さ
れる。Next, the sentence analyzing means 16 extracts a reference target sentence from the text 11 (step ST13). In this example, among the sentences recorded in the text 11, a description relating to an action coping with the failure is set as a reference target sentence. The sentence tag determination table is referred to in the same manner as the processing in step ST12. In each of the sentences of the failure response record in FIG. 6, a sentence tag including "Act" is added to a sentence including any of the character strings of "execute", "contact", and "request", and is determined as a reference target sentence.

【００４３】図８はステップＳＴ１２，ＳＴ１３の処理
によって文タグが付与された障害対応記録を示す説明図
である。ただし、図８では後の説明で識別しやすいよう
に文タグに番号を付与し、同一の障害対応記録中で何番
目に現れた「症状」または「行動」であるかを示してい
る。FIG. 8 is an explanatory diagram showing a failure response record to which a sentence tag has been added by the processing of steps ST12 and ST13. However, in FIG. 8, the sentence tags are numbered so as to be easily identified in the following description, and the order in which the "symptom" or "action" appears in the same failure response record is shown.

【００４４】次に、文解析手段１６は、分類対象文の形
態素解析を実行する（ステップＳＴ１４）。形態素解析
は、例えば、「未登録語を含む日本語文の形態素解析」
(吉村賢治、竹内美津乃、津田健蔵、首藤公昭、情報処
理学会論文誌、Ｖｏｌ．３０、Ｎｏ．３、１９８９年、
以降、「文献Ａ」と呼ぶ)に開示されたコスト最小法に
基づいて実行する。文献Ａでは、入力文中に形態素解析
辞書にない未登録語が含まれる場合も解析可能である。
その際、片仮名列やアルファベット列の未登録語は自立
語と同等に取り扱う。図９は図８の分類対象文に対して
形態素解析を施した結果得られる単語の区切りを示して
いる。Next, the sentence analyzing means 16 executes a morphological analysis of the sentence to be classified (step ST14). Morphological analysis is, for example, "morphological analysis of Japanese sentence including unregistered words"
(Kenji Yoshimura, Mitsuno Takeuchi, Kenzo Tsuda, Kimiaki Shuto, Information Processing Society of Japan, Vol. 30, No. 3, 1989,
Hereinafter, it is executed based on the minimum cost method disclosed in “Document A”. Document A can also analyze an input sentence that includes an unregistered word that is not in the morphological analysis dictionary.
At that time, unregistered words in the katakana string or alphabet string are treated the same as independent words. FIG. 9 shows word breaks obtained as a result of performing morphological analysis on the classification target sentence of FIG.

【００４５】用語抽出手段１７は、文解析手段１６が分
類対象文の形態素解析を実行すると、用語パターン記述
テーブル１３を参照して、テキスト１１に含まれる用語
を抽出する（ステップＳＴ１５）。ここで、用語パター
ン記述テーブル１３の記述形態について述べる。図１０
は用語パターン記述テーブル１３の書式をＢＮＦ（Ｂａ
ｃｋｕｓＮａｕｒｆｏｒｍ）記法で表したものであ
る。When the sentence analysis means 16 executes the morphological analysis of the sentence to be classified, the term extraction means 17 refers to the term pattern description table 13 to extract the terms contained in the text 11 (step ST15). Here, the description form of the term pattern description table 13 will be described. FIG.
Changes the format of the term pattern description table 13 to BNF (Ba
ckus Naur form) notation.

【００４６】用語パターンは、パターン定義部と用語決
定部から構成され、両者は「→」で連結される。パター
ン定義部は、形態素解析結果の特徴を表す形態素記述の
1つ以上の並びから構成される。形態素記述は表記パタ
ーン、品詞パターン、字種パターン又は文字列長パター
ン(総称して「形態素パターン」という)のＡＮＤ、ＯＲ
及び繰り返しで表される。The term pattern is composed of a pattern definition section and a term determination section, and both are connected by "→". The pattern definition part contains a morphological description that represents the characteristics of the morphological analysis result.
Consists of one or more sequences. The morpheme description is an AND, OR of a notation pattern, a part-of-speech pattern, a character type pattern, or a character string length pattern (collectively referred to as a “morpheme pattern”).
And repeated.

【００４７】図１０において、ｓｔｒｉｎｇは文字列を
表し、ｎｕｍｂｅｒは数字を表している。表記パター
ン、品詞パターン、字種パターン、語彙概念パターンの
引数には文字列を取り、文字列長パターンの引数には数
字を取る。例えば、ｓ（発生）は表記が「発生」の形態
素、ｐ（名詞）は品詞が名詞の形態素、ｔ（カタ）は字
種がカタカナの形態素、ｌ（１）は文字列長が１文字の
形態素、ｃ（装置）は語彙概念が「装置」の形態素であ
る。各形態素記述の間は「,」で区切られる。なお、図
１１はこの実施の形態１で用いる字種の一覧を示し、図
１２は形態素記述の一例を示している。In FIG. 10, string represents a character string, and number represents a number. A character string is taken as an argument of a notation pattern, a part of speech pattern, a character type pattern, and a vocabulary concept pattern, and a number is taken as an argument of a character string length pattern. For example, s (occurrence) is a morpheme whose notation is “occurrence”, p (noun) is a morpheme whose noun is part of speech, t (kata) is a morpheme of katakana character type, and l (1) is a morpheme whose character string is one character. The morpheme c (device) is a morpheme whose vocabulary concept is “device”. Each morpheme description is separated by “,”. FIG. 11 shows a list of character types used in the first embodiment, and FIG. 12 shows an example of a morpheme description.

【００４８】一方、用語決定部は、用語範囲決定部、用
語品詞決定部及び用語概念決定部の並びで表される。用
語範囲決定部はパターン定義部の形態素記述の範囲を数
字で指定するものである。例えば、ｒａｎｇｅ（１，
２）は、パターン定義部の形態素パターンの１番目と２
番目を連結した範囲を1つの用語として獲得するという
意味である。用語品詞決定部は獲得した用語の品詞を指
定するものであり、例えば、ｐｏｓ（名詞）は、抽出し
た用語の品詞が名詞であることを表している。用語概念
決定部は獲得した用語の概念を指定するものであり、例
えば、ｃｏｎｃｅｐｔ（装置）は、抽出した用語の概念
が装置であることを表している。また、用語概念決定部
の引数として、用語範囲決定部で決定される文字列を指
定することができる。例えば、ｃｏｎｃｅｐｔ（ｒａｎ
ｇｅ（１，２）＋装置）は、パターン定義部の形態素パ
ターンの１番目と２番目を連結した文字列と、「装置」
という文字列とを連結した文字列を概念として抽出する
ことを表している。On the other hand, the term deciding section is represented by a sequence of a term range deciding section, a term part of speech deciding section, and a term concept deciding section. The term range deciding unit designates the range of the morpheme description of the pattern definition unit by a numeral. For example, range (1,
2) are the first and second morpheme patterns in the pattern definition section.
This means that the range where the th is connected is acquired as one term. The term part-of-speech determining unit specifies the part of speech of the acquired term. For example, pos (noun) indicates that the part of speech of the extracted term is a noun. The term concept determination unit specifies the concept of the acquired term. For example, concept (device) indicates that the concept of the extracted term is a device. Further, a character string determined by the term range determination unit can be specified as an argument of the term concept determination unit. For example, concept (ran
ge (1,2) + device) is a character string that connects the first and second morpheme patterns in the pattern definition unit, and “device”
Is extracted as a concept.

【００４９】図１３はステップＳＴ１５の用語抽出処理
において、形態素解析結果とパターン定義部の照合処理
を示すフローチャートである。ここでは、一例として、
図９の障害対応記録１における症状１の形態素解析結果
である「フレーム／監視装置／にて／Ｍａｊｏｒ／アラ
ーム／発生／。」と、図３の（１）の用語パターンとを
照合する動作について説明する。FIG. 13 is a flowchart showing the morphological analysis result and the collation processing of the pattern definition part in the term extraction processing of step ST15. Here, as an example,
Operation for collating “frame / monitoring device / at Major / alarm / occurrence /.” Which is the result of morphological analysis of symptom 1 in failure response record 1 of FIG. 9 with the term pattern of (1) of FIG. explain.

【００５０】まず、ステップＳＴ３１において、最初の
形態素記述「（ｔ（カタ）｜｜ｔ（アル）｜｜ｔ（漢
字））＋」を取り出す処理を実行する。次に、ステップ
ＳＴ３２において、形態素記述に繰り返し記号があるか
否かを判定し、この例では、形態素記述に繰り返し記号
があるので、ステップＳＴ３３に進み、解析結果の形態
素である「フレーム」を取り出す処理を実行する。First, in step ST31, a process of extracting the first morpheme description "(t (kata) | | t (al) | | t (kanji)) +" is executed. Next, in step ST32, it is determined whether or not there is a repetition symbol in the morpheme description. In this example, since there is a repetition symbol in the morpheme description, the process proceeds to step ST33 to extract a “frame” which is a morpheme of the analysis result. Execute the process.

【００５１】次に、ステップＳＴ３４において、図１５
の処理を実行して、評価値が「ＴＲＵＥ」であるか否か
を判定する。即ち、図１５のステップＳＴ６１において
処理Ｂを呼び出して、ステップＳＴ６５において処理Ｃ
を呼び出し、ステップＳＴ６９において、最初の文字が
「（」であるか否かを判定する。「（ｔ（カタ）｜｜ｔ
（アル）｜｜ｔ（漢字））＋」の最初の文字は「(」で
あるので、ステップＳＴ７０に進み、処理Ａを呼び出す
処理を実行する。Next, in step ST34, FIG.
To determine whether the evaluation value is “TRUE”. That is, the process B is called in step ST61 of FIG.
Is called, and in step ST69, it is determined whether or not the first character is “(”. “(T (kata) || t”
Since the first character of ((Al) || t (Kanji)) + ”is“ (”, the process proceeds to step ST70 to execute a process for calling process A.

【００５２】次に、ステップＳＴ６１において処理Ｂを
呼び出して、ステップＳＴ６５において処理Ｃを呼び出
し、ステップＳＴ６９において、最初の文字が「（」で
あるか否かを判定する。次の文字は「t」であるので、
判定結果が「Ｎｏ」となり、ステップＳＴ７１におい
て、形態素パターン「ｔ（カタ）」と形態素「フレー
ム」が一致するか否かを評価する。Next, in step ST61, process B is called, and in step ST65, process C is called, and in step ST69, it is determined whether or not the first character is "(". So that
The determination result is “No”, and in step ST71, it is evaluated whether or not the morpheme pattern “t (kata)” matches the morpheme “frame”.

【００５３】図１６はステップＳＴ７１の評価処理を示
すフローチャートである。形態素パターン「t（カ
タ）」は、字種パターンであるので、ステップＳＴ８１
の判定が「Ｎｏ」、ステップＳＴ８５の判定が「Ｎ
ｏ」、ステップＳＴ８９の判定が「Ｙｅｓ」となる。ス
テップＳＴ９０において、「フレーム」がカタカナ文字
列であることにより、ステップＳＴ９１で評価値が「Ｔ
ＲＵＥ」となり、図１６のフローを終了して、ステップ
ＳＴ７１に戻る。FIG. 16 is a flowchart showing the evaluation processing in step ST71. Since the morpheme pattern “t (kata)” is a character type pattern, step ST81
Is "No" and the determination in step ST85 is "N".
o ", the determination in step ST89 is" Yes ". In step ST90, since the “frame” is a katakana character string, the evaluation value is “T” in step ST91.
RUE ", the flow in FIG. 16 ends, and the process returns to step ST71.

【００５４】次に、処理Ｃを呼び出したステップＳＴ６
５に戻り、ステップＳＴ６６において、次の文字列が
「＆＆」であるか否かをチェックするが、この例では、
次の文字列が「｜｜」であるので、判定が「Ｎｏ」とな
り、処理Ｂを呼び出したステップＳＴ６１に戻る。Next, step ST6 in which the process C is called
Returning to step ST66, in step ST66, it is checked whether or not the next character string is "&&". In this example,
Since the next character string is “||”, the determination is “No”, and the process returns to step ST61 in which the process B is called.

【００５５】次に、ステップＳＴ６２において、次の文
字列が「｜｜」であるか否かをチェックし、判定が「Ｙ
ｅｓ」であるので、ステップＳＴ６３において、次の形
態素記述「ｔ（アル）」が形態素「フレーム」と一致す
るか否かを評価する。上述した形態素記述「ｔ（カ
タ）」と形態素「フレーム」の照合と同様の処理によっ
て、評価値が「ＦＡＬＳＥ」となるが、ステップＳＴ６
４において、式のＯＲを取ることにより（「ＴＲＵＥ」
と「ＦＡＬＳＥ」のＯＲを取る）、評価値が「ＴＲＵ
Ｅ」となる。Next, in step ST62, it is checked whether or not the next character string is "||".
es ”, it is evaluated in step ST63 whether the next morpheme description“ t (al) ”matches the morpheme“ frame ”. The evaluation value becomes “FALSE” by the same processing as the above-described process of matching the morpheme description “t (kata)” with the morpheme “frame”.
In 4, by ORing the expressions ("TRUE"
OR "FALSE"), and the evaluation value is "TRU
E ".

【００５６】次に、ステップＳＴ６２に進み、次の文字
列が「｜｜」であるか否かをチェックする。判定が「Ｙ
ｅｓ」となり、次の形態素記述「ｔ（漢字）」が形態素
「フレーム」と一致するか否かを評価する。上述した処
理によって、評価値が「ＦＡＬＳＥ」となるが、ステッ
プＳＴ６４において、式のＯＲを取ることにより（「Ｔ
ＲＵＥ」と「ＦＡＬＳＥ」のＯＲを取る）、評価値が
「ＴＲＵＥ」となる。以上の処理によって、図１５のル
ーチンは評価値「ＴＲＵＥ」をもって終了し、図１３の
ステップＳＴ３４に戻る。Then, the process proceeds to a step ST62 to check whether or not the next character string is "||". If the judgment is "Y
es ", and evaluates whether the next morpheme description" t (kanji) "matches the morpheme" frame ". Although the evaluation value becomes “FALSE” by the above-described processing, in step ST64, the OR of the expressions is obtained (“T
RUE ”and“ FALSE ”), and the evaluation value becomes“ TRUE ”. By the above processing, the routine of FIG. 15 ends with the evaluation value “TRUE”, and returns to step ST34 of FIG.

【００５７】評価値が「ＴＲＵＥ」であるため、ステッ
プＳＴ３４の判定が「Ｙｅｓ」となり、ステップＳＴ３
５に進む。ステップＳＴ３５において、繰り返し部以降
の形態素記述を照合する。図１４はステップＳＴ３５の
照合処理を示すフローチャートである。Since the evaluation value is "TRUE", the determination in step ST34 is "Yes", and step ST3
Go to 5. In step ST35, the morpheme description after the repetition part is collated. FIG. 14 is a flowchart showing the collation processing in step ST35.

【００５８】ステップＳＴ５１において、用語パターン
の形態素記述「（ｓ（にて）｜｜ｓ（で））」を取り出
す処理を実行する。次に、ステップＳＴ５２において、
解析結果の形態素「監視装置」を取り出す処理を実行す
る。ステップＳＴ５３において、用語パターンの形態素
記述と解析結果の形態素を照合するが、図１５の処理に
より、判定が「ＦＡＬＳＥ」となる。この結果、ステッ
プＳＴ５４において、照合が失敗と判定され、ステップ
ＳＴ３５に戻る。In step ST51, a process of extracting the morpheme description of the term pattern "(s (at) || s (at))" is executed. Next, in step ST52,
A process for extracting the morpheme “monitoring device” of the analysis result is executed. In step ST53, the morpheme description of the term pattern and the morpheme of the analysis result are collated, and the determination in the process of FIG. 15 becomes “FALSE”. As a result, in step ST54, it is determined that the collation has failed, and the process returns to step ST35.

【００５９】なお、図１４のステップＳＴ５１で形態素
記述を取り出す処理と、ステップＳＴ５２で形態素解析
結果を取り出す処理は、図１３の処理には影響しない。
即ち、ステップＳＴ３５の実行後における、次に取り出
すべき形態素記述又は形態素解析結果は、ステップＳＴ
３５の実行以前における、次に取り出すべき形態素記述
又は形態素解析結果と同一である。The process of extracting the morpheme description in step ST51 of FIG. 14 and the process of extracting the morphological analysis result in step ST52 do not affect the process of FIG.
That is, the morpheme description or morphological analysis result to be taken out next after execution of step ST35
This is the same as the morpheme description or morphological analysis result to be taken out next before the execution of 35.

【００６０】次に、ステップＳＴ３６において、解析結
果が終了したか否かを判定する。「フレーム」に後続す
る解析結果が存在するため、判定が「Ｎｏ」となり、ス
テップＳＴ３３に進み、次の解析結果の形態素「監視装
置」を取り出す処理を実行する。次に、ステップＳＴ３
４において、図１５の処理手順にしたがって形態素記述
「（ｔ（カタ）｜｜ｔ（アル）｜｜ｔ（漢字））＋」と
解析結果「監視装置」を照合する。上述した照合処理を
実行することにより、評価値が「ＴＲＵＥ」となる。Next, in step ST36, it is determined whether or not the analysis result has been completed. Since there is an analysis result subsequent to the “frame”, the determination is “No”, the process proceeds to step ST33, and a process of extracting the morpheme “monitoring device” of the next analysis result is executed. Next, step ST3
In step 4, the morpheme description “(t (kata) || t (al) || t (kanji)) +” is compared with the analysis result “monitoring device” according to the processing procedure of FIG. By executing the above-described collation processing, the evaluation value becomes “TRUE”.

【００６１】次に、ステップＳＴ３５において、繰り返
し部以降の形態素記述を照合する。図１４のステップＳ
Ｔ５１において、用語パターンの形態素記述「（ｓ（に
て）｜｜ｓ（で））」を取り出し、ステップＳＴ５２に
おいて、「にて」を取り出す処理を実行する。次に、ス
テップＳＴ５３において両者を照合し、判定が「ＴＲＵ
Ｅ」となる。この結果、ステップＳＴ５５において、照
合が成功と判断され、ステップＳＴ３５に戻る。Next, in step ST35, the morpheme description after the repetition part is collated. Step S in FIG.
In T51, the morpheme description “(s (at) || s (at))” of the term pattern is extracted, and at step ST52, a process of extracting “at” is executed. Next, in step ST53, the two are collated, and the determination is “TRU
E ". As a result, in step ST55, the collation is determined to be successful, and the process returns to step ST35.

【００６２】次に、ステップＳＴ３６において、解析結
果が終了したか否かを判定するが、「監視装置」に後続
する解析結果が存在するため判定が「Ｎｏ」となり、ス
テップＳＴ３３に進み、次の解析結果の形態素「にて」
を取り出す処理を実行する。次に、ステップＳＴ３４に
おいて、図１５の処理手順にしたがって形態素記述
「（ｔ（カタ）｜｜ｔ（アル）｜｜ｔ（漢字））＋」と
解析結果「にて」を照合する。上述した照合処理を実行
することにより、評価値が「ＦＡＬＳＥ」となる。Next, in step ST36, it is determined whether or not the analysis result has been completed. However, since there is an analysis result subsequent to the "monitoring device", the determination is "No", and the process proceeds to step ST33. The morpheme “at” of the analysis result
Execute the process of extracting the. Next, in step ST34, the morpheme description “(t (kata) || t (al) || t (kanji)) +” is compared with the analysis result “at” according to the processing procedure of FIG. By performing the above-described collation processing, the evaluation value becomes “FALSE”.

【００６３】次に、ステップＳＴ４３において、照合が
成功した文字列があるか否かをチェックするが、「（ｔ
（カタ）｜｜ｔ（アル）｜｜ｔ（漢字））＋」に対して
「フレーム」と「監視装置」の照合が成功したので、判
定が「Ｙｅｓ」となる。次に、ステップＳＴ４４におい
て、パターン定義部が終了したか否かをチェックする
が、「（ｔ（カタ）｜｜ｔ（アル）｜｜ｔ（漢字））
＋」に後続する形態素記述「（ｓ（にて）｜｜ｓ
（で））」が存在するので、判定が「Ｎｏ」となる。Next, in step ST43, it is checked whether or not there is a character string that has been successfully collated.
(Kata) || t (Al) || t (Kanji)) +], the matching between the "frame" and the "monitoring device" was successful, so the determination is "Yes". Next, in step ST44, it is checked whether or not the pattern definition section has been completed, and "(t (kata) | | t (al) | | t (kanji))
+ ”Followed by the morpheme description“ (s (at) || s
()) ”, The determination is“ No ”.

【００６４】次に、ステップＳＴ３１に進み、次の形態
素記述「（ｓ（にて）｜｜ｓ（で））」を取り出す処理
を実行する。ステップＳＴ３２において、形態素記述に
繰り返し記号がないので、判定が「Ｎｏ」となり、ステ
ップＳＴ３７に進む。次に、ステップＳＴ３７におい
て、照合が成功した「フレーム」、「監視装置」の次の
形態素「にて」を取り出す処理を実行する。Next, the process proceeds to step ST31, in which a process for extracting the next morpheme description "(s (at) || s (at))" is executed. In step ST32, since there is no repetition symbol in the morpheme description, the determination is “No”, and the process proceeds to step ST37. Next, in step ST37, a process of extracting the next morpheme “at” of the “frame” and the “monitoring device” that have been successfully collated is executed.

【００６５】次に、ステップＳＴ３８において、図１５
の処理による判定を実行し、その結果が「ＴＲＵＥ」に
なるので、ステップＳＴ３９に進む。次に、ステップＳ
Ｔ３９において、パターン定義部が終了したか否かをチ
ェックするが、「（ｓ（にて）｜｜ｓ（で））」に後続
する形態素記述「（ｔ（カタ）｜｜ｔ（アル）｜｜ｔ
（漢字））＋」があるので、判定が「Ｎｏ」となり、ス
テップＳＴ４０に進む。次に、ステップＳＴ４０におい
て、解析結果が終了したか否かをチェックするが、「に
て」に後続する形態素「Ｍａｊｏｒ」があるので、判定
が「Ｎｏ」となり、ステップＳＴ３１に進む。Next, in step ST38, FIG.
Is performed and the result becomes "TRUE", so that the procedure goes to step ST39. Next, step S
At T39, it is checked whether or not the pattern definition part has been completed. The morpheme description “(t (kata) || t (al) |” following “(s (at) || s (at))” is checked. | T
(Kanji)) + ”, the determination is“ No ”, and the process proceeds to step ST40. Next, in step ST40, it is checked whether or not the analysis result has been completed. Since there is a morpheme “Major” following “at”, the determination is “No”, and the process proceeds to step ST31.

【００６６】同様の処理を続けると、形態素解析結果と
用語パターンのパターン定義部との間で、図１７に示す
対応付けが成立する。次に、用語決定部の記述にしたが
って用語を抽出する。用語範囲決定部の記述「ｒａｎｇ
ｅ（１，１）」より用語の表記として「（ｔ（カタ）｜
｜ｔ（アル）｜｜ｔ（漢字））＋」に対応する「フレー
ム監視装置」が抽出され、用語品詞決定部の記述「ｐｏ
ｓ（名詞）」より品詞として「名詞」が抽出され、用語
概念決定部の記述「ｃｏｎｃｅｐｔ（監視装置）」より
概念として「監視装置」が抽出される。When the same processing is continued, the correspondence shown in FIG. 17 is established between the morphological analysis result and the pattern definition part of the term pattern. Next, terms are extracted according to the description of the term determination unit. The description of the term range determination unit "rang
e (1,1) ”as the notation of the term“ (t (kata) |
| T (Al) || t (Kanji)) + ”is extracted, and the description“ po
“Noun” is extracted as a part of speech from “s (noun)”, and “monitoring device” is extracted as a concept from the description “concept (monitoring device)” of the term concept determination unit.

【００６７】また、図９の障害対応記録１における症状
２の形態素解析結果である「ルータ／断／検知／。」
と、図３の（３）の用語パターンに対しては、形態素解
析結果とパターン定義部の間で、図１８に示す対応付け
が成立する。この時、抽出される用語は、用語範囲決定
部の記述「ｒａｎｇｅ（１，２）」より表記として「ル
ータ断」が抽出され、用語品詞決定部の記述「ｐｏｓ
（名詞）」より品詞として「名詞」が抽出される。ま
た、用語概念決定部の記述「ｃｏｎｃｅｐｔ（ｒａｎｇ
ｅ（１，１）＋障害）」より用語の概念としては、「ｒ
ａｎｇｅ（１，１）」に対応する「ルータ」と、「障
害」を連結した「ルータ障害」が抽出される。図１９は
ステップＳＴ１５において、図９の障害対応記録から抽
出した用語を示す説明図である。"Router / disconnection / detection /." Which is the result of the morphological analysis of symptom 2 in the failure response record 1 of FIG.
18 is established between the morphological analysis result and the pattern definition unit for the term pattern of (3) in FIG. At this time, the extracted term “router disconnection” is extracted from the description “range (1, 2)” of the term range determination unit, and the description “pos
“Noun” is extracted as a part of speech from “(Noun)”. In addition, the description “concept (rang)
e (1,1) + failure), the concept of the term is “r
A “router fault” that connects the “router” corresponding to “ange (1, 1)” and the “fault” is extracted. FIG. 19 is an explanatory diagram showing terms extracted from the failure response record of FIG. 9 in step ST15.

【００６８】分類パターン生成手段１８は、用語抽出手
段１７の用語抽出処理が終了すると、文解析手段１６の
解析結果である全ての単語に対して表記揺れを解消する
（ステップＳＴ１６）。ここでは、カタカナ文字列の最
後尾につく長音「ー」のある／ないによる表記の揺れ
と、アルファベット文字列の大文字／小文字の違いによ
る表記揺れを解消する。このため、カタカナ文字列の最
後尾につく長音は削除し、アルファベットの小文字は全
て大文字に変換する。When the term extraction processing by the term extraction means 17 is completed, the classification pattern generation means 18 eliminates the swaying of all words which are the results of analysis by the sentence analysis means 16 (step ST16). Here, the sway of the notation due to the presence / absence of the long sound "-" at the end of the katakana character string and the sway of the notation due to the difference between uppercase / lowercase in the alphabet character string are eliminated. Therefore, long sounds at the end of the katakana character string are deleted, and all lowercase letters of the alphabet are converted to uppercase.

【００６９】また、長音のある／ないや、アルファベッ
トの大文字／小文字のように、規則的な変換では解消で
きない表記の揺れに対しては、図２０に示す同義語辞書
を用いる。同義語辞書はテキスト中に現れる文字列を記
載した見出しフィールドと、見出しの正表記を格納する
正表記フィールドを備えている。例えば、「フレーム」
と「フレームリレー」は共に「ＦＲ」と変換されること
によって、同じ意味を持つ語であると認識できる。図２
１は表記揺れの解消処理が実行された用語抽出結果を示
す説明図である。For a notation that cannot be resolved by regular conversion, such as the presence / absence of a long sound and the uppercase / lowercase letters of the alphabet, the synonym dictionary shown in FIG. 20 is used. The synonym dictionary includes a heading field in which a character string appearing in a text is described, and a regular notation field for storing a regular notation of the heading. For example, "frame"
And "frame relay" can be recognized as words having the same meaning by being converted to "FR". FIG.
FIG. 1 is an explanatory diagram showing a term extraction result in which a spelling elimination process has been executed.

【００７０】分類パターン生成手段１８は、表記の揺れ
を解消する処理を終了すると、分類パターン記述テーブ
ル１４を参照して、分類パターンを生成する（ステップ
ＳＴ１７）。分類パターン記述テーブル１４は、分類に
用いられる語の表記又は概念（以下、「分類要素」とい
う）のＡＮＤ又はＯＲの組み合わせで表わされる。分類
要素は、「ｓｔｒｉｎｇ」、「ｃｏｎｃｅｐｔ」及び
「ｓｕｂｃｏｎｃｅｐｔ」という述語の指定が可能であ
る。述語が指定されていない文字列は、その文字列の表
記を表す。図２２は「ｓｔｒｉｎｇ」、「ｃｏｎｃｅｐ
ｔ」及び「ｓｕｂｃｏｎｃｅｐｔ」の使用方法の一例を
示す説明図である。When the classification pattern generating means 18 completes the processing for eliminating the fluctuation of the notation, it generates a classification pattern by referring to the classification pattern description table 14 (step ST17). The classification pattern description table 14 is represented by a combination of AND or OR of notations or concepts (hereinafter, referred to as “classification elements”) of words used for classification. For the classification element, predicates “string”, “concept”, and “subconcept” can be specified. A character string for which no predicate is specified represents the notation of the character string. FIG. 22 shows “string”, “concept
It is an explanatory view showing an example of how to use “t” and “subconcept”.

【００７１】「ｓｔｒｉｎｇ」は引数に概念を取り、引
数に示した概念に対する個々の表記を表す。例えば、図
２１に示す用語が抽出されている場合、「ｓｔｒｉｎｇ
（アラーム名）」の値は、「ＭＡＪＯＲアラーム」とい
う表記と「ＭＩＮＯＲアラーム」という表記である。"String" takes a concept as an argument and represents each notation for the concept shown in the argument. For example, if the terms shown in FIG.
The value of (alarm name) is expressed as “MAJOR alarm” and “MINOR alarm”.

【００７２】「ｃｏｎｃｅｐｔ」は引数に概念を取り、
引数に示した概念そのものを表す。例えば、「ｃｏｎｃ
ｅｐｔ（アラーム名）」の値は、「アラーム名」という
概念である。なお、分類要素においては、文字列の表記
と概念が混在するので、混乱を避けるため、概念は「＜
＞」で囲んで示すものとする。例えば、「アラーム名」
という表記は、「アラーム名」と表し、「アラーム名」
という概念は「＜アラーム名＞」と表す。“Concept” takes a concept as an argument,
Represents the concept itself shown in the argument. For example, "conc
The value of “ept (alarm name)” is a concept of “alarm name”. In the classification element, the notation of a character string and the concept are mixed, so to avoid confusion, the concept is “<
>>. For example, "Alarm name"
Notation is expressed as "alarm name" and "alarm name"
Is expressed as “<alarm name>”.

【００７３】「ｓｕｂｃｏｎｃｅｐｔ」は引数に概念を
取り、引数に示した概念の下位概念を表す。ここで、上
位概念と下位概念の関係は、概念の末尾を部分文字列と
して含むか否かで判定される。上位概念は下位概念と末
尾が一致する部分文字列である。例えば、「＜障害＞」
は「＜ルータ障害＞」に対し、末尾が一致する部分文字
列であるため、「＜障害＞」という概念は「＜ルータ障
害＞」という概念の上位概念である。逆に、「＜ルータ
障害＞」という概念は「＜障害＞」という概念に対して
下位概念である。従って、図２１に示す用語が抽出され
ている場合、「ｓｕｂｃｏｎｃｅｐｔ（障害）」という
記述の値は、「＜ルータ障害＞」と「＜モデム障害＞」
である。"Subconcept" takes a concept as an argument and represents a lower concept of the concept shown in the argument. Here, the relationship between the superordinate concept and the subordinate concept is determined based on whether or not the end of the concept is included as a partial character string. The superordinate concept is a partial character string whose end matches the subordinate concept. For example, "<failure>"
Is a partial character string that matches the end of “<router fault>”, so the concept “<failure>” is a superordinate concept of the concept “<router fault>”. Conversely, the concept “<router fault>” is a subordinate concept to the concept “<failure>”. Therefore, when the terms shown in FIG. 21 are extracted, the value of the description “subconcept (fault)” is “<router fault>” and “<modem fault>”
It is.

【００７４】図４に示す分類パターンの（１−１）によ
ると、複数の文がいずれも「＜監視装置＞」と「＜アラ
ーム名＞」を概念とする語を持ち、それらの語の表記が
同じであれは、同じクラスタに分類されることを表して
いる。また、分類パターンの（２−１）によると、「障
害」の下位概念として同じ概念を持つ語を含む文同士
は、同じクラスタに分類することを表している。さら
に、分類パターンの（２−２）によると、同じ「調査」
の下位概念を持つ語を含み、かつ、「異常」という表記
を含み、かつ、「あり」という表記を含む文は同じクラ
スタに分類し、また、同じ「調査」の下位概念を持つ語
を含み、かつ、「異常」という文字列を含み、かつ、
「なし」という文字列を含む文は同じクラスタに分類す
ることを表している。図４に示すように、分類パターン
は、階層毎に記述することができる。図２３は図９の形
態素解析結果に対して生成される分類パターンを示す説
明図である。According to the classification pattern (1-1) shown in FIG. 4, each of a plurality of sentences has words having the concept of "<monitoring device>" and "<alarm name>", and the notation of those words. Are the same, they are classified into the same cluster. Further, according to the classification pattern (2-1), sentences including words having the same concept as a lower concept of “disorder” are classified into the same cluster. Furthermore, according to the classification pattern (2-2), the same “survey”
Sentences that contain words with the subordinate concept of, and that contain the notation “abnormal” and that contain the notation “there are” are classified into the same cluster, and contain words that have the same subordinate concept of “investigation”. , And contains the character string "abnormal", and
Sentences containing the character string “None” indicate that they are classified into the same cluster. As shown in FIG. 4, the classification pattern can be described for each layer. FIG. 23 is an explanatory diagram showing a classification pattern generated for the morphological analysis result of FIG.

【００７５】クラスタ生成手段１９は、分類パターン生
成手段１８が分類パターンを生成すると、その分類パタ
ーンを基準にしてクラスタリングを実行し、クラスタを
生成する（ステップＳＴ１８）。図２４はクラスタの生
成処理を示すフローチャートである。クラスタは、先頭
の分類対象文から階層的に生成される。まず、ステップ
ＳＴ１０１において、現在の階層ｉが、最大階層ｌａｙ
ｅｒ以下であるか否かを判定する。ただし、ｉの初期値
は“１”であり、本例では最大階層ｌａｙｅｒを“４”
とする。したがって、ここでの判定は「Ｙｅｓ」であ
り、ステップＳＴ１０２に進む。When the classification pattern generation unit 18 generates a classification pattern, the cluster generation unit 19 executes clustering based on the classification pattern to generate a cluster (step ST18). FIG. 24 is a flowchart showing the cluster generation processing. The cluster is generated hierarchically from the head classification target sentence. First, in step ST101, the current hierarchy i is changed to the maximum hierarchy layer
er is determined. However, the initial value of i is “1”, and in this example, the maximum hierarchy layer is “4”.
And Therefore, the determination here is “Yes”, and the process proceeds to step ST102.

【００７６】ステップＳＴ１０２では、１番目の階層の
分類パターンを取り出す処理を実行する。ここでは、障
害対応記録１、障害対応記録２、障害対応記録３、障害
対応記録４及び障害対応記録５の「症状１」に対応する
分類パターンを取り出す。取り出される分類パターンは
下記の通りである。・障害対応記録１に対して、（ＦＲ監視装置，ＭＡＪＯ
Ｒアラーム) ・障害対応記録２に対して、（ＦＲ監視装置，ＭＡＪＯ
Ｒアラーム) ・障害対応記録３に対して、（ＦＲ監視装置，ＭＡＪＯ
Ｒアラーム) ・障害対応記録４に対して、（ＦＲ監視装置，ＭＡＪＯ
Ｒアラーム) ・障害対応記録５に対して、（モデム監視装置，ＭＩＮ
ＯＲアラーム)In step ST102, a process of extracting the classification pattern of the first hierarchy is executed. Here, the classification patterns corresponding to “symptom 1” of the failure response record 1, the failure response record 2, the failure response record 3, the failure response record 4, and the failure response record 5 are extracted. The extracted classification patterns are as follows.・ For the fault response record 1, (FR monitoring device, MAJO
(R alarm) ・ For failure response record 2, (FR monitor, MAJO
(R alarm) ・ For failure response record 3, (FR monitor, MAJO
(R alarm) ・ For failure response record 4, (FR monitoring device, MAJO
(R alarm) ・ For the fault response record 5, (modem monitoring device, MIN
OR alarm)

【００７７】次に、ステップＳＴ１０３において、分類
パターンが共通する分類対象文よりクラスタを生成す
る。障害対応記録１、障害対応記録２、障害対応記録
３、障害対応記録４の分類パターンが共通であることよ
り、下記に示す文が１つのクラスタになる（以下、「ク
ラスタＣ１」という）。Next, in step ST103, a cluster is generated from the classification target sentences having the same classification pattern. Since the classification patterns of the failure response record 1, the failure response record 2, the failure response record 3, and the failure response record 4 are common, the following statement forms one cluster (hereinafter, referred to as "cluster C1").

【００７８】・障害対応記録１の症状１の文→「フレー
ム監視装置にてＭａｊｏｒアラーム発生。」・障害対応記録２の症状１の文→「ＦＲ監視装置にてＭ
ＡＪＯＲアラーム検知。」・障害対応記録３の症状１の文→「ＦＲ監視装置でＭａ
ｊｏｒアラーム検知。」・障害対応記録４の症状１の文→「ＦＲ監視装置でＭａ
ｊｏｒアラーム検知。」The sentence of the symptom 1 of the failure response record 1 → “A major alarm has occurred in the frame monitoring device.” The sentence of the symptom 1 of the failure response record 2 → “M
AJOR alarm detection.・ Symptom 1 sentence of failure response record 3 → “Ma with FR monitoring device”
Jor alarm detection. -The sentence of symptom 1 in trouble shooting record 4 → "Ma with FR monitoring device"
Jor alarm detection. "

【００７９】また、障害対応記録５の症状１の文「モデ
ム監視装置にてＭｉｎｏｒアラーム発生。」が１つのク
ラスタ（以下、「クラスタＣ２」という）になる。Further, the sentence of the symptom 1 of the failure handling record 5 “Minor alarm has occurred in the modem monitoring device” becomes one cluster (hereinafter, referred to as “cluster C2”).

【００８０】次に、ステップＳＴ１０４において、参照
対象文をクラスタに割り付ける処理を実行する。本例で
は、「行動」の記述を参照対象文としている。クラスタ
に対応付ける参照対象文は、対応付けを行うクラスタに
属する「症状」の記述から、次の「症状」の記述までに
現れる「行動」の記述である。この条件に当てはまる
「行動」の記述は次の通りである。Next, in step ST104, a process of allocating the reference target sentence to the cluster is executed. In this example, the description of “action” is a reference target sentence. The reference target sentence to be associated with a cluster is a description of an "action" that appears from the description of the "symptom" belonging to the cluster to be associated with the description of the next "symptom". The description of “action” that satisfies this condition is as follows.

【００８１】・障害対応記録１→行動１「客先に連絡。」 →行動２「現地調査依頼。」・障害対応記録２→行動１「客先に状況を連絡。」 →行動２「現地調査依頼。」・障害対応記録３→行動１「客先連絡。」 →行動２「現地に連絡。」・障害対応記録４→なし・障害対応記録５→行動１「客先連絡するが、担当者不在とのこと。」 →行動２「現地に連絡。」-Failure response record 1-> Action 1 "Contact customer."-> Action 2-Request for on-site investigation.-Failure response record 2-> Action 1-Report the situation to the customer. Request. ”• Failure response record 3 → Action 1“ Contact customer. ”→ Action 2“ Contact the site. ”• Failure response record 4 → None. • Failure response record 5 → Action 1“ Contact customer Absence. "→ Action 2" Contact the site. "

【００８２】従って、クラスタＣ１に対応付けられる参
照対象文は次の通りである。・障害対応記録１の行動１「客先に連絡。」・障害対応記録１の行動２「現地調査依頼。」・障害対応記録２の行動１「客先に状況を連絡。」・障害対応記録２の行動２「現地調査依頼。」・障害対応記録３の行動１「客先連絡。」・障害対応記録３の行動２「現地に連絡。」Therefore, the sentence to be referred to which is associated with the cluster C1 is as follows. -Action 1 of the failure response record 1 "Contact the customer."-Action 2 of the failure response record 1 "Request a field survey."-Action 1 of the failure response record 2 "Notify the customer to the situation." Action 2 of 2 “Request for field survey.” ・ Action 1 of failure response record 3 “Contact customer.” ・ Action 2 of failure response record 3 “Contact the site.”

【００８３】また、クラスタＣ２に対応付けられる参照
対象文は次の通りである。・障害対応記録５の行動１「客先連絡するが、担当者不
在とのこと。」・障害対応記録５の行動２「現地に連絡。」The reference target sentence associated with the cluster C2 is as follows.・ Act 1 of the failure response record 5 “Contact the customer, but there is no person in charge.” ・ Act 2 of the failure response record 5 “Contact the site.”

【００８４】次に、ステップＳＴ１０５において、テキ
スト全体をクラスタに対応付ける処理を実行する。クラ
スタＣ１に対しては、障害対応記録１、障害対応記録
２、障害対応記録３及び障害対応記録４の全文が対応付
けられ、クラスタＣ２に対しては、障害対応記録５の全
文が対応付けられる。なお、ステップＳＴ１０３におい
て生成されたクラスタの数ｋは“２”であり、クラスタ
の集合はＣ１とＣ２である。Next, in step ST105, a process of associating the entire text with the cluster is executed. The full text of the failure response record 1, the failure response record 2, the failure response record 3, and the failure response record 4 is associated with the cluster C1, and the full text of the failure response record 5 is associated with the cluster C2. . Note that the number k of clusters generated in step ST103 is “2”, and a set of clusters is C1 and C2.

【００８５】次に、クラスタＣ１とクラスタＣ２のそれ
ぞれにおいて、次の階層のクラスタを生成する。ステッ
プＳＴ１０６において、処理する対象のクラスタの順番
を示すｊを“１”に指定し、ステップＳＴ１０７におい
て、全てのクラスタに対して子クラスタが生成されたと
判定されるまで、ステップＳＴ１０８において、（ｉ＋
１）番目の階層のクラスタを生成する。ここでは、Ｃ１
の子クラスタを生成する処理について説明する。Next, in each of the clusters C1 and C2, a cluster of the next hierarchy is generated. In step ST106, j indicating the order of the cluster to be processed is designated as “1”. In step ST108, until it is determined in step ST107 that child clusters have been generated for all clusters, (i +
1) Generate a cluster of the tier. Here, C1
A process of generating a child cluster of will be described.

【００８６】まず、現在の階層を示すｉの値が“２”で
あることより、ステップＳＴ１０１の判定が「Ｙｅｓ」
となり、ステップＳＴ１０２に進む。次に、ステップＳ
Ｔ１０２において、クラスタＣ１に属するテキストか
ら、２番目の階層の分類パターンを取り出す処理を実行
する。ここでは、障害対応記録１、障害対応記録２、障
害対応記録３及び障害対応記録４の症状２の文に対応す
る分類パターンを取り出す。取り出される分類パターン
は次の通りである。First, since the value of i indicating the current hierarchy is “2”, the determination in step ST101 is “Yes”.
And the process proceeds to step ST102. Next, step S
At T102, a process of extracting the classification pattern of the second hierarchy from the text belonging to the cluster C1 is executed. Here, a classification pattern corresponding to the sentence of the symptom 2 in the failure response record 1, the failure response record 2, the failure response record 3, and the failure response record 4 is extracted. The extracted classification patterns are as follows.

【００８７】・障害対応記録１に対して、（＜ルータ障害＞）・障害対応記録２に対して、（＜ルータ障害＞）・障害対応記録３に対して、（＜モデム障害＞）・障害対応記録４に対して、（＜高速モデム障害＞）For the failure response record 1, (<router failure>) For the failure response record 2, (<router failure>) For the failure response record 3, (<modem failure>) Failure For correspondence record 4, (<High-speed modem failure>)

【００８８】次に、ステップＳＴ１０３において、分類
パターンが共通する分類対象文よりクラスタを生成す
る。ここでは、クラスタを生成する際、語彙の概念関係
を記述したオントロジ辞書を用いる例について説明す
る。オントロジ辞書には、語彙の上位概念と下位概念の
関係を記述したＩＳ−Ａ辞書と、機器などの上位構成と
下位構成の関係を記述したＨＡＳ−Ａ辞書がある。図２
５はＩＳ−Ａ辞書の一例を示し、図２６はＨＡＳ−Ａ辞
書の一例を示している。図２５及び図２６の（Ａ）は上
位・下位関係を木構造で表し、実際の辞書には、図２５
及び図２６の（Ｂ）のように、各々の語の上位・下位の
関係を２語のペアで記述している。Next, in step ST103, clusters are generated from the classification target sentences having the same classification pattern. Here, an example will be described in which an ontology dictionary that describes conceptual relationships of vocabulary is used when generating a cluster. The ontology dictionary includes an IS-A dictionary that describes the relationship between a higher concept and a lower concept of a vocabulary, and a HAS-A dictionary that describes the relationship between a higher configuration and a lower configuration of devices and the like. FIG.
5 shows an example of the IS-A dictionary, and FIG. 26 shows an example of the HAS-A dictionary. FIGS. 25A and 26A show the superordinate / subordinate relation in a tree structure.
26, and the upper / lower relation of each word is described as a pair of two words.

【００８９】ここで、「高速モデム」は「モデム」の下
位概念であるので、「＜高速モデム障害＞」は、「＜モ
デム障害＞」の一種であると判断できる。従って、ステ
ップＳＴ１０３の処理では、下記に示す文が１つのクラ
スタ（以下、「クラスタＣ１１」という）になる。・障害対応記録１の症状２の文→「ルータ断検知。」・障害対応記録２の症状２の文→「ルーター障害検
知。」Here, since "high-speed modem" is a subordinate concept of "modem", "<high-speed modem failure>" can be determined as a kind of "<modem failure>". Therefore, in the process of step ST103, the following statement forms one cluster (hereinafter, referred to as “cluster C11”). -The sentence of the symptom 2 of the failure response record 1 → "Router disconnection detected."-The sentence of the symptom 2 of the failure response record 2 → "Router failure detected."

【００９０】また、下記に示す文が１つのクラスタ（以
下、「クラスタＣ１２」という）になる。・障害対応記録３の症状２の文→「モデム障害も検
知。」・障害対応記録４の症状２の文→「高速モデム断も検
知。」The following statement forms one cluster (hereinafter, referred to as "cluster C12"). -The sentence of the symptom 2 in the failure response record 3 → "Modem failure is also detected."-The sentence of the symptom 2 in the failure response record 4 → "High-speed modem disconnection is also detected."

【００９１】次に、ステップＳＴ１０４において、参照
対象文をクラスタに対応付ける処理を実行する。上述し
た処理を同様に実行すると、クラスタＣ１１に対応付け
られる参照対象文は“なし”になり、クラスタＣ１２に
対応付けられる参照対象文は次の通りとなる。・障害対応記録３の行動３「モデム交換を実施。」・障害対応記録４の行動１「モデム交換を実施。」Next, in step ST104, a process of associating a reference target sentence with a cluster is executed. When the above-described processing is similarly executed, the reference target sentence associated with the cluster C11 becomes “none”, and the reference target sentence associated with the cluster C12 is as follows. -Action 3 of failure response record 3 "Perform modem replacement."-Action 1 of failure response record 4 "Perform modem replacement."

【００９２】次に、ステップＳＴ１０５において、テキ
スト全体をクラスタに対応付ける処理を実行する。クラ
スタＣ１１に対しては、障害対応記録１及び障害対応記
録２の全文が対応付けられ、クラスタＣ１２に対して
は、障害対応記録３及び障害対応記録４の全文が対応付
けられる。Next, in step ST105, processing for associating the entire text with the cluster is executed. The full text of the failure response record 1 and the failure response record 2 is associated with the cluster C11, and the full text of the failure response record 3 and the failure response record 4 is associated with the cluster C12.

【００９３】次に、クラスタＣ１１とクラスタＣ１２に
おける次の階層のクラスタを生成する。ステップＳＴ１
０８において、クラスタＣ１１に対して、（ｉ＋１）番
目、即ち、３番目のクラスタを生成する。ステップＳＴ
１０１の判定が「Ｙｅｓ」であることより、ステップＳ
Ｔ１０２に進み、クラスタＣ１１に属する障害対応記録
１、障害対応記録２の症状３の文に対応する分類パター
ンを取り出す処理を実行する。取り出される分類パター
ンは次の通りである。・障害対応記録１に対して、（＜電源スイッチ＞，Ｏ
Ｎ）・障害対応記録２に対して、（＜ルータスイッチ＞，Ｏ
Ｎ）Next, a cluster of the next hierarchy in the clusters C11 and C12 is generated. Step ST1
At 08, the (i + 1) th, that is, the third cluster is generated for the cluster C11. Step ST
Since the judgment of 101 is "Yes",
Proceeding to T102, a process of extracting a classification pattern corresponding to the sentence of the symptom 3 of the failure response record 1 and the failure response record 2 belonging to the cluster C11 is executed. The extracted classification patterns are as follows. -For the failure response record 1, (<power switch>, O
N)-For the failure response record 2, (<router switch>, O
N)

【００９４】ここで、図２６のＨＡＳ−Ａ辞書を参照す
ると、「ルータ」の下位構成として「電源」を持つこと
より、「ルータスイッチ」と「電源スイッチ」は同じも
のと判断できる。従って、ステップＳＴ１０３の処理で
は、下記に示す文が１つのクラスタになる。・障害対応記録１の症状３の文→「電源スイッチＯＮを
確認。」・障害対応記録２の症状３の文→「ルータスイッチはＯ
Ｎである。」Here, referring to the HAS-A dictionary of FIG. 26, since the “router” has “power” as a lower configuration, it can be determined that the “router switch” and the “power switch” are the same. Therefore, in the process of step ST103, the following sentences form one cluster. -The sentence of the symptom 3 in the failure response record 1 → "Check that the power switch is turned on."-The sentence of the symptom 3 in the failure response record 2 → "The router switch is O.
N. "

【００９５】以下、同様の処理を実行することにより、
障害対応記録１、障害対応記録２、障害対応記録３、障
害対応記録４及び障害対応記録５から生成されたクラス
タを図２７に示す。Hereinafter, by performing the same processing,
FIG. 27 shows clusters generated from the failure response record 1, the failure response record 2, the failure response record 3, the failure response record 4, and the failure response record 5.

【００９６】図２４において、既存のクラスタが存在し
ない場合にクラスタを生成する処理について示したが、
事例ベース２３に既存のクラスタが存在する場合、ステ
ップＳＴ１０３において、新規に追加するテキストを既
存のクラスタに対応付ける処理が必要になる。図２７の
クラスタが既に存在するとき、図２８の障害対応記録６
を追加する場合について説明する。FIG. 24 shows a process for generating a cluster when an existing cluster does not exist.
If an existing cluster exists in the case base 23, a process of associating a text to be newly added with the existing cluster is required in step ST103. When the cluster in FIG. 27 already exists, the failure handling record 6 in FIG.
Is added.

【００９７】図２９は図２８のテキストを対象にして、
ステップＳＴ１２による分類対象文の抽出結果、ステッ
プＳＴ１３による参照対象文の抽出結果、ステップＳＴ
１４による形態素解析結果、ステップＳＴ１５による用
語の抽出結果、ステップＳＴ１７による分類パターンの
生成結果を示す説明図である。図２９（Ｅ）の分類パタ
ーンにしたがってクラスタを生成する処理について説明
する。FIG. 29 shows the text of FIG.
Extraction result of the classification target sentence in step ST12, extraction result of the reference target sentence in step ST13, step ST13
FIG. 14 is an explanatory diagram showing a morphological analysis result by step 14, a term extraction result by step ST15, and a classification pattern generation result by step ST17. A process of generating a cluster according to the classification pattern of FIG.

【００９８】図２４のステップＳＴ１０１において、現
在の階層を示すｉの初期値が“１”であり、最大階層ｌ
ａｙｅｒが“４”であることより、ステップＳＴ１０１
の判定が「Ｙｅｓ」となり、次にステップＳＴ１０２に
おいて、最初の分類対象文に対する分類パターン（ＦＲ
監視装置，ＭＡＪＯＲアラーム）を取り出す。In step ST101 in FIG. 24, the initial value of i indicating the current hierarchy is “1”, and the maximum hierarchy l
ayer is "4", the step ST101 is executed.
Is "Yes", and then in step ST102, the classification pattern (FR
Take out the monitoring device, MAJOR alarm).

【００９９】次に、ステップＳＴ１０３において、共通
する分類パターンによりクラスタを生成するが、図２７
のクラスタＣ１と同じ分類パターンであるため、クラス
タＣ１に対応付けられる。次に、ステップＳＴ１０４に
おいて、参照対象文のクラスタへの対応付けを実行し、
ステップＳＴ１０５において、テキスト全体のクラスタ
への対応付けを実行するが、上述した処理と同様である
ので、説明を割愛する。Next, in step ST103, clusters are generated based on the common classification pattern.
Since the classification pattern is the same as that of the cluster C1, the image is associated with the cluster C1. Next, in step ST104, the reference target statement is associated with the cluster,
In step ST105, the entire text is associated with the cluster, but the description is omitted because it is the same as the above-described processing.

【０１００】次に、子の階層に対する処理を行うが、そ
の際の比較の対象は、クラスタＣ１の子クラスタである
クラスタＣ１１とクラスタＣ１２に含まれる分類パター
ンである。ステップＳＴ１０２において、２番目の分類
対象文に対する分類パターン（＜モデム障害＞）を取り
出し、ステップＳＴ１０３において、クラスタを生成す
るが、クラスタＣ１２と同じ分類パターンであるため、
クラスタＣ１２に対応付けられる。Next, the process for the child hierarchy is performed. In this case, the comparison target is the classification patterns included in the clusters C11 and C12 which are the child clusters of the cluster C1. In step ST102, a classification pattern (<modem failure>) for the second classification target sentence is extracted, and in step ST103, a cluster is generated.
It is associated with the cluster C12.

【０１０１】障害対応記録６の３番目の分類対象文に対
する分類パターン（＜モデムスイッチ＞，ＯＦＦ）に対
しては、クラスタＣ１２の子クラスタであるクラスタＣ
１２１に含まれる文を比較の対象にして分類処理を行う
が、クラスタＣ１２１のクラスタの分類パターンと異な
るので、新しいクラスタが生成される。また、障害対応
記録６の４番目の分類対象文における分類パターン（復
旧）に対しては、比較の対象となるクラスタがないの
で、新しいクラスタを生成する。図２７のような既存の
クラスタがある場合、図２８の障害対応記録６を追加し
た際のクラスタを図３０に示す。For the classification pattern (<modem switch>, OFF) for the third classification target sentence of the failure response record 6, the cluster C, which is a child cluster of the cluster C12,
Although the classification process is performed with the sentence included in 121 as a comparison target, a new cluster is generated because the classification pattern is different from that of the cluster C121. In addition, since there is no cluster to be compared with the classification pattern (recovery) in the fourth classification target sentence of the failure response record 6, a new cluster is generated. When there is an existing cluster as shown in FIG. 27, the cluster when the failure handling record 6 of FIG. 28 is added is shown in FIG.

【０１０２】なお、上記説明では、同じクラスタに割り
当てる条件を、分類パターンが完全に一致したときのみ
としたが、「類似度」の概念を設け、分類パターンの類
似度が高いときに同じクラスタに割り当てる処理にして
もよい。例えば、類似度を、(一致した分類要素の個数)
／(分類パターンに記載された分類要素の個数)と定義
し、ある値以上の類似度を持つ文は、該当する分類パタ
ーンと一致するとみなす方法がある。In the above description, the condition to be assigned to the same cluster is only when the classification patterns completely match. However, the concept of “similarity” is provided, and when the similarity of the classification pattern is high, the same cluster is assigned to the same cluster. Assignment processing may be performed. For example, the similarity is calculated as (number of matching classification elements)
/ (The number of classification elements described in the classification pattern), there is a method in which a sentence having a similarity equal to or greater than a certain value is regarded as matching the corresponding classification pattern.

【０１０３】以上のようにして、図３のステップＳＴ１
８におけるクラスタの生成処理が終了すると、クラスタ
生成手段１９は、クラスタに含まれる事例数にしたがっ
て、クラスタをソートする（ステップＳＴ１９）。出力
手段２０は、クラスタ生成手段１９によりソートされた
クラスタリング結果を表示する（ステップＳＴ２０）。
図３１は木構造のクラスタリング結果を示す表示例であ
る。クラスタには、分類対象文（症状の記述）の他、参
照対象文（行動の記述）とテキスト全体が対応付けられ
ている。As described above, step ST1 in FIG.
When the cluster generation processing in 8 is completed, the cluster generation means 19 sorts the clusters according to the number of cases included in the cluster (step ST19). The output unit 20 displays the clustering results sorted by the cluster generation unit 19 (Step ST20).
FIG. 31 is a display example showing a clustering result of a tree structure. In addition to the classification target sentence (description of symptoms), the cluster is associated with the reference target sentence (description of behavior) and the entire text.

【０１０４】利用者は、表示された画面の階層構造の先
頭から順にクラスタの内容を確認することによって、目
的の事例を検索することができる。また、クラスタが表
す「症状」に対する「行動」を参照することができる。
例えば、Ｃ２のクラスタ「モデム監視装置、ＭＩＮＯＲ
アラーム」という「症状」に対して、Ｃ２に対応付けら
れた「行動」の文「客先連絡するが、担当者不在とのこ
と」を参照することにより、オペレータがとるべき行動
をガイダンスすることができる。また、同じ「症状」が
発生した過去の事例を参照可能とすることにより、障害
が発生したときのオペレータの問題解決を支援すること
ができる。The user can search for a target case by checking the contents of clusters in order from the top of the hierarchical structure of the displayed screen. Further, “action” for “symptom” represented by the cluster can be referred to.
For example, C2 cluster "modem monitoring device, MINOR
Guidance on the action to be taken by the operator by referring to the statement of "action" associated with C2 for the "symptom" of "alarm", "contacting customer, but not in charge" Can be. Further, by making it possible to refer to past cases in which the same “symptom” has occurred, it is possible to assist the operator in solving a problem when a failure occurs.

【０１０５】利用者は、編集手段２１を用いてクラスタ
リング結果を編集することができる（ステップＳＴ２
１）。図３２は編集画面の一例であり、既存の事例ベー
ス２３に新規に発生したテキストを１つ割り当てる画面
である。図において、１０１はＯＫボタン、１０２はＮ
Ｇボタン、１０３はクラスタリング結果表示部、１０４
は分類対象文表示部、１０５は文タグ表示部、１０６は
テキスト表示部である。The user can edit the clustering result using the editing means 21 (step ST2).
1). FIG. 32 is an example of an editing screen, which is a screen for allocating one newly generated text to the existing case base 23. In the figure, 101 is an OK button, 102 is N
G button 103, clustering result display unit 104
Denotes a classification target sentence display unit, 105 denotes a sentence tag display unit, and 106 denotes a text display unit.

【０１０６】クラスタリング結果表示部１０３では、既
存の事例ベース２３中の階層構造に対して、新規のテキ
ストが如何に割り当てられたかを示し、図３２の斜線を
付けたクラスタは、新たに追加されたクラスタであり、
対応付けられた既存のクラスタの兄弟クラスタと共に表
示される。ステップＳＴ１２における分類対象文の抽出
処理や、ステップＳＴ１３における参照対象文の抽出処
理で、プログラムの判定が誤っているときは、文タグ表
示部１０５を編集し、「症状」又は「行動」の正しい文
タグを付与することにより修正できる。「症状」のタグ
が付与された分類対象文は、分類対象文表示部１０４に
表示される。The clustering result display unit 103 shows how a new text has been assigned to the hierarchical structure in the existing case base 23, and the hatched clusters in FIG. A cluster,
Displayed with sibling clusters of the existing clusters associated with them. In the extraction processing of the classification target sentence in step ST12 and the extraction processing of the reference target sentence in step ST13, when the determination of the program is incorrect, the sentence tag display unit 105 is edited, and the "symptom" or "action" is correct. It can be modified by adding a sentence tag. The classification target sentence to which the tag of “symptom” is added is displayed on the classification target sentence display unit 104.

【０１０７】また、ステップＳＴ１８におけるクラスタ
の生成処理で、プログラムが誤った既存のクラスタに対
応付けたときは、図３３に示すように、分類対象文表示
部１０４に表示された文を、対応付けたいクラスタにド
ラッグ＆ドロップで対応付ける。このとき、対応付けら
れたクラスタの子クラスタについては、図２４のフロー
にしたがって計算をやり直し、対応付けを行う。When the program is associated with an erroneous existing cluster in the cluster generation processing in step ST18, the sentence displayed on the classification target sentence display unit 104 is associated with the program as shown in FIG. Drag and drop to the cluster you want. At this time, for the child cluster of the associated cluster, the calculation is performed again according to the flow of FIG. 24, and the association is performed.

【０１０８】ＯＫボタン１０１をクリックすると、クラ
スタリング結果表示部１０３の表示を正しいものとし、
図５のステップＳＴ２２において、そのクラスタリング
結果を事例ベース２３に登録する。ＮＧボタン１０２を
クリックすると、クラスタリング結果表示部１０３の表
示が正しくないものとして、事例ベース２３には登録せ
ず、一時蓄積ファイル（図示せず）に保存する。一時蓄
積ファイルは、事例ベース２３への登録を保留したテキ
スト１１を保存するファイルであり、後でまとめて登録
処理を行うときに利用する。When the OK button 101 is clicked, the display of the clustering result display section 103 is regarded as correct,
In step ST22 of FIG. 5, the clustering result is registered in the case base 23. When the NG button 102 is clicked, the display on the clustering result display unit 103 is incorrect and is not registered in the case base 23 but is stored in a temporary storage file (not shown). The temporary storage file is a file for storing the text 11 whose registration in the case base 23 is suspended, and is used later when collectively performing the registration processing.

【０１０９】以上は、障害対応記録などのテキストが発
生する毎に、逐次的に事例を追加する場合の処理であ
る。このとき、追加すべきテキストの分類パターンを、
既存のクラスタに割り当てる際、下記に示す式により類
似度を定義する。類似度＝(一致した分類要素の個数)／(分類パターンに
記載された分類要素の個数) そして、全ての階層において類似度が予め設定された閾
値Ｔ１より高い場合には、クラスタの割り当て処理の信
頼度が高いものと判定し、クラスタリング結果の表示と
クラスタリング結果の編集を省いて自動的に事例ベース
２３に登録し、それ以外の場合のみ表示及び編集を行う
ようにしてもよい。The above is the processing in the case of sequentially adding cases each time a text such as a failure response record is generated. At this time, the text classification pattern to be added is
When assigning to an existing cluster, the similarity is defined by the following expression. Similarity = (the number of matching classification elements) / (the number of classification elements described in the classification pattern) If the similarity is higher than a preset threshold T1 in all the hierarchies, the cluster allocation processing It may be determined that the reliability is high, and the display of the clustering result and the editing of the clustering result may be omitted and automatically registered in the case base 23, and the display and editing may be performed only in other cases.

【０１１０】また、追加すべきテキストの分類パターン
を既存のクラスタに割り当てる際、類似度が予め設定さ
れた閾値Ｔ２より低い場合は、クラスタの割り当て処理
の信頼度が低いものと判定して、ステップＳＴ２０によ
るクラスタリング結果の表示とステップＳＴ２１による
クラスタリング結果の編集を省いて、自動的に一時蓄積
ファイルに保存し、後でまとめて、事例ベース２３に登
録するようにしてもよい。When assigning a text classification pattern to be added to an existing cluster, if the similarity is lower than a preset threshold T2, it is determined that the reliability of the cluster assignment process is low, and The display of the clustering result in ST20 and the editing of the clustering result in step ST21 may be omitted, automatically stored in a temporary storage file, and later collectively registered in the case base 23.

【０１１１】また、この実施の形態１では、「症状」の
記述を分類対象文とし、「行動」の記述を参照対象文と
して、階層構造を作成したが、「症状」の記述と「行
動」の記述の両方を分類対象文とし、クラスタリング結
果を表示する際に、クラスタの色を変えるなど、両者を
識別可能にして階層構造を表示してもよい。また、全て
の「症状」と「行動」を表示させるのではなく、階層構
造の初めは「症状」のみを表示し、ある階層のレベル以
降は「行動」を表示させるようにしてもよい。In the first embodiment, a hierarchical structure is created with the description of "symptoms" as a classification target sentence and the description of "action" as a reference target sentence. May be used as the classification target sentence, and when displaying the clustering result, the hierarchical structure may be displayed so that both can be distinguished, for example, by changing the color of the cluster. Instead of displaying all "symptoms" and "behaviors", only "symptoms" may be displayed at the beginning of the hierarchical structure, and "behaviors" may be displayed after a certain hierarchical level.

【０１１２】なお、上述した事例ベース構築装置の動作
を実現する事例ベース構築プログラムを、フロッピーデ
ィスク、ＣＤ−ＲＯＭなどのコンピュータ読み取り可能
な記録媒体に記録して動作させてもよく、同様の効果が
得られる。The case-based construction program for realizing the operation of the case-based construction apparatus described above may be recorded on a computer-readable recording medium such as a floppy disk or a CD-ROM and operated. can get.

【０１１３】以上で明らかなように、この実施の形態１
によれば、テキスト１１に含まれる用語を抽出する用語
抽出手段１７と、その用語からテキスト１１の分類パタ
ーンを生成する分類パターン生成手段１８とを設け、そ
の分類パターンを基準としてテキスト１１のクラスタリ
ングを実行するように構成したので、利用者の目的に応
じて共通する概念を持った単語を基準に分類を行うこと
が可能となり、その結果、目的の事例を効率的に検索す
ることができる効果を奏する。また、分類の基準となる
個々のキーワードを指定する必要がなく、利用者が文書
中に存在することを知らない語であっても、分類の基準
となるキーワードとして、対象文書を分類することが可
能な事例ベース構築方法を提供することができる効果を
奏する。As is clear from the above, the first embodiment
According to this, a term extraction unit 17 for extracting a term included in the text 11 and a classification pattern generation unit 18 for generating a classification pattern of the text 11 from the term are provided, and the clustering of the text 11 is performed based on the classification pattern. Since it is configured to execute, it is possible to perform classification based on words having a common concept according to the user's purpose, and as a result, the effect of efficiently searching for the target case can be obtained. Play. In addition, there is no need to specify individual keywords that serve as classification criteria, and even if a user does not know that a keyword exists in a document, the target document can be classified as a keyword that serves as a classification standard. An effect is provided that a possible case-based construction method can be provided.

【０１１４】[0114]

【発明の効果】以上のように、この発明によれば、テキ
ストに含まれる用語を抽出する用語抽出ステップと、そ
の用語からテキストの分類パターンを生成する分類パタ
ーン生成ステップとを設け、その分類パターンを基準と
してテキストのクラスタリングを実行するように構成し
たので、利用者の目的に応じて共通する概念を持った単
語を基準に分類を行うことが可能となり、その結果、目
的の事例を効率的に検索することができる効果がある。
また、分類の基準となる個々のキーワードを指定するこ
となく、分類の基準となるキーワードを文書から自動的
に抽出して文書を分類することができる効果がある。As described above, according to the present invention, a term extraction step for extracting a term included in a text and a classification pattern generation step for generating a text classification pattern from the term are provided. Is configured to perform text clustering on the basis of, so that it is possible to perform classification based on words that have a common concept according to the user's purpose, and as a result, the target case can be efficiently There is an effect that can be searched.
In addition, there is an effect that a keyword serving as a reference for classification can be automatically extracted from a document and a document can be classified without designating individual keywords serving as a reference for classification.

【０１１５】この発明によれば、用語抽出ステップによ
り抽出された用語の表記の類似性から用語の同義性を判
定し、同義語に関する表記の揺れを解消するように構成
したので、クラスタリング結果の精度を高めることがで
きる効果がある。According to the present invention, the synonym of a term is determined from the similarity of the notation of the term extracted in the term extracting step, and the fluctuation of the notation related to the synonym is eliminated, so that the accuracy of the clustering result is improved. There is an effect that can be increased.

【０１１６】この発明によれば、同義語辞書を参照し
て、用語抽出ステップにより抽出された用語の同義性を
判定し、同義語に関する表記の揺れを解消するように構
成したので、規則的な変換では表記の揺れを解消できな
い用語の揺れを解消することができる効果がある。According to the present invention, the synonym dictionary is referred to to determine the synonymity of the term extracted in the term extraction step, and the fluctuation of the notation related to the synonym is eliminated. The conversion has the effect of eliminating the fluctuation of the term that cannot be corrected.

【０１１７】この発明によれば、語彙の概念関係が記述
されたオントロジ辞書を参照してクラスタリングを実行
するように構成したので、クラスタの生成を行う際、異
なる分類パターンを同一とみなして、適切なクラスタリ
ングを実行することができる効果がある。According to the present invention, clustering is performed with reference to an ontology dictionary in which vocabulary concept relationships are described. Therefore, when clusters are generated, different classification patterns are regarded as the same and appropriate There is an effect that a proper clustering can be executed.

【０１１８】この発明によれば、単語の上位概念と下位
概念の関係が記述されたＩＳ−Ａ辞書をオントロジ辞書
として用いるように構成したので、例えば、「モデム障
害」と「高速モデム障害」が同種の障害であることを認
識して、クラスタリングを実行することができる効果が
ある。According to the present invention, since the IS-A dictionary in which the relation between the superordinate concept and the subordinate concept of a word is described is used as an ontology dictionary, for example, "modem failure" and "high-speed modem failure" There is an effect that clustering can be executed by recognizing that the failures are of the same kind.

【０１１９】この発明によれば、機器の上位構成と下位
構成の関係が記述されたＨＡＳ−Ａ辞書をオントロジ辞
書として用いるように構成したので、例えば、「ルータ
スイッチ」と「電源スイッチ」が同じ物であることを認
識して、クラスタリングを実行することができる効果が
ある。According to the present invention, since the HAS-A dictionary in which the relationship between the higher-level configuration and the lower-level configuration of the device is described is used as the ontology dictionary, for example, the "router switch" and the "power switch" are the same. There is an effect that the clustering can be executed by recognizing the object.

【０１２０】この発明によれば、入力ステップにより入
力されたテキストから、用語抽出ステップ、分類パター
ン生成ステップ及びクラスタ生成ステップの処理対象と
なる分類対象文を抽出する分類対象文抽出ステップを設
けるように構成したので、例えば、「症状」や「行動」
など、ある特定の意味を持った文のみを対象に分類を行
うことが可能となり、その結果、利用者の目的に応じ
て、事例を効率的に検索することができる効果がある。According to the present invention, a classification target sentence extraction step for extracting a classification target sentence to be processed in the term extraction step, the classification pattern generation step, and the cluster generation step from the text input in the input step is provided. Because it is composed, for example, "symptoms" and "behavior"
For example, it is possible to classify only sentences having a certain specific meaning, and as a result, there is an effect that an example can be efficiently searched according to the purpose of the user.

【０１２１】この発明によれば、分類対象文抽出ステッ
プにより１つのテキストから複数の分類対象文が抽出さ
れた場合、そのテキストの先頭の分類対象文から順番に
クラスタを生成して階層構造を作成するように構成した
ので、利用者は階層構造の先頭から順にクラスタの内容
を確認することができる結果、目的の事例を効率的に検
索することができる効果がある。According to the present invention, when a plurality of classification target sentences are extracted from one text in the classification target sentence extraction step, clusters are generated in order from the classification target sentence at the head of the text to create a hierarchical structure. As a result, the user can check the contents of the cluster in order from the top of the hierarchical structure, and as a result, there is an effect that the target case can be efficiently searched.

【０１２２】この発明によれば、テキストの分類パター
ンを生成する際、分類の基準となる語彙のパターンが記
述された分類パターン記述テーブルを参照する一方、そ
の分類パターン記述テーブルには同一レベルの階層毎に
語彙のパターンが記述されているように構成したので、
各階層をどのような基準でクラスタリングするかを指定
することができるようになり、その結果、利用者の目的
に応じて、事例を効率的に検索することができる効果が
ある。According to the present invention, when a text classification pattern is generated, a classification pattern description table in which a vocabulary pattern serving as a reference for classification is described, and the classification pattern description table has the same level of hierarchy. Since the vocabulary pattern is described for each word,
It is possible to specify on what basis each layer is to be clustered, and as a result, there is an effect that cases can be efficiently searched according to the purpose of the user.

【０１２３】この発明によれば、クラスタ生成ステップ
により生成されたクラスタのうち、属するテキストの個
数が多いクラスタから順番に表示する表示ステップを設
けるように構成したので、利用者はテキストの個数が多
いクラスタの内容から確認することができる結果、目的
の事例を効率的に検索することができる効果がある。According to the present invention, among the clusters generated in the cluster generation step, the display step is provided for displaying the clusters with the largest number of texts in order, so that the user has a large number of texts. As a result of being able to confirm from the contents of the cluster, there is an effect that a target case can be efficiently searched.

【０１２４】この発明によれば、表示対象のクラスタが
指定されると、そのクラスタに割り当てられたテキスト
の全文を表示するように構成したので、利用者がクラス
タの内容を順次確認する過程において、クラスタから直
接テキスト全体を確認することができる効果がある。According to the present invention, when the cluster to be displayed is specified, the entire text of the text assigned to the cluster is displayed. Therefore, in the process of the user sequentially confirming the contents of the cluster, The effect is that the entire text can be confirmed directly from the cluster.

【０１２５】この発明によれば、表示対象のクラスタが
指定されると、そのクラスタに割り当てられた参照対象
文を表示するように構成したので、例えば、ある「症
状」が発生したとき、どのような「行動」を取り得る
か、というような利用者に有用な情報を提示することが
できる効果がある。According to the present invention, when a cluster to be displayed is specified, the reference target sentence assigned to the cluster is displayed. For example, when a certain "symptom" occurs, It is possible to present useful information to the user, such as whether the user can take an appropriate "action".

【０１２６】この発明によれば、分類対象文抽出ステッ
プにより１つのテキストから複数の分類対象文が抽出さ
れた場合、各分類対象文の種類を識別する識別情報を表
示するように構成したので、例えば、「症状」と「行
動」の因果関係を一目で確認可能な事例ベースを構築す
ることができる効果がある。According to the present invention, when a plurality of classification target sentences are extracted from one text in the classification target sentence extracting step, the identification information for identifying the type of each classification target sentence is displayed. For example, there is an effect that a case base in which the causal relationship between “symptoms” and “behavior” can be confirmed at a glance is provided.

【０１２７】この発明によれば、分類対象文抽出ステッ
プにより１つのテキストから複数の分類対象文が抽出さ
れた場合、階層構造の初期段階では任意の分類対象文を
表示し、階層構造の途中段階から他の分類対象文を表示
するように構成したので、例えば、ある程度「症状」を
絞った上で、対処すべき「行動」を提示する事例ベース
を構築することができる効果がある。According to the present invention, when a plurality of classification target sentences are extracted from one text in the classification target sentence extracting step, an arbitrary classification target sentence is displayed in an initial stage of the hierarchical structure, and a middle stage of the hierarchical structure is displayed. Since other classification target sentences are configured to be displayed, for example, there is an effect that a case base that presents “actions” to be dealt with after narrowing down “symptoms” to some extent is effective.

【０１２８】この発明によれば、分類対象文抽出ステッ
プにより抽出された分類対象文を事例ベースに格納され
ている既存のクラスタと対応付けながらクラスタリング
を実行するように構成したので、新しいテキストを、既
存の分類体系と整合性を取りながら追加することができ
る効果がある。According to the present invention, clustering is performed while associating the classification target sentence extracted in the classification target sentence extraction step with the existing cluster stored in the case base. There is an effect that it can be added while maintaining consistency with the existing classification system.

【０１２９】この発明によれば、事例ベースに格納する
クラスタを編集する編集ステップを設けるように構成し
たので、プログラムが誤った分類を行ったときも、訂正
して事例ベースに登録することができる効果がある。According to the present invention, since an editing step for editing a cluster stored in the case base is provided, even when a program performs an incorrect classification, it can be corrected and registered in the case base. effective.

【０１３０】この発明によれば、クラスタ生成ステップ
により生成されたクラスタを事例ベースに格納する際、
そのクラスタと事例ベースに格納されている既存のクラ
スタの類似度が所定の閾値より高い場合、表示ステップ
の表示処理及び編集ステップの編集処理を省略させるよ
うに構成したので、事例構築の際の確認作業の負荷を軽
減することができる効果がある。According to the present invention, when storing the cluster generated in the cluster generation step in the case base,
When the similarity between the cluster and the existing cluster stored in the case base is higher than a predetermined threshold, the display process of the display step and the edit process of the edit step are configured to be omitted. There is an effect that the work load can be reduced.

【０１３１】この発明によれば、クラスタ生成ステップ
により生成されたクラスタを事例ベースに格納する際、
そのクラスタと事例ベースに格納されている既存のクラ
スタの類似度が所定の閾値より低い場合、表示ステップ
の表示処理及び編集ステップの編集処理を省略させると
ともに、そのクラスタを事例ベースに格納せずに、一時
蓄積ファイルに保存するように構成したので、障害対応
等のテキストが発生する毎に、事例の追加作業を行う煩
わしさから解放され、効率的に事例ベースの構築作業を
行うことができる効果がある。According to the present invention, when storing the cluster generated in the cluster generation step in the case base,
If the similarity between the cluster and the existing cluster stored in the case base is lower than a predetermined threshold, the display process of the display step and the edit process of the edit step are omitted, and the cluster is not stored in the case base. , Because it is configured to save in a temporary storage file, every time a text such as a failure response occurs, the trouble of adding the case is released, and the case-based construction work can be performed efficiently. There is.

【０１３２】この発明によれば、テキストに含まれる用
語を抽出する用語抽出手段と、その用語からテキストの
分類パターンを生成する分類パターン生成手段とを設
け、その分類パターンを基準としてテキストのクラスタ
リングを実行するように構成したので、利用者の目的に
応じて共通する概念を持った単語を基準に分類を行うこ
とが可能となり、その結果、目的の事例を効率的に検索
することができる効果がある。また、分類の基準となる
個々のキーワードを指定することなく、分類の基準とな
るキーワードを文書から自動的に抽出して文書を分類す
ることができる効果がある。According to the present invention, there is provided a term extracting means for extracting a term included in a text, and a classification pattern generating means for generating a text classification pattern from the term, and performs text clustering based on the classification pattern. Because it is configured to execute, it is possible to perform classification based on words that have a common concept according to the user's purpose, and as a result, the effect of efficiently searching for the target case can be obtained. is there. In addition, there is an effect that a keyword serving as a reference for classification can be automatically extracted from a document and a document can be classified without designating individual keywords serving as a reference for classification.

【０１３３】この発明によれば、テキストに含まれる用
語を抽出する用語抽出処理手順と、その用語からテキス
トの分類パターンを生成する分類パターン生成処理手順
とを設け、その分類パターンを基準としてテキストのク
ラスタリングを実行するように構成したので、利用者の
目的に応じて共通する概念を持った単語を基準に分類を
行うことが可能となり、その結果、目的の事例を効率的
に検索することができる効果がある。また、分類の基準
となる個々のキーワードを指定することなく、分類の基
準となるキーワードを文書から自動的に抽出して文書を
分類することができる効果がある。According to the present invention, a term extraction processing procedure for extracting terms included in text and a classification pattern generation processing procedure for generating a text classification pattern from the terms are provided. Since it is configured to execute clustering, it is possible to perform classification based on words having a common concept according to the purpose of the user, and as a result, the target case can be efficiently searched. effective. In addition, there is an effect that a keyword serving as a reference for classification can be automatically extracted from a document and a document can be classified without designating individual keywords serving as a reference for classification.

[Brief description of the drawings]

【図１】この発明の実施の形態１による事例ベース構
築装置を示す構成図である。FIG. 1 is a configuration diagram showing a case-based construction device according to a first embodiment of the present invention.

【図２】基本語辞書の一例を示す説明図である。FIG. 2 is an explanatory diagram showing an example of a basic word dictionary.

【図３】用語パターン記述テーブルの一例を示す説明
図である。FIG. 3 is an explanatory diagram showing an example of a term pattern description table.

【図４】分類パターン記述テーブルの一例を示す説明
図である。FIG. 4 is an explanatory diagram showing an example of a classification pattern description table.

【図５】この発明の実施の形態１による事例ベース構
築方法を示すフローチャートである。FIG. 5 is a flowchart showing a case-based construction method according to the first embodiment of the present invention.

【図６】テキストの一例を示す説明図である。FIG. 6 is an explanatory diagram showing an example of a text.

【図７】文タグ判定テーブルの一例を示す説明図であ
る。FIG. 7 is an explanatory diagram showing an example of a sentence tag determination table.

【図８】文タグが付与された障害対応記録を示す説明
図である。FIG. 8 is an explanatory diagram showing a failure response record to which a sentence tag is added.

【図９】図８の分類対象文に対して形態素解析を施し
た結果得られる単語の区切りを示す説明図である。FIG. 9 is an explanatory diagram showing word breaks obtained as a result of performing morphological analysis on the classification target sentence of FIG. 8;

【図１０】用語パターン記述テーブルの書式をＢＮＦ
記法で表した説明図である。FIG. 10 shows the format of the term pattern description table as BNF.
It is explanatory drawing represented by the notation.

【図１１】字種の一覧を示す説明図である。FIG. 11 is an explanatory diagram showing a list of character types.

【図１２】形態素記述の一例を示す説明図である。FIG. 12 is an explanatory diagram showing an example of a morpheme description.

【図１３】形態素解析結果とパターン定義部の照合処
理を示すフローチャートである。FIG. 13 is a flowchart showing a matching process between a morphological analysis result and a pattern definition unit.

【図１４】ステップＳＴ３５の照合処理を示すフロー
チャートである。FIG. 14 is a flowchart showing a collation process in step ST35.

【図１５】ステップＳＴ３４の判定処理を示すフロー
チャートである。FIG. 15 is a flowchart showing a determination process in step ST34.

【図１６】ステップＳＴ７１の評価処理を示すフロー
チャートである。FIG. 16 is a flowchart showing an evaluation process in step ST71.

【図１７】形態素解析結果とパターン定義部との対応
関係を示す説明図である。FIG. 17 is an explanatory diagram showing a correspondence between a morphological analysis result and a pattern definition unit.

【図１８】形態素解析結果とパターン定義部との対応
関係を示す説明図である。FIG. 18 is an explanatory diagram showing a correspondence between a morphological analysis result and a pattern definition unit.

【図１９】図９の障害対応記録から抽出した用語を示
す説明図である。FIG. 19 is an explanatory diagram showing terms extracted from the failure handling record of FIG. 9;

【図２０】同義語辞書の一例を示す説明図である。FIG. 20 is an explanatory diagram showing an example of a synonym dictionary.

【図２１】表記揺れの解消処理が実行された用語抽出
結果を示す説明図である。FIG. 21 is an explanatory diagram illustrating a term extraction result in which a spelling elimination process is executed.

【図２２】「ｓｔｒｉｎｇ」、「ｃｏｎｃｅｐｔ」及
び「ｓｕｂｃｏｎｃｅｐｔ」の使用方法の一例を示す説
明図である。FIG. 22 is an explanatory diagram showing an example of how to use “string”, “concept”, and “subconcept”.

【図２３】図９の形態素解析結果に対して生成される
分類パターンを示す説明図である。FIG. 23 is an explanatory diagram showing a classification pattern generated for the morphological analysis result of FIG. 9;

【図２４】クラスタの生成処理を示すフローチャート
である。FIG. 24 is a flowchart illustrating cluster generation processing.

【図２５】ＩＳ−Ａ辞書の一例を示す説明図である。FIG. 25 is an explanatory diagram showing an example of an IS-A dictionary.

【図２６】ＨＡＳ−Ａ辞書の一例を示す説明図であ
る。FIG. 26 is an explanatory diagram showing an example of a HAS-A dictionary.

【図２７】クラスタの生成結果を示す説明図である。FIG. 27 is an explanatory diagram showing a cluster generation result.

【図２８】追加する障害対応記録を示す説明図であ
る。FIG. 28 is an explanatory diagram showing a failure handling record to be added.

【図２９】各種処理結果を示す説明図である。FIG. 29 is an explanatory diagram showing various processing results.

【図３０】障害対応記録が追加された後のクラスタの
生成結果を示す説明図である。FIG. 30 is an explanatory diagram showing a cluster generation result after a failure response record has been added;

【図３１】木構造のクラスタリング結果の表示例を示
す説明図である。FIG. 31 is an explanatory diagram showing a display example of a clustering result of a tree structure.

【図３２】編集画面の一例を示す説明図である。FIG. 32 is an explanatory diagram illustrating an example of an edit screen.

【図３３】クラスタの編集処理を示す説明図である。FIG. 33 is an explanatory diagram showing a cluster editing process.

【図３４】従来の事例ベース構築装置を示す構成図で
ある。FIG. 34 is a configuration diagram showing a conventional case-based construction apparatus.

【図３５】文書分類部の処理を示すフローチャートで
ある。FIG. 35 is a flowchart illustrating processing of a document classification unit.

【図３６】分類の基準となる単語を示す説明図であ
る。FIG. 36 is an explanatory diagram showing words serving as reference for classification.

【図３７】分類の基準となる単語を示す説明図であ
る。FIG. 37 is an explanatory diagram showing words serving as reference for classification.

[Explanation of symbols]

１１テキスト、１２基本語辞書、１３用語パター
ン記述テーブル、１４分類パターン記述テーブル、１５
入力手段、１６文解析手段、１７用語抽出手段、
１８分類パターン生成手段、１９クラスタ生成手
段、２０出力手段、２１編集手段、２２制御手
段、２３事例ベース、１０１ＯＫボタン、１０２
ＮＧボタン、１０３クラスタリング結果表示部、１０
４分類対象文表示部、１０５文タグ表示部、１０６
テキスト表示部。11 text, 12 basic word dictionary, 13 term pattern description table, 14 classification pattern description table, 15
Input means, 16 sentence analyzing means, 17 term extracting means,
18 Classification pattern generation means, 19 Cluster generation means, 20 Output means, 21 Editing means, 22 Control means, 23 Case base, 101 OK button, 102
NG button, 103 clustering result display section, 10
4 Classification target sentence display unit, 105 Sentence tag display unit, 106
Text display section.

───────────────────────────────────────────────────── フロントページの続き (72)発明者鈴木克志東京都千代田区丸の内二丁目２番３号三菱電機株式会社内Ｆターム(参考） 5B075 ND03 ND20 ND35 NK31 NK34 NK35 NK54 NR02 NR12 NR20 PR06 QP03 UU05 UU40 ──────────────────────────────────────────────────続き Continuing on the front page (72) Inventor Katsushi Suzuki 2-3-2 Marunouchi, Chiyoda-ku, Tokyo F-term in Mitsubishi Electric Corporation (reference) 5B075 ND03 ND20 ND35 NK31 NK34 NK35 NK54 NR02 NR12 NR20 PR06 QP03 UU05 UU40

Claims

[Claims]

An input step of inputting a text; a term extraction step of extracting terms included in the text; a classification pattern generation step of generating a classification pattern of the text from the terms extracted by the term extraction step. A case base comprising: a cluster generation step of performing the text clustering based on the classification pattern generated by the classification pattern generation step; and an output step of storing the cluster generated by the cluster generation step in a case base. How to build.

2. A classification pattern generating step for determining a synonym of a term from the similarity of the notation of the term extracted in the term extracting step when generating a classification pattern of a text, and eliminating a fluctuation of the notation related to the synonym. The case-based construction method according to claim 1, wherein

3. A classification pattern generating step includes: referring to a synonym dictionary when generating a text classification pattern;
2. The method according to claim 1, wherein the synonym of the term extracted in the term extracting step is determined, and the fluctuation of the description of the synonym is eliminated.

4. The case according to claim 1, wherein the cluster generation step performs clustering with reference to an ontology dictionary in which vocabulary conceptual relationships are described. Base construction method.

5. The case-based construction method according to claim 4, wherein in the cluster generation step, an IS-A dictionary in which a relation between a superordinate concept and a subordinate concept of a word is described is used as an ontology dictionary.

6. The case-based construction method according to claim 4, wherein in the cluster generation step, a HAS-A dictionary in which a relationship between a higher-order configuration and a lower-order configuration of a device is described is used as an ontology dictionary.

7. A classification target sentence extraction step for extracting a classification target sentence to be processed in a term extraction step, a classification pattern generation step, and a cluster generation step from text input in the input step. The case-based construction method according to any one of claims 1 to 6.

8. A cluster generating step includes: when a plurality of classification target sentences are extracted from one text in the classification target sentence extracting step, clusters are generated in order from the top classification target sentence of the text to form a hierarchical structure. The case-based construction method according to claim 7, wherein the case-based construction method is created.

9. The classification pattern generation step refers to a classification pattern description table in which a vocabulary pattern serving as a classification reference is described when generating a text classification pattern, and the classification pattern description table has the same level. 9. The case-based construction method according to claim 8, wherein a vocabulary pattern is described for each hierarchy.

10. The display device according to claim 1, further comprising a display step of displaying, from among the clusters generated by the cluster generation step, a cluster to which a large number of texts belong, in order. The case-based construction method according to claim 1.

11. The case-based construction method according to claim 10, wherein in the display step, when a cluster to be displayed is designated, the full text of the text assigned to the cluster is displayed.

12. The case-based construction method according to claim 10, wherein in the display step, when a cluster to be displayed is specified, a reference target sentence assigned to the cluster is displayed.

13. The display step, wherein when a plurality of classification target sentences are extracted from one text in the classification target sentence extraction step, identification information for identifying a type of each classification target sentence is displayed. Item 13. The case-based construction method according to Item 10.

14. A display step, wherein when a plurality of classification target sentences are extracted from one text in the classification target sentence extraction step, an arbitrary classification target sentence is displayed in an initial stage of the hierarchical structure, and a middle stage of the hierarchical structure is displayed. The case-based construction method according to claim 10, wherein another classification target sentence is displayed from.

15. The method according to claim 7, wherein the cluster generation step executes clustering while associating the classification target sentence extracted in the classification target sentence extraction step with an existing cluster stored in the case base. The case-based construction method according to claim 14.

16. The case base construction method according to claim 1, further comprising an editing step of editing a cluster stored in the case base.

17. An output step, when storing a cluster generated by the cluster generation step in a case base, when a similarity between the cluster and an existing cluster stored in the case base is higher than a predetermined threshold value, 17. The case-based construction method according to claim 16, wherein the display process of the display step and the edit process of the edit step are omitted.

18. The output step, when storing the cluster generated by the cluster generation step in the case base, when the similarity between the cluster and the existing cluster stored in the case base is lower than a predetermined threshold, 17. The case base construction method according to claim 16, wherein the display processing of the display step and the edit processing of the edit step are omitted, and the cluster is stored in a temporary storage file without being stored in the case base.

19. Input means for inputting text, term extraction means for extracting terms included in the text, classification pattern generation means for generating a classification pattern for the text from the terms extracted by the term extraction means A case base comprising: cluster generation means for performing clustering of the text based on the classification pattern generated by the classification pattern generation means; and output means for storing the cluster generated by the cluster generation means in a case base. Construction equipment.

20. An input processing procedure for inputting a text,
A term extraction processing procedure for extracting terms included in the text, a classification pattern generation processing procedure for generating a classification pattern of the text from the terms extracted by the term extraction processing procedure, and a classification pattern generation procedure for generating the classification pattern Recording a case-based construction program having a cluster generation processing procedure for executing the clustering of the text based on the classified pattern and an output processing procedure for storing the cluster generated by the cluster generation processing procedure in a case base Medium.