JP2007102657A

JP2007102657A - Community analysis system, community analysis method, and computer program

Info

Publication number: JP2007102657A
Application number: JP2005294344A
Authority: JP
Inventors: Takashi Sonoda; 隆志園田; Noriyuki Kurabayashi; 則之倉林; Masakazu Fujimoto; 正和藤本; Nobuhiro Yamazaki; 伸宏山崎; Yuichi Ueno; 裕一上野; Keiichi Nemoto; 啓一根本; Atsushi Ito; 敦伊東
Original assignee: Fuji Xerox Co Ltd
Current assignee: Fujifilm Business Innovation Corp
Priority date: 2005-10-07
Filing date: 2005-10-07
Publication date: 2007-04-19

Abstract

<P>PROBLEM TO BE SOLVED: To provide a system and a method which perform community analysis by analyzing access log to an electronic document storing part. <P>SOLUTION: This system records user IDs and document IDs as access information to an electronic document storing part in association with the time and date of accesses, and selects and extracts a different user who makes access to the same document as a user who has community relation based on the access log, or extracts a different document to which access is made by the same user as the related document. For example, the system determines a set of users who have interest in the same document as the users who are in community relation, and can efficiently prepare network diagrams and analyze community based on these user sets. <P>COPYRIGHT: (C)2007,JPO&INPIT

Description

本発明は、例えばデータベースに格納された電子文書に対するアクセスログの解析処理により、人と人との関連性などのコミュニティ分析を行なうコミュニティ分析装置、およびコミュニティ分析方法、並びにコンピュータ・プログラムに関する。 The present invention relates to a community analysis device, a community analysis method, and a computer program that perform community analysis such as a person-to-person relationship by analyzing an access log for an electronic document stored in a database, for example.

従来、組織活動の特徴を調査するために、組織のメンバー間の社内メールを使っての連絡や口頭での報告など、さまざまな形態での交信を記録し分析する手法が利用されてきた。例えば、非特許文献１に見られるソーシャル・ネットワーク分析と呼ばれる手法では、組織のメンバー間のネットワークを、ネットワーク図で表現する。これにより、人事上の組織図では表現されないメンバーの役割やメンバー間の依存関係を知ることができる。しかしながら、ネットワーク分析は、分析者が組織のメンバーにインタビューやアンケートを行なうことによって進められるため、時間がかかるのが課題である。さらに、分析のためには分析者の経験が必要となる。 Conventionally, in order to investigate the characteristics of organizational activities, methods for recording and analyzing communication in various forms such as communication using in-house mail between members of the organization and verbal reports have been used. For example, in a technique called social network analysis found in Non-Patent Document 1, a network between members of an organization is represented by a network diagram. As a result, it is possible to know the roles of members that are not represented in the personnel organization chart and the dependency relationships among the members. However, since network analysis is performed by an analyst conducting interviews and questionnaires to members of the organization, it is a problem that it takes time. Furthermore, analyst's experience is required for analysis.

近年では、インターネットの発達により多くの交信が電子メールで行なわれるようになってきている。電子メールは、メールサーバと呼ばれる電子メールの送受信を管理するコンピュータを通して行なわれる。メールサーバには通信記録（電子メールログ）が残されており、これらを利用することでネットワーク分析も容易に行うことができる。 In recent years, with the development of the Internet, a lot of communication has been performed by electronic mail. E-mail is performed through a computer that manages transmission / reception of e-mail called a mail server. Communication records (electronic mail logs) are left in the mail server, and network analysis can be easily performed by using these.

非特許文献２には、この電子メールログを使ったネットワーク分析の手法が提案されている。ネットワーク分析は、誰と誰がコミュニケーションを取ったことがあるかをグラフで表し、そのグラフの構造を調べることによって組織のコミュニケーションを分析しようとするものである。 Non-Patent Document 2 proposes a network analysis technique using this electronic mail log. Network analysis is intended to analyze an organization's communication by expressing who has communicated with whom in a graph and examining the structure of the graph.

図１６を参照して、グラフの構造を調べることによる組織コミュニケーションの分析手法について説明する。図１６において、コミュニケーションの参加者はノード１１で、そのノード１１間のコミュニケーションはリンク１２で表される。そして、コミュニケーションの回数はリンク１２の太さで表される。 With reference to FIG. 16, the analysis method of the organization communication by investigating the structure of a graph is demonstrated. In FIG. 16, a communication participant is represented by a node 11, and communication between the nodes 11 is represented by a link 12. The number of communications is represented by the thickness of the link 12.

アクセスログから作成されるネットワークはさまざまなネットワークの重ねあわせである。すなわち、図１６に示す（Ａ）全ネットワークには、さまざまなネットワークの重ねあわせである。ネットワーク分析手法を利用しようとすると、これらを分離する必要がある。例えば図１６に示すように、（Ｂ１）組織図に対応したネットワーク、（Ｂ２）庶務ネットワーク、（Ｂ３）同期入社ネットワークなど、各ノードの情報に基づいて分離したネットワークが取得される。 The network created from the access log is a superposition of various networks. That is, (A) all networks shown in FIG. 16 are a combination of various networks. If network analysis techniques are to be used, they need to be separated. For example, as shown in FIG. 16, (B1) a network corresponding to the organization chart, (B2) a general affairs network, (B3) a synchronized hiring network, etc., a separated network is acquired based on the information of each node.

ネットワーク分析で重要な指標は、コミュニケーションの参加者であるノードに対する入次数と出次数、２点間の距離、中心媒介性などである。ノードに向かうリンクの数は入次数、ノードから出るリンクを出次数と呼ぶ。また２点間の距離は、あるノードから別のノードまで、直接電子メールをやりとりしたノードをたどり、いくつのノードでたどりつけるかのその人数（ノード数）である。さらに、媒介中心性とは、あるノードを取り除いた時に情報が伝わる度合いを表している。 The important indicators in network analysis are the incoming and outgoing orders for nodes that are participants in communication, the distance between two points, central mediation, and the like. The number of links going to the node is called the incoming order, and the link leaving the node is called the outgoing order. The distance between two points is the number of nodes (the number of nodes) at which a node that directly exchanges e-mails from one node to another node is reached. Further, mediation centrality represents the degree to which information is transmitted when a certain node is removed.

また特許文献１では、利用者のメッセージのやりとりの履歴情報を保存しておいて、電子メールのメッセージを返信した相手数、返信を受けた相手数、投稿したメッセージの総量（長さ）、投稿したスレッドの数からグループ活動上の役割に対する利用者の適合度を求める手法を開示している。
特開２００３−２１６７８５号公報ローレンス・プルサック、ロブ・クロス、西尚久訳『ソーシャル・ネットワーク：組織活力の源泉』ＤＩＡＭＯＮＤハーバード・ビジネス・レビュー、２００２年１０月号、ｐ．９６−１０７安田雪著、「ネットワーク分析」、新曜社、１９９７年発行 Also, in Patent Document 1, history information of user message exchange is stored, the number of recipients who have returned an e-mail message, the number of recipients who have received a reply, the total amount (length) of posted messages, posts A method for obtaining the user's suitability for the role in the group activity from the number of threads that have been created is disclosed.
JP 2003-216785 A Lawrence Prussack, Rob Cross, Naohisa Nishi, “Social Network: A Source of Organizational Vitality” DIAMOND Harvard Business Review, October 2002, p. 96-107 Yasuda Yuki, “Network Analysis”, Shinsyo, 1997

このように、電子データの利用により、大量のデータであっても分析が可能となる。しかしながら、電子データには、電子メールのように直接コミュニケーションを表すようなデータでないものが存在する。 Thus, the use of electronic data makes it possible to analyze even a large amount of data. However, some electronic data is not data that directly represents communication, such as e-mail.

例えば、電子文書を格納した電子ファイリングシステムでは、電子文書の格納や閲覧の情報が記録される。これらのデータは、組織における人の活動を分析する情報であるが、人と人の関係が直接記述されてはいないが、同じ文書にアクセスした人の間に関係を付けることでネットワークグラフを作成できる。このグラフを使ってネットワーク分析の手法で分析が可能である。しかしながら、アクセスの関係で記述されるネットワークはさまざまな種類のネットワークが含まれている可能性があり、これらを分離する必要がある。この関係は、インタビューや、アクセスした文書の本文を分析することで分離可能であるが、処理の簡便性は失われてしまう。さらに、アクセスした文書の内容を調査することは、組織メンバーのプライバシーを侵害する可能性があり、この点でも実施が困難である。 For example, in an electronic filing system storing an electronic document, information on storage and browsing of the electronic document is recorded. These data are information that analyzes the activities of people in the organization, but the relationship between people who have accessed the same document is created, although the relationship between people is not described directly. it can. Using this graph, it is possible to analyze with the network analysis technique. However, the network described in the access relationship may include various types of networks, and these need to be separated. This relationship can be separated by interviewing or analyzing the text of the accessed document, but the ease of processing is lost. Furthermore, investigating the contents of accessed documents can violate the privacy of organizational members and is difficult to implement in this respect as well.

本発明は上記事情に鑑みてなされたものであり、調査対象者となるメンバーのプライバシーを保ちつつ、直接人と人の関係が記述されていない電子文書アクセス記録からのコミュニティ分析を実現するコミュニティ分析装置、およびコミュニティ分析方法、並びにコンピュータ・プログラムを提供することを目的とする The present invention has been made in view of the above circumstances, and a community analysis that realizes a community analysis from an electronic document access record in which a person-to-person relationship is not described while maintaining the privacy of a member who is a survey subject. To provide a device, a community analysis method, and a computer program

本発明の第１の側面は、電子文書記憶部に対するアクセス情報として、アクセスユーザ識別子であるユーザＩＤと、アクセス対象文書識別子である文書ＩＤとをアクセス日時情報に対応付けて記録するアクセスログ記憶部と、前記アクセスログ記憶部に記録されたアクセスログの分析処理を実行するアクセスログ分析部とを有し、前記アクセスログ分析部は、前記アクセスログ記憶部に記録されたアクセスログに基づいて、同一文書に対するアクセスを実行した異なるユーザを、コミュニティ関係を有するユーザとして選択抽出する処理を実行する構成であることを特徴とするコミュニティ分析装置にある。 A first aspect of the present invention is an access log storage unit that records, as access information for an electronic document storage unit, a user ID that is an access user identifier and a document ID that is an access target document identifier in association with access date information. And an access log analysis unit that executes an analysis process of the access log recorded in the access log storage unit, the access log analysis unit based on the access log recorded in the access log storage unit, The community analysis apparatus is characterized in that a process for selecting and extracting different users who have executed access to the same document as users having a community relationship is executed.

本構成によれば、電子文書記憶部に対するアクセスログに基づいて、同一文書に対する興味を持つユーザを効率的に選択抽出することが可能となり、同一文書に対する興味を持つユーザの集合をコミュニティ関係のあるユーザとして判定し、これらのユーザ集合に基づいて効率的にネットワーク図の作成、コミュニティ解析を行なうことができる。 According to this configuration, it becomes possible to efficiently select and extract users who are interested in the same document based on the access log for the electronic document storage unit, and a set of users who are interested in the same document has a community relationship. It is possible to determine as a user and efficiently create a network diagram and perform community analysis based on these user sets.

さらに、本発明のコミュニティ分析装置の一実施態様において、前記アクセスログ分析部は、前記コミュニティ関係を有するユーザの各々をノードとし、コミュニティ関係を有するユーザ間をリンクで結び付けたネットワーク図を生成する処理を実行する構成であることを特徴とする。 Furthermore, in an embodiment of the community analysis device of the present invention, the access log analysis unit generates a network diagram in which each user having the community relationship is a node and the users having the community relationship are linked by a link. It is the structure which performs.

本構成によれば、電子文書記憶部に対するアクセスログに基づいて抽出された同一文書に対する興味を持つユーザの集合情報を適用することで、効率的なネットワーク図の作成、コミュニティ解析を行なうことができる。 According to this configuration, it is possible to efficiently create a network diagram and perform community analysis by applying the collective information of users who are interested in the same document extracted based on the access log for the electronic document storage unit. .

さらに、本発明のコミュニティ分析装置の一実施態様において、前記アクセスログ分析部は、予め定められた時間間隔としての閾値インターバル内において、同一文書に対するアクセスを実行した異なるユーザを、コミュニティ関係を有するユーザとして選択抽出する処理を実行する構成であることを特徴とする。 Furthermore, in an embodiment of the community analysis device of the present invention, the access log analysis unit may identify different users who have accessed the same document within a threshold interval as a predetermined time interval as users having a community relationship. It is the structure which performs the process selectively selected as.

本構成によれば、電子文書記憶部に対するアクセスログに基づいて、規定時間内において実行された同一文書に対するアクセスユーザを抽出する構成としたので、例えばあるプロジェクトの特定の処理フェーズなど、特定期間内に同一文書に対する興味を示したユーザの集合を選択することが可能であり、より深いコミュニティ関係のあるユーザの抽出処理、およびネットワーク図の作成、コミュニティ解析を行なうことができる。 According to this configuration, since the access user for the same document executed within the specified time is extracted based on the access log for the electronic document storage unit, for example, a specific processing phase of a project within a specific period It is possible to select a set of users who have shown interest in the same document, and it is possible to perform extraction processing of users having a deeper community relationship, creation of a network diagram, and community analysis.

さらに、本発明のコミュニティ分析装置の一実施態様において、前記アクセスログ分析部は、前記閾値インターバルを、アクセス密度の高い時間間隔として設定する処理を実行する構成であることを特徴とする。 Furthermore, in an embodiment of the community analysis device of the present invention, the access log analysis unit is configured to execute a process of setting the threshold interval as a time interval having a high access density.

本構成によれば、電子文書記憶部に対するアクセスログに基づいて、規定時間内において実行された同一文書に対するアクセスユーザを抽出する際、規定時間をアクセス密度の高い時間間隔として設定する構成としたので、アクセス処理状況に応じた柔軟な解析処理を行なうことができる。 According to this configuration, when the access user for the same document executed within the specified time is extracted based on the access log for the electronic document storage unit, the specified time is set as a time interval with a high access density. Thus, it is possible to perform flexible analysis processing according to the access processing status.

さらに、本発明の第２の側面は、電子文書記憶部に対するアクセス情報として、アクセスユーザ識別子であるユーザＩＤと、アクセス対象文書識別子である文書ＩＤとをアクセス日時情報に対応付けて記録するアクセスログ記憶部と、前記アクセスログ記憶部に記録されたアクセスログの分析処理を実行するアクセスログ分析部とを有し、前記アクセスログ分析部は、前記アクセスログ記憶部に記録されたアクセスログに基づいて、同一ユーザによってアクセスの実行された異なる文書を、関連文書として抽出する処理を実行する構成であることを特徴とするコミュニティ分析装置にある。 Furthermore, the second aspect of the present invention provides an access log that records, as access information for the electronic document storage unit, a user ID that is an access user identifier and a document ID that is an access target document identifier in association with access date information. A storage unit, and an access log analysis unit that executes an analysis process of the access log recorded in the access log storage unit, wherein the access log analysis unit is based on the access log recorded in the access log storage unit Thus, the community analysis apparatus is characterized in that a process for extracting different documents accessed by the same user as related documents is executed.

本構成によれば、電子文書記憶部に対するアクセスログに基づいて、同一ユーザによってアクセスの実行された異なる文書を、関連文書として効率的に抽出することが可能となり、これらの文書関連情報に基づいて効率的にネットワーク図の作成、コミュニティ解析を行なうことができる。 According to this configuration, it is possible to efficiently extract different documents accessed by the same user as related documents based on the access log for the electronic document storage unit, and based on these document related information. Create network diagrams and perform community analysis efficiently.

さらに、本発明のコミュニティ分析装置の一実施態様において、前記アクセスログ分析部は、予め定められた時間間隔としての閾値インターバル内において、同一ユーザによってアクセスの実行された異なる文書を関連文書として抽出する処理を実行する構成であることを特徴とする。 Furthermore, in one embodiment of the community analysis device of the present invention, the access log analysis unit extracts different documents accessed by the same user as related documents within a threshold interval as a predetermined time interval. It is the structure which performs a process.

本構成によれば、電子文書記憶部に対するアクセスログに基づいて、規定時間内において同一ユーザによってアクセスの実行された異なる文書を関連文書として抽出する構成としたので、例えばあるプロジェクトの特定の処理フェーズなど、特定期間内にアクセスされた関連文書を選択することが可能となる。 According to this configuration, since a different document accessed by the same user within a specified time is extracted as a related document based on the access log for the electronic document storage unit, for example, a specific processing phase of a certain project For example, it is possible to select related documents accessed within a specific period.

さらに、本発明の第３の側面は、電子文書記憶部に対するアクセス情報として、アクセスユーザ識別子であるユーザＩＤと、アクセス対象文書識別子である文書ＩＤとをアクセス日時情報に対応付けて記憶部に記録するアクセスログ記憶ステップと、アクセスログ分析部において、前記アクセスログ記憶部に記録されたアクセスログの分析処理を実行するアクセスログ分析ステップとを有し、前記アクセスログ分析ステップは、前記アクセスログ記憶部に記録されたアクセスログに基づいて、同一文書に対するアクセスを実行した異なるユーザを、コミュニティ関係を有するユーザとして選択抽出する処理を実行することを特徴とするコミュニティ分析方法にある。 Further, according to a third aspect of the present invention, as access information for the electronic document storage unit, a user ID that is an access user identifier and a document ID that is an access target document identifier are recorded in the storage unit in association with access date information. An access log storage step, and an access log analysis step for executing an analysis process of an access log recorded in the access log storage unit in the access log analysis unit, wherein the access log analysis step includes the access log storage step. A community analysis method is characterized in that, based on an access log recorded in a section, a process of selecting and extracting different users who have performed access to the same document as users having a community relationship is performed.

さらに、本発明のコミュニティ分析方法の一実施態様において、前記コミュニティ分析方法は、さらに、前記コミュニティ関係を有するユーザの各々をノードとし、コミュニティ関係を有するユーザ間をリンクで結び付けたネットワーク図を生成する処理を実行するステップを有することを特徴とする。 Furthermore, in one embodiment of the community analysis method of the present invention, the community analysis method further generates a network diagram in which each user having the community relationship is a node and the users having the community relationship are linked by a link. It has the step which performs a process, It is characterized by the above-mentioned.

さらに、本発明のコミュニティ分析方法の一実施態様において、前記アクセスログ分析ステップは、予め定められた時間間隔としての閾値インターバル内において、同一文書に対するアクセスを実行した異なるユーザを、コミュニティ関係を有するユーザとして選択抽出する処理を実行することを特徴とする。 Furthermore, in an embodiment of the community analysis method of the present invention, the access log analysis step includes a step in which different users who have performed access to the same document within a threshold interval as a predetermined time interval are designated as users having a community relationship. As a feature, a process of selecting and extracting is executed.

さらに、本発明のコミュニティ分析方法の一実施態様において、前記アクセスログ分析ステップにおいて、前記閾値インターバルを、アクセス密度の高い時間間隔として設定する処理を実行することを特徴とする。 Furthermore, in one embodiment of the community analysis method of the present invention, in the access log analysis step, a process of setting the threshold interval as a time interval having a high access density is executed.

さらに、本発明の第４の側面は、電子文書記憶部に対するアクセス情報として、アクセスユーザ識別子であるユーザＩＤと、アクセス対象文書識別子である文書ＩＤとをアクセス日時情報に対応付けて記憶部に記録するアクセスログ記憶ステップと、アクセスログ分析部において、前記アクセスログ記憶部に記録されたアクセスログの分析処理を実行するアクセスログ分析ステップとを有し、前記アクセスログ分析ステップは、前記アクセスログ記憶部に記録されたアクセスログに基づいて、同一ユーザによってアクセスの実行された異なる文書を、関連文書として抽出する処理を実行することを特徴とするコミュニティ分析方法にある。 Further, according to a fourth aspect of the present invention, as access information for the electronic document storage unit, a user ID that is an access user identifier and a document ID that is an access target document identifier are recorded in the storage unit in association with access date information. An access log storage step, and an access log analysis step for executing an analysis process of an access log recorded in the access log storage unit in the access log analysis unit, wherein the access log analysis step includes the access log storage step. In the community analysis method, a process for extracting different documents accessed by the same user as related documents is executed based on the access log recorded in the section.

さらに、本発明のコミュニティ分析方法の一実施態様において、前記アクセスログ分析ステップは、予め定められた時間間隔としての閾値インターバル内において、同一ユーザによってアクセスの実行された異なる文書を、関連文書として抽出する処理を実行する構成であることを特徴とする。 Furthermore, in one embodiment of the community analysis method of the present invention, the access log analysis step extracts different documents accessed by the same user as related documents within a threshold interval as a predetermined time interval. It is the structure which performs the process to perform.

さらに、本発明の第５の側面は、コミュニティ分析処理をコンピュータにおいて実行させるコンピュータ・プログラムであり、電子文書記憶部に対するアクセス情報として、アクセスユーザ識別子であるユーザＩＤと、アクセス対象文書識別子である文書ＩＤとをアクセス日時情報に対応付けて記憶部に記録するアクセスログ記憶ステップと、アクセスログ分析部において、前記アクセスログ記憶部に記録されたアクセスログの分析処理を実行するアクセスログ分析ステップとを実行させ、前記アクセスログ分析ステップにおいては、前記アクセスログ記憶部に記録されたアクセスログに基づいて、同一文書に対するアクセスを実行した異なるユーザを、コミュニティ関係を有するユーザとして選択抽出する処理を実行させることを特徴とするコンピュータ・プログラムにある。 Further, a fifth aspect of the present invention is a computer program that causes a computer to execute community analysis processing. As access information for an electronic document storage unit, a user ID that is an access user identifier and a document that is an access target document identifier An access log storage step for recording the ID in the storage unit in association with the access date information; and an access log analysis step for executing an analysis process of the access log recorded in the access log storage unit in the access log analysis unit. In the access log analysis step, a process of selecting and extracting different users who have executed access to the same document as users having a community relationship based on the access log recorded in the access log storage unit is executed. It is characterized by In the computer program.

本構成によれば、電子文書記憶部に対するアクセスログに基づいて、同一文書に対する興味を持つユーザを効率的に選択抽出することが可能となり、同一文書に対する興味を持つユーザの集合をコミュニティ関係のあるユーザとして判定し、これらのユーザ集合に基づいて効率的にネットワーク図の作成、コミュニティ解析をコンピュータによって行なうことができる。 According to this configuration, it becomes possible to efficiently select and extract users who are interested in the same document based on the access log for the electronic document storage unit, and a set of users who are interested in the same document has a community relationship. It can be determined as a user, and a network diagram can be efficiently created and a community analysis can be performed by a computer based on these user sets.

さらに、本発明の第６の側面は、コミュニティ分析処理をコンピュータにおいて実行させるコンピュータ・プログラムであり、電子文書記憶部に対するアクセス情報として、アクセスユーザ識別子であるユーザＩＤと、アクセス対象文書識別子である文書ＩＤとをアクセス日時情報に対応付けて記憶部に記録するアクセスログ記憶ステップと、アクセスログ分析部において、前記アクセスログ記憶部に記録されたアクセスログの分析処理を実行するアクセスログ分析ステップとを実行させ、前記アクセスログ分析ステップにおいては、前記アクセスログ記憶部に記録されたアクセスログに基づいて、同一ユーザによってアクセスの実行された異なる文書を、関連文書として抽出する処理を実行させることを特徴とするコンピュータ・プログラムにある。 Furthermore, a sixth aspect of the present invention is a computer program that causes a computer to execute community analysis processing. As access information for an electronic document storage unit, a user ID that is an access user identifier and a document that is an access target document identifier An access log storage step for recording the ID in the storage unit in association with the access date information; and an access log analysis step for executing an analysis process of the access log recorded in the access log storage unit in the access log analysis unit. And in the access log analyzing step, a process of extracting different documents accessed by the same user as related documents is executed based on the access log recorded in the access log storage unit. Computer program Located in.

本構成によれば、電子文書記憶部に対するアクセスログに基づいて、同一ユーザによってアクセスの実行された異なる文書を、関連文書として効率的に抽出することが可能となり、これらの文書関連情報に基づいて効率的にネットワーク図の作成、コミュニティ解析をコンピュータによって行なうことができる。 According to this configuration, it is possible to efficiently extract different documents accessed by the same user as related documents based on the access log for the electronic document storage unit, and based on these document related information. Network diagram creation and community analysis can be efficiently performed by a computer.

なお、本発明のコンピュータ・プログラムは、例えば、様々なプログラム・コードを実行可能なコンピュータシステムに対して、コンピュータ可読な形式で提供する記憶媒体、通信媒体、例えば、ＣＤやＦＤ、ＭＯなどの記録媒体、あるいは、ネットワークなどの通信媒体によって提供可能なコンピュータ・プログラムである。このようなプログラムをコンピュータ可読な形式で提供することにより、コンピュータシステム上でプログラムに応じた処理が実現される。 Note that the computer program of the present invention is a recording medium provided in a computer-readable format for a computer system capable of executing various program codes, for example, a recording medium such as a CD, FD, or MO. A computer program that can be provided by a medium or a communication medium such as a network. By providing such a program in a computer-readable format, processing corresponding to the program is realized on the computer system.

本発明のさらに他の目的、特徴や利点は、後述する本発明の実施例や添付する図面に基づくより詳細な説明によって明らかになるであろう。なお、本明細書においてシステムとは、複数の装置の論理的集合構成であり、各構成の装置が同一筐体内にあるものには限らない。 Other objects, features, and advantages of the present invention will become apparent from a more detailed description based on embodiments of the present invention described later and the accompanying drawings. In this specification, the system is a logical set configuration of a plurality of devices, and is not limited to one in which the devices of each configuration are in the same casing.

本発明の一実施例の構成によれば、電子文書記憶部に対するアクセス情報として、アクセスユーザ識別子であるユーザＩＤと、アクセス対象文書識別子である文書ＩＤとをアクセス日時情報に対応付けて記録し、このアクセスログに基づいて、同一文書に対するアクセスを実行した異なるユーザを、コミュニティ関係を有するユーザとして選択抽出する処理を実行する構成としたので、同一文書に対する興味を持つユーザを効率的に選択抽出することが可能となり、同一文書に対する興味を持つユーザの集合をコミュニティ関係のあるユーザとして判定し、これらのユーザ集合に基づいて効率的にネットワーク図の作成、コミュニティ解析を行なうことができる。 According to the configuration of one embodiment of the present invention, as access information for the electronic document storage unit, a user ID that is an access user identifier and a document ID that is an access target document identifier are recorded in association with access date information, Based on this access log, a process for selecting and extracting different users who have accessed the same document as users having a community relationship is executed, so that users who are interested in the same document are efficiently selected and extracted. It is possible to determine a set of users who have an interest in the same document as users having a community relationship, and to efficiently create a network diagram and perform community analysis based on these user sets.

さらに、本発明の一実施例の構成によれば、電子文書記憶部に対するアクセスログに基づいて、アクセスログ記憶部に記録されたアクセスログに基づいて、同一ユーザによってアクセスの実行された異なる文書を、関連文書として抽出する処理を実行する構成としたので、関連文書の効率的な抽出処理が可能となり、これらの文書関連情報に基づいて効率的にネットワーク図の作成、コミュニティ解析を行なうことができる。 Furthermore, according to the configuration of the embodiment of the present invention, based on the access log for the electronic document storage unit, different documents accessed by the same user based on the access log recorded in the access log storage unit Since the process of extracting as related documents is executed, it is possible to efficiently extract related documents, and network diagrams can be efficiently created and community analysis can be performed based on these document related information. .

以下、図面を参照しながら本発明のコミュニティ分析装置、およびコミュニティ分析方法、並びにコンピュータ・プログラムの詳細について説明する。 Details of the community analysis device, community analysis method, and computer program of the present invention will be described below with reference to the drawings.

まず、本発明のコミュニティ分析の概要について説明する。本発明のコミュニティ分析では、例えばネットワークを介してアクセス可能なデータベースに格納されたデータファイルに対するアクセスログの解析を実行する。 First, an outline of community analysis of the present invention will be described. In the community analysis of the present invention, for example, an access log is analyzed for a data file stored in a database accessible via a network.

例えば、同じファイルにアクセスを行なっているアクセスログを解析して、そのアクセス時間の間隔によって、ユーザ間、すなわち人と人の関係を分類する。例えば、同じ文書にアクセスするユーザを解析してネットワークを作成しネットワーク分析の手法を用いてコミュニティ分析を行う。 For example, an access log accessing the same file is analyzed, and the relationship between users, that is, the relationship between people, is classified according to the access time interval. For example, a user who accesses the same document is analyzed to create a network, and community analysis is performed using a network analysis technique.

一般的なネットワーク分析の教科書などで示されるネットワーク分析手法は、多くの場合、理想的なネットワークを対象とした分析手法を示している。たとえば、組織には複数のコミュニティが存在するが、コミュニティの重なりがないネットワークを対象とした分析を示している。しかし、実際の組織は、組織図上の組織のほかに、新人社員のコミュニティや、組織図上の各グループに属する庶務担当のコミュニティなどが重なり合って存在する。このようなネットワークは、従来のネットワーク分析手法を直接適用することは困難であることが多い。 In many cases, network analysis methods shown in general network analysis textbooks indicate analysis methods for an ideal network. For example, the analysis shows a network that has multiple communities in an organization but does not overlap. However, in addition to the organization on the organization chart, the actual organization includes a community of new employees and a community in charge of general affairs belonging to each group on the organization chart. In such a network, it is often difficult to directly apply a conventional network analysis method.

本発明においては、例えば新人社員のコミュニティなどの同じコミュニティに属する人たちは、データベースに格納された同じ文書に、ある期間に集中してアクセスを行なうという仮定に基づいて解析処理を行なう。たとえば、同期入社のコミュニティでは、入社後、一定年数後に行われる研修の資料をアクセスすることがある。この仮定に基づき、同一文書に対して所定期間内にアクセスするユーザをクラスター化し分類することでネットワークを作成し、ネットワーク分析を可能にする。 In the present invention, for example, people belonging to the same community, such as a community of new employees, perform analysis processing based on the assumption that the same document stored in the database concentrates on a certain period of time. For example, a community that joins a company may access materials for training that has been held for a certain number of years after joining the company. Based on this assumption, a network is created by clustering and classifying users who access the same document within a predetermined period of time, thereby enabling network analysis.

本発明のコミュニティ分析の実行環境としてのネットワークシステムの構成例について、図１を参照して説明する。ネットワークシステムは、例えば、ある企業内の複数の事業所にまたがって構築された広域ネットワーク（ＷＡＮ）であって、図１に示すように複数のユーザ端末（クライアント）５１〜５３がネットワーク６０によって接続されている。ネットワーク６０には、電子文書ファイリング装置７０が接続され、ユーザ端末５１〜５３は、電子文書ファイリング装置７０に格納された電子文書を取得、あるいは閲覧を行なうことができる。 A configuration example of a network system as an execution environment for community analysis according to the present invention will be described with reference to FIG. The network system is, for example, a wide area network (WAN) constructed across a plurality of offices in a company, and a plurality of user terminals (clients) 51 to 53 are connected by a network 60 as shown in FIG. Has been. An electronic document filing device 70 is connected to the network 60, and the user terminals 51 to 53 can acquire or view an electronic document stored in the electronic document filing device 70.

電子文書ファイリング装置７０にはコミュニティ分析装置１００が接続されている。コミュニティ分析装置１００は、電子文書ファイリング装置７０に対するユーザ端末（クライアント）５１〜５３からのアクセス状況を解析する。 A community analysis device 100 is connected to the electronic document filing device 70. The community analysis device 100 analyzes the access status from the user terminals (clients) 51 to 53 to the electronic document filing device 70.

図２にコミュニティ分析装置１００の構成を示す。図２に示すようにコミュニティ分析装置１００は、アクセスログ記憶部１０１と、アクセスログ分析部１０２を有する。アクセスログ記憶部１０１には、電子文書ファイリング装置７０に対するユーザ端末（クライアント）５１〜５３からのアクセスログが記録され、アクセスログ分析部１０２は、アクセスログ記憶部１０１に記録されたアクセスログの解析を実行する。 FIG. 2 shows the configuration of the community analysis device 100. As illustrated in FIG. 2, the community analysis device 100 includes an access log storage unit 101 and an access log analysis unit 102. The access log storage unit 101 records access logs from the user terminals (clients) 51 to 53 for the electronic document filing device 70, and the access log analysis unit 102 analyzes the access logs recorded in the access log storage unit 101. Execute.

なお、コミュニティ分析装置１００は、例えばＰＣなどの情報処理装置によって構成され、アクセスログ記憶部１０１はハードディスク等の記憶手段によって構成される。また、アクセスログ分析部１０２は、ＲＯＭ等の記憶部に格納されたコンピュータ・プログラムを実行するＣＰＵ等を有するデータ処理部によって構成される。アクセスログ分析部１０２において実行する処理の具体例については、以下、詳細に説明する。 The community analysis device 100 is configured by an information processing device such as a PC, and the access log storage unit 101 is configured by storage means such as a hard disk. The access log analysis unit 102 includes a data processing unit including a CPU that executes a computer program stored in a storage unit such as a ROM. A specific example of processing executed in the access log analysis unit 102 will be described in detail below.

図３にアクセスログ記憶部１０１の記録データの例を示す。アクセスログ記憶部１０１には、図３に示すように、電子文書ファイリング装置７０に対するユーザ端末（クライアント）５１〜５３からのアクセスログとして、アクセス時刻、アクセスを実行したユーザの識別子としてのユーザＩＤと、アクセス対象となった文書の識別子としての文書ＩＤの対応データを記録する。 FIG. 3 shows an example of recorded data in the access log storage unit 101. As shown in FIG. 3, the access log storage unit 101 includes, as an access log from the user terminals (clients) 51 to 53 for the electronic document filing device 70, an access time and a user ID as an identifier of the user who executed the access. The correspondence data of the document ID as the identifier of the document to be accessed is recorded.

アクセスログ分析部１０２は、このアクセスログ記憶部１０１に記録されたアクセスログの解析を実行する。以下、アクセスログ分析部１０２の実行するアクセスログ解析処理の異なる解析処理例について、順次説明する。以下の４つの処理例について、順次、説明する。
（処理例１）共通文書に対するアクセスユーザの解析によるコミュニティ分析
（処理例２）規定時間内の共通文書に対するアクセスユーザの解析によるコミュニティ分析
（処理例３）共通のユーザによる異なる文書に対するアクセス解析によるコミュニティ分析
（処理例４）規定時間内の同一ユーザによる異なる文書に対するアクセスの解析によるコミュニティ分析 The access log analysis unit 102 analyzes the access log recorded in the access log storage unit 101. Hereinafter, different analysis processing examples of the access log analysis processing executed by the access log analysis unit 102 will be sequentially described. The following four processing examples will be described sequentially.
(Processing Example 1) Community Analysis by Analysis of Access Users for Common Documents (Processing Example 2) Community Analysis by Analysis of Access Users for Common Documents within Specified Time (Processing Example 3) Community by Analysis of Access to Different Documents by Common Users Analysis (Processing Example 4) Community analysis by analyzing access to different documents by the same user within a specified time

（処理例１）共通文書に対するアクセスユーザの解析によるコミュニティ分析
まず、コミュニティ分析処理例１として、共通文書に対してアクセスを行なうユーザの解析によるコミュニティ分析処理について説明する。アクセスログ分析部１０２の実行する本処理例における解析処理の一例について、図４を参照して説明する。アクセスログ分析部１０２は、アクセスログ記憶部１０１に記録されたアクセスログから、同一文書に対してアクセスを実行した複数のユーザを抽出し、これら同一文書にアクセスしたユーザをコミュニティ関係にあるユーザであると判定する。 (Processing Example 1) Community Analysis by Analysis of Access User for Common Document First, community analysis processing by analysis of a user who accesses a common document will be described as community analysis processing example 1. An example of analysis processing in this processing example executed by the access log analysis unit 102 will be described with reference to FIG. The access log analysis unit 102 extracts a plurality of users who have accessed the same document from the access log recorded in the access log storage unit 101, and the users who have accessed the same document are community-related users. Judge that there is.

図４に示すように、図１に示すユーザ端末（クライアント）５１〜５３を利用するユーザＡとユーザＢが、電子文書ファイリング装置７０に格納された同一文書（文書Ｄａ）に対してアクセスを行ったことを示すアクセスログがアクセスログ記憶部１０１に存在する場合、アクセスログ分析部１０２は、これらのログ情報に基づいてユーザＡとユーザＢが同一コミュニティに属するユーザである可能性が高いと判定する。 As shown in FIG. 4, users A and B who use the user terminals (clients) 51 to 53 shown in FIG. 1 access the same document (document Da) stored in the electronic document filing device 70. When the access log storage unit 101 has an access log indicating that the user A and the user B belong to the same community, the access log analysis unit 102 determines that the user A and the user B belong to the same community. To do.

具体的には、異なるユーザが同一文書に対して共通にアクセスを実行している回数をカウントし、カウント値に基づいて、ユーザのコミュニティ分析を行なう。例えば、アクセスログ記憶部１０１に記録されたアクセスログから、ユーザ［Ｕ１］とユーザ［Ｕ２］との共通文書に対するアクセス回数をＡ（Ｕ１，Ｕ２）として、アクセス回数をカウントする。共通文書に対するアクセス回数Ａ（Ｕ１，Ｕ２）が大きな値を示すほど、ユーザ［Ｕ１］，［Ｕ２］は、同一コミュニティに属する可能性が高いと判定する。 Specifically, the number of times that different users access the same document in common is counted, and the user's community analysis is performed based on the count value. For example, the number of accesses is counted from the access log recorded in the access log storage unit 101, where A (U1, U2) is the number of accesses to the common document of user [U1] and user [U2]. The larger the access count A (U1, U2) for the common document, the higher the possibility that the users [U1] and [U2] belong to the same community.

このように、アクセスログ分析部１０２は、同一文書に対するアクセス回数が高いユーザの集合を同一コミュニティに属するユーザであると判定する。この解析処理シーケンスについて、図５に示すフローチャートを参照して説明する。 As described above, the access log analysis unit 102 determines that a set of users having a high access count to the same document is a user belonging to the same community. This analysis processing sequence will be described with reference to the flowchart shown in FIG.

まず、アクセスログ分析部１０２は、ステップＳ１０１において、アクセスログ記憶部１０１に記録されたアクセスログ（図３参照）から、分析対象期間のログを取り出す。なお、分析期間は、オペレータによって任意の期間を分析期間として予め設定する。 First, in step S101, the access log analysis unit 102 extracts a log for the analysis target period from the access log (see FIG. 3) recorded in the access log storage unit 101. The analysis period is set in advance by the operator as an analysis period.

次に、ステップＳ１０２において、アクセスログのログ番号を示す変数［Ｋ］を初期値Ｋ＝１に設定する。ステップＳ１０３において、Ｋ番目のログを選択し、ステップＳ１０４において、取得したＫ番目のログに記録されたユーザＩＤを［Ｕ１］とし、ドキュメントＩＤを［Ｄ１］とする。 Next, in step S102, a variable [K] indicating the log number of the access log is set to an initial value K = 1. In step S103, the Kth log is selected. In step S104, the user ID recorded in the acquired Kth log is [U1], and the document ID is [D1].

次に、ステップＳ１０５において、比較ログのログ番号［Ｋ＋Ｎ］を設定するための変数［Ｎ］の初期設定としてＮ＝１に設定し、ステップＳ１０６において、Ｋ＋Ｎ番目のログを取得し、ステップＳ１０７において、取得したＫ＋Ｎ番目のログに記録されたユーザＩＤを［Ｕ２］とし、ドキュメントＩＤを［Ｄ２］とする。 Next, in step S105, N = 1 is set as the initial setting of the variable [N] for setting the log number [K + N] of the comparison log. In step S106, the K + N-th log is acquired, and in step S107. The user ID recorded in the acquired K + Nth log is [U2], and the document ID is [D2].

ステップＳ１０８では、Ｋ番目とＫ＋Ｎ番目のログのユーザＩＤを比較し、ステップＳ１０９では、Ｋ番目とＫ＋Ｎ番目のログのドキュメントＩＤを比較する。ステップＳ１０８において、Ｋ番目とＫ＋Ｎ番目のログのユーザＩＤが同一であると判定された場合は、同一ユーザによるアクセスログ記録であるので、異なるユーザ間のコミュニティを解析する対象としてのデータに選定されず、ステップＳ１１１に進む。 In step S108, the user IDs of the Kth and K + Nth logs are compared, and in step S109, the document IDs of the Kth and K + Nth logs are compared. If it is determined in step S108 that the user IDs of the Kth and K + Nth logs are the same, it is an access log record by the same user, so it is selected as data for analyzing a community between different users. Instead, the process proceeds to step S111.

ステップＳ１０８において、Ｋ番目とＫ＋Ｎ番目のログのユーザＩＤが同一でない場合は、ステップＳ１０９に進み、Ｋ番目とＫ＋Ｎ番目のログのドキュメントＩＤが同一であるか否かを判定する。Ｋ番目とＫ＋Ｎ番目のログのドキュメントＩＤが同一である場合は、Ｋ番目とＫ＋Ｎ番目のログは、異なるユーザによる同一ドキュメントに対するアクセス情報を示すログであると判定される。すなわち、図４を参照して説明した関係を提示していることになる。 If the user IDs of the Kth and K + Nth logs are not the same in step S108, the process proceeds to step S109 to determine whether the document IDs of the Kth and K + Nth logs are the same. When the document IDs of the Kth and K + Nth logs are the same, it is determined that the Kth and K + Nth logs are logs indicating access information for the same document by different users. That is, the relationship described with reference to FIG. 4 is presented.

この場合は、ステップＳ１１０に進み、ユーザ［Ｕ１］とユーザ［Ｕ２］との同一文書に対するアクセス回数を示すデータ［Ａ（Ｕ１，Ｕ２）］を１つ増加させる処理を行なう。すなわち、
Ａ（Ｕ１，Ｕ２）←Ａ（Ｕ１，Ｕ２）＋１
とする。 In this case, the process proceeds to step S110, and a process of increasing the data [A (U1, U2)] indicating the number of accesses to the same document by the user [U1] and the user [U2] by one is performed. That is,
A (U1, U2) ← A (U1, U2) +1
And

［Ａ（Ｕ１，Ｕ２）］は、ユーザＵ１，Ｕ２の同一文書に対するアクセス数を示すデータであり、この数値が高いほど、ユーザＵ１，Ｕ２が同一コミュニティに属する確率が高いと判定される。 [A (U1, U2)] is data indicating the number of accesses to the same document by the users U1, U2, and it is determined that the higher this value, the higher the probability that the users U1, U2 belong to the same community.

ステップＳ１１１では、比較ログのログ番号［Ｋ＋Ｎ］を設定するための変数［Ｎ］の値を１つ増加させる変数更新処理を行なう。すなわち、Ｋ番目のログとの比較ログを次のログデータに設定する処理である。ステップＳ１１２では、比較ログのログ番号［Ｋ＋Ｎ］が最大ログ番号を超えたか否か、すなわち解析対象として取得したログに存在するか否かを判定する。比較ログのログ番号［Ｋ＋Ｎ］が最大ログ番号を超えていない場合は、ステップＳ１０６に戻り、Ｋ番目のログと、Ｋ＋Ｎ番目のログとの比較処理を繰り返す。 In step S111, variable update processing for increasing the value of the variable [N] by one for setting the log number [K + N] of the comparison log is performed. That is, it is a process of setting a comparison log with the Kth log as the next log data. In step S112, it is determined whether or not the log number [K + N] of the comparison log exceeds the maximum log number, that is, whether or not it exists in the log acquired as the analysis target. If the log number [K + N] of the comparison log does not exceed the maximum log number, the process returns to step S106, and the comparison process between the Kth log and the K + Nth log is repeated.

すなわち、ユーザＩＤとドキュメントＩＤの比較を実行し、異なるユーザによる同一ドキュメントに対するアクセスログの関係にあれば、ユーザ［Ｕ１］とユーザ［Ｕ２］との同一文書に対するアクセス回数を示すデータ［Ａ（Ｕ１，Ｕ２）］を１つ増加させる処理を行なう。 That is, the user ID and the document ID are compared, and if there is a relationship of access logs for the same document by different users, the data [A (U1) indicating the number of accesses to the same document by the user [U1] and the user [U2]. , U2)] is incremented by one.

ステップＳ１１２において、比較ログのログ番号［Ｋ＋Ｎ］が最大ログ番号を超えていると判定され、最終ログまでの比較が終了したと判定すると、ステップＳ１１３に進み、比較元のログ番号［Ｋ］を１つ増加させる処理を行い、ステップＳ１１４において、比較元のログのログ番号［Ｋ］が最大ログ番号を超えたか否かを判定する。すなわち、最終ログのログ番号を超えていないかをチェックする。ログ番号［Ｋ］が最大ログ番号を超えていない場合は、ステップＳ１０３に戻り、Ｋ番目のログと、Ｋ＋Ｎ番目のログとの比較処理を繰り返す。 In step S112, if it is determined that the log number [K + N] of the comparison log exceeds the maximum log number and it is determined that the comparison up to the final log has been completed, the process proceeds to step S113, and the comparison source log number [K] is set. In step S114, it is determined whether or not the log number [K] of the comparison source log exceeds the maximum log number. That is, it is checked whether the log number of the last log has been exceeded. If the log number [K] does not exceed the maximum log number, the process returns to step S103, and the comparison process between the Kth log and the K + Nth log is repeated.

ステップＳ１１４において、比較元のログのログ番号［Ｋ］が最大ログ番号を超え、取得ログに存在しないと判定されると、すべての取得ログ間の比較処理が終了したことになり、アクセスログ分析部１０２の処理を終了する。 In step S114, if it is determined that the log number [K] of the comparison source log exceeds the maximum log number and does not exist in the acquired log, the comparison processing between all acquired logs is completed, and the access log analysis is completed. The process of the unit 102 ends.

このようなログ分析によって、アクセスログ分析部１０２は、同一文書に対するアクセス回数が高いユーザの組み合わせを解析し、このようなユーザの集合を取得することができる。これらのユーザ集合は、同一コミュニティに属するユーザであると推定することができる。図５を参照して説明した処理フローによって、各ユーザの同一文書に対するアクセス回数Ａ（Ｕｎ，Ｕｍ）が算出され、この各ユーザのアクセス回数Ａ（Ｕｎ，Ｕｍ）に基づいて、ネットワーク図を作成する。アクセスログ分析部１０２は、コミュニティ関係を有するユーザの各々をノードとし、コミュニティ関係を有するユーザ間をリンクで結び付けたネットワーク図を生成する処理を実行する。 By such log analysis, the access log analysis unit 102 can analyze a combination of users who have a high number of accesses to the same document and acquire such a set of users. These user sets can be estimated to be users belonging to the same community. The number of accesses A (Un, Um) for each user's same document is calculated by the processing flow described with reference to FIG. 5, and a network diagram is created based on the number of accesses A (Un, Um) for each user. To do. The access log analysis unit 102 executes a process of generating a network diagram in which each user having a community relationship is a node and the users having a community relationship are linked by a link.

各ユーザの共通文書に対するアクセス回数Ａ（Ｕｎ，Ｕｍ）が大きいほどユーザ間の結びつきが高いと判定して、各ユーザ間のアクセス回数Ａ（Ｕｎ，Ｕｍ）をユーザ間の結びつきを示す指標、すなわち同一コミュニティに属する判定指標として適用してネットワーク図を作成する。本手法によって、例えば、図６（ａ）に示すネットワーク図が作成される。図６（ａ）に示すネットワーク図は、ノード２０１がユーザを示し、リンク２０２がユーザ（ノード）間の結びつきを示す。 It is determined that the connection between users is higher as the number of accesses A (Un, Um) to the common document of each user is larger, and the access number A (Un, Um) between users is an index indicating the connection between users, that is, A network diagram is created by applying as a determination index belonging to the same community. By this method, for example, the network diagram shown in FIG. 6A is created. In the network diagram shown in FIG. 6A, a node 201 indicates a user, and a link 202 indicates a connection between users (nodes).

リンクの設定は、上述のアクセス回数Ａ（Ｕｎ，Ｕｍ）に基づいて設定される。例えば予め設定された閾値以上のアクセス回数を持つユーザ間にリンクを設定する。なお、ネットワーク構成手法の具体例については、例えば、ローレンス・プルサック、ロブ・クロス『ソーシャル・ネットワーク：組織活力の源泉』ＤＩＡＭＯＮＤハーバード・ビジネス・レビュー、２００２年１０月号９６ページに記載されている。この手法を適用し、上述したログ分析によって算出される各ユーザ間のアクセス回数Ａ（Ｕｎ，Ｕｍ）をユーザ間の結びつきを示す指標、すなわち同一コミュニティに属する判定指標として適用してネットワーク図を生成する。このネットワーク図によって、人事上の組織図では表現されない、構成人員の役割や、構成人員間の依存関係を知ることができる。 The link is set based on the above access count A (Un, Um). For example, a link is set between users having an access count equal to or greater than a preset threshold. Specific examples of the network configuration method are described in, for example, Lawrence Prussack, Rob Cross “Social Network: Source of Organizational Vitality” DIAMOND Harvard Business Review, October 2002, page 96. By applying this method, the network diagram is generated by applying the access count A (Un, Um) between users calculated by the log analysis described above as an index indicating the connection between users, that is, a determination index belonging to the same community. To do. With this network diagram, it is possible to know the roles of the constituent members and the dependency relationships between the constituent members, which are not represented in the personnel organization chart.

このような、ネットワーク図を利用し、ネットワーク図の解析手法において利用される最短経路媒介性を計算することで、コミュニティの発見を行なうことができる。例えば図６（ａ）に示すように、ユーザノードＡと、ユーザノードＢとを結びつける最短経路Ａ〜Ｃ〜Ｂを最短経路媒介性の計算により検出し、図６（ｂ）に示すように、ユーザＡの属するコミュニティ１と、ユーザＢの属するコミュニティ２を発見することができる。これらの各コミュニティは、例えば、同期入社のコミュニティや、類似製品の開発エンジニアのコミュニティなど、ある共通のカテゴリを持つコミュニティである。 By using such a network diagram and calculating the shortest path mediation used in the network diagram analysis method, a community can be found. For example, as shown in FIG. 6A, the shortest paths A to C-B connecting the user node A and the user node B are detected by calculation of the shortest path mediation, and as shown in FIG. Community 1 to which user A belongs and community 2 to which user B belongs can be found. Each of these communities is a community having a certain common category, such as a community for synchronous employment and a community for development engineers of similar products.

（処理例２）規定時間内の共通文書に対するアクセスユーザの解析によるコミュニティ分析
次に、コミュニティ分析処理例２として、予め定めた規定時間内に共通文書に対してアクセスを行なうユーザの解析によるコミュニティ分析処理について説明する。アクセスログ分析部１０２の実行する本処理例における解析処理の一例について、図７を参照して説明する。アクセスログ分析部１０２は、アクセスログ記憶部１０１に記録されたアクセスログから、予め定めた規定時間内に同一文書に対してアクセスを実行した複数のユーザを抽出し、これら規定時間内に同一文書にアクセスしたユーザをコミュニティ関係にあるユーザであると判定する。 (Processing Example 2) Community Analysis by Analysis of Access User for Common Document within Specified Time Next, as Community Analysis Processing Example 2, community analysis by analysis of a user who accesses a common document within a predetermined stipulated time Processing will be described. An example of analysis processing in this processing example executed by the access log analysis unit 102 will be described with reference to FIG. The access log analysis unit 102 extracts, from the access log recorded in the access log storage unit 101, a plurality of users who have accessed the same document within a predetermined time, and the same document within the predetermined time. It is determined that the user who has accessed is a user who has a community relationship.

すなわち、先の処理例では、時間を考慮することなく、同一文書に対してアクセスを実行したユーザであるか否かのみを判定指標として、同一コミュニティに属するユーザ関係にあるか否かを決定していた。本処理例では、さらに時間要素を加え、規定時間内の同一文書アクセスの有無に基づいて、同一コミュニティに属するユーザ関係にあるか否かを決定する。 That is, in the previous processing example, whether or not the user belongs to the same community is determined using only whether or not the user has executed access to the same document without considering time. It was. In this processing example, a time element is further added, and it is determined whether or not there is a user relationship belonging to the same community based on whether or not the same document has been accessed within a specified time.

図７に示すように、ユーザＵ１〜Ｕ３が、電子文書ファイリング装置７０に格納された同一文書（文書Ｄ１）に対してアクセスを行ったものとする。これらのアクセス情報を示すアクセスログがアクセスログ記憶部１０１に存在するものとする。 As shown in FIG. 7, it is assumed that the users U1 to U3 have accessed the same document (document D1) stored in the electronic document filing device 70. Assume that an access log indicating these pieces of access information exists in the access log storage unit 101.

アクセスログ分析部１０２は、これらのログ情報に基づく解析を実行する。ユーザＵ１とユーザＵ２の文書Ｄ１に対するアクセス時間の差ｔ１は、予め定めた時間間隔としての閾値インターバル［Ｉａ］未満であり、ユーザＵ１とユーザＵ３の文書Ｄ１に対するアクセス時間の差ｔ２は、予め定めた時間間隔としての閾値インターバル［Ｉａ］以上となっている。これらの時間情報は、アクセスログ記憶部１０１に記録されたログ情報（図３参照）に基づいて取得される。 The access log analysis unit 102 performs analysis based on these log information. The access time difference t1 between the user U1 and the user U2 with respect to the document D1 is less than a threshold interval [Ia] as a predetermined time interval, and the access time difference t2 between the user U1 and the user U3 with respect to the document D1 is determined in advance. More than the threshold interval [Ia] as the time interval. These pieces of time information are acquired based on log information (see FIG. 3) recorded in the access log storage unit 101.

この場合、アクセスログ分析部１０２は、ユーザＵ１とユーザＵ２の関係は、同一コミュニティに属する可能性が高く、ユーザＵ１とユーザＵ３の関係は、同一コミュニティに属する可能性が低い関係であると判定する。 In this case, the access log analysis unit 102 determines that the relationship between the user U1 and the user U2 is highly likely to belong to the same community, and the relationship between the user U1 and the user U3 is a relationship that is unlikely to belong to the same community. To do.

本処理例においても、先の処理例と同様、異なるユーザが、規定時間（閾値インターバルＩａ）未満の時間間隔で、同一文書に対して共通にアクセスを実行している回数をカウントし、カウント値に基づいて、ユーザのコミュニティ分析を行なう。例えば、アクセスログ記憶部１０１に記録されたアクセスログから、ユーザ［Ｕ１］とユーザ［Ｕ２］との規定時間（閾値インターバルＩａ）未満の時間間隔で実行された共通文書に対するアクセス回数をＡ（Ｕ１，Ｕ２）として、アクセス回数をカウントする。規定時間未満に実行された共通文書に対するアクセス回数Ａ（Ｕ１，Ｕ２）が大きな値を示すほど、ユーザ［Ｕ１］，［Ｕ２］は、同一コミュニティに属する可能性が高いと判定する。 Also in this processing example, as in the previous processing example, the number of times that different users commonly access the same document at a time interval less than the specified time (threshold interval Ia) is counted. Based on the above, user community analysis is performed. For example, from the access log recorded in the access log storage unit 101, the number of accesses to the common document executed at a time interval less than the prescribed time (threshold interval Ia) between the user [U1] and the user [U2] is A (U1 , U2), the number of accesses is counted. It is determined that the users [U1] and [U2] are more likely to belong to the same community as the access count A (U1, U2) with respect to the common document executed in less than the specified time is larger.

このように、アクセスログ分析部１０２は、規定時間内の同一文書に対するアクセス回数が高いユーザの集合を同一コミュニティに属するユーザであると判定する。この解析処理シーケンスについて、図８に示すフローチャートを参照して説明する。 As described above, the access log analysis unit 102 determines that a set of users having a high access count for the same document within a specified time is a user belonging to the same community. This analysis processing sequence will be described with reference to the flowchart shown in FIG.

まず、アクセスログ分析部１０２は、ステップＳ２０１において、アクセスログ記憶部１０１に記録されたアクセスログ（図３参照）から、分析対象期間のログを取り出す。なお、分析期間は、オペレータによって任意の期間を分析期間として予め設定する。 First, in step S201, the access log analysis unit 102 extracts a log for the analysis target period from the access log (see FIG. 3) recorded in the access log storage unit 101. The analysis period is set in advance by the operator as an analysis period.

次に、ステップＳ２０２において、アクセスログのログ番号を示す変数［Ｋ］を初期値Ｋ＝１に設定する。ステップＳ２０３において、Ｋ番目のログを選択し、ステップＳ２０４において、取得したＫ番目のログに記録されたアクセス時刻を［Ｔ１］、ユーザＩＤを［Ｕ１］とし、ドキュメントＩＤを［Ｄ１］とする。 In step S202, a variable [K] indicating the log number of the access log is set to an initial value K = 1. In step S203, the Kth log is selected, and in step S204, the access time recorded in the acquired Kth log is [T1], the user ID is [U1], and the document ID is [D1].

次に、ステップＳ２０５において、比較ログのログ番号［Ｋ＋Ｎ］を設定するための変数［Ｎ］の初期設定としてＮ＝１に設定し、ステップＳ２０６において、Ｋ＋Ｎ番目のログを取得し、ステップＳ２０７において、取得したＫ＋Ｎ番目のログに記録されたアクセス時刻を［Ｔ２］、ユーザＩＤを［Ｕ２］とし、ドキュメントＩＤを［Ｄ２］とする。 Next, in step S205, N = 1 is set as the initial setting of the variable [N] for setting the log number [K + N] of the comparison log. In step S206, the K + N-th log is acquired, and in step S207. The access time recorded in the acquired K + Nth log is [T2], the user ID is [U2], and the document ID is [D2].

ステップＳ２０８では、Ｋ番目とＫ＋Ｎ番目のログのユーザＩＤを比較し、ステップＳ２０９では、Ｋ番目とＫ＋Ｎ番目のログのドキュメントＩＤを比較し、ステップＳ２１０では、Ｋ番目とＫ＋Ｎ番目のログのアクセス時刻の差［Ｔ２−Ｔ１］と予め定めた閾値インターバル［Ｉａ］とを比較する。すなわち、２つのログのアクセス時間の差が閾値インターバル［Ｉａ］未満であるか否かを判定する。 In step S208, the user IDs of the Kth and K + Nth logs are compared. In step S209, the document IDs of the Kth and K + Nth logs are compared. In step S210, the access times of the Kth and K + Nth logs are compared. [T2−T1] is compared with a predetermined threshold interval [Ia]. That is, it is determined whether or not the difference between the access times of the two logs is less than the threshold interval [Ia].

ステップＳ２０８において、Ｋ番目とＫ＋Ｎ番目のログのユーザＩＤが同一であると判定された場合は、同一ユーザによるアクセスログ記録であるので、異なるユーザ間のコミュニティを解析する対象としてのデータに選定されず、ステップＳ２１２に進む。 If it is determined in step S208 that the user IDs of the Kth and K + Nth logs are the same, it is the access log record by the same user, so it is selected as the data for analyzing the community between different users. Instead, the process proceeds to step S212.

ステップＳ２０８において、Ｋ番目とＫ＋Ｎ番目のログのユーザＩＤが同一でない場合は、ステップＳ２０９に進み、Ｋ番目とＫ＋Ｎ番目のログのドキュメントＩＤが同一であるか否かを判定する。Ｋ番目とＫ＋Ｎ番目のログのドキュメントＩＤが同一である場合は、Ｋ番目とＫ＋Ｎ番目のログは、異なるユーザによる同一ドキュメントに対するアクセス情報を示すログであると判定される。 If the user IDs of the Kth and K + Nth logs are not the same in step S208, the process proceeds to step S209 to determine whether the document IDs of the Kth and K + Nth logs are the same. When the document IDs of the Kth and K + Nth logs are the same, it is determined that the Kth and K + Nth logs are logs indicating access information for the same document by different users.

この場合は、さらに、ステップＳ２１０に進み、Ｋ番目とＫ＋Ｎ番目のログのアクセス時刻の差［Ｔ２−Ｔ１］と予め定めた閾値インターバル［Ｉａ］とを比較する。すなわち、
Ｔ２−Ｔ１＜Ｉａ
上記式が成立するか否かを判定する。これは、２つのログのアクセス時間の差が閾値インターバル［Ｉａ］未満であるか否かを判定する処理である。 In this case, the process further proceeds to step S210, and the difference [T2−T1] between the access times of the Kth and K + Nth logs is compared with a predetermined threshold interval [Ia]. That is,
T2-T1 <Ia
It is determined whether or not the above equation holds. This is a process for determining whether or not the difference between the access times of the two logs is less than the threshold interval [Ia].

上記式が成立する場合は、ステップＳ２１１に進み、ユーザ［Ｕ１］とユーザ［Ｕ２］との規定時間内の同一文書に対するアクセス回数を示すデータ［Ａ（Ｕ１，Ｕ２）］を１つ増加させる処理を行なう。すなわち、
Ａ（Ｕ１，Ｕ２）←Ａ（Ｕ１，Ｕ２）＋１
とする。このケースは、図７に示すユーザＵ１とユーザＵ２との関係に相当する。 If the above equation is satisfied, the process proceeds to step S211, and the data [A (U1, U2)] indicating the number of accesses to the same document within the specified time by the user [U1] and the user [U2] is increased by one. To do. That is,
A (U1, U2) ← A (U1, U2) +1
And This case corresponds to the relationship between the user U1 and the user U2 shown in FIG.

ステップＳ２１０において、Ｋ番目とＫ＋Ｎ番目のログのアクセス時刻の差［Ｔ２−Ｔ１］が、
Ｔ２−Ｔ１＜Ｉａ
を満たさない場合、すなわち、２つのログのアクセス時間の差が閾値インターバル［Ｉａ］以上である場合は、２つのログのユーザは、同一文書に対するアクセスを行ってはいるが、予め定めた閾値インターバル以上の差を持ったアクセスであり、このログに示されるユーザ間の共通性は低いと判定し、ステップＳ２１１のカウントアップ処理を実行することなく、ステップＳ２１２に進む。このケースは、図７に示すユーザＵ１とユーザＵ３との関係に相当する。 In step S210, the difference [T2-T1] between the access times of the Kth and K + Nth logs is
T2-T1 <Ia
If the difference between the access times of the two logs is equal to or greater than the threshold interval [Ia], the users of the two logs are accessing the same document, but the predetermined threshold interval The access has the above difference, and it is determined that the commonality between users shown in the log is low, and the process proceeds to step S212 without executing the count-up process in step S211. This case corresponds to the relationship between the user U1 and the user U3 shown in FIG.

本処理例では、［Ａ（Ｕ１，Ｕ２）］は、ユーザＵ１，Ｕ２の規定時間内の同一文書に対するアクセス数を示すデータであり、この数値が高いほど、ユーザＵ１，Ｕ２が同一コミュニティに属する確率が高いと判定される。 In this processing example, [A (U1, U2)] is data indicating the number of accesses to the same document within the specified time of the users U1, U2, and the higher this value, the more the users U1, U2 belong to the same community. It is determined that the probability is high.

ステップＳ２１２では、比較ログのログ番号［Ｋ＋Ｎ］を設定するための変数［Ｎ］の値を１つ増加させる変数更新処理を行なう。すなわち、Ｋ番目のログとの比較ログを次のログデータに設定する処理である。ステップＳ２１３では、比較ログのログ番号［Ｋ＋Ｎ］が最大ログ番号を超えたか否か、すなわち解析対象として取得したログに存在するか否かを判定する。比較ログのログ番号［Ｋ＋Ｎ］が最大ログ番号を超えていない場合は、ステップＳ２０６に戻り、Ｋ番目のログと、Ｋ＋Ｎ番目のログとの比較処理を繰り返す。 In step S212, variable update processing for increasing the value of the variable [N] by one for setting the log number [K + N] of the comparison log is performed. That is, it is a process of setting a comparison log with the Kth log as the next log data. In step S213, it is determined whether or not the log number [K + N] of the comparison log exceeds the maximum log number, that is, whether or not it exists in the log acquired as the analysis target. If the log number [K + N] of the comparison log does not exceed the maximum log number, the process returns to step S206, and the comparison process between the Kth log and the K + Nth log is repeated.

すなわち、ログ間のユーザＩＤとドキュメントＩＤの比較と、ログ間の時間差［Ｔ２−Ｔ１］と閾値インターバル［Ｉａ］との比較を実行し、異なるユーザによる閾値時間Ｔａ内の同一ドキュメントに対するアクセスログの関係にあれば、ユーザ［Ｕ１］とユーザ［Ｕ２］との規定時間内の同一文書に対するアクセス回数を示すデータ［Ａ（Ｕ１，Ｕ２）］を１つ増加させる処理を行なう。 That is, the comparison between the user ID and the document ID between the logs and the comparison between the time difference [T2-T1] between the logs and the threshold interval [Ia] are performed, and the access logs for the same document within the threshold time Ta by different users If there is a relationship, a process of increasing data [A (U1, U2)] indicating the number of accesses to the same document within a specified time by user [U1] and user [U2] by one is performed.

ステップＳ２１３において、比較ログのログ番号［Ｋ＋Ｎ］が最大ログ番号を超えていると判定され、最終ログまでの比較が終了したと判定すると、ステップＳ２１４に進み、比較元のログ番号［Ｋ］を１つ増加させる処理を行い、ステップＳ２１５において、比較元のログのログ番号［Ｋ］が最大ログ番号を超えたか否かを判定する。すなわち、最終ログのログ番号を超えていないかをチェックする。ログ番号［Ｋ］が最大ログ番号を超えていない場合は、ステップＳ２０３に戻り、更新したログ番号［Ｋ］によって指定されるＫ番目のログと、Ｋ＋Ｎ番目のログとの比較処理を繰り返す。 If it is determined in step S213 that the log number [K + N] of the comparison log exceeds the maximum log number, and it is determined that the comparison up to the final log is completed, the process proceeds to step S214, and the log number [K] of the comparison source is set. A process of incrementing by one is performed, and in step S215, it is determined whether or not the log number [K] of the comparison source log exceeds the maximum log number. That is, it is checked whether the log number of the last log has been exceeded. If the log number [K] does not exceed the maximum log number, the process returns to step S203, and the comparison process between the Kth log designated by the updated log number [K] and the K + Nth log is repeated.

ステップＳ２１５において、比較元のログのログ番号［Ｋ］が最大ログ番号を超え、取得ログに存在しないと判定されると、すべての取得ログ間の比較処理が終了したことになり、アクセスログ分析部１０２の処理を終了する。 In step S215, if it is determined that the log number [K] of the comparison source log exceeds the maximum log number and does not exist in the acquisition log, the comparison processing between all acquisition logs is completed, and the access log analysis is completed. The process of the unit 102 ends.

このようなログ分析によって、アクセスログ分析部１０２は、同一文書に対する規定時間内のアクセス回数が高いユーザの組み合わせを解析し、このようなユーザの集合を取得することができる。これらのユーザ集合は、同一コミュニティに属するユーザであると推定することができる。このように、図８を参照して説明した処理フローによって、各ユーザの規定時間内の同一文書に対するアクセス回数Ａ（Ｕｎ，Ｕｍ）が算出され、この規定時間内の異なるユーザのアクセス回数Ａ（Ｕｎ，Ｕｍ）に基づいて、ネットワーク図を作成する。 By such log analysis, the access log analysis unit 102 can analyze a combination of users who have a high number of accesses within the specified time for the same document, and acquire such a set of users. These user sets can be estimated to be users belonging to the same community. As described above, the number of accesses A (Un, Um) for the same document within the specified time of each user is calculated by the processing flow described with reference to FIG. (Un, Um) to create a network diagram.

各ユーザの閾値インターバル［Ｉａ］内の共通文書に対するアクセス回数Ａ（Ｕｎ，Ｕｍ）が大きいほどユーザ間の結びつきが高いと判定して、各ユーザ間のアクセス回数Ａ（Ｕｎ，Ｕｍ）をユーザ間の結びつきを示す指標、すなわち同一コミュニティに属する判定指標として適用してネットワーク図を作成する。本手法によっても、先に図６を参照して説明したと同様のネットワーク図が作成される。 As the number of accesses A (Un, Um) to the common document within the threshold interval [Ia] of each user is larger, it is determined that the connection between users is higher, and the number of accesses A (Un, Um) between users is determined between users. A network diagram is created by applying as an index indicating the connection between the two, that is, a determination index belonging to the same community. This method also creates a network diagram similar to that described above with reference to FIG.

このような、閾値インターバル［Ｉａ］内の共通文書に対するアクセス回数Ａ（Ｕｎ，Ｕｍ）に基づくネットワーク分析によって、例えば、図９に示すような、異なるコミュニティ、例えば課長コミュニティや、同期入社コミュニティの存在を把握することが可能となる。例えば、年次教育の資料に対するアクセスログの解析によって、このような、組織図からは直接読み取ることが困難なコミュニティの存在を解析することができる。 By such network analysis based on the access count A (Un, Um) for the common document within the threshold interval [Ia], for example, the existence of different communities, for example, a section manager community and a synchronous joining community, as shown in FIG. Can be grasped. For example, it is possible to analyze the existence of a community that is difficult to read directly from the organization chart by analyzing access logs for annual education materials.

なお、上述した処理例では、図７を参照して説明したように、アクセスログ分析部１０２が実行するログ情報に基づく解析において、異なるユーザ間の同一文書Ｄ１に対するアクセス時間の差と、予め定めた時間間隔としての閾値インターバル［Ｉａ］とを比較して、ユーザ間の関係を判断する際、適用する閾値インターバル［Ｉａ］を予め定めた固定時間とした例を説明したが、この閾値インターバルを文書に対するアクセス状況に応じて決定する構成としてもよい。 In the above-described processing example, as described with reference to FIG. 7, in the analysis based on the log information executed by the access log analysis unit 102, the difference in access time for the same document D1 between different users is determined in advance. When the threshold interval [Ia] is compared with the threshold interval [Ia], and the relationship between the users is determined, the threshold interval [Ia] to be applied has been described as a fixed time. A configuration may be adopted in which it is determined according to the access status to the document.

この処理例について、図１０を参照して説明する。図１０（ａ）は、上述した処理例と同様、予め定めた時間間隔としての閾値インターバル［Ｉａ］を適用した場合の処理である。（ｂ）は、閾値インターバルを文書に対するアクセス状況に応じて決定する構成例を示している。図において、左から右に時間経過を示しており、マーク３０１はユーザによるアクセスタイミングを示している。これらの時間情報は、アクセスログ記憶部１０１に記録されたログ情報（図３参照）に基づいて取得される。 This processing example will be described with reference to FIG. FIG. 10A shows the processing when the threshold interval [Ia] as a predetermined time interval is applied, as in the processing example described above. (B) shows a configuration example in which the threshold interval is determined according to the access status to the document. In the figure, the passage of time is shown from left to right, and the mark 301 shows the access timing by the user. These pieces of time information are acquired based on log information (see FIG. 3) recorded in the access log storage unit 101.

このアクセス履歴から、時間Ｔ１〜Ｔ２の間に、アクセスが集中していることが解析される。このアクセス密度の高い期間をユーザ間の関係を判断する際に適用する閾値インターバル［Ｉｂ］とする。具体的には、予め定めたアクセス密度の値以上のアクセス密度を持つ期間を閾値インターバル［Ｉｂ］として設定する。このようにアクセス密度に応じて閾値インターバル［Ｉｂ］を設定することで、より柔軟な解析が可能となる。 From this access history, it is analyzed that access is concentrated between times T1 and T2. This period of high access density is defined as a threshold interval [Ib] applied when determining the relationship between users. Specifically, a period having an access density equal to or higher than a predetermined access density value is set as the threshold interval [Ib]. Thus, by setting the threshold interval [Ib] according to the access density, more flexible analysis can be performed.

（処理例３）共通のユーザによる異なる文書に対するアクセス解析によるコミュニティ分析
次に、コミュニティ分析処理例３として、共通のユーザによる異なる文書に対するアクセス解析によるコミュニティ分析処理について説明する。アクセスログ分析部１０２の実行する本処理例における解析処理の一例について、図１１を参照して説明する。アクセスログ分析部１０２は、アクセスログ記憶部１０１に記録されたアクセスログから、共通のユーザによる異なる文書に対するアクセスログを抽出し、これら共通のユーザによってアクセスされた文書を関連文書であると判定する。 (Processing Example 3) Community Analysis by Access Analysis for Different Documents by a Common User Next, community analysis processing by access analysis for different documents by a common user will be described as community analysis processing example 3. An example of analysis processing in this processing example executed by the access log analysis unit 102 will be described with reference to FIG. The access log analysis unit 102 extracts access logs for different documents by a common user from the access logs recorded in the access log storage unit 101, and determines that the documents accessed by the common user are related documents. .

図１１に示すように、同一ユーザが、電子文書ファイリング装置７０に格納された異なる文書、すなわち文書Ｄａと文書Ｄｂに対してアクセスを行ったことを示すアクセスログがアクセスログ記憶部１０１に存在する場合、アクセスログ分析部１０２は、これらのログ情報に基づいて文書Ｄａと文書Ｄｂとが関連性の高い文書であると判定する。 As shown in FIG. 11, an access log indicating that the same user has accessed different documents stored in the electronic document filing device 70, that is, the document Da and the document Db, exists in the access log storage unit 101. In this case, the access log analysis unit 102 determines that the document Da and the document Db are highly related documents based on the log information.

具体的には、同一ユーザが異なる文書に対してアクセスを実行している回数をカウントし、カウント値に基づいてコミュニティ分析を行なう。例えば、アクセスログ記憶部１０１に記録されたアクセスログから、同一のユーザ［Ｕ１］による異なる文書［Ｄ１］，［Ｄ２］に対するアクセス回数をＡ（Ｄ１，Ｄ２）として、アクセス回数をカウントする。共通ユーザによる異なる文書に対するアクセス回数Ａ（Ｄ１，Ｄ２）が大きな値を示すほど、文書［Ｄ１］，［Ｄ２］は、関連性の高い文書である可能性が高いと判定する。この解析処理シーケンスについて、図１２に示すフローチャートを参照して説明する。 Specifically, the number of times the same user accesses different documents is counted, and community analysis is performed based on the count value. For example, from the access log recorded in the access log storage unit 101, the access count for different documents [D1] and [D2] by the same user [U1] is A (D1, D2), and the access count is counted. It is determined that the documents [D1] and [D2] are more likely to be highly related documents as the number of accesses A (D1, D2) to different documents by the common user increases. This analysis processing sequence will be described with reference to the flowchart shown in FIG.

まず、アクセスログ分析部１０２は、ステップＳ３０１において、アクセスログ記憶部１０１に記録されたアクセスログ（図３参照）から、分析対象期間のログを取り出す。なお、分析期間は、オペレータによって任意の期間を分析期間として予め設定する。 First, in step S301, the access log analysis unit 102 extracts a log for the analysis target period from the access log (see FIG. 3) recorded in the access log storage unit 101. The analysis period is set in advance by the operator as an analysis period.

次に、ステップＳ３０２において、アクセスログのログ番号を示す変数［Ｋ］を初期値Ｋ＝１に設定する。ステップＳ３０３において、Ｋ番目のログを選択し、ステップＳ３０４において、取得したＫ番目のログに記録されたユーザＩＤを［Ｕ１］とし、ドキュメントＩＤを［Ｄ１］とする。 Next, in step S302, a variable [K] indicating the log number of the access log is set to an initial value K = 1. In step S303, the Kth log is selected, and in step S304, the user ID recorded in the acquired Kth log is [U1], and the document ID is [D1].

次に、ステップＳ３０５において、比較ログのログ番号［Ｋ＋Ｎ］を設定するための変数［Ｎ］の初期設定としてＮ＝１に設定し、ステップＳ３０６において、Ｋ＋Ｎ番目のログを取得し、ステップＳ３０７において、取得したＫ＋Ｎ番目のログに記録されたユーザＩＤを［Ｕ２］とし、ドキュメントＩＤを［Ｄ２］とする。 Next, in step S305, N = 1 is set as the initial setting of the variable [N] for setting the log number [K + N] of the comparison log. In step S306, the K + N-th log is acquired, and in step S307. The user ID recorded in the acquired K + Nth log is [U2], and the document ID is [D2].

ステップＳ３０８では、Ｋ番目とＫ＋Ｎ番目のログのドキュメントＩＤを比較し、ステップＳ３０９では、Ｋ番目とＫ＋Ｎ番目のログのユーザＩＤを比較する。ステップＳ３０８において、Ｋ番目とＫ＋Ｎ番目のログのドキュメントＩＤが同一であると判定された場合は、同一ドキュメントに対するアクセスログ記録であるので、関連文書判定処理対象としてのデータに選定されず、ステップＳ３１１に進む。 In step S308, the document IDs of the Kth and K + Nth logs are compared, and in step S309, the user IDs of the Kth and K + Nth logs are compared. If it is determined in step S308 that the document IDs of the Kth and K + Nth logs are the same, it is an access log record for the same document, so that it is not selected as data as a related document determination processing target, and step S311 is performed. Proceed to

ステップＳ３０８において、Ｋ番目とＫ＋Ｎ番目のログのドキュメントＩＤが同一でない場合は、ステップＳ３０９に進み、Ｋ番目とＫ＋Ｎ番目のログのユーザＩＤが同一であるか否かを判定する。Ｋ番目とＫ＋Ｎ番目のログのユーザＩＤが同一である場合は、Ｋ番目とＫ＋Ｎ番目のログは、同一ユーザによる異なるドキュメントに対するアクセス情報を示すログであると判定される。すなわち、図１１を参照して説明した関係を提示していることになる。 If the document IDs of the Kth and K + Nth logs are not the same in step S308, the process proceeds to step S309 to determine whether the user IDs of the Kth and K + Nth logs are the same. When the user IDs of the Kth and K + Nth logs are the same, it is determined that the Kth and K + Nth logs are logs indicating access information for different documents by the same user. That is, the relationship described with reference to FIG. 11 is presented.

この場合は、ステップＳ３１０に進み、同一ユーザの異なる文書に対するアクセス回数を示すデータ［Ａ（Ｄ１，Ｄ２）］を１つ増加させる処理を行なう。すなわち、
Ａ（Ｄ１，Ｄ２）←Ａ（Ｄ１，Ｄ２）＋１
とする。 In this case, the process proceeds to step S310, in which data [A (D1, D2)] indicating the number of accesses to different documents by the same user is increased by one. That is,
A (D1, D2) <-A (D1, D2) +1
And

［Ａ（Ｄ１，Ｄ２）］は、文書Ｄ１，Ｄ２に対する共通ユーザによるアクセス数を示すデータであり、この数値が高いほど、文書Ｄ１，Ｄ２の関連性が高いと判定される。 [A (D1, D2)] is data indicating the number of accesses by the common user to the documents D1, D2, and it is determined that the higher the numerical value, the higher the relevance of the documents D1, D2.

ステップＳ３１１では、比較ログのログ番号［Ｋ＋Ｎ］を設定するための変数［Ｎ］の値を１つ増加させる変数更新処理を行なう。すなわち、Ｋ番目のログとの比較ログを次のログデータに設定する処理である。ステップＳ３１２では、比較ログのログ番号［Ｋ＋Ｎ］が最大ログ番号を超えたか否か、すなわち解析対象として取得したログに存在するか否かを判定する。比較ログのログ番号［Ｋ＋Ｎ］が最大ログ番号を超えていない場合は、ステップＳ３０６に戻り、Ｋ番目のログと、Ｋ＋Ｎ番目のログとの比較処理を繰り返す。 In step S311, variable update processing for increasing the value of the variable [N] by one for setting the log number [K + N] of the comparison log is performed. That is, it is a process of setting a comparison log with the Kth log as the next log data. In step S312, it is determined whether or not the log number [K + N] of the comparison log exceeds the maximum log number, that is, whether or not it exists in the log acquired as the analysis target. If the log number [K + N] of the comparison log does not exceed the maximum log number, the process returns to step S306, and the comparison process between the Kth log and the K + Nth log is repeated.

すなわち、ドキュメントＩＤとユーザＩＤの比較を実行し、同一ユーザによる異なるドキュメントに対するアクセスログの関係にあれば、同一ユーザによるドキュメント［Ｄ１］とドキュメント［Ｄ２］のアクセス回数を示すデータ［Ａ（Ｄ１，Ｄ２）］を１つ増加させる処理を行なう。 That is, the comparison between the document ID and the user ID is performed, and if there is an access log relationship with respect to different documents by the same user, data [A (D1, D1, D2)] is incremented by one.

ステップＳ３１２において、比較ログのログ番号［Ｋ＋Ｎ］が最大ログ番号を超えていると判定され、最終ログまでの比較が終了したと判定すると、ステップＳ３１３に進み、比較元のログ番号［Ｋ］を１つ増加させる処理を行い、ステップＳ３１４において、比較元のログのログ番号［Ｋ］が最大ログ番号を超えたか否かを判定する。すなわち、最終ログのログ番号を超えていないかをチェックする。ログ番号［Ｋ］が最大ログ番号を超えていない場合は、ステップＳ３０３に戻り、Ｋ番目のログと、Ｋ＋Ｎ番目のログとの比較処理を繰り返す。 In step S312, when it is determined that the log number [K + N] of the comparison log exceeds the maximum log number and it is determined that the comparison up to the final log is completed, the process proceeds to step S313, and the log number [K] of the comparison source is set. A process of incrementing by one is performed, and in step S314, it is determined whether or not the log number [K] of the comparison source log exceeds the maximum log number. That is, it is checked whether the log number of the last log has been exceeded. If the log number [K] does not exceed the maximum log number, the process returns to step S303, and the comparison process between the Kth log and the K + Nth log is repeated.

ステップＳ３１４において、比較元のログのログ番号［Ｋ］が最大ログ番号を超え、取得ログに存在しないと判定されると、すべての取得ログ間の比較処理が終了したことになり、アクセスログ分析部１０２の処理を終了する。 In step S314, if it is determined that the log number [K] of the comparison source log exceeds the maximum log number and does not exist in the acquisition log, the comparison processing between all acquisition logs is completed, and the access log analysis is completed. The process of the unit 102 ends.

このようなログ分析によって、アクセスログ分析部１０２は、同一ユーザによる異なる文書に対するアクセス回数が高い文書の組み合わせを解析し、このような文書の集合を取得することができる。これらの文書集合は、同一コミュニティに属するユーザによってアクセスされる可能性の高い文書集合であると推定することができる。図１２を参照して説明した処理フローによって、同一ユーザの異なる文書に対するアクセス回数Ａ（Ｄｎ，Ｄｍ）が算出され、このアクセス回数Ａ（Ｄｎ，Ｄｍ）に基づいて、ネットワーク図を作成する。 Through such log analysis, the access log analysis unit 102 can analyze a combination of documents having a high number of accesses to different documents by the same user, and obtain a set of such documents. These document sets can be presumed to be document sets that are likely to be accessed by users belonging to the same community. According to the processing flow described with reference to FIG. 12, the access count A (Dn, Dm) for different documents of the same user is calculated, and a network diagram is created based on the access count A (Dn, Dm).

同一ユーザの異なる文書に対するアクセス回数Ａ（Ｄｎ，Ｄｍ）が大きいほど文書間の関連性が高いと判定して、これらの文書集合は、同一コミュニティに属するユーザによってアクセスされる可能性の高い文書集合であると推定し、このような共通の文書集合との結びつきを持つ異なるユーザ間の結びつきについても高い結びつきがあると推定し、この推定をユーザ間の結びつきを示す指標、すなわち同一コミュニティに属する判定指標として適用してネットワーク図を作成する。本手法によっても、例えば、図６を参照して説明したと同様のネットワーク図が設定され、ネットワーク解析を行なうことができる。 The larger the number of accesses A (Dn, Dm) for different documents of the same user, the higher the relevance between the documents, and these document sets are likely to be accessed by users belonging to the same community. It is estimated that there is also a high connection between different users who have a connection with such a common document set, and this estimation is an index indicating the connection between users, that is, a determination belonging to the same community Apply as an index to create a network diagram. Also with this method, for example, a network diagram similar to that described with reference to FIG. 6 is set, and network analysis can be performed.

（処理例４）規定時間内の同一ユーザによる異なる文書に対するアクセスの解析によるコミュニティ分析
次に、コミュニティ分析処理例４として、予め定めた規定時間内に同一ユーザによって異なる文書に対して行なわれたアクセスの解析によって行なわれるコミュニティ分析処理について説明する。アクセスログ分析部１０２の実行する本処理例における解析処理の一例について、図１３を参照して説明する。アクセスログ分析部１０２は、アクセスログ記憶部１０１に記録されたアクセスログから、同一のユーザが予め定めた規定時間内に異なる文書に対してアクセスを行なったことを示すログを抽出し、これら規定時間内に同一ユーザがアクセスした文書を関連性の高い文書であると判定する。 (Processing Example 4) Community Analysis by Analyzing Access to Different Documents by the Same User within a Specified Time Next, as a community analysis processing example 4, accesses made to different documents by the same user within a predetermined specified time A community analysis process performed by the above analysis will be described. An example of analysis processing in this processing example executed by the access log analysis unit 102 will be described with reference to FIG. The access log analysis unit 102 extracts, from the access logs recorded in the access log storage unit 101, logs indicating that the same user has accessed different documents within a predetermined time, and these specified A document accessed by the same user within the time is determined to be a highly relevant document.

すなわち、先の処理例３では、時間を考慮することなく、同一ユーザによるアクセス文書を関連性のある文書として判定する関連性判定を実行したが、本処理例では、さらに時間要素を加え、同一ユーザによって、規定時間内に文書アクセスが発生した場合に関連性のある文書であると判定する。 That is, in the previous processing example 3, the relevance determination is performed in which the access document by the same user is determined as a related document without considering the time. However, in this processing example, the time element is further added and the relevance determination is performed. When a user accesses a document within a specified time, it is determined that the document is relevant.

図１３に示すように、ユーザＵ１が、電子文書ファイリング装置７０に格納された異なる文書、すなわち、文書Ｄ１〜Ｄ３に対してアクセスを行ったものとする。これらのアクセス情報を示すアクセスログがアクセスログ記憶部１０１に存在するものとする。 As shown in FIG. 13, it is assumed that the user U1 accesses a different document stored in the electronic document filing device 70, that is, the documents D1 to D3. Assume that an access log indicating these pieces of access information exists in the access log storage unit 101.

アクセスログ分析部１０２は、これらのログ情報に基づく解析を実行する。ユーザＵ１による文書Ｄ１と文書Ｄ２に対するアクセス時間の差ｔ１は、予め定めた時間間隔としての閾値インターバル［Ｉｂ］未満であり、文書Ｄ１と文書Ｄ３に対するアクセス時間の差ｔ２は、予め定めた時間間隔としての閾値インターバル［Ｉｂ］以上となっている。これらの時間情報は、アクセスログ記憶部１０１に記録されたログ情報（図３参照）に基づいて取得される。 The access log analysis unit 102 performs analysis based on these log information. The access time difference t1 between the document D1 and the document D2 by the user U1 is less than the threshold interval [Ib] as a predetermined time interval, and the access time difference t2 between the document D1 and the document D3 is a predetermined time interval. The threshold interval [Ib] is over. These pieces of time information are acquired based on log information (see FIG. 3) recorded in the access log storage unit 101.

この場合、アクセスログ分析部１０２は、文書Ｄ１と文書Ｄ２は関連性が高は、文書Ｄ１と文書Ｄ３の関連性は低いと判定する。 In this case, the access log analysis unit 102 determines that the relationship between the document D1 and the document D2 is high and the relationship between the document D1 and the document D3 is low.

本処理例においては、各文書の組み合わせ（Ｄｎ，Ｄｍ）について、規定時間（閾値インターバルＩａ）未満の時間間隔で同一ユーザによってアクセスされた回数をカウントし、カウント値に基づいて、文書の関連性を判断する。例えば、アクセスログ記憶部１０１に記録されたアクセスログから、文書［Ｄ１］と文書［Ｄ２］とに対して、規定時間（閾値時間Ｔａ）未満の時間間隔で実行された同一ユーザからのアクセス回数をＡ（Ｄ１，Ｄ２）として、アクセス回数をカウントする。規定時間未満に同一ユーザによって実行された異なる文書に対するアクセス回数Ａ（Ｄ１，Ｄ２）が大きな値を示すほど、文書［Ｄ１］，［Ｄ２］は関連性が高いと判定する。この解析処理シーケンスについて、図１４に示すフローチャートを参照して説明する。 In this processing example, for each combination (Dn, Dm) of documents, the number of times accessed by the same user at a time interval less than the specified time (threshold interval Ia) is counted, and the relevance of the document is determined based on the count value. Judging. For example, from the access log recorded in the access log storage unit 101, the number of accesses from the same user executed for the document [D1] and the document [D2] at a time interval less than a specified time (threshold time Ta). Is A (D1, D2), and the number of accesses is counted. The documents [D1] and [D2] are determined to have higher relevance as the number of accesses A (D1, D2) for different documents executed by the same user in less than the specified time shows a larger value. This analysis processing sequence will be described with reference to the flowchart shown in FIG.

まず、アクセスログ分析部１０２は、ステップＳ４０１において、アクセスログ記憶部１０１に記録されたアクセスログ（図３参照）から、分析対象期間のログを取り出す。なお、分析期間は、オペレータによって任意の期間を分析期間として予め設定する。 First, in step S401, the access log analysis unit 102 extracts a log for the analysis target period from the access log (see FIG. 3) recorded in the access log storage unit 101. The analysis period is set in advance by the operator as an analysis period.

次に、ステップＳ４０２において、アクセスログのログ番号を示す変数［Ｋ］を初期値Ｋ＝１に設定する。ステップＳ４０３において、Ｋ番目のログを選択し、ステップＳ４０４において、取得したＫ番目のログに記録されたアクセス時刻を［Ｔ１］、ユーザＩＤを［Ｕ１］とし、ドキュメントＩＤを［Ｄ１］とする。 Next, in step S402, a variable [K] indicating the log number of the access log is set to an initial value K = 1. In step S403, the Kth log is selected. In step S404, the access time recorded in the acquired Kth log is [T1], the user ID is [U1], and the document ID is [D1].

次に、ステップＳ４０５において、比較ログのログ番号［Ｋ＋Ｎ］を設定するための変数［Ｎ］の初期設定としてＮ＝１に設定し、ステップＳ４０６において、Ｋ＋Ｎ番目のログを取得し、ステップＳ４０７において、取得したＫ＋Ｎ番目のログに記録されたアクセス時刻を［Ｔ２］、ユーザＩＤを［Ｕ２］とし、ドキュメントＩＤを［Ｄ２］とする。 Next, in step S405, N = 1 is set as the initial setting of the variable [N] for setting the log number [K + N] of the comparison log. In step S406, the K + N-th log is acquired, and in step S407. The access time recorded in the acquired K + Nth log is [T2], the user ID is [U2], and the document ID is [D2].

ステップＳ４０８では、Ｋ番目とＫ＋Ｎ番目のログのドキュメントＩＤを比較し、ステップＳ４０９では、Ｋ番目とＫ＋Ｎ番目のログのユーザＩＤを比較し、ステップＳ４１０では、Ｋ番目とＫ＋Ｎ番目のログのアクセス時刻の差［Ｔ２−Ｔ１］と予め定めた閾値インターバル［Ｉｂ］とを比較する。すなわち、２つのログのアクセス時間の差が閾値インターバル［Ｉｂ］未満であるか否かを判定する。 In step S408, the document IDs of the Kth and K + Nth logs are compared. In step S409, the user IDs of the Kth and K + Nth logs are compared. In step S410, the access times of the Kth and K + Nth logs are compared. The difference [T2−T1] is compared with a predetermined threshold interval [Ib]. That is, it is determined whether or not the difference between the access times of the two logs is less than the threshold interval [Ib].

ステップＳ４０８において、Ｋ番目とＫ＋Ｎ番目のログのドキュメントＩＤが同一であると判定された場合は、同一文書に対するアクセスログ記録であるので、異なる文書間の関連性を判定する対象としてのデータに選定されず、ステップＳ４１２に進む。 In step S408, if it is determined that the document IDs of the Kth and K + Nth logs are the same, it is an access log record for the same document, so it is selected as data as a target for determining the relationship between different documents. Otherwise, the process proceeds to step S412.

ステップＳ４０８において、Ｋ番目とＫ＋Ｎ番目のログのドキュメントＩＤが同一でない場合は、ステップＳ４０９に進み、Ｋ番目とＫ＋Ｎ番目のログのユーザＩＤが同一であるか否かを判定する。Ｋ番目とＫ＋Ｎ番目のログのユーザＩＤが同一である場合は、Ｋ番目とＫ＋Ｎ番目のログは、同一ユーザによる異なるドキュメントに対するアクセス情報を示すログであると判定される。 If the document IDs of the Kth and K + Nth logs are not the same in step S408, the process proceeds to step S409 to determine whether the user IDs of the Kth and K + Nth logs are the same. When the user IDs of the Kth and K + Nth logs are the same, it is determined that the Kth and K + Nth logs are logs indicating access information for different documents by the same user.

この場合は、さらに、ステップＳ４１０に進み、Ｋ番目とＫ＋Ｎ番目のログのアクセス時刻の差［Ｔ２−Ｔ１］と予め定めた閾値インターバル［Ｉｂ］とを比較する。すなわち、
Ｔ２−Ｔ１＜Ｉｂ
上記式が成立するか否かを判定する。これは、２つのログのアクセス時間の差が閾値インターバル［Ｉｂ］未満であるか否かを判定する処理である。 In this case, the process further proceeds to step S410, and the difference [T2−T1] between the access times of the Kth and K + Nth logs is compared with a predetermined threshold interval [Ib]. That is,
T2-T1 <Ib
It is determined whether or not the above equation holds. This is processing for determining whether or not the difference between the access times of the two logs is less than the threshold interval [Ib].

上記式が成立する場合は、ステップＳ４１１に進み、同一ユーザの異なる文書に対するアクセス回数を示すデータ［Ａ（Ｄ１，Ｄ２）］を１つ増加させる処理を行なう。すなわち、
Ａ（Ｄ１，Ｄ２）←Ａ（Ｄ１，Ｄ２）＋１
とする。このケースは、図１３に示す文書Ｄ１と文書Ｄ２との関係に相当する。 If the above equation is satisfied, the process proceeds to step S411, and a process of increasing the data [A (D1, D2)] indicating the number of accesses to a different document by the same user by one is performed. That is,
A (D1, D2) <-A (D1, D2) +1
And This case corresponds to the relationship between the document D1 and the document D2 shown in FIG.

ステップＳ４１０において、Ｋ番目とＫ＋Ｎ番目のログのアクセス時刻の差［Ｔ２−Ｔ１］が、
Ｔ２−Ｔ１＜Ｉｂ
を満たさない場合、すなわち、２つのログのアクセス時間の差が閾値インターバル［Ｉｂ］以上である場合は、２つのログの文書は、同一ユーザによるアクセスがなされているが、予め定めた閾値時間以上の差を持ったアクセスであり、このログに示される文書間の関連性は低いと判定し、ステップＳ４１１のカウントアップ処理を実行することなく、ステップＳ４１２に進む。このケースは、図１３に示す文書Ｄ１と文書Ｄ３との関係に相当する。 In step S410, the difference [T2-T1] between the access times of the Kth and K + Nth logs is
T2-T1 <Ib
Is not satisfied, that is, when the difference between the access times of the two logs is equal to or greater than the threshold interval [Ib], the documents of the two logs are accessed by the same user, but are equal to or longer than a predetermined threshold time. Therefore, it is determined that the relevance between the documents shown in the log is low, and the process proceeds to step S412 without executing the count-up process in step S411. This case corresponds to the relationship between the document D1 and the document D3 shown in FIG.

本処理例では、［Ａ（Ｄ１，Ｄ２）］は、文書Ｄ１，Ｄ２の規定時間内の同一ユーザによるアクセス数を示すデータであり、この数値が高いほど、文書Ｄ１，Ｄ２の関連性が高いと判定される。 In this processing example, [A (D1, D2)] is data indicating the number of accesses by the same user within the prescribed time of the documents D1, D2, and the higher this value, the higher the relevance of the documents D1, D2. It is determined.

ステップＳ４１２では、比較ログのログ番号［Ｋ＋Ｎ］を設定するための変数［Ｎ］の値を１つ増加させる変数更新処理を行なう。すなわち、Ｋ番目のログとの比較ログを次のログデータに設定する処理である。ステップＳ４１３では、比較ログのログ番号［Ｋ＋Ｎ］が最大ログ番号を超えたか否か、すなわち解析対象として取得したログに存在するか否かを判定する。比較ログのログ番号［Ｋ＋Ｎ］が最大ログ番号を超えていない場合は、ステップＳ４０６に戻り、Ｋ番目のログと、Ｋ＋Ｎ番目のログとの比較処理を繰り返す。 In step S412, variable update processing for increasing the value of the variable [N] by one for setting the log number [K + N] of the comparison log is performed. That is, it is a process of setting a comparison log with the Kth log as the next log data. In step S413, it is determined whether or not the log number [K + N] of the comparison log exceeds the maximum log number, that is, whether or not it exists in the log acquired as the analysis target. If the log number [K + N] of the comparison log does not exceed the maximum log number, the process returns to step S406, and the comparison process between the Kth log and the K + Nth log is repeated.

すなわち、ログ間のドキュメントＩＤとユーザＩＤの比較と、ログ間の時間差［Ｔ２−Ｔ１］と閾値インターバル［Ｉｂ］との比較を実行し、同一ユーザによる閾値インターバルＩｂ内の異なるドキュメントに対応するアクセスログの関係にあれば、文書Ｄ１，Ｄ２の規定時間内の同一ユーザによるアクセス数を示すデータ［Ａ（Ｄ１，Ｄ２）］を１つ増加させる処理を行なう。 That is, the comparison between the document ID and the user ID between the logs and the comparison between the time difference [T2-T1] between the logs and the threshold interval [Ib] are performed, and access corresponding to different documents within the threshold interval Ib by the same user. If there is a log relationship, a process of increasing the data [A (D1, D2)] indicating the number of accesses by the same user within the specified time of the documents D1, D2 by one is performed.

ステップＳ４１３において、比較ログのログ番号［Ｋ＋Ｎ］が最大ログ番号を超えていると判定され、最終ログまでの比較が終了したと判定すると、ステップＳ４１４に進み、比較元のログ番号［Ｋ］を１つ増加させる処理を行い、ステップＳ４１５において、比較元のログのログ番号［Ｋ］が最大ログ番号を超えたか否かを判定する。すなわち、最終ログのログ番号を超えていないかをチェックする。ログ番号［Ｋ］が最大ログ番号を超えていない場合は、ステップＳ４０３に戻り、更新したログ番号［Ｋ］によって指定されるＫ番目のログと、Ｋ＋Ｎ番目のログとの比較処理を繰り返す。 In step S413, if it is determined that the log number [K + N] of the comparison log exceeds the maximum log number and it is determined that the comparison up to the final log is completed, the process proceeds to step S414, and the log number [K] of the comparison source is set. A process of incrementing by one is performed, and in step S415, it is determined whether or not the log number [K] of the comparison source log exceeds the maximum log number. That is, it is checked whether the log number of the last log has been exceeded. If the log number [K] does not exceed the maximum log number, the process returns to step S403, and the comparison process between the Kth log specified by the updated log number [K] and the K + Nth log is repeated.

ステップＳ４１５において、比較元のログのログ番号［Ｋ］が最大ログ番号を超え、取得ログに存在しないと判定されると、すべての取得ログ間の比較処理が終了したことになり、アクセスログ分析部１０２の処理を終了する。 In step S415, if it is determined that the log number [K] of the comparison source log exceeds the maximum log number and does not exist in the acquisition log, the comparison processing between all acquisition logs is completed, and the access log analysis is completed. The process of the unit 102 ends.

このようなログ分析によって、アクセスログ分析部１０２は、同一ユーザによる規定時間内のアクセス回数が高い異なる文書の組み合わせを解析し、このような文書の集合を取得することができる。これらの文書集合は関連性が高いと判定して、これらの文書集合は、同一コミュニティに属するユーザによってアクセスされる可能性の高い文書集合であると推定し、このような共通の文書集合との結びつきを持つ異なるユーザ間の結びつきについても高い結びつきがあると推定し、この推定をユーザ間の結びつきを示す指標、すなわち同一コミュニティに属する判定指標として適用してネットワーク図を作成する。本手法によっても、例えば、図６を参照して説明したと同様のネットワーク図が設定され、ネットワーク解析を行なうことができる。 By such log analysis, the access log analysis unit 102 can analyze a combination of different documents having a high number of accesses within a specified time by the same user, and obtain such a set of documents. It is determined that these document sets are highly relevant, and it is assumed that these document sets are highly likely to be accessed by users belonging to the same community. A connection between different users having a connection is also estimated to have a high connection, and this estimation is applied as an index indicating the connection between users, that is, a determination index belonging to the same community, to create a network diagram. Also with this method, for example, a network diagram similar to that described with reference to FIG. 6 is set, and network analysis can be performed.

以上、複数のコミュニティ分析処理例として、ユーザによる文書アクセスログの解析によるコミュニティ分析処理例として、以下の４つの処理例、
（処理例１）共通文書に対するアクセスユーザの解析によるコミュニティ分析
（処理例２）規定時間内の共通文書に対するアクセスユーザの解析によるコミュニティ分析
（処理例３）共通のユーザによる異なる文書に対するアクセス解析によるコミュニティ分析
（処理例４）規定時間内の同一ユーザによる異なる文書に対するアクセスの解析によるコミュニティ分析
これらの処理例を説明した。 As described above, as a plurality of community analysis processing examples, the following four processing examples are given as community analysis processing examples by analysis of a document access log by a user.
(Processing Example 1) Community Analysis by Analysis of Access Users for Common Documents (Processing Example 2) Community Analysis by Analysis of Access Users for Common Documents within Specified Time (Processing Example 3) Community by Analysis of Access to Different Documents by Common Users Analysis (Processing Example 4) Community Analysis by Analyzing Access to Different Documents by the Same User within a Specified Time These processing examples have been described.

例えば、処理例２や、処理例４では、規定の時間ごとに区切りを設定して、ユーザのコミュニティ関係を解析する構成としているが、例えば、このような時間区切りのコミュニティ解析を実行することで、特定のプロジェクトのメンバーの関わりなどを解析することが容易となる。 For example, in the processing example 2 and the processing example 4, a configuration is set in which a break is set for each specified time and the user's community relationship is analyzed. For example, by executing such a time break community analysis, It becomes easy to analyze the relations of members of a specific project.

図１５を参照してコミュニティ解析処理例について説明する。例えば、ある会社においてある製品の開発プロジェクトがあった場合、その製品に関するプロセスの流れとしては、企画、設計、試作、評価、量産等の各ステップが時間の進行に従って行われる。これらのプロジェクトにかかわるメンバーは、各処理ステップのいずれかにかかわることになる。企画、設計、試作、評価、量産の全ステップのいずれかに参加したメンバーのネットワーク図として、図１５（ａ）が得られるようなコミュニティがある場合、上述した処理例２や、処理例４を適用し、規定の時間ごとに区切りを設定して、ユーザのコミュニティ関係を解析することで、図１５（ｂ）〜（ｅ）のようなプロジェクトの段階ごとの時期に対応した個別のコミュニティネットワークを取得し、解析を行なうことが可能となる。 An example of community analysis processing will be described with reference to FIG. For example, when there is a development project for a product in a certain company, each process such as planning, design, prototyping, evaluation, mass production, etc. is performed according to the progress of time as a process flow regarding the product. Members involved in these projects will be involved in any of the processing steps. If there is a community that can obtain Fig. 15 (a) as a network diagram of members who participated in any of the planning, design, prototyping, evaluation, and mass production steps, the processing example 2 and the processing example 4 described above are used. By applying and analyzing the community relationship of users by setting a break at each specified time, an individual community network corresponding to the timing of each stage of the project as shown in FIGS. It can be acquired and analyzed.

また、（処理例１）共通文書に対するアクセスユーザの解析によるコミュニティ分析を適用することで、ある専門分野、例えば建築関係の文書にアクセスしているユーザのネットワークを解析することが加納であり、また、（処理例３）共通のユーザによる異なる文書に対するアクセス解析によるコミュニティ分析を実行することで、異なる文書の関連性の解析が可能となり、このような関連性の高い文書集合との結びつきを持つユーザ間の結びつきをネットワーク解析によって行なうことが可能となる。 (Processing Example 1) By applying community analysis based on analysis of users accessing common documents, it is Kano to analyze a network of users who are accessing documents related to a certain specialized field, for example, architecture. (Processing Example 3) By executing community analysis by analyzing access to different documents by a common user, it becomes possible to analyze the relevance of different documents, and a user having a connection with such a highly related document set. It becomes possible to perform the connection between them by network analysis.

以上、特定の実施例を参照しながら、本発明について詳解してきた。しかしながら、本発明の要旨を逸脱しない範囲で当業者が該実施例の修正や代用を成し得ることは自明である。すなわち、例示という形態で本発明を開示してきたのであり、限定的に解釈されるべきではない。本発明の要旨を判断するためには、冒頭に記載した特許請求の範囲の欄を参酌すべきである。 The present invention has been described in detail above with reference to specific embodiments. However, it is obvious that those skilled in the art can make modifications and substitutions of the embodiments without departing from the gist of the present invention. In other words, the present invention has been disclosed in the form of exemplification, and should not be interpreted in a limited manner. In order to determine the gist of the present invention, the claims section described at the beginning should be considered.

なお、明細書中において説明した一連の処理はハードウェア、またはソフトウェア、あるいは両者の複合構成によって実行することが可能である。ソフトウェアによる処理を実行する場合は、処理シーケンスを記録したプログラムを、専用のハードウェアに組み込まれたコンピュータ内のメモリにインストールして実行させるか、あるいは、各種処理が実行可能な汎用コンピュータにプログラムをインストールして実行させることが可能である。 The series of processes described in the specification can be executed by hardware, software, or a combined configuration of both. When executing processing by software, the program recording the processing sequence is installed in a memory in a computer incorporated in dedicated hardware and executed, or the program is executed on a general-purpose computer capable of executing various processing. It can be installed and run.

例えば、プログラムは記録媒体としてのハードディスクやＲＯＭ（Read Only Memory)に予め記録しておくことができる。あるいは、プログラムはフレキシブルディスク、ＣＤ−ＲＯＭ(Compact Disc Read Only Memory)，ＭＯ(Magneto optical)ディスク，ＤＶＤ(Digital Versatile Disc)、磁気ディスク、半導体メモリなどのリムーバブル記録媒体に、一時的あるいは永続的に格納（記録）しておくことができる。このようなリムーバブル記録媒体は、いわゆるパッケージソフトウエアとして提供することができる。 For example, the program can be recorded in advance on a hard disk or ROM (Read Only Memory) as a recording medium. Alternatively, the program is temporarily or permanently stored on a removable recording medium such as a flexible disk, a CD-ROM (Compact Disc Read Only Memory), an MO (Magneto optical) disk, a DVD (Digital Versatile Disc), a magnetic disk, or a semiconductor memory. It can be stored (recorded). Such a removable recording medium can be provided as so-called package software.

なお、プログラムは、上述したようなリムーバブル記録媒体からコンピュータにインストールする他、ダウンロードサイトから、コンピュータに無線転送したり、ＬＡＮ(Local Area Network)、インターネットといったネットワークを介して、コンピュータに有線で転送し、コンピュータでは、そのようにして転送されてくるプログラムを受信し、内蔵するハードディスク等の記録媒体にインストールすることができる。 The program is installed on the computer from the removable recording medium as described above, or is wirelessly transferred from the download site to the computer, or is wired to the computer via a network such as a LAN (Local Area Network) or the Internet. The computer can receive the program transferred in this manner and install it on a recording medium such as a built-in hard disk.

なお、明細書に記載された各種の処理は、記載に従って時系列に実行されるのみならず、処理を実行する装置の処理能力あるいは必要に応じて並列的にあるいは個別に実行されてもよい。また、本明細書においてシステムとは、複数の装置の論理的集合構成であり、各構成の装置が同一筐体内にあるものには限らない。 Note that the various processes described in the specification are not only executed in time series according to the description, but may be executed in parallel or individually according to the processing capability of the apparatus that executes the processes or as necessary. Further, in this specification, the system is a logical set configuration of a plurality of devices, and the devices of each configuration are not limited to being in the same casing.

以上、説明したように、本発明の一実施例の構成によれば、電子文書記憶部に対するアクセス情報として、アクセスユーザ識別子であるユーザＩＤと、アクセス対象文書識別子である文書ＩＤとをアクセス日時情報に対応付けて記録し、このアクセスログに基づいて、同一文書に対するアクセスを実行した異なるユーザを、コミュニティ関係を有するユーザとして選択抽出する処理を実行する構成としたので、同一文書に対する興味を持つユーザを効率的に選択抽出することが可能となり、同一文書に対する興味を持つユーザの集合をコミュニティ関係のあるユーザとして判定し、これらのユーザ集合に基づいて効率的にネットワーク図の作成、コミュニティ解析を行なうことができる。 As described above, according to the configuration of the embodiment of the present invention, the access date and time information includes the user ID that is the access user identifier and the document ID that is the access target document identifier as the access information for the electronic document storage unit. The user is interested in the same document because the process of selecting and extracting different users who have accessed the same document as users having a community relationship based on the access log is recorded. Can be selected and extracted efficiently, and a group of users who are interested in the same document is determined as a community-related user, and a network diagram is efficiently created and a community analysis is performed based on these users. be able to.

本発明にかかるコミュニティ分析が適用されるネットワークシステムの構成を例示する図である。It is a figure which illustrates the structure of the network system to which the community analysis concerning this invention is applied. 本発明にかかるコミュニティ分析装置の構成を示す図である。It is a figure which shows the structure of the community analyzer concerning this invention. アクセスログ記憶部の記録データの例を示す図である。It is a figure which shows the example of the recording data of an access log memory | storage part. アクセスログ分析部の実行する解析処理の一例について説明する図である。It is a figure explaining an example of the analysis process which an access log analysis part performs. アクセスログ分析部の実行する解析処理シーケンスについて説明するフローチャートを示す図である。It is a figure which shows the flowchart explaining the analysis process sequence which an access log analysis part performs. 本発明にかかるコミュニティ分析装置において実行される解析情報を適用して生成されるネットワーク図の例を示す図である。It is a figure which shows the example of the network diagram produced | generated by applying the analysis information performed in the community analysis apparatus concerning this invention. アクセスログ分析部の実行する解析処理の一例について説明する図である。It is a figure explaining an example of the analysis process which an access log analysis part performs. アクセスログ分析部の実行する解析処理シーケンスについて説明するフローチャートを示す図である。It is a figure which shows the flowchart explaining the analysis process sequence which an access log analysis part performs. 本発明にかかるコミュニティ分析装置において実行される解析処理例について説明する図である。It is a figure explaining the example of an analysis process performed in the community analysis apparatus concerning this invention. アクセスログ分析部の実行する解析処理において、適用する閾値インターバルの一例について説明する図である。It is a figure explaining an example of the threshold interval applied in the analysis processing which an access log analysis part performs. アクセスログ分析部の実行する解析処理の一例について説明する図である。It is a figure explaining an example of the analysis process which an access log analysis part performs. アクセスログ分析部の実行する解析処理シーケンスについて説明するフローチャートを示す図である。It is a figure which shows the flowchart explaining the analysis process sequence which an access log analysis part performs. アクセスログ分析部の実行する解析処理の一例について説明する図である。It is a figure explaining an example of the analysis process which an access log analysis part performs. アクセスログ分析部の実行する解析処理シーケンスについて説明するフローチャートを示す図である。It is a figure which shows the flowchart explaining the analysis process sequence which an access log analysis part performs. 本発明にかかるコミュニティ分析装置において実行される解析処理例について説明する図である。It is a figure explaining the example of an analysis process performed in the community analysis apparatus concerning this invention. グラフの構造を調べることによって行なわれる組織コミュニケーションの分析手法について説明する図である。It is a figure explaining the analysis method of the organization communication performed by investigating the structure of a graph.

Explanation of symbols

１１ノード
１２リンク
５１〜５３ユーザ端末（クライアント）
６０ネットワーク
７０電子文書ファイリング装置
１００コミュニティ分析装置
１０１アクセスログ記憶部
１０２アクセスログ分析部 11 nodes 12 links 51 to 53 user terminals (clients)
60 Network 70 Electronic Document Filing Device 100 Community Analysis Device 101 Access Log Storage Unit 102 Access Log Analysis Unit

Claims

As access information for the electronic document storage unit, an access log storage unit that records a user ID that is an access user identifier and a document ID that is an access target document identifier in association with access date and time information;
An access log analysis unit that executes an analysis process of the access log recorded in the access log storage unit,
The access log analysis unit
Based on the access log recorded in the access log storage unit,
A community analysis apparatus characterized by executing a process of selecting and extracting different users who have executed access to the same document as users having a community relationship.

The access log analysis unit
The community analysis apparatus according to claim 1, wherein the community analysis apparatus is configured to execute a process of generating a network diagram in which each user having the community relationship is a node and the users having the community relationship are linked by a link.

The access log analysis unit
2. The configuration according to claim 1, wherein a process of selecting and extracting different users who have accessed the same document as users having a community relationship within a threshold interval as a predetermined time interval is performed. Community analysis device.

The access log analysis unit
The community analysis device according to claim 3, wherein the threshold analysis interval is configured to execute processing for setting a time interval having a high access density.

As access information for the electronic document storage unit, an access log storage unit that records a user ID that is an access user identifier and a document ID that is an access target document identifier in association with access date and time information;
An access log analysis unit that executes an analysis process of the access log recorded in the access log storage unit,
The access log analysis unit
Based on the access log recorded in the access log storage unit,
A community analyzing apparatus characterized in that it is configured to execute processing for extracting different documents accessed by the same user as related documents.

The access log analysis unit
6. The community analysis according to claim 5, wherein a process of extracting different documents accessed by the same user as related documents is executed within a threshold interval as a predetermined time interval. apparatus.

An access log storage step of recording, in the storage unit, a user ID that is an access user identifier and a document ID that is an access target document identifier as access information for the electronic document storage unit in association with the access date information;
An access log analysis step for performing an analysis process of the access log recorded in the access log storage unit in the access log analysis unit;
The access log analyzing step includes:
Based on the access log recorded in the access log storage unit,
A community analysis method characterized by executing a process of selecting and extracting different users who have executed access to the same document as users having a community relationship.

The community analysis method further includes:
8. The community analysis method according to claim 7, further comprising a step of generating a network diagram in which each user having the community relationship is a node and the users having the community relationship are linked by a link.

The access log analyzing step includes:
The community analysis according to claim 7, wherein a process of selecting and extracting different users who have accessed the same document as users having a community relationship within a threshold interval as a predetermined time interval is executed. Method.

In the access log analysis step,
The community analysis method according to claim 9, wherein a process of setting the threshold interval as a time interval having a high access density is executed.

An access log storage step of recording, in the storage unit, a user ID that is an access user identifier and a document ID that is an access target document identifier as access information for the electronic document storage unit in association with the access date information;
An access log analysis step for performing an analysis process of the access log recorded in the access log storage unit in the access log analysis unit;
The access log analyzing step includes:
Based on the access log recorded in the access log storage unit,
A community analysis method characterized by executing a process of extracting different documents accessed by the same user as related documents.

The access log analyzing step includes:
The community analysis according to claim 11, wherein a process of extracting different documents accessed by the same user as related documents within a threshold interval as a predetermined time interval is executed. Method.

A computer program that executes community analysis processing on a computer,
An access log storage step of recording, in the storage unit, a user ID that is an access user identifier and a document ID that is an access target document identifier as access information for the electronic document storage unit in association with the access date information;
In the access log analysis unit, an access log analysis step for executing an analysis process of the access log recorded in the access log storage unit,
Let it run
In the access log analysis step,
Based on the access log recorded in the access log storage unit,
A computer program for executing a process of selecting and extracting different users who have executed access to the same document as users having a community relationship.

A computer program that executes community analysis processing on a computer,
An access log storage step of recording, in the storage unit, a user ID that is an access user identifier and a document ID that is an access target document identifier as access information for the electronic document storage unit in association with the access date information;
In the access log analysis unit, an access log analysis step for executing an analysis process of the access log recorded in the access log storage unit,
Let it run
In the access log analysis step,
Based on the access log recorded in the access log storage unit,
A computer program for executing processing for extracting different documents accessed by the same user as related documents.