JP2010117757A

JP2010117757A - Performance monitoring system and performance monitoring method

Info

Publication number: JP2010117757A
Application number: JP2008288556A
Authority: JP
Inventors: Kimimasa Hirose; 公将廣瀬; Tatsutoshi Murata; 龍俊村田
Original assignee: Nomura Research Institute Ltd
Current assignee: Nomura Research Institute Ltd
Priority date: 2008-11-11
Filing date: 2008-11-11
Publication date: 2010-05-27

Abstract

【課題】特定のアクセスについて処理経路上の各サーバや内部ネットワークでの応答時間をリアルタイムで把握可能とする性能監視システムおよび性能監視方法を提供する。
【解決手段】複数のサーバによりサービスを提供するサーバシステム１００と性能監視サーバ２００からなり、前記各サーバは、アクセスに対してリクエストＩＤを付与する手段と、前記サーバの間で前記リクエストＩＤを引き継ぐ手段と、処理の開始時と終了時に前記リクエストＩＤと時刻を含む個別ログ１２１を出力する手段とを有し、性能監視サーバ２００は、個別ログ１２１を収集した収集ログ２１０と、収集ログ２１０のデータからログを前記リクエストＩＤによって関連付けて特定し、前記各サーバでの処理の開始時刻の情報および終了時刻の情報に基づいて、前記各サーバ、前記各サーバ間のネットワーク、および前記処理経路全体での応答時間を算出するログ集計部２２０とを有する。
【選択図】図１A performance monitoring system and a performance monitoring method are provided that make it possible to grasp in real time the response time of each server or internal network on a processing path for a specific access.
The server system includes a server system and a performance monitoring server that provide services by a plurality of servers, and each server takes over the request ID between the server and a means for assigning a request ID to access. Means, and a means for outputting the individual log 121 including the request ID and time at the start and end of processing, and the performance monitoring server 200 collects the individual log 121, A log is identified from data by associating with the request ID, and based on the information on the start time and the end time of the process in each server, the server, the network between the servers, and the entire processing path And a log totaling unit 220 for calculating the response time.
[Selection] Figure 1

Description

本発明は、複数のサーバの連携によりネットワークを介してクライアント端末にサービスを提供するサーバシステムの性能を監視する性能監視システムおよび性能監視方法に関し、特に、各サーバおよび各サーバ間のネットワークにおける応答時間に基づいて性能を監視する性能監視システムおよび性能監視方法に適用して有効な技術に関するものである。 The present invention relates to a performance monitoring system and a performance monitoring method for monitoring the performance of a server system that provides services to a client terminal via a network through cooperation of a plurality of servers, and in particular, response time in each server and the network between the servers. The present invention relates to a technology effective when applied to a performance monitoring system and a performance monitoring method for monitoring performance based on the above.

現在では、例えばＷｅｂサーバをフロントエンドとし、業務サーバ、ＤＢサーバ、外部サーバなどが連携するシステムによって、金融取引等を含む種々のサービスがインターネット等のネットワークを介してユーザに提供されている。 At present, various services including financial transactions and the like are provided to users via a network such as the Internet by a system in which, for example, a Web server is used as a front end and business servers, DB servers, external servers, and the like cooperate.

このようなシステムにおいては、ユーザからの膨大な数のサービス要求に対して所定の応答時間等を維持しつつサービスの提供を継続する必要があり、そのためにシステムの性能監視が行われるのが一般的である。応答時間をはじめとするシステムの性能指標や障害率などのサービスレベルは、サービスの提供主体（金融機関等）とシステム運用者との間で契約等で取り決められている場合がある。 In such a system, it is necessary to continue providing the service while maintaining a predetermined response time for a huge number of service requests from the user. For this reason, the performance of the system is generally monitored. Is. Service levels such as response time and other system performance indicators and failure rates may be negotiated between a service provider (such as a financial institution) and a system operator.

各サーバや内部ネットワーク等のシステムの性能監視を行う手段としては、従来から種々のものが提案されているが、一般的には、各サーバや内部ネットワークについてのＣＰＵ使用率や帯域使用状況などのリソースベースでの監視である。各サーバや内部ネットワークでの応答時間といった直接のサービスレベルの観点での監視を行うものとしては、例えば、特開２００７−２６３０３号公報（特許文献１）に開示されたシステム性能監視方法がある。 Various methods have been proposed as means for monitoring the performance of systems such as each server and internal network. Generally, the CPU usage rate and bandwidth usage status of each server and internal network, etc. are proposed. This is resource-based monitoring. For example, a system performance monitoring method disclosed in Japanese Patent Application Laid-Open No. 2007-26303 (Patent Document 1) performs monitoring from the viewpoint of a direct service level such as response time in each server or internal network.

特許文献１に開示されたシステム性能監視方法では、各サーバや内部ネットワークのスイッチ等から、ＣＰＵ使用率や個々のアクセスに関する情報、サービス種別、応答時間などの動作状態に関する観測情報を収集し、性能の測定に必要な観測情報に不足があればアクセス発生手段により所定のアクセスを発生させて不足する観測情報を生成し、観測情報が揃うと観測情報を分析して性能を測定し評価する。
特開２００７−２６３０３号公報 The system performance monitoring method disclosed in Patent Document 1 collects observation information on the operating state such as information on the CPU usage rate, individual access, service type, response time, etc. from each server or internal network switch. If the observation information necessary for the measurement is insufficient, the access generation means generates a predetermined access to generate the insufficient observation information, and when the observation information is prepared, the observation information is analyzed to measure and evaluate the performance.
JP 2007-26303 A

特許文献１のような性能監視方法では、各サーバ単位で、サービス種別（ＵＲＬ：Uniform Resource Locator）毎にＣＰＵ使用率などのリソースベースでの性能監視や、応答時間によるレスポンスベースの性能監視を行うことができる。 In the performance monitoring method such as Patent Document 1, performance monitoring based on resources such as CPU usage rate and response based performance monitoring based on response time is performed for each service type (URL: Uniform Resource Locator) for each server. be able to.

しかし、性能測定の対象となるアクセスは当該サービスにおけるユーザの実取引であり、ユーザの指定内容等によってサーバでの処理内容は様々である。すなわち、同じサービス種別であってもアクセス毎にサーバでの処理内容やシステム内での処理経路（アクセスするサーバや通過する内部ネットワーク）は異なる。従って、このようにして得られた応答時間の情報は取得タイミングによって変動が大きくなる場合があり、サービスレベルを満足しているか否かの判断基準として用いるには不適格な場合がある。有効な値を得るには多数もしくは長時間のアクセスの平均を算出するなど負荷が高くなり、またリアルタイム性も乏しくなる。 However, the access that is the target of the performance measurement is the actual transaction of the user in the service, and the processing contents on the server vary depending on the user's designated contents and the like. That is, even for the same service type, the processing contents in the server and the processing path in the system (the server to be accessed and the internal network that passes through) differ for each access. Accordingly, the response time information obtained in this way may vary greatly depending on the acquisition timing, and may not be suitable for use as a criterion for determining whether or not the service level is satisfied. In order to obtain an effective value, the load becomes high, such as calculating the average of many or long-time accesses, and the real-time property is also poor.

ここで、特許文献１の性能監視方法におけるアクセス発生手段により発生された擬似アクセスを基準のアクセスとして、当該アクセスに対する応答時間を取得することによって性能測定を行うことが考えられる。しかし、特許文献１のような性能監視方法では、当該アクセスについてのクライアント端末ベースでの応答時間は測定可能であるが、収集された観測情報から当該アクセスのデータを特定することができないため、当該アクセスについての各サーバにおける応答時間などは取得することができない。 Here, it is conceivable to perform performance measurement by acquiring a response time for the access, using the pseudo access generated by the access generation means in the performance monitoring method of Patent Document 1 as a reference access. However, in the performance monitoring method as in Patent Document 1, the response time on the client terminal basis for the access can be measured, but the access data cannot be specified from the collected observation information. The response time at each server for access cannot be acquired.

従って、特定のアクセスについて、クライアント端末ベースでの応答時間が処理経路の各サーバや内部ネットワークでどのような内訳となっているかというようなきめ細かい情報をリアルタイムで把握することは難しい。 Therefore, it is difficult to grasp in real time detailed information such as the breakdown of the response time on the client terminal basis in each server or internal network for a specific access.

そこで本発明の目的は、特定のアクセスについて処理経路上の各サーバや内部ネットワークでの応答時間をリアルタイムで把握可能とする性能監視システムおよび性能監視方法を提供することにある。本発明の前記ならびにその他の目的と新規な特徴は、本明細書の記述および添付図面から明らかになるであろう。 Accordingly, an object of the present invention is to provide a performance monitoring system and a performance monitoring method that make it possible to grasp in real time the response time of each server or internal network on a processing path for a specific access. The above and other objects and novel features of the present invention will be apparent from the description of this specification and the accompanying drawings.

本願において開示される発明のうち、代表的なものの概要を簡単に説明すれば、以下のとおりである。 Of the inventions disclosed in this application, the outline of typical ones will be briefly described as follows.

本発明の代表的な実施の形態による性能監視システムは、複数のサーバの連携によりネットワークを介してクライアント端末にサービスを提供するサーバシステムと、前記サーバシステムの性能を監視する性能監視サーバからなり、前記各サーバは、前記サーバが前記クライアント端末からのアクセスを最初に受けた前記サーバである場合に、前記アクセスに対して他の前記アクセスから識別可能となるようなリクエストＩＤを付与する手段と、前記サーバシステムにおいて前記アクセスに係る処理経路上の前記サーバの間で前記リクエストＩＤを引き継ぐ手段と、前記アクセスに係る処理の開始時と終了時に、前記リクエストＩＤと時刻を含むログを個別ログとして出力する手段とを有し、前記性能監視サーバは、前記各サーバにより出力された前記個別ログを収集した収集ログと、前記収集ログのデータから、前記各サーバでの前記アクセスに係る処理の際に出力された前記ログを、前記リクエストＩＤによって関連付けて特定し、特定された前記ログにおける前記各サーバでの処理の開始時刻の情報および終了時刻の情報に基づいて、前記アクセスに係る前記サーバシステムでの処理における前記各サーバ、前記各サーバ間のネットワーク、および前記処理経路全体での応答時間を算出するログ集計部とを有することを特徴とするものである。 A performance monitoring system according to a representative embodiment of the present invention includes a server system that provides services to client terminals via a network by cooperation of a plurality of servers, and a performance monitoring server that monitors the performance of the server system, Each of the servers, when the server is the server that first received an access from the client terminal, a means for giving a request ID that can be identified from the other access to the access; Means for taking over the request ID between the servers on the processing path related to the access in the server system, and outputting a log including the request ID and time as an individual log at the start and end of the processing related to the access And the performance monitoring server outputs the output by each server. From the collected log obtained by collecting the individual logs and the data of the collected log, the log output at the time of the processing related to the access at each server is identified and identified by associating with the request ID. In addition, based on the information on the start time and the end time of processing in each server in the log, the servers in the processing in the server system related to the access, the network between the servers, and the processing path And a log totaling unit for calculating the overall response time.

本願において開示される発明のうち、代表的なものによって得られる効果を簡単に説明すれば以下のとおりである。 Among the inventions disclosed in the present application, effects obtained by typical ones will be briefly described as follows.

本発明の代表的な実施の形態によれば、特定のアクセスについて処理経路上の各サーバや内部ネットワークでの個別の応答時間をリアルタイムで把握することが可能となり、処理経路上でのボトルネック等をより精度良く評価することができるレスポンスベースの性能監視システムが実現可能となる。 According to a typical embodiment of the present invention, it becomes possible to grasp in real time the individual response time in each server or internal network on a processing path for a specific access, such as a bottleneck on the processing path It is possible to realize a response-based performance monitoring system that can evaluate the above with higher accuracy.

また、本発明の代表的な実施の形態によれば、所定の処理経路を通るアクセスを基準のアクセスとして用いることができ、当該アクセスについての処理経路にわたっての応答時間を把握することにより、システムの現状のサービスレベルを偏りなく評価し、また、性能予測等における基準としての適格性の高い指標を提供することができる性能監視システムの実現が可能となる。 Further, according to the representative embodiment of the present invention, an access through a predetermined processing path can be used as a reference access, and by grasping a response time over the processing path for the access, It is possible to realize a performance monitoring system that can evaluate the current service level without any bias and provide a highly qualified index as a standard in performance prediction and the like.

以下、本発明の実施の形態を図面に基づいて詳細に説明する。なお、実施の形態を説明するための全図において、同一部には原則として同一の符号を付し、その繰り返しの説明は省略する。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings. Note that components having the same function are denoted by the same reference symbols throughout the drawings for describing the embodiment, and the repetitive description thereof will be omitted.

図１は、本発明の一実施の形態である性能監視システムの構成例の概要を示した図である。性能監視システムは、例えば、Ｗｅｂサーバ、アプリケーションサーバ、ＤＢサーバや、外部システムへのゲートウェイサーバ等の複数のサーバの連携により、インターネット等のネットワーク５００を介してクライアント端末４００にサービスを提供するサーバシステム１００と、サーバシステム１００の性能を監視する性能監視サーバ２００から構成される。サーバシステム１００は、提供するサービスの種類に応じて複数存在していてもよい。 FIG. 1 is a diagram showing an outline of a configuration example of a performance monitoring system according to an embodiment of the present invention. The performance monitoring system is a server system that provides a service to the client terminal 400 via a network 500 such as the Internet by cooperation of a plurality of servers such as a Web server, an application server, a DB server, and a gateway server to an external system. 100 and a performance monitoring server 200 that monitors the performance of the server system 100. A plurality of server systems 100 may exist depending on the type of service to be provided.

サーバシステム１００には、さらに、後述するスタンダードリクエストを送信する端末として、スタンダードリクエスト送信端末３００が接続されている。スタンダードリクエスト送信端末３００は、外部のネットワークであるネットワーク５００経由ではなく、内部ネットワークを介してサーバシステム１００に接続されているが、クライアント端末４００と同様に、ネットワーク５００を介して接続されていてもよい。 The server system 100 is further connected with a standard request transmission terminal 300 as a terminal for transmitting a standard request to be described later. The standard request transmission terminal 300 is connected to the server system 100 not via the network 500 that is an external network but via the internal network. However, similarly to the client terminal 400, the standard request transmission terminal 300 may be connected via the network 500. Good.

サーバシステム１００は、図１の例ではサーバＡ１１０、サーバＢ１２０、ＤＢサーバＣ１３０、サーバＤ１４０の４つのサーバがそれぞれネットワークを介して接続された構成となっているが、サーバの数や接続等の構成はこれに限るものではない。これらの各サーバが連携して、ミドルウェアやアプリケーションプログラム等による処理、ＤＢ１３２に対するデータの読み書きなどにより、サーバシステム１００として、クライアント端末４００からの要求に対してサービスを提供する。サーバシステム１００によって提供されるサービスは１種類に限らず複数種類あってもよく、その場合サービス毎に処理を行うサーバの経路が異なっていてもよい。 In the example of FIG. 1, the server system 100 has a configuration in which four servers, a server A110, a server B120, a DB server C130, and a server D140, are connected via a network. Is not limited to this. Each of these servers cooperates to provide a service in response to a request from the client terminal 400 as the server system 100 by processing by middleware, an application program, or the like, or reading / writing data from / to the DB 132. The service provided by the server system 100 is not limited to one type, and there may be a plurality of types. In this case, the route of the server that performs processing may be different for each service.

サービスを提供する際に、最初にクライアント端末４００からのアクセスを受けたサーバＡ１１０において、アクセス毎に当該アクセスに対して他のアクセスから識別可能となるようなユニークなリクエストＩＤを付与する。リクエストＩＤは各サーバ間で当該アクセスについての処理要求／応答を送受信する際に引き継ぐ。 When providing a service, the server A 110 that first receives an access from the client terminal 400 assigns a unique request ID that can be distinguished from other accesses for each access. The request ID is inherited when a processing request / response for the access is transmitted / received between the servers.

各サーバは、当該アクセスについての処理要求／応答を受信して処理を実行する際に、その開始時と終了時に個別ログ１１１にログを出力する。この個別ログ１１１にはリクエストＩＤも出力する。これにより、各サーバによって書き出された個別ログ１１１において当該アクセスについてのログのエントリを特定できるようにする。なお、個別ログ１１１にはデータベースやファイルなどを用いることができる。 Each server outputs a log to the individual log 111 at the start and end when it receives the process request / response for the access and executes the process. The individual log 111 also outputs a request ID. This makes it possible to specify a log entry for the access in the individual log 111 written by each server. Note that a database or a file can be used for the individual log 111.

リクエストＩＤの付与や個別ログ１１１へのログの出力の機能は、各サーバにおいて、例えば、アプリケーションプログラムの下位層である基盤層で動作するモジュールに共通で組み込むなどの実装方法をとることができる。 The function of assigning the request ID and outputting the log to the individual log 111 can be implemented in each server by, for example, incorporating it in common in a module operating in a base layer that is a lower layer of the application program.

このとき、各サーバの基盤層は、サーバがクライアント端末４００からのアクセスを最初に受けたサーバである場合に、当該アクセスに対してリクエストＩＤを付与する手段と、サーバシステム１００において当該アクセスに係る処理経路上のサーバの間でリクエストＩＤを引き継ぐ手段と、当該アクセスに係る処理の開始時と終了時に、リクエストＩＤと時刻を含むログを個別ログ１１１として出力する手段とを有する。 At this time, when the server is the server that first received the access from the client terminal 400, the base layer of each server relates to the access in the server system 100 and means for assigning a request ID to the access. Means for taking over the request ID between servers on the processing path, and means for outputting a log including the request ID and time as the individual log 111 at the start and end of the processing relating to the access.

図２は、サーバの構成と処理の概要の例を示した図である。ここでは、例としてサーバＡ１１０について図示している。サーバＡ１１０は、例えばソフトウェアとして最下層においてＯＳ／ミドルウェア１１２が動作し、その上位層でサーバシステム１００におけるサーバとして動作するための基本機能および上述した各手段を有する基盤層１１３が動作する。さらにその上位層で各サーバ固有の業務処理等を行うアプリケーションプログラム１１４が動作する。 FIG. 2 is a diagram showing an example of an outline of the server configuration and processing. Here, the server A110 is illustrated as an example. In the server A 110, for example, the OS / middleware 112 operates as the lowest layer as software, and the base layer 113 having the basic functions and the above-described means for operating as a server in the server system 100 operates as an upper layer. Further, an application program 114 that performs business processing unique to each server operates in the upper layer.

このとき、例えばサーバＡ１１０では、クライアント端末４００からの処理要求を受信した際に、ＯＳ／ミドルウェア１１２から処理を受け渡された基盤層１１３では、リクエストＩＤを付与し、リクエストＩＤと時刻を含む開始時のログを個別ログ１１１に出力してアプリケーションプログラム１１４に処理を受け渡す。アプリケーションプログラム１１４では、業務処理やＤＢアクセスなど所定の処理を行ったうえで基盤層１１３に処理を受け渡す。 At this time, for example, when the server A 110 receives a processing request from the client terminal 400, the base layer 113 that has received the processing from the OS / middleware 112 assigns a request ID and includes a request ID and a start time The time log is output to the individual log 111 and the process is transferred to the application program 114. The application program 114 performs predetermined processing such as business processing and DB access, and then passes the processing to the base layer 113.

基盤層１１３では、リクエストＩＤと時刻を含む終了時のログを個別ログ１１１に出力して他のサーバ（図２の例ではサーバＢ１２０）に処理要求を送信する。このとき、リクエストＩＤも送信することで引き継ぐ。これにより、アプリケーションプログラム１１４は、リクエストＩＤおよび個別ログ１１１への出力に関する処理に関知することなく業務処理等を行うことができる。 The base layer 113 outputs an end-time log including the request ID and time to the individual log 111 and transmits a processing request to another server (server B120 in the example of FIG. 2). At this time, the request ID is also transmitted to take over. As a result, the application program 114 can perform business processing or the like without knowing about the processing related to the output to the request ID and the individual log 111.

なお、図２の例ではクライアント端末４００からのアクセスを最初に受けるサーバＡ１１０での処理要求を受信した場合の処理を示しているが、処理応答を受信した場合、および他のサーバにおいても同様である。ただし、これらの場合はリクエストＩＤがすでに付与されて引き継がれているため、リクエストＩＤを付与する処理は行われない。 Note that the example of FIG. 2 shows processing when a processing request is received at the server A110 that first receives access from the client terminal 400, but the same applies when a processing response is received and at other servers. is there. However, in these cases, since the request ID has already been assigned and taken over, the process of assigning the request ID is not performed.

図１の構成例において、性能監視サーバ２００は、コンピュータシステムからなり、例えば、ログ集計部２２０、評価部２３０、ユーザインタフェース（Ｉ／Ｆ）部２４０を有する構成となっている。また、収集ログ２１０、およびログ実績情報２５０、応答時間閾値情報２６０、サービス経路情報２７０の各データを有する。ログ集計部２２０、評価部２３０、ユーザＩ／Ｆ部２４０は１つ以上のプログラムにより実装される。また収集ログ２１０、ログ実績情報２５０、応答時間閾値情報２６０、サービス経路情報２７０にはそれぞれデータベースやファイルなどを用いることができる。 In the configuration example of FIG. 1, the performance monitoring server 200 includes a computer system, and includes, for example, a log totaling unit 220, an evaluation unit 230, and a user interface (I / F) unit 240. Further, the collected log 210, log record information 250, response time threshold information 260, and service route information 270 are included. The log totaling unit 220, the evaluation unit 230, and the user I / F unit 240 are implemented by one or more programs. Further, a database, a file, or the like can be used for each of the collected log 210, log record information 250, response time threshold information 260, and service route information 270.

収集ログ２１０は、サーバシステム１００の各サーバの個別ログ１１１を収集してマージしたものである。ログを収集する手段については種々のものが利用可能である。例えば、ＮＡＳ（Network Attached Storage）などの各サーバから共通にアクセスが可能なストレージに収集ログ２１０を配置しておき、各サーバが、個別ログ１１１に出力された内容を、収集ログ２１０に対して非同期に書き込むなどの手段をとることができる。この場合、個別ログ１１１に書き込まれるのとほぼ同時に収集ログ２１０にもデータが書き込まれるようにすることが可能である。 The collection log 210 is obtained by collecting and merging the individual logs 111 of each server in the server system 100. Various means are available for collecting logs. For example, the collection log 210 is arranged in a storage that can be commonly accessed from each server such as NAS (Network Attached Storage), and each server outputs the contents output to the individual log 111 to the collection log 210. A means such as writing asynchronously can be taken. In this case, it is possible to write data to the collection log 210 almost simultaneously with writing to the individual log 111.

ここで、上述したように、個別ログ１１１にはリクエストＩＤの情報が出力される。従って、収集ログ２１０では、クライアント端末４００からのアクセス毎に、それぞれの個別ログ１１１で出力されたログをリクエストＩＤによって関連付けて、任意のアクセスについて処理経路にわたってログを特定することが可能である。 Here, as described above, request ID information is output to the individual log 111. Therefore, in the collected log 210, for each access from the client terminal 400, the log output in each individual log 111 can be associated by the request ID, and the log can be specified over the processing path for any access.

各サーバでは、処理の開始時と終了時にその時刻を含むログを出力しているため、これらのログから各サーバでの処理の応答時間を算出することができる。また、処理経路上の隣接するサーバ間における、前段のサーバの終了時のログと後段のサーバの開始時のログとから、当該サーバ間のネットワークでの応答時間を算出することができる。 Since each server outputs a log including the time at the start and end of the process, the response time of the process at each server can be calculated from these logs. Further, the response time in the network between the servers can be calculated from the log at the end of the previous server and the log at the start of the subsequent server between adjacent servers on the processing path.

ログ集計部２２０は、定期的に、収集ログ２１０のデータから、各サーバでの後述する所定のアクセスに係る処理の際に出力されたログをリクエストＩＤによって関連付けて特定し、特定されたログにおける各サーバでの処理の開始時刻の情報および終了時刻の情報に基づいて、当該所定のアクセスに係るサーバシステム１００での処理における各サーバ、各サーバ間のネットワーク、および処理経路全体での応答時間を算出し、その結果をログ実績情報２５０に出力する。 The log totaling unit 220 periodically identifies the log output in the process related to the predetermined access described later in each server from the data of the collected log 210 by using a request ID, Based on the information on the start time and end time of the processing at each server, the response time for each server, the network between the servers, and the entire processing path in the processing in the server system 100 related to the predetermined access is calculated. The result is calculated and the result is output to the log record information 250.

評価部２３０は、ログ実績情報２５０のデータと、サービス経路情報２７０および応答時間閾値情報２６０の内容に基づいて、当該アクセスに係るサーバシステム１００での処理における各サーバ、サーバ間のネットワーク、および処理経路全体での応答時間が、それぞれあらかじめ定義された応答時間の閾値を超えているか否かを判定し、サーバシステム１００が所定のサービスレベルを満たしているか否かを評価する。 Based on the data of the log record information 250, the contents of the service route information 270, and the response time threshold information 260, the evaluation unit 230 uses each server, the network between servers, and the process in the process in the server system 100 related to the access. It is determined whether or not the response time in the entire route exceeds a predefined response time threshold value, and it is evaluated whether or not the server system 100 satisfies a predetermined service level.

なお、詳細は後述するが、サービス経路情報２７０は、サーバシステム１００が提供するサービス毎にその処理経路の情報をあらかじめ保持している。また、応答時間閾値情報２６０は、サービスの処理経路上の各サーバやサーバ間のネットワークについてあらかじめ定義された応答時間の閾値の情報を保持している。 Although details will be described later, the service route information 270 holds information on the processing route for each service provided by the server system 100 in advance. The response time threshold information 260 holds information on response time thresholds defined in advance for each server on the service processing path and the network between servers.

ユーザＩ／Ｆ部２４０は、ディスプレイ等にサーバシステム１００の性能監視の画面を表示する手段である。例えば、各サービスの処理経路の情報を描画して、処理経路上のサーバやサーバ間ネットワーク毎に応答時間の情報を表示し、閾値を超えている場合には警告表示するような監視画面を表示する。また、他の性能測定手段によって取得したＣＰＵ使用率やネットワークの使用帯域などリソースベースでの性能監視の情報を合わせて表示したり切り替えて表示するようにしてもよい。また、応答時間が閾値を超えるサーバやネットワークがある場合には、警告表示に加えてシステム管理者などの関係者にメールを自動発信するような手段を有していてもよい。 The user I / F unit 240 is means for displaying a performance monitoring screen of the server system 100 on a display or the like. For example, draw processing path information for each service, display response time information for each server or server-to-server network on the processing path, and display a monitoring screen that displays a warning if the threshold is exceeded To do. In addition, information on performance monitoring based on resources such as CPU usage rate and network usage bandwidth acquired by other performance measuring means may be displayed together or switched. In addition, when there is a server or network whose response time exceeds the threshold, it may have means for automatically sending an email to a related person such as a system administrator in addition to the warning display.

このように、本実施の形態の性能監視サーバ２００では、クライアント端末４００からのアクセス毎に処理経路にわたって各サーバやサーバ間ネットワークでの応答時間の情報を取得し、閾値と比較することで性能監視を行うことができる。しかし、クライアント端末４００からのアクセスには種々のパターンがあり、その処理経路もそれぞれ異なる。従って、あるアクセスについて処理経路にわたって応答時間を算出した場合に、例えば処理経路が途中のサーバで終わっており、サーバシステム１００全体の性能指標としては不十分であったりする可能性がある。 As described above, the performance monitoring server 200 according to the present embodiment acquires information on response time in each server or server-to-server network over the processing path for each access from the client terminal 400, and compares the information with a threshold value for performance monitoring. It can be performed. However, there are various patterns of access from the client terminal 400, and their processing paths are also different. Therefore, when a response time is calculated over a processing path for a certain access, for example, the processing path ends at a server on the way, which may be insufficient as a performance index for the entire server system 100.

そこで、本実施の形態の性能監視サーバ２００では、スタンダードリクエスト送信端末３００により、定期的にスタンダードリクエストをサーバシステム１００に対して送信する。このスタンダードリクエストは、例えば株式注文などの実際のサービスの要求であるが、あらかじめサービス提供主体と合意しておき、現実のサービスの提供（株の実際の売買など）は行われないようにしているダミー取引のリクエストである。 Therefore, in the performance monitoring server 200 of this embodiment, the standard request transmission terminal 300 periodically transmits a standard request to the server system 100. This standard request is a request for an actual service such as a stock order, for example, but it is agreed in advance with the service provider so that the actual service is not provided (such as actual trading of shares). This is a dummy transaction request.

このスタンダードリクエストは、サーバシステム１００内で最も影響の大きい（応答時間の大きい）処理経路となるようにパラメータなどが設定されているのが望ましい。また、毎回必ず同じ処理経路を通って処理される。このスタンダードリクエストを基準となるアクセスとし、スタンダードリクエストに係る処理に対して各サーバで個別ログ１１１に出力されたログを収集ログ２１０から特定して応答時間を算出する。これにより、性能評価や性能予測を行う際に、基準としての適格性の高い応答時間の情報を取得することができる。なお、スタンダードリクエストは複数種類設定しておいてもよい。 It is desirable that parameters and the like are set for the standard request so that the processing path has the greatest influence (the response time is large) in the server system 100. In addition, processing is always performed through the same processing path. Using this standard request as a reference access, the log output to the individual log 111 at each server for the processing related to the standard request is specified from the collection log 210 and the response time is calculated. Thereby, when performing performance evaluation or performance prediction, it is possible to acquire information on response time having a high qualification as a reference. A plurality of types of standard requests may be set.

収集ログ２１０からスタンダードリクエストについてのログを特定できるようにするため、サーバＡ１１０では、アクセス毎にリクエストＩＤを付与する際に、スタンダードリクエストに係るアクセスについては、当該アクセスがスタンダードリクエストに係るものであることが識別可能となるような特殊なリクエストＩＤを付与する。対象のアクセスがスタンダードリクエストに係るアクセスであるか否かについては、種々の方法にて判定することが可能であるが、例えば、特定のユーザＩＤからのアクセスについてはスタンダードリクエストに係るアクセスであると判定するようにしてもよい。 In order to be able to specify the log for the standard request from the collected log 210, when the server A 110 assigns a request ID for each access, the access related to the standard request is related to the standard request. Is assigned a special request ID that makes it identifiable. Whether the target access is an access related to a standard request can be determined by various methods. For example, an access from a specific user ID is an access related to a standard request. You may make it determine.

図３は、個別ログ１１１のデータ構成と具体的なデータの例を示した図である。個別ログ１１１は、リクエストＩＤフィールド３０１、サービス名フィールド３０２、サーバフィールド３０３、アプリフィールド３０４、種別フィールド３０５、Ｉ／Ｏフィールド３０６、タイムスタンプフィールド３０７、ログデータフィールド３０８の各フィールドを有する。これ以外に他のフィールドを有していてもよい。個別ログ１１１には、各サーバにおいて処理要求／応答を受信して処理を行う際に、その開始時と終了時にエントリが追加される。 FIG. 3 is a diagram illustrating a data configuration of the individual log 111 and an example of specific data. The individual log 111 has fields of a request ID field 301, a service name field 302, a server field 303, an application field 304, a type field 305, an I / O field 306, a time stamp field 307, and a log data field 308. Other fields may be included in addition to this. An entry is added to the individual log 111 at the start and end when each server receives a processing request / response and performs processing.

リクエストＩＤフィールド３０１は、サーバＡ１１０によってアクセス毎に付与されたリクエストＩＤを示す。例えば、図３の例において、「Ａ００１」や「Ａ００２」は通常のアクセスに付与されたリクエストＩＤであり、「ＺＡ０１」はスタンダードリクエストに係るアクセスについて付与されたリクエストＩＤを示している。本実施の形態では、スタンダードリクエストに係るアクセスについてはリクエストＩＤの先頭を通常用いられない「Ｚ」として、スタンダードリクエストであることが識別可能なようにしているが、これに限るものではない。 The request ID field 301 indicates a request ID assigned for each access by the server A110. For example, in the example of FIG. 3, “A001” and “A002” are request IDs given to normal access, and “ZA01” shows a request ID given to access related to a standard request. In the present embodiment, for the access related to the standard request, the head of the request ID is set to “Z”, which is not normally used, so that it can be identified as a standard request. However, the present invention is not limited to this.

サービス名フィールド３０２は、対象のアクセスに係るサービスの名称を示す。同一のサーバで複数のサービス（図３の例では「国内株式注文」と「国内株式照会」）を処理可能な場合もあり、これを判別するために保持する。サーバフィールド３０３およびアプリフィールド３０４は、それぞれ対象の個別ログ１１１を出力したサーバおよびアプリケーションプログラム１１４を識別する情報を示す。なお、これらの情報の命名規則等については図３の例に示すものに限らない。 The service name field 302 indicates the name of the service related to the target access. In some cases, a plurality of services ("domestic stock order" and "domestic stock inquiry" in the example of FIG. 3) can be processed by the same server, and this is retained for discrimination. The server field 303 and the application field 304 indicate information for identifying the server and the application program 114 that output the target individual log 111, respectively. Note that the naming rules of these pieces of information are not limited to those shown in the example of FIG.

種別フィールド３０５は、対象のアクセスについての処理要求を受信した場合のログであるか（図３の例では「Ｒｅｑ」）、処理応答を受信した場合のログであるか（図３の例では「Ｒｓｐ」）の処理方向を識別する情報を示す。また、Ｉ／Ｏフィールド３０６は、対象のアクセスについての処理の開始時のログであるか（図３の例では「ＩＮ」）、終了時のログであるか（図３の例では「ＯＵＴ」）を識別する情報を示す。これらの情報により、対象のログがどのタイミングで出力されたものかを特定することができる。 Whether the type field 305 is a log when a processing request for a target access is received (“Req” in the example of FIG. 3) or a log when a processing response is received (in the example of FIG. 3, “ Rsp ") indicates information for identifying the processing direction. Whether the I / O field 306 is a log at the start of processing for the target access (“IN” in the example of FIG. 3) or a log at the end (“OUT” in the example of FIG. 3). ) To identify information. With these pieces of information, it is possible to specify at what timing the target log was output.

タイムスタンプフィールド３０７は、対象のログのエントリを出力した時刻を示す。ログデータ３０８は、各サーバにおけるアプリケーションプログラム１１４が個別にパラメータデータや処理結果などをアプリケーション的なログデータとして出力したものを示す。従って、ログデータ３０８の値のフォーマットは各サーバによって異なる場合がある。 The time stamp field 307 indicates the time when the target log entry is output. The log data 308 indicates that the application program 114 in each server individually outputs parameter data and processing results as application log data. Therefore, the format of the value of the log data 308 may differ depending on each server.

なお、収集ログ２１０は各サーバの個別ログ１１１を収集してマージしたものに相当するため、データ構成は図３に示した個別ログ１１１のデータ構成と同様であり説明は省略する。 Since the collection log 210 corresponds to a collection and merge of the individual logs 111 of the servers, the data configuration is the same as the data configuration of the individual log 111 shown in FIG.

図４は、ログ実績情報２５０のデータ構成と具体的なデータの例を示した図である。ログ実績情報２５０は、リクエストＩＤフィールド４０１、サービス名フィールド４０２、サーバフィールド４０３、アプリフィールド４０４、種別フィールド４０５、応答時間フィールド４０６の各フィールドを有する。これ以外に他のフィールドを有していてもよい。ログ実績情報２５０は、スタンダードリクエストに係るアクセスについて、処理経路上の各サーバおよびサーバ間ネットワーク毎に応答時間の実績情報を保持する。 FIG. 4 is a diagram showing a data configuration of the log record information 250 and an example of specific data. The log record information 250 includes fields of a request ID field 401, a service name field 402, a server field 403, an application field 404, a type field 405, and a response time field 406. Other fields may be included in addition to this. The log record information 250 holds record information of response time for each server and the network between servers for the access related to the standard request.

リクエストＩＤフィールド４０１およびサービス名フィールド４０２は、図３に示すリクエストＩＤフィールド３０１およびサービス名フィールド３０２と同様である。ただし、ログ実績情報２５０には、ログ集計部２２０が収集ログ２１０からスタンダードリクエストに係るアクセスについてのログのみを特定して集計した結果が含まれる。従って、リクエストＩＤフィールド４０１には、スタンダードリクエストに係るアクセスを示す値（本実施の形態では先頭が「Ｚ」）のみが含まれる。 The request ID field 401 and the service name field 402 are the same as the request ID field 301 and the service name field 302 shown in FIG. However, the log result information 250 includes a result obtained by the log totaling unit 220 specifying and counting only the logs related to the access related to the standard request from the collected log 210. Therefore, the request ID field 401 includes only a value indicating access related to the standard request (the head is “Z” in the present embodiment).

サーバフィールド４０３は、スタンダードリクエストに係るアクセスの処理経路上のサーバ（図４の例では「Ａ１」「Ｂ１」）およびサーバ間ネットワーク（図４の例では「Ａ１−Ｂ１」「Ｂ１−Ａ１」）、すなわち応答時間の算出対象であるサーバ等を識別する情報を示す。 The server field 403 includes a server (“A1” “B1” in the example of FIG. 4) and an inter-server network (“A1-B1” “B1-A1” in the example of FIG. 4) on the access path related to the standard request. That is, information for identifying a server or the like that is a response time calculation target is shown.

アプリフィールド４０４は、応答時間の算出対象であるアプリケーションプログラム１１４を識別する情報を示す。サーバフィールド４０３の値がサーバ間ネットワークを示す場合は、アプリケーションプログラム１１４が存在しないため、アプリフィールド４０４の値を例えば「−」とする。種別フィールド４０５は、対象のエントリの応答時間が処理要求を受信した場合のものであるか、処理応答を受信した場合のものであるかの処理方向を識別する情報を示す。なお、図４の例において、「Ｒｅｑ／Ｒｓｐ」の値は、処理要求を受信してアプリケーションプログラム１１４による処理を行った後、折り返して処理応答を送信するような場合（例えば図１におけるＤＢサーバＣ１３０やサーバＤ１４０など）を示している。 The application field 404 indicates information for identifying the application program 114 that is a response time calculation target. When the value of the server field 403 indicates an inter-server network, since the application program 114 does not exist, the value of the application field 404 is set to “−”, for example. The type field 405 indicates information for identifying the processing direction as to whether the response time of the target entry is that when a processing request is received or when the processing response is received. In the example of FIG. 4, the value of “Req / Rsp” is the value when a processing response is received after processing request is received and processed by the application program 114 (for example, the DB server in FIG. 1). C130 and server D140).

応答時間フィールド４０６は、対象のサーバもしくはサーバ間ネットワークについて、処理方向の種別毎に図３の収集ログ２１０のデータから算出された応答時間をミリ秒で示す。 The response time field 406 indicates, in milliseconds, the response time calculated from the data of the collection log 210 of FIG. 3 for each type of processing direction for the target server or the network between servers.

図５は、応答時間閾値情報２６０のデータ構成と具体的なデータの例を示した図である。応答時間閾値情報２６０は、サービス名フィールド５０１、サーバフィールド５０２、アプリフィールド５０３、種別フィールド５０４、閾値フィールド５０５の各フィールドを有する。これ以外に他のフィールドを有していてもよい。応答時間閾値情報２６０は、スタンダードリクエストに係るアクセスについて、処理経路上の各サーバおよびサーバ間ネットワーク毎にあらかじめ定められた応答時間の閾値を保持する。 FIG. 5 is a diagram showing a data configuration of the response time threshold information 260 and an example of specific data. The response time threshold information 260 includes a service name field 501, a server field 502, an application field 503, a type field 504, and a threshold field 505. Other fields may be included in addition to this. The response time threshold information 260 holds a threshold of response time that is predetermined for each server on the processing path and the network between servers for access related to the standard request.

サービス名フィールド５０１、サーバフィールド５０２、アプリフィールド５０３、種別フィールド５０４は、図４で示したサービス名フィールド４０２、サーバフィールド４０３、アプリフィールド４０４、種別フィールド４０５とほぼ同様である。ただし、図５の例において、種別フィールド５０４の「Ｐａｔｈ」の値は、サーバ間ネットワークではなく当該サービスの処理経路全体を示しているものとする。閾値フィールド５０５は、対象のサーバ、サーバ間ネットワーク、もしくは処理経路全体における応答時間の閾値をミリ秒で示す。 The service name field 501, server field 502, application field 503, and type field 504 are substantially the same as the service name field 402, server field 403, application field 404, and type field 405 shown in FIG. However, in the example of FIG. 5, it is assumed that the value of “Path” in the type field 504 indicates the entire processing path of the service, not the network between servers. A threshold field 505 indicates a response time threshold in milliseconds for the target server, the network between servers, or the entire processing path.

図６は、サービス経路情報２７０のデータ構成と具体的なデータの例を示した図である。サービス経路情報２７０は、サービス名フィールド６０１、順序番号フィールド６０２、サーバフィールド６０３、アプリフィールド６０４、サーバ名称フィールド６０５の各フィールドを有する。これ以外に他のフィールドを有していてもよい。サービス経路情報２７０は、スタンダードリクエストに係るアクセスについての処理経路上の各サーバおよびサーバ間ネットワークに関する情報を保持する。 FIG. 6 is a diagram showing a data configuration of service route information 270 and an example of specific data. The service route information 270 includes a service name field 601, a sequence number field 602, a server field 603, an application field 604, and a server name field 605. Other fields may be included in addition to this. The service route information 270 holds information on each server and the network between servers on the processing route for access related to the standard request.

サービス名フィールド６０１、サーバフィールド６０３、アプリフィールド６０４は、図４で示したサービス名フィールド４０２、サーバフィールド４０３、アプリフィールド４０４と同様である。順序番号フィールド６０２は、対象のエントリのサーバもしくはサーバ間ネットワークが処理経路上で何番目に処理されるかを識別する情報を示す。この順にサーバもしくはサーバ間ネットワークを辿ることにより、対象のサービスについての処理経路を特定することができる。サーバ名称フィールド６０５は、対象のサーバの名称を示す。 The service name field 601, server field 603, and application field 604 are the same as the service name field 402, server field 403, and application field 404 shown in FIG. The sequence number field 602 indicates information for identifying the order in which the server of the target entry or the network between servers is processed on the processing path. By tracing the server or the server-to-server network in this order, the processing route for the target service can be specified. The server name field 605 indicates the name of the target server.

図７は、本実施の形態の性能監視サーバ２００での処理の概要の例を示すフローチャートである。スタンダードリクエスト送信端末３００から定期的にスタンダードリクエストを送信して処理を行った後、性能監視サーバ２００は、処理を開始すると、各サーバの個別ログ１１１のデータを収集ログ２１０に収集してマージする（Ｓ７０１）。この処理については、上述したように、例えば、ＮＡＳなどの各サーバから共通にアクセスが可能なストレージに収集ログ２１０を配置しておき、各サーバが、個別ログ１１１に出力された内容を、収集ログ２１０に対して非同期に書き込むなどの手段をとることができる。 FIG. 7 is a flowchart illustrating an example of an outline of processing in the performance monitoring server 200 according to the present embodiment. After performing the processing by periodically transmitting the standard request from the standard request transmission terminal 300, the performance monitoring server 200 collects and merges the data of the individual log 111 of each server into the collection log 210 when the processing is started. (S701). For this process, as described above, for example, the collection log 210 is placed in a storage that can be accessed in common by each server such as NAS, and each server collects the contents output to the individual log 111. Means such as asynchronously writing to the log 210 can be taken.

次に、ログ集計部２２０により、収集ログ２１０のデータから、スタンダードリクエストの種別毎に、各サーバでのスタンダードリクエストに係る処理の際に出力されたログをリクエストＩＤによって関連付けて特定し、特定されたログにおける各サーバでの処理の開始時刻の情報および終了時刻の情報に基づいて、当該スタンダードリクエストに係るサーバシステム１００での処理における各サーバ、サーバ間のネットワーク、および処理経路全体での応答時間を算出し、ログ実績情報２５０に格納する（Ｓ７０２）。スタンダードリクエストの送信と応答時間の算出は、例えば１時間に１回などの頻度で行うことができる。 Next, the log totaling unit 220 identifies and identifies the log output in the process related to the standard request in each server by the request ID for each type of standard request from the data of the collected log 210. Response time for each server, the network between servers, and the entire processing path in the processing in the server system 100 related to the standard request based on the information on the start time and the end time of the processing in each server in the log. Is calculated and stored in the log record information 250 (S702). The transmission of the standard request and the calculation of the response time can be performed at a frequency such as once per hour.

次に、全てのスタンダードリクエストの種別についてのループ処理を開始する（Ｓ７０３）。このループ処理では、さらに対象のスタンダードリクエストに係るアクセスについての処理経路上の全てのサーバおよびサーバ間ネットワークについてのループ処理を開始する（Ｓ７０４）。処理経路上のサーバおよびサーバ間ネットワークについては、例えば、サービス経路情報２７０から知ることが可能である。 Next, loop processing for all standard request types is started (S703). In this loop processing, loop processing is started for all servers on the processing path for access related to the target standard request and the network between servers (S704). The server on the processing path and the network between servers can be known from the service path information 270, for example.

サーバおよびサーバ間ネットワークについてのループ処理では、対象のサーバもしくはサーバ間ネットワークについて、ログ実績情報２５０に保持された応答時間をユーザＩ／Ｆ部２４０により監視画面上の該当箇所に表示する（Ｓ７０５）。次に、評価部２３０により、応答時間が応答時間閾値情報２６０にあらかじめ設定された閾値を超えているか否かを判定する（Ｓ７０６）。閾値を超えている場合は、ユーザＩ／Ｆ部２４０により監視画面上の該当箇所に警告表示をし、システム管理者等にメール等で通知を行う（Ｓ７０７）。 In the loop processing for the server and the server-to-server network, the response time held in the log record information 250 for the target server or the server-to-server network is displayed by the user I / F unit 240 at a corresponding location on the monitoring screen (S705). . Next, the evaluation unit 230 determines whether or not the response time exceeds a threshold set in advance in the response time threshold information 260 (S706). If the threshold value is exceeded, the user I / F unit 240 displays a warning at a corresponding location on the monitoring screen and notifies the system administrator or the like by e-mail or the like (S707).

以上のステップＳ７０５〜Ｓ７０７の処理を処理経路上の全てのサーバおよびサーバ間ネットワークについて繰り返し（Ｓ７０８）、さらにステップＳ７０４〜Ｓ７０８の処理を全てのスタンダードリクエストの種別について繰り返して（Ｓ７０９）、１サイクルの処理を終了する。 The processes in steps S705 to S707 are repeated for all servers and inter-server networks on the processing path (S708), and the processes in steps S704 to S708 are repeated for all standard request types (S709). The process ends.

図８は、ユーザＩ／Ｆ部２４０が表示する監視画面の例を示した図である。監視画面には、例えば、各サービスの種類および各サービスの処理経路と処理経路上のサーバ、サーバ間ネットワークの論理的配置が視覚的に容易に把握可能なように表示され、さらに、各サーバ、サーバ間ネットワークでの応答時間が表示されている。また、応答時間が閾値を超える部分については強調表示するなどして警告表示を行っている。 FIG. 8 is a diagram illustrating an example of a monitoring screen displayed by the user I / F unit 240. On the monitoring screen, for example, the type of each service, the processing route of each service, the servers on the processing route, and the logical arrangement of the network between the servers are displayed so that it can be easily grasped visually. The response time in the server-to-server network is displayed. Further, a warning is displayed by highlighting a portion where the response time exceeds the threshold.

また、各サービス毎に、処理経路全体での目標応答時間とそれに対する各測定タイミングでの応答時間の実績情報が表形式で表示されている。ここでも、応答時間が閾値を超えるサービスについては強調表示するなどして警告表示を行っている。 For each service, the target response time for the entire processing path and the actual result information of the response time at each measurement timing are displayed in a table format. Again, a warning is displayed by highlighting a service whose response time exceeds the threshold.

なお、図８の監視画面はあくまでも一例であり、他の表示形式であってもよい。また、例えば、ＣＰＵ使用率などの他のリソースベースでの性能測定の手段による測定結果と組み合わせて、リソース使用量が所定の閾値を超えた場合に該当のサーバについて警告表示したり、当該サーバを選択すると性能測定結果の詳細表示を行ったりしてもよい。また、サービス毎の応答時間を表示するサービスビューからネットワーク構成に基づくネットワークビューに監視画面を切り替えて、ネットワーク構成に基づいて各回線についての帯域利用率や応答時間についての性能測定値を表示・監視可能なようにしてもよい。 Note that the monitoring screen of FIG. 8 is merely an example, and other display formats may be used. In addition, for example, in combination with measurement results by other resource-based performance measurement means such as CPU usage rate, when the resource usage exceeds a predetermined threshold, a warning is displayed for the corresponding server, When selected, detailed display of performance measurement results may be performed. In addition, the monitor screen is switched from the service view that displays the response time for each service to the network view that is based on the network configuration, and performance measurement values for bandwidth utilization and response time for each line are displayed and monitored based on the network configuration. It may be possible.

以上に説明したように、本実施の形態の性能監視サーバ２００によれば、スタンダードリクエストに係るアクセスについて、リクエストＩＤに基づいて各サーバで出力されたログを関連付けて、一連の処理について、各サーバにおける処理開始時と終了時のログから応答時間を算出する。これにより、処理経路上の各サーバおよびサーバ間ネットワークでの応答時間をリアルタイムで把握することが可能となり、処理経路上でのボトルネック等をより精度良く評価することができる。また、ハブなどのログ出力機能を有さないネットワーク機器によって構成されるサーバ間ネットワークの応答時間も把握することが可能となる。 As described above, according to the performance monitoring server 200 of the present embodiment, the log output from each server based on the request ID is associated with the access related to the standard request, The response time is calculated from the log at the start and end of the process. Thereby, it becomes possible to grasp in real time the response time in each server and the server-to-server network on the processing path, and the bottleneck and the like on the processing path can be more accurately evaluated. It is also possible to grasp the response time of the network between servers configured by network devices that do not have a log output function such as a hub.

また、所定の処理経路上のサーバによって常に処理されるスタンダードリクエストを用いて、スタンダードリクエストに係るアクセスについての処理経路にわたっての応答時間を把握することにより、システムの現状のサービスレベルを偏りなく評価し、また、性能評価や性能予測等における基準としての適格性の高い指標を提供することができる。 Also, by using standard requests that are always processed by a server on a predetermined processing path, the current service level of the system can be evaluated evenly by grasping the response time over the processing path for access related to the standard request. In addition, it is possible to provide a highly qualified index as a standard in performance evaluation and performance prediction.

以上、本発明者によってなされた発明を実施の形態に基づき具体的に説明したが、本発明は前記実施の形態に限定されるものではなく、その要旨を逸脱しない範囲で種々変更可能であることはいうまでもない。 As mentioned above, the invention made by the present inventor has been specifically described based on the embodiment. However, the present invention is not limited to the embodiment, and various modifications can be made without departing from the scope of the invention. Needless to say.

本発明は、複数のサーバの連携によりネットワークを介してクライアント端末にサービスを提供するサーバシステムの性能を監視する性能監視システムおよび性能監視方法に利用可能である。 The present invention can be used in a performance monitoring system and a performance monitoring method for monitoring the performance of a server system that provides services to client terminals via a network through cooperation of a plurality of servers.

本発明の一実施の形態である性能監視システムの構成例の概要を示した図である。It is the figure which showed the outline | summary of the structural example of the performance monitoring system which is one embodiment of this invention. 本発明の一実施の形態におけるサーバの構成と処理の概要の例を示した図である。It is the figure which showed the example of the structure of the server in 1 embodiment of this invention, and the outline | summary of a process. 本発明の一実施の形態における個別ログのデータ構成と具体的なデータの例を示した図である。It is the figure which showed the example of the data structure and specific data of the separate log in one embodiment of this invention. 本発明の一実施の形態におけるログ実績情報のデータ構成と具体的なデータの例を示した図である。It is the figure which showed the data structure of the log performance information in one embodiment of this invention, and the example of concrete data. 本発明の一実施の形態における応答時間閾値情報のデータ構成と具体的なデータの例を示した図である。It is the figure which showed the data structure of the response time threshold value information in one embodiment of this invention, and the example of concrete data. 本発明の一実施の形態におけるサービス経路情報のデータ構成と具体的なデータの例を示した図である。It is the figure which showed the data structure of the service route information in one embodiment of this invention, and the example of concrete data. 本発明の一実施の形態における性能監視システムでの処理の概要の例を示すフローチャートである。It is a flowchart which shows the example of the outline | summary of the process in the performance monitoring system in one embodiment of this invention. 本発明の一実施の形態におけるユーザＩ／Ｆ部が表示する監視画面の例を示した図である。It is the figure which showed the example of the monitoring screen which the user I / F part in one embodiment of this invention displays.

Explanation of symbols

１００…サーバシステム、１１０…サーバＡ，１１１…個別ログ、１１２…ＯＳ／ミドルウェア、１１３…基盤層、１１４…アプリケーションプログラム、１２０…サーバＢ、１２１…個別ログ、１３０…ＤＢサーバＣ、１３１…個別ログ、１４０…サーバＤ、１４１…個別ログ、２００…性能監視サーバ、２１０…収集ログ、２２０…ログ集計部、２３０…評価部、２４０…ユーザＩ／Ｆ部、２５０…ログ実績情報、２６０…応答時間閾値情報、２７０…サービス経路情報、３００…スタンダードリクエスト送信端末、４００…クライアント端末、５００…ネットワーク、
３０１…リクエストＩＤフィールド、３０２…サービス名フィールド、３０３…サーバフィールド、３０４…アプリフィールド、３０５…種別フィールド、３０６…Ｉ／Ｏフィールド、３０７…タイムスタンプフィールド、３０８…ログデータフィールド、
４０１…リクエストＩＤフィールド、４０２…サービス名フィールド、４０３…サーバフィールド、４０４…アプリフィールド、４０５…種別フィールド、４０６…応答時間フィールド、
５０１…サービス名フィールド、５０２…サーバフィールド、５０３…アプリフィールド、５０４…種別フィールド、５０５…閾値フィールド、
６０１…サービス名フィールド、６０２…順序番号フィールド、６０３…サーバフィールド、６０４…アプリフィールド、６０５…サーバ名称フィールド。 DESCRIPTION OF SYMBOLS 100 ... Server system, 110 ... Server A, 111 ... Individual log, 112 ... OS / middleware, 113 ... Base layer, 114 ... Application program, 120 ... Server B, 121 ... Individual log, 130 ... DB server C, 131 ... Individual Log, 140 ... Server D, 141 ... Individual log, 200 ... Performance monitoring server, 210 ... Collection log, 220 ... Log tabulation unit, 230 ... Evaluation unit, 240 ... User I / F unit, 250 ... Log performance information, 260 ... Response time threshold information, 270 ... service route information, 300 ... standard request transmission terminal, 400 ... client terminal, 500 ... network,
301 ... Request ID field, 302 ... Service name field, 303 ... Server field, 304 ... Application field, 305 ... Type field, 306 ... I / O field, 307 ... Time stamp field, 308 ... Log data field,
401 ... Request ID field, 402 ... Service name field, 403 ... Server field, 404 ... Application field, 405 ... Type field, 406 ... Response time field,
501 ... Service name field, 502 ... Server field, 503 ... Application field, 504 ... Type field, 505 ... Threshold field,
601 ... Service name field, 602 ... Sequence number field, 603 ... Server field, 604 ... Application field, 605 ... Server name field.

Claims

A server system that provides services to client terminals via a network by cooperation of a plurality of servers;
A performance monitoring system comprising a performance monitoring server for monitoring the performance of the server system,
Each of the servers gives a request ID such that when the server is the server that first received an access from the client terminal, the access can be identified from the other access. Means,
A second means for taking over the request ID between the servers on the processing path relating to the access in the server system;
A third means for outputting a log including the request ID and time as an individual log at the start and end of the processing relating to the access;
The performance monitoring server includes a collection log obtained by collecting the individual logs output by the servers;
From the collected log data, the log output in the process related to the access at each server is specified in association with the request ID, and the process at each server in the specified log is started. Based on the time information and the end time information, the log totaling unit that calculates the response time in each server, the network between the servers, and the entire processing path in the processing in the server system related to the access; A performance monitoring system comprising:

The performance monitoring system according to claim 1,
The performance monitoring system according to claim 1, wherein the access is an access related to a standard request processed by each server on a predetermined processing path in the server system.

The performance monitoring system according to claim 2,
The server assigns the request ID so that it is possible to identify that the access is related to the standard request when the request ID is assigned to the access related to the standard request. A performance monitoring system.

In the performance monitoring system according to any one of claims 1 to 3,
The performance monitoring system according to claim 1, wherein the first means, the second means, and the third means in each server operate in a lower layer of an application program that operates on each server.

In the performance monitoring system according to any one of claims 1 to 4,
The performance monitoring server predefines each response time in each server, the network between the servers, and the entire processing path in the processing in the server system related to the access calculated by the log aggregation unit. A performance monitoring system comprising: an evaluation unit that determines whether or not a response time threshold value is exceeded.

A server system that provides services to client terminals via a network by cooperation of a plurality of servers;
A performance monitoring method in a system comprising a performance monitoring server for monitoring the performance of the server system,
Each of the servers, when the server is the server that first received an access from the client terminal, giving a request ID such that the access can be identified from other access;
Outputting the log including the request ID and time as an individual log at the start of the process related to the access, and passing the process to the application program;
Outputting the log including the request ID and time as the individual log at the end of the processing relating to the access after the processing is delivered from the application program;
Taking over the request ID to the other server on the processing path related to the access in the server system,
The performance monitoring server collecting the individual logs output by the servers in a collection log;
From the collected log data, the log output in the process related to the access at each server is specified in association with the request ID, and the process at each server in the specified log is started. And calculating the response time for each of the servers, the network between the servers, and the entire processing path in the processing in the server system related to the access based on the time information and the end time information. A performance monitoring method characterized by: