JP2007265137A

JP2007265137A - Multi-task processing method and multi-task processing apparatus

Info

Publication number: JP2007265137A
Application number: JP2006090718A
Authority: JP
Inventors: Tomotake Koike; 友岳小池; Atsushi Ohori; 厚大堀
Original assignee: Oki Electric Industry Co Ltd; Oki Networks Co Ltd
Current assignee: Oki Electric Industry Co Ltd; Oki Networks Co Ltd
Priority date: 2006-03-29
Filing date: 2006-03-29
Publication date: 2007-10-11

Abstract

<P>PROBLEM TO BE SOLVED: To provide a multi-task processing method and multi-task processing apparatus for improving reliability of a system by reducing a restart time for services when a failure occurs. <P>SOLUTION: In the multi-task processing method, when the failure occurs, a management process means restarts one or more service processes on a shared memory in parallel with processing performed by a damp processing means, for damping shared resources having one or more services held in the shared memory as a core file. The multi-task processing method is characterized in that the damp processing means, when the failure occurs, backs up one or more service process memory areas held on the shared memory to an area different from the memory areas for one or more service processes on the shared memory to be restarted by the management process means. <P>COPYRIGHT: (C)2008,JPO&INPIT

Description

本発明は、マルチタスク処理方法及びマルチタスク処理装置に関し、例えば、常にサービスを維持しておかなければならないようなミッションクリティカルシステムに適用し得る。 The present invention relates to a multitask processing method and a multitask processing apparatus, and can be applied to, for example, a mission critical system in which a service must always be maintained.

例えば、一般的なコールサーバのようなサービス提供サーバ（交換サーバ）は、収容するユーザの数が非常に多く、１ユーザに対してサービスを提供するために必要なイベントを多数かつ輻輳的に発生させてサービスを提供している。 For example, a service providing server (exchange server) such as a general call server has a very large number of users to be accommodated, and a large number of events necessary for providing a service to one user occur congestively. Let the service provided.

また、コールサーバのようなサービス提供サーバは、例えば、Ｌｉｎｕｘ（ＵＮＩＸ：共に登録商標）やＷｉｎｄｏｗｓ（登録商標）等の汎用マルチタスクＯＳ（オペレーションシステム）適用して、膨大なユーザに対して同時に提供するサービスを管理している。そのため、多大なメモリサイズを伴ったサービスプロセスと、そのサービスプロセスを継続させるための管理プロセスを有している。 In addition, a service providing server such as a call server is provided to a large number of users at the same time by applying a general-purpose multitask OS (operation system) such as Linux (UNIX: registered trademark) or Windows (registered trademark). Manage the services that Therefore, it has a service process with a large memory size and a management process for continuing the service process.

このサービスプロセスは、プロセスが終了したときも、共有メモリとしてのメモリエリアに保持され、サービス状態を維持するために、管理プロセスによりプロセス再起動がなされ得る。 This service process is held in the memory area as a shared memory even when the process is terminated, and the process can be restarted by the management process in order to maintain the service state.

ところで、従来、例えば、Ｌｉｎｕｘ（ＵＮＩＸ：共に登録商標）やＷｉｎｄｏｗｓ（登録商標）系のＯＳを適用するサービス提供サーバにおいて、プロセス実行中にシステム障害やプログラムの異常処理により、処理を継続できないような致命的なエラーが発生すると、プログラムは強制終了されるが、その際に、障害の解析を事後的に実施するため、障害発生時のメモリ内容やレジスタ内容を「Ｃｏｒｅ」という名前のファイルでハードディスクに出力するコアダンプという処理ができる（特許文献１参照）。 By the way, conventionally, for example, in a service providing server to which a Linux (UNIX: both registered trademark) or Windows (registered trademark) OS is applied, the process cannot be continued due to a system failure or abnormal program processing during the process execution. When a fatal error occurs, the program is forcibly terminated. At that time, the failure analysis is performed afterwards, so the memory contents and register contents at the time of the failure are stored in a file named “Core” on the hard disk. Can be called core dump (see Patent Document 1).

そして、ハードディクスに出力されたＣｏｒｅファイルを、デバッガに読み込ませることで、エラーが発生した場所や、そのときの変数の値等を事後的に解析することができ、エラー発生の原因を特定することができる。 Then, by reading the Core file output to the hard disk into the debugger, the location where the error occurred, the value of the variable at that time, etc. can be analyzed later, and the cause of the error is identified be able to.

特開２００５−３０１５７０号公報JP 2005-301570 A

ところで、例えば、Ｌｉｎｕｘ（ＵＮＩＸ：共に登録商標）・Ｗｉｎｄｏｗｓ（登録商標）等の汎用マルチタスクＯＳを適用したサービス提供サーバ（例えばコールサーバ）において、サービス実施中に致命的なエラーが発生によりサービスプロセスのプロセスが強制終了された場合、Ｃｏｒｅファイルをハードディスクに出力するため、コアダンプの生成が行なわれる。 By the way, for example, in a service providing server (for example, a call server) to which a general-purpose multitasking OS such as Linux (UNIX: both registered trademark) and Windows (registered trademark) is applied, a service process occurs due to occurrence of a fatal error during service execution. When this process is forcibly terminated, a core dump is generated in order to output the Core file to the hard disk.

このとき、コアダンプを生成するために必要なＣＰＵ使用率等のシステム・リソースを過渡に使用してしまう。そのため、管理プロセスにおいて実施しなければならないプロセスの再起動処理までもが遅延してしまい、サービス再起動の完了までの時間が大幅に遅延してしまうという問題がある。 At this time, system resources such as CPU utilization necessary for generating the core dump are transiently used. For this reason, there is a problem that even the restart process of the process that must be performed in the management process is delayed, and the time until the service restart is completed is greatly delayed.

また、一般的にハードディスクヘの書き出し処理は遅く、多大なメモリを保持しているサービスプロセス領域のダンプ完了までには時間がかかる。そのため、サービスプロセス障害発生直後に、管理プロセスによりサービスプロセスを再起動してしまうと、コアダンプ出力と並列的に再起動されたサービスプロセスの初期設定処理が動作することとなる。そうすると、コアダンプの完了していない共有メモリ領域を、再起動させたサービスプロセスが変更してしまい、障害発生時のコアファイルを出力することができず、障害解析が不可能となるという問題がある。 In general, the writing process to the hard disk is slow, and it takes time to complete the dump of the service process area holding a large amount of memory. For this reason, if the service process is restarted by the management process immediately after the service process failure occurs, the initial setting process of the service process restarted in parallel with the core dump output operates. Then, the restarted service process changes the shared memory area where the core dump has not been completed, and the core file at the time of failure cannot be output, and failure analysis becomes impossible. .

マルチプロセス／マルチスレッドで輻輳的に処理を行っているようなシステムにおいて、停止時間が大幅にかかってしまうことは性能的なボトルネックとなる。 In a system in which multi-process / multi-thread processing is performed in a congested manner, a significant downtime is a performance bottleneck.

図２では、運用系と待機系とを有する冗長系システムを構成する運用系のコールサーバを管理するクラスタリングシステムの再開処理の例を示すシーケンスである。 FIG. 2 is a sequence showing an example of a restart process of a clustering system that manages an active call server constituting a redundant system having an active system and a standby system.

図２において、サービスプロセス１４に致命的な例外処理が発生した場合（Ｓ２０１）、カーネル１２ａからＳＩＧＳＥＧＶ等の強制終了のシグナルがサービスプロセス１４に与えられ（Ｓ２０２）、サービスプロセス１４がＳＩＧＳＥＧＶ等を受信すると、コアファイルを出力する（Ｓ２０３、Ｓ２０４）。 In FIG. 2, when a fatal exception process occurs in the service process 14 (S201), a forced termination signal such as SIGEGV is given from the kernel 12a to the service process 14 (S202), and the service process 14 receives SIGSEGV and the like. Then, a core file is output (S203, S204).

管理プロセス１３は、サービスプロセスを生成した親プロセスとなっているため、サービスプロセス１４からのＳＩＧＣＨＩＬＤのシグナル受信を契機に、運用系から待機系への系切り替えが行なわれる（Ｓ２０９）。 Since the management process 13 is the parent process that generated the service process, the system is switched from the active system to the standby system when the SIGCHILD signal is received from the service process 14 (S209).

しかし、管理プロセス１３におけるＳＩＧＣＨＩＬＤのシグナル受信は、Ｃｏｒｅファイル出力終了後（Ｓ２０５）のシグナル受信となるため（Ｓ２０６）、プロセス再起動時間が大幅に遅延することとなる。 However, the SIGCHILD signal reception in the management process 13 is the signal reception after the completion of the Core file output (S205) (S206), so the process restart time is greatly delayed.

そのため、常にサービスを維持しておかなければならないようなミッションクリティカルシステムにおいて、ソフトウェア障害時発生するコアダンプ処理を遅延させ、サービスの再起動時間を短縮することにより、システムの信頼性を向上させるマルチタスク処理方法及びマルチタスク処理装置が求められている。 Therefore, in mission critical systems where services must be maintained at all times, multitasking that improves system reliability by delaying core dump processing that occurs when software fails and shortening service restart time There is a need for a processing method and a multitask processing device.

かかる課題を解決するために、第１の本発明のマルチタスク処理方法は、障害発生時に、ダンプ処理手段が、共有メモリに保持されている１又は複数のサービスプロセスを有する共有資源をコアファイルとしてダンプする処理と並行し、管理プロセス手段が、共有メモリ上の１又は複数のサービスプロセスを再起動させるマルチタスク処理方法において、ダンプ処理手段が、障害発生時に、共有メモリ上に保持されている１又は複数のサービスプロセスのメモリ領域を、管理プロセス手段により再起動される共有メモリ上の１又は複数のサービスプロセスのメモリ領域とは異なる領域にバックアップすることを特徴とする。 In order to solve this problem, the multitask processing method according to the first aspect of the present invention is configured such that, when a failure occurs, the dump processing means uses a shared resource having one or more service processes held in the shared memory as a core file. In parallel with the dumping process, in the multitask processing method in which the management process means restarts one or more service processes on the shared memory, the dump processing means is held on the shared memory when a failure occurs. Alternatively, the memory areas of the plurality of service processes are backed up to an area different from the memory areas of the one or more service processes on the shared memory restarted by the management process means.

第２の本発明のマルチタスク処理装置は、障害発生時に、共有メモリに保持されている１又は複数のサービスプロセスを有する共有資源をコアファイルとしてダンプするダンプ処理手段と、ダンプ処理手段によるダンプ処理と並行して、共有メモリ上の１又は複数のサービスプロセスを再起動させる管理プロセス手段とを備えるマルチタスク処理装置において、ダンプ処理手段が、障害発生時に、共有メモリ上に保持されている１又は複数のサービスプロセスのメモリ領域を、管理プロセス手段により再起動される共有メモリ上の１又は複数のサービスプロセスのメモリ領域とは異なる領域にバックアップすることを特徴とする。 A multitask processing device according to a second aspect of the present invention includes a dump processing unit that dumps a shared resource having one or a plurality of service processes held in a shared memory as a core file when a failure occurs, and a dump process by the dump processing unit In parallel with the multitask processing apparatus, the dump processing means is held in the shared memory when a failure occurs, in the multitask processing device comprising a management process means for restarting one or more service processes on the shared memory. The memory area of the plurality of service processes is backed up to an area different from the memory area of the one or more service processes on the shared memory restarted by the management process means.

本発明によれば、ミッションクリティカルシステムにおいて、ソフトウェア障害時発生するコアダンプ処理を遅延させ、サービスの再起動時間を短縮することにより、システムの信頼性を向上させることができる。 According to the present invention, in a mission critical system, it is possible to improve the reliability of the system by delaying the core dump process that occurs at the time of software failure and reducing the service restart time.

（Ａ）実施形態
以下、本発明のマルチタスク処理方法及びマルチタスク処理装置の実施形態を図面を参照しながら説明する。 (A) Embodiment Hereinafter, an embodiment of a multitask processing method and a multitask processing apparatus of the present invention will be described with reference to the drawings.

以下で説明する実施形態は、ＩＰ電話システムにおけるコールサーバ（交換サーバ）に設けられているマルチ呼処理タスクシステムを例に行なうこととする。 In the embodiment described below, a multi-call processing task system provided in a call server (switching server) in an IP telephone system is taken as an example.

本実施形態では、クラスタで構成されたシステムにおいて、致命的エラー発生時に共有メモリ情報を一時的にバックアップすることにより、Ｃｏｒｅファイルの出力を一時的に抑止することを可能とすることを特徴とする。これにより、負荷が分散され、通常のプロセス再起動時間と同様となることを実現できる。 The present embodiment is characterized in that, in a system configured by a cluster, it is possible to temporarily suppress the output of a Core file by temporarily backing up shared memory information when a fatal error occurs. . As a result, it is possible to realize that the load is distributed and becomes the same as the normal process restart time.

（Ａ−１）実施形態の構成
図３は、本実施形態に係るマルチタスク処理装置１の構成を示す構成図である。 (A-1) Configuration of Embodiment FIG. 3 is a configuration diagram illustrating a configuration of the multitask processing device 1 according to the present embodiment.

本実施形態に係るマルチタスク処理装置は、例えば、ＩＰ電話機等からの多量のＶｏＩＰ信号（例えば、８０万回線）を取り扱うコールサーバに適用されているものであり、ハードウェア１１としては、一般的なコールサーバと同様である。すなわち、ＣＰＵ、メモリ、内蔵ＨＤＤ等の大容量記憶装置１０、キーボード、マウス、ディスプレイ、通信インターフェース部などを備えており、ＣＰＵは、システムデバイスを介してメモリに接続され、また、システムデバイス及び入出力デバイスを介して、大容量記憶装置１０、キーボード、マウス、ディスプレイ、通信引退フェース部等と接続されている。 The multitask processing apparatus according to the present embodiment is applied to a call server that handles a large amount of VoIP signals (for example, 800,000 lines) from an IP telephone or the like. It is similar to a simple call server. That is, a CPU, a memory, a large-capacity storage device 10 such as a built-in HDD, a keyboard, a mouse, a display, a communication interface unit, and the like are provided. The CPU is connected to the memory via a system device. The output device is connected to the mass storage device 10, the keyboard, the mouse, the display, the communication retirement face unit, and the like.

マルチタスク処理装置１は、例えば、Ｌｉｎｕｘ（ＵＮＩＸ：共に登録商標）・Ｗｉｎｄｏｗｓ（登録商標）等の汎用マルチタスクＯＳ１２を適用している。汎用マルチタスクＯＳにおいては、少なくともカーネル１２ａと、システムデバイス及び入出力デバイスを動作させるためのデバイスドライバ１２ｂとを有している。 The multitask processing device 1 uses, for example, a general-purpose multitasking OS 12 such as Linux (UNIX: both registered trademark) and Windows (registered trademark). The general-purpose multitasking OS has at least a kernel 12a and a device driver 12b for operating system devices and input / output devices.

また、マルチタスク処理装置１は、ユーザに対して提供する呼処理サービスを実現するための多大なメモリ領域を伴うサービスプロセス１４と、そのサービスプロセスを継続させるための管理プロセス１３を有する。 Further, the multitask processing apparatus 1 has a service process 14 with a large memory area for realizing a call processing service provided to a user, and a management process 13 for continuing the service process.

サービスプロセス１４は、プロセスが終了した後も、プロセス再起動によりサービス状態を継続できるように、共有メモリ１５としてのメモリエリアを有している。 The service process 14 has a memory area as the shared memory 15 so that the service state can be continued by restarting the process even after the process ends.

また、サービスプロセス１４の機能部としては、致命的なエラー発生後、カーネル１２ａから強制終了を指示するシグナルを受信すると、現在使用しているサービスプロセスの共有メモリを示すエイリアスを生成し、一時的に共有メモリに格納されている情報をバックアップするダンプ管理部１４ａを有する。 Further, as a function unit of the service process 14, when a signal instructing forced termination is received from the kernel 12a after a fatal error occurs, an alias indicating a shared memory of the service process currently used is generated and temporarily Has a dump management unit 14a for backing up information stored in the shared memory.

管理プロセス１３の機能部としては、サービスプロセス１４からＳＩＧＵＳＲ２シグナルを受け取ると、サービスプロセスを停止させ、待機系サーバへの切り替えを指示するサービスプロセス停止処理部１３ａと、待機系サーバへの切り替え後、サービスプロセスを再生成させるサービスプロセス再生成処理部１３ｂを有する。 As a functional part of the management process 13, when a SIGUSR2 signal is received from the service process 14, the service process is stopped, and after switching to the standby server, the service process stop processing part 13a for instructing switching to the standby server, It has a service process regeneration processor 13b that regenerates a service process.

（Ａ−２）実施形態の動作
次に、本実施形態のマルチタスク処理装置１におけるマルチタスク処理の動作を図面を参照しながら説明する。なお、以下において、マルチタスクダンプ管理装置１は、Ｌｉｎｕｘ（ＵＮＩＸ：共に登録商標）上で動作しているものとする。 (A-2) Operation of Embodiment Next, the operation of multitask processing in the multitask processing apparatus 1 of this embodiment will be described with reference to the drawings. In the following, it is assumed that the multitask dump management apparatus 1 is operating on Linux (UNIX: both are registered trademarks).

以下では、当該マルチタスク処理装置１が、運用系として動作しており、障害発生により、図示しない待機系のマルチタスク処理装置に系切替する場合の動作を想定して説明する。なお、図示しない待機系のマルチタスク処理装置も、当該マルチタスク処理装置１と同様の機能を有するものを適用できる。 In the following description, it is assumed that the multitask processing apparatus 1 operates as an active system and the system is switched to a standby multitask processing apparatus (not shown) when a failure occurs. A standby multitask processing apparatus (not shown) having the same function as that of the multitask processing apparatus 1 can be applied.

図１は、致命的なエラーが生じた場合、運用系のマルチタスク処理装置１におけるマルチタスクダンプ管理処理を示すシーケンス図である。 FIG. 1 is a sequence diagram showing a multitask dump management process in the active multitask processing apparatus 1 when a fatal error occurs.

サービスプロセス１４において、例外処理により致命的なエラーが発生すると（Ｓ１０１）、カーネル１２ａからサービスプロセス１４にＳＩＧＳＥＧＶ等の強制終了シグナルが与えられる（Ｓ１０２）。 When a fatal error occurs due to exception processing in the service process 14 (S101), a forced termination signal such as SIGEGV is given from the kernel 12a to the service process 14 (S102).

カーネル１２ａからＳＩＧＳＥＧＶ等の強制終了シグナルがサービスプロセス１４に与えられると、サービスプロセス１４のダンプ管理部１４ａにより、現在使用しているサービスプロセスの共有メモリ１５を示すエイリアスが生成される（Ｓ１０３）。 When a forced termination signal such as SIGSEGV is given from the kernel 12a to the service process 14, the dump management unit 14a of the service process 14 generates an alias indicating the shared memory 15 of the currently used service process (S103).

サービスプロセス１４のダンプ管理部１４ａにより、サービスプロセスの共有メモリ１５のエイリアスが生成されると、サービスプロセス１４のダンプ管理部１４ａにより、ＳＩＧＵＳＲ２シグナルが管理プロセス１３に与えられる（Ｓ１０４）。 When the alias of the shared memory 15 of the service process is generated by the dump management unit 14a of the service process 14, a SIGUSR2 signal is given to the management process 13 by the dump management unit 14a of the service process 14 (S104).

なお、ＳＩＧＵＳＲ２シグナルは、ユーザ定義可能なシグナルであり、本実施形態では、サービスプロセス１４のダンプ管理部１４ａにより、共有メモリ１５を示すエイリアスを生成し、共有メモリ１５上の共有メモリ情報をバックアップが完了したので、管理プロセス１３におけるサービスプロセスの停止処理を指示することと定義する。 Note that the SIGUSR2 signal is a user-definable signal. In this embodiment, the dump management unit 14a of the service process 14 generates an alias indicating the shared memory 15 and backs up the shared memory information on the shared memory 15. Since it has been completed, it is defined that an instruction to stop the service process in the management process 13 is instructed.

また、サービスプロセス１４のダンプ管理部１４ａは、サービスプロセス自身に対してＳＩＧＳＴＯＰシグナルを送信し、全てのプロセスを停止状態にさせる（Ｓ１０５及びＳ１０６）。これにより、サービスプロセス１４におけるプロセスを停止状態にすることができ、コアダンプの生成も停止させることができるから、Ｃｏｒｅファイルのハードウェア１０への出力を遅らせることができる。 In addition, the dump management unit 14a of the service process 14 transmits a SIGSTOP signal to the service process itself, and causes all processes to be stopped (S105 and S106). Thereby, the process in the service process 14 can be stopped, and the generation of the core dump can also be stopped, so that the output of the Core file to the hardware 10 can be delayed.

サービスプロセス１４からＳＩＧＵＳＲ２シグナルが管理プロセス１３に与えられると、管理プロセス１３のサービスプロセス停止処理部１３ａにより、サービスプロセスの停止処理が行なわれ（Ｓ１０７）、待機系のマルチタスク処理装置に対して系切替を指示する（Ｓ１０８）。このサービスプロセスの停止処理は、例えば、自系のインターフェースに設定されている仮想ＩＰアドレスや仮想ＭＡＣアドレスを解放したり、待機系マルチタスク処理装置との間のデータ２重化を停止したりする処理等が行なわれる。 When the SIGUSR2 signal is given from the service process 14 to the management process 13, the service process stop processing unit 13a of the management process 13 performs the service process stop processing (S107), and the system is instructed to the standby multitask processing device. Switching is instructed (S108). This service process stop processing, for example, releases the virtual IP address or virtual MAC address set in the own system interface or stops data duplication with the standby multitask processing device. Processing is performed.

また、管理プロセス１３において、待機系マルチタスク処理装置に対して系切替指示が与えられると、管理プロセス１３のサービスプロセス再生処理部１３ｂにより、再開指示シグナルであるＳＩＧＣＯＮＴシグナルをサービスプロセス１４に送信する（Ｓ１０９）。これにより、プロセス停止状態となっていたサービスプロセス１４において、Ｃｏｒｅダンプを生成することができ、ＣｏｒｅダンプによるＣｏｒｅファイルのハードディスク１０への出力を再開させることができる。 In the management process 13, when a system switching instruction is given to the standby multitask processing apparatus, the service process regeneration processing unit 13 b of the management process 13 transmits a SIGCONT signal that is a restart instruction signal to the service process 14. (S109). As a result, the core dump can be generated in the service process 14 that has been in the process stop state, and the output of the core file to the hard disk 10 by the core dump can be resumed.

また、管理プロセス１３のサービスプロセス再生成処理部１３ｂにより、サービスプロセス１４を再起動させるために、新たな共有メモリ１５のメモリエリアを生成し（Ｓ１１１）、サービスプロセスの再生成処理により、サービスプロセスの再起動が行なわれる（Ｓ１１２及びＳ１１３）。 In addition, in order to restart the service process 14 by the service process regeneration processing unit 13b of the management process 13, a new memory area of the shared memory 15 is generated (S111). Is restarted (S112 and S113).

このとき、Ｃｏｒｅダンプ中の古い共有メモリ１５のメモリエリアは、障害発生時、サービスプロセスによりエイリアスが生成されているため、再起動サービスプロセスが保持する共有メモリ１５のメモリ領域とは異なる共有メモリ１５のメモリ領域となり、Ｃｏｒｅファイルのハードウェア１０への出力を保証することができる。 At this time, since the alias of the memory area of the old shared memory 15 in the Core dump is generated by the service process when a failure occurs, the shared memory 15 is different from the memory area of the shared memory 15 held by the restart service process. And the output of the Core file to the hardware 10 can be guaranteed.

なお、Ｃｏｒｅファイルのハードウェア１０への出力が完了すると（Ｓ１１４）、ＳＩＧＣＨＬＤシグナルが、サービスプロセス１４から管理プロセス１３に与えられるが（Ｓ１１５）、管理プロセス１３は、サービスプログラム１４からのＳＩＧＣＨＬＤシグナルの受信を待たずに、新プロセスの再生成を行なうことができるから、障害発生直後に、迅速かつ的確な運用系から待機系への系切り替えを可能とする。 When the output of the Core file to the hardware 10 is completed (S114), a SIGCHLD signal is given from the service process 14 to the management process 13 (S115). The management process 13 receives the SIGCHLD signal from the service program 14. Since a new process can be regenerated without waiting for reception, the system can be quickly and accurately switched from the active system to the standby system immediately after a failure occurs.

その後、運用系として動作していた当該マルチタスク処理装置１は、待機系としての動作状態に戻る。 Thereafter, the multitask processing apparatus 1 operating as the active system returns to the operating state as the standby system.

（Ａ−３）実施形態の効果
以上のように、本実施形態によれば、致命的なエラー発生時に、サービスプロセス１４が自身に対してサービスプログラムを停止させることにより、Ｃｏｒｅファイルのハードディスク１０への出力処理を一旦停止させることができ、その停止期間に、運用系から待機系への系切替を行なうことができる。その結果、Ｃｏｒｅファイルのハードディスク１０への出力による遅延に依存することなく、迅速な系切替を行なうことができので、サービスの中断時間を短くすることができる。 (A-3) Effects of the Embodiment As described above, according to the present embodiment, when a fatal error occurs, the service process 14 stops the service program for itself, so that the Core file can be transferred to the hard disk 10. Output processing can be temporarily stopped, and system switching from the active system to the standby system can be performed during the stop period. As a result, rapid system switching can be performed without depending on the delay due to the output of the Core file to the hard disk 10, so that the service interruption time can be shortened.

また、本実施形態によれば、致命的なエラー発生時に、サービスプロセスの共有メモリ１５を示すエイリアスを変更することにより、従来のように上書きや破損されることなく、Ｃｏｒｅファイルを障害発生時の状態のまま保障することができる。 Further, according to the present embodiment, when a fatal error occurs, the alias indicating the shared memory 15 of the service process is changed, so that the Core file is not overwritten or damaged as in the conventional case. It can be guaranteed in a state.

（Ｂ）他の実施形態
本発明は、汎用ＯＳを実装して、複数の処理を並行して動作するクラスタリング構成を有するシステムに広く適用することができ、特に、プロセスのメモリサイズが大きく、リアルタイム性を必要するサービスを提供するシステムに広く適用できる。 (B) Other Embodiments The present invention can be widely applied to a system having a clustering configuration in which a general-purpose OS is mounted and a plurality of processes operate in parallel. It can be widely applied to systems that provide services that require sex.

実施形態のマルチタスク処理装置におけるダンプ管理処理を示すシーケンス図である。It is a sequence diagram which shows the dump management process in the multitask processing apparatus of embodiment. 従来のマルチタスク処理装置におけるダンプ管理処理を示すシーケンス図である。It is a sequence diagram which shows the dump management process in the conventional multitask processing apparatus. 実施形態のマルチタスク処理装置の構成を示す構成図である。It is a block diagram which shows the structure of the multitask processing apparatus of embodiment.

Explanation of symbols

１…マルチタスク処理装置、１０…ハードディスク、１１…ハードウェア、１２…汎用マルチタスクＯＳ、１２ａ…カーネル、１２ｂ…デバイスドライバ、１３…管理プロセス、１３ａ…サービスプロセス停止処理部、１３ｂ…サービスプロセス再生成処理部、１４…サービスプロセス、１４ａ…ダンプ管理部、１５…共有メモリ。
DESCRIPTION OF SYMBOLS 1 ... Multitask processing apparatus, 10 ... Hard disk, 11 ... Hardware, 12 ... General-purpose multitask OS, 12a ... Kernel, 12b ... Device driver, 13 ... Management process, 13a ... Service process stop processing part, 13b ... Service process reproduction | regeneration Generation processing unit, 14 ... service process, 14a ... dump management unit, 15 ... shared memory.

Claims

In parallel with the process in which the dump processing means dumps a shared resource having one or more service processes held in the shared memory as a core file when a failure occurs, the management process means has one or more on the shared memory. In the multitask processing method of restarting the service process of
One or more service processes on the shared memory in which the dump processing means restarts the memory area of one or more service processes held on the shared memory when the failure occurs, by the management process means A multitask processing method comprising backing up to an area different from the memory area.

The dump processing means commands a process operation stop when a failure occurs, and commands the management process means to stop the one or more service processes during the process operation stop period. Item 4. The multitask processing method according to Item 1.

In the event of a failure, a dump processing means for dumping a shared resource having one or more service processes held in the shared memory as a core file, and a 1 on the shared memory in parallel with the dump processing by the dump processing means Or a multi-task processing device comprising management process means for restarting a plurality of service processes,
One or more service processes on the shared memory in which the dump processing means restarts the memory area of one or more service processes held on the shared memory when the failure occurs, by the management process means The multitask processing apparatus is characterized by backing up to an area different from the memory area.