JPH0217823B2

JPH0217823B2 -

Info

Publication number: JPH0217823B2
Application number: JP59071207A
Authority: JP
Inventors: Yukari Kyomoto; Kimio Yamanaka
Original assignee: Mitsubishi Electric Corp
Current assignee: Mitsubishi Electric Corp
Priority date: 1984-04-09
Filing date: 1984-04-09
Publication date: 1990-04-23
Also published as: JPS60214068A

Description

【発明の詳細な説明】〔発明の技術分野〕この発明は共通領域を複数のプロセツサが共用
するマルチプロセツサシステムに関するものであ
る。DETAILED DESCRIPTION OF THE INVENTION [Technical Field of the Invention] The present invention relates to a multiprocessor system in which a plurality of processors share a common area.

[Prior art]

第１図は例えば特公昭58―53382号公報に示さ
れた従来のマルチプロセツサシステムを示すブロ
ツク図であり、図において、１Ａ〜１ＮはP₀〜
Pnで示す複数のプロセツサ、２Ａ〜２ＮはCA₀
〜CAnで示す複数の共通領域、３は共通領域管
理装置であり、一般的な構成を示す。 FIG. 1 is a block diagram showing a conventional multiprocessor system disclosed in, for example, Japanese Patent Publication No. 58-53382. In the figure, 1A to 1N are P ₀ to
Multiple processors indicated by Pn, 2A to 2N are CA ₀
A plurality of common areas are indicated by ~CAn, and 3 is a common area management device, which shows a general configuration.

ところで、マルチプロセツサシステムにおける
共通領域にある出力データを考えると、一般に出
力データはマルチプロセツサシステムにおいて、
各出力に特定のプロセツサが存在する。例えばあ
るプロセツサＡが出力している領域に対し、他の
プロセツサがＳ／Ｗの誤り、ないしデバツグ操作
中の誤動作により誤つて書込動作をしてしまう
と、システムとして重大な障害となつてしまい、
このための共通領域の誤出力防止手段がいろいろ
考えられてきた。 By the way, if we consider the output data in the common area in a multiprocessor system, generally the output data in the multiprocessor system is
There is a specific processor for each output. For example, if another processor mistakenly writes to the area that processor A is outputting due to a S/W error or a malfunction during debugging, this will cause a serious system failure. ,
For this purpose, various means for preventing erroneous output of the common area have been considered.

この誤出力防止手段として第２図に示すものが
考えられている。図において、１０Ａ，１０Ｂは
プロセツサCPUa，CPUb、３０Ａ〜３０Ｄは共
通領域CA₀〜CA₃、３は共通領域管理装置、４０
Ａ，４０ＢはプロセツサCPUa１０Ａ，プロセツ
サCPUb１０Ｂからの出力要求信号REQa，
REQb、４１はバス競合制御部、４２は比較器、
４３はRAMで構成されるテーブル、４４はゲー
ト、４５は共通バスであり、第３図にRAMで構
成されるテーブル４３のフオーマツトを示し、５
０は占有フラグ、５１はプロセツサナンバであ
る。第２図において、例えばプロセツサCPUa１
０Ａが共通領域CA₁３０Ｂにデータを出力したい
時、CPUa１０Ａは出力要求信号REQa４０Ａを
送り、共通領域アクセス許可信号に従い共通バス
４５にCA₁３０Ｂのアドレスと出力データを送
る。共通領域管理装置３は、CA₁３０Ｂのアドレ
スｉを受け、テーブル４３のアドレスｉの内容を
参照する。アドレスｉの占有フラグ５０がセツト
されていない場合、CA₁３０Ｂはどのプロセツサ
にも占有されていない自由な状態であるから、ア
ドレスｉのエリアの占有フラグ５０をセツトし、
プロセツサナンバ５１にCPUa１０Ａのプロセツ
サナンバの書込みを実行した上でゲート４４を開
きCPUa１０ＡがCA₁３０Ｂに出力することを許
可する。占有フラグ５０がすでにセツトされてい
る場合、バス競合制御部４１により得られたプロ
セツサとテーブル４３のプロセツサナンバ５１の
内容を比較器４２で比較し一致した場合、CA₁３
０ＢはCPUa１０Ａの管理している領域であるか
ら出力したいデータをCA₁３０Ｂに出力すること
を許可する。一致しない場合は、誤出力であるか
らゲート４４を閉じCA₁３０Ｂへの出力を禁止す
る。また、このテーブル４３は例えばすべてイニ
シアル時にクリアしておけば前もつて共通領域に
対応するプロセツサ設定の必要がなく、共通領域
管理装置３は共通領域を管理すると同時に共通領
域に対応するプロセツサの設定も自動的に実行す
る。 As a means for preventing this erroneous output, the one shown in FIG. 2 has been considered. In the figure, 10A and 10B are processors CPUa and CPUb, 30A to 30D are common areas CA ₀ to _{CA 3} , 3 is a common area management device, and 40
A and 40B are output request signals REQa and 40B from processor CPUa10A and processor CPUb10B, respectively.
REQb, 41 is a bus contention control unit, 42 is a comparator,
43 is a table composed of RAM, 44 is a gate, and 45 is a common bus. Figure 3 shows the format of table 43 composed of RAM.
0 is an occupancy flag, and 51 is a processor number. In Figure 2, for example, processor CPUa1
When 0A wants to output data to the common area CA ₁ 30B, the CPUa 10A sends an output request signal REQa 40A, and sends the address and output data of CA ₁ 30B to the common bus 45 in accordance with the common area access permission signal. The common area management device 3 receives the address i of the CA ₁ 30B and refers to the contents of the address i in the table 43. If the occupancy flag 50 of address i is not set, CA ₁ 30B is in a free state where it is not occupied by any processor, so the occupancy flag 50 of the area of address i is set,
After writing the processor number of the CPUa 10A to the processor number 51, the gate 44 is opened to allow the CPUa 10A to output to the CA ₁ 30B. If the occupancy flag 50 has already been set, the comparator 42 compares the processor obtained by the bus contention control unit 41 with the contents of the processor number 51 in the table 43, and if they match, CA ₁ 3
Since 0B is an area managed by CPUa 10A, it is permitted to output data to CA ₁ 30B. If they do not match, it is an erroneous output, so the gate 44 is closed and output to CA ₁ 30B is prohibited. Furthermore, if this table 43 is all cleared at the time of initialization, there is no need to set the processor corresponding to the common area in advance, and the common area management device 3 manages the common area and at the same time sets the processor corresponding to the common area. is also executed automatically.

しかるに、従来のマルチプロセツサシステムに
おいては、少なくとも１つのプロセツサに障害が
発生した場合、該障害プロセツサは障害のまま放
置されるか、またはシステムから切離されるかさ
れていた。しかし、実時間処理を要求されるシス
テムおよびオンライン処理を行うシステムでは、
障害発生後速やかに他の健全なるプロセツサによ
つて障害処理が行われることが望まれる。ここで
言う障害処理とは、（）障害プロセツサが処理
中であつたジヨプを異常終了させるか、または健
全なプロセツサに引継ぐこと、（）障害プロセ
ツサが使用していたメモリエリアを開放する処理
を行うこと、（）障害プロセツサが使用中であ
つた入出力装置を解放する処理を行うこと、など
の処理がある。 However, in conventional multiprocessor systems, when at least one processor fails, the failed processor is either left as failed or disconnected from the system. However, in systems that require real-time processing and systems that perform online processing,
It is desirable that another healthy processor handle the failure immediately after the failure occurs. The failure handling here refers to () abnormally terminating the job being processed by the failed processor or handing it over to a healthy processor, and () freeing the memory area used by the failed processor. (2) Processing to release the input/output device that was being used by the faulty processor.

このような従来の技術として、特公昭59―2943
号公報にて示すように、複数個のプロセツサを結
合して構成するマルチプロセツサシステムにおい
て、各プロセツサからアクセス可能な障害制御装
置が、いずれか少なくとも１つのプロセツサに障
害が発生した時、障害信号を受信して他の健全な
プロセツサに対して障害処理を行わせるよう指示
するものがある。 As such conventional technology,
As shown in the above publication, in a multiprocessor system configured by combining a plurality of processors, a fault control device accessible from each processor sends a fault signal when a fault occurs in at least one of the processors. There is a processor that receives a message and instructs other healthy processors to handle the failure.

しかし、プラント制御の如く、プロセスに対す
るデジタル出力の様な情報単位の小さな出力に対
しては上記機能の適用のための余分なＨ／Ｗが多
く効率が悪かつた。 However, for small outputs of information units such as digital outputs for processes such as plant control, there is a lot of extra H/W required to apply the above functions, resulting in poor efficiency.

これに対して、次の様な方式がプロセスに対す
る出力への対策としてとられていた。 In response to this, the following methods have been taken as a countermeasure for the output of the process.

第１の方式は第４図に示すように、例えばプロ
セツサCPUa１０Ａがダウンした場合、プロセツ
サCPUa１０ＡはプロセツサCPUb１０Ｂにダウ
ンを知らせる信号DWNa２１Ａを送る。この信
号を受け取つたCPUb１０Ｂはバツクアツプ用の
Ｓ／Ｗを起動し、プロセツサCPUa１０Ａの出力
している共通領域に対して出力リセツト等のフエ
イルセイフ処理を行うようにしている。 In the first method, as shown in FIG. 4, for example, when the processor CPUa 10A goes down, the processor CPUa 10A sends a signal DWNa 21A to notify the processor CPUb 10B that it is down. Upon receiving this signal, the CPUb 10B activates the backup S/W and performs fail-safe processing such as output reset on the common area output by the processor CPUa 10A.

第２の方式として、第５図で示すようにプロセ
ツサCPUa１０Ａ，CPUb１０Ｂのダウン信号
DWNa２１Ａ，DWNb２１Ｂを直接プロセス出
力へワイヤードロジツクで渡す方式が使用されて
いた。この概要を以下に示す。 As the second method, as shown in Fig. 5, the down signal of processors CPUa10A and CPUb10B is
A method was used in which DWNa21A and DWNb21B were passed directly to the process output using wire logic. A summary of this is shown below.

３，１０Ａ，１０Ｂ，２０Ａ，２０Ｂ，２１
Ａ，２１Ｂは第４図と同様のものであり、３０
Ａ，３０Ｂ，３０Ｃ，３０Ｄは共通領域である。
プロセツサCPUa１０Ａ，CPUb１０Ｂは内部に
自プロセツサの状況を判断するプロセツサ状況判
断部JUDGa２０Ａ，JUDGb２０Ｂを持ち、自プ
ロセツサが正常に起動している時には、自プロセ
ツサが占有している共通領域CA₀３０Ａ，CA₁３
０Ｂ，CA₂３０Ｃ，CA₃３０Ｄに対してプロセツ
サダウン信号DWNa２１Ａ，DWNb２１Ｂを送
らない。第５図では、CA₀３０Ａ，CA₁３０Ｂは
プロセツサCPUa１０Ａ，CA₂３０Ｃ，CA₃３０
ＤはプロセツサCPUb１０Ｂが占有している。こ
のプロセツサダウン信号DWNa２１Ａ，DWNb
２１Ｂがない時、共通領域CA₀３０Ａ，CA₁３０
Ｂ，CA₂３０Ｃ，CA₃３０Ｄに出力可能である。
例えばCA₁３０ＢはプロセツサCPUa１０Ａが正
常で、かつプロセツサCPUa１０Ａからのみ出力
可能である。プロセツサCPUa１０Ａがダウンし
た場合、プロセツサCPUa１０Ａからのプロセツ
サダウン信号DWNa２１Ａが送られ、プロセツ
サCPUa１０Ａが占有しているCA₀３０Ａ，CA₁
３０Ｂはオフになりフエイルセイフ処理を行え
る。 3, 10A, 10B, 20A, 20B, 21
A and 21B are similar to those in Fig. 4, and 30
A, 30B, 30C, and 30D are common areas.
The processors CPUa 10A and CPUb 10B have internal processor status determination units JUDGa 20A and JUDGb 20B that determine the status of their own processors, and when their own processors are started normally, the common areas CA ₀ 30A and CA ₁ occupied by their own processors are 3
Processor down signals DWNa21A and DWNb21B are not sent to 0B, CA ₂ 30C, and CA ₃ 30D. In FIG. 5, CA ₀ 30A, CA ₁ 30B are processor CPUa 10A, CA ₂ 30C, CA ₃ 30
D is occupied by processor CPUb10B. This processor down signal DWNa21A, DWNb
When 21B is not present, common area CA ₀ 30A, CA ₁ 30
It is possible to output to B, CA ₂ 30C, and CA ₃ 30D.
For example, in CA ₁ 30B, processor CPUa 10A is normal, and output can be made only from processor CPUa 10A. When the processor CPUa 10A goes down, the processor down signal DWNa 21A is sent from the processor CPUa 10A, and the CA ₀ 30A, CA ₁ occupied by the processor CPUa 10A is
30B is turned off and failsafe processing can be performed.

従来のフエイルセイフ処理の第１の方法では、
各プロセツサに他のプロセツサのダウン時に必要
なバツクアツプＳ／Ｗを用意する必要があり、そ
のバツクアツプＳ／Ｗには各プロセツサに対応す
る共通領域を設定しておかなければならず、その
手続きが面倒であり、共通領域の変更を行うに
は、各プロセツサのバツクアツプＳ／Ｗを修正し
なければならず、変更は非常に難しかつた。 In the first method of conventional fail-safe processing,
It is necessary to prepare a backup S/W for each processor that is necessary when other processors go down, and a common area corresponding to each processor must be set in the backup S/W, which is a cumbersome procedure. Therefore, in order to change the common area, it was necessary to modify the backup S/W of each processor, which was extremely difficult.

また、第２の方法では、各プロセツサに自プロ
セツサが正常であることを自プロセツサに対応し
ている共通領域に信号を送るＨ／Ｗと、各共通領
域にその信号を受けフエイルセイフ処理を行う
Ｈ／Ｗを持たせなければならないという欠点があ
つた。また、このため各プロセツサに対応する共
通領域の変更は非常に難しかつた。 In addition, in the second method, there is a H/W that sends a signal to the common area corresponding to each processor to indicate that its own processor is normal, and an H/W that sends a signal to each common area and performs fail-safe processing. There was a drawback that /W had to be provided. Furthermore, it is therefore extremely difficult to change the common area corresponding to each processor.

[Summary of the invention]

この発明は上記のようなものの欠点を除去する
ためになされたもので、プロセツサのダウン時、
共通領域管理装置のテーブルを通してダウンした
プロセツサを検知し、このダウンしたプロセツサ
に対応する共通領域のフエイルセイフ処理を行わ
せる手段を設けることにより、フエイルセイフ処
理をバツクアツプＳ／Ｗあるいは特別なＨ／Ｗを
必要とせず容易に行うマルチプロセツサシステム
を提供することを目的としている。 This invention was made to eliminate the drawbacks of the above-mentioned devices, and when the processor is down,
By providing a means for detecting a down processor through the table of the common area management device and performing fail-safe processing for the common area corresponding to the down processor, backup S/W or special H/W is not required for fail-safe processing. The purpose of this project is to provide a multiprocessor system that can be easily implemented without having to do this.

[Embodiments of the invention]

以下、この発明の一実施例を図について説明す
る。 An embodiment of the present invention will be described below with reference to the drawings.

第６図において、３０Ａ〜３０Ｄ，１０Ａ，１
０Ｂ，４０Ａ，４０Ｂ，３，４１〜４５は第２図
と同様のものである。６０はフエイルセイフ処理
装置、６１Ａ，６１Ｂはプロセツサダウン信号
DWNa，DWNb、６２は切換信号、６３はフエ
イルセイフ処理信号、６４は切換信号６２を受け
取りフエイルセイフ処理と共通領域出力の切り換
えを行うセレクタである。 In Fig. 6, 30A to 30D, 10A, 1
0B, 40A, 40B, 3, 41-45 are the same as those in FIG. 60 is a fail-safe processing device, 61A and 61B are processor down signals
DWNa, DWNb, 62 are switching signals, 63 is a fail-safe processing signal, and 64 is a selector that receives the switching signal 62 and switches between fail-safe processing and common area output.

いま、プロセツサCPUa１０Ａがダウンした場
合、プロセツサCPUa１０Ａがダウンしたという
プロセツサCPUa１０Ａからフエイルセイフ処理
装置６０へのプロセツサダウン信号DWNa６１
Ａが送られ、フエイルセイフ処理装置６０は共通
領域管理装置３のテーブル４３を通して、プロセ
ツサCPUa１０Ａの占有している共通領域のアド
レスを検知し、切換信号６２を送つてセレクタ６
４を切換え、フエイルセイフ処理信号６３をセレ
クタ６４を通して共通バス４５へ送り、プロセツ
サCPUa１０Ａの占有している共通領域に対して
出力リセツトあるいは必要な状態への移行、例え
ばシステムの安全側への状態移行等のフエイルセ
イフ処理を行うようにしている。また、共通領域
に対するプロセツサの設定とその変更も容易に行
うことができる。 Now, if the processor CPUa 10A is down, a processor down signal DWNa 61 is sent from the processor CPUa 10A to the failsafe processing device 60 indicating that the processor CPUa 10A is down.
A is sent, the fail-safe processing device 60 detects the address of the common area occupied by the processor CPUa 10A through the table 43 of the common area management device 3, and sends a switching signal 62 to selector 6.
4, sends the fail-safe processing signal 63 to the common bus 45 through the selector 64, and resets the output for the common area occupied by the processor CPUa 10A or transitions to a necessary state, for example, transitions the system to the safe side, etc. We are trying to perform fail-safe processing. Furthermore, processor settings and changes for common areas can be easily performed.

なお、上記実施例では占有ビツト５０とプロセ
ツサナンバ５１から構成されているテーブル４３
を使用したが、どのプロセツサにも占有されてい
ない自由な状態を示すビツトパターンフオーマツ
トを決めれば占有ビツト５０を設けなくてもよ
い。 In the above embodiment, the table 43 is composed of occupied bits 50 and processor numbers 51.
However, if a bit pattern format indicating a free state that is not occupied by any processor is determined, the occupied bit 50 may not be provided.

また、上記実施例では、共通領域３とテーブル
４３を１対１のアドレス対応としたが、共通領域
３をテーブル４３に写像する装置を追加すること
により、１対１の対応としなくてもよい。 Further, in the above embodiment, the common area 3 and the table 43 have a one-to-one address correspondence, but by adding a device that maps the common area 3 to the table 43, it is not necessary to have a one-to-one correspondence. .

また、上記実施例ではフエイルセイフ処理装置
６０がプロセツサのダウン時にテーブル４３を参
照しダウンしたプロセツサに対応する共通領域に
対してフエイルセイフ処理を行うとしたが、マル
チプロセツサシステムにおいて、システムの状態
を監視する装置を持つていれば、そこにダウン時
のフエイルセイフ処理機能を持たせてもよい。 Furthermore, in the above embodiment, when a processor goes down, the fail-safe processing device 60 refers to the table 43 and performs fail-safe processing on the common area corresponding to the down processor. However, in a multiprocessor system, the system status is monitored. If you have a device that does this, you can provide it with a fail-safe processing function in the event of a downtime.

また、上記実施例では出力要求プロセツサナン
バを登録する手段が自動的に登録される場合につ
いて述べたが、出力データ対応に占有プロセツサ
ナンバを記憶できるものであればよく、例えば該
プロセツサナンバを手動で設定することも考えら
れる。 Further, in the above embodiment, a case has been described in which the means for registering the output request processor number is automatically registered, but any means that can store the occupied processor number corresponding to the output data may be used. It is also possible to set it manually.

〔Effect of the invention〕

以上のようにこの発明によれば、プロセツサダ
ウン時、共通領域管理装置のテーブルを通してダ
ウンしたプロセツサを検知し、このダウンしたプ
ロセツサに対応する共通領域のフエイルセイフ処
理を、バツクアツプ用Ｓ／Ｗ、特別なＨ／Ｗを必
要とせず確実に行うことができるマルチプロセツ
サシステムが得られる。 As described above, according to the present invention, when a processor goes down, the down processor is detected through the table of the common area management device, and the backup S/W and the special A multiprocessor system that can perform operations reliably without requiring extensive H/W can be obtained.

[Brief explanation of drawings]

第１図，第２図はそれぞれ従来のマルチプロセ
ツサシステムを示すブロツク図、第３図は第２図
の共通領域管理装置のテーブルのフオーマツト
図、第４図，第５図はそれぞれ従来の他のマルチ
プロセツサシステムを示すブロツク図、第６図は
この発明の一実施例によるマルチプロセツサシス
テムを示すブロツク図である。図において、３は共通領域管理装置、１０Ａ，
１０Ｂはプロセツサ、３０Ａ，３０Ｂ，３０Ｃ，
３０Ｄは共通領域、４３はテーブル、６０はフエ
イルセイフ処理装置、６１Ａ，６１Ｂはプロセツ
サダウン信号、６３はフエイルセイフ処理信号で
ある。なお、各図中、同一符号は同一又は相当部
分を示す。 1 and 2 are block diagrams showing conventional multiprocessor systems, FIG. 3 is a table format diagram of the common area management device shown in FIG. 2, and FIGS. 4 and 5 are block diagrams showing conventional multiprocessor systems, respectively. FIG. 6 is a block diagram showing a multiprocessor system according to an embodiment of the present invention. In the figure, 3 is a common area management device, 10A,
10B is a processor, 30A, 30B, 30C,
30D is a common area, 43 is a table, 60 is a fail-safe processing device, 61A and 61B are processor down signals, and 63 is a fail-safe processing signal. In each figure, the same reference numerals indicate the same or equivalent parts.

Claims

[Claims]

1. In a multiprocessor system configured with a plurality of processors, a common area management device shared by the plurality of processors, and one or more common areas accessible via the common area management device, the above-mentioned A means for registering an output request processor number in a table corresponding to addresses of the common area in the common area management device, and comparing the registered processor number with a processor number permitted to output in the common area management device to determine whether they match. and means for detecting the down processor through the table when the processor goes down, and causing fail-safe processing of the common area corresponding to the down processor. Features a multiprocessor system.