KR100913799B1

KR100913799B1 - CPU Degenerate System and Degenerate Method Using Service Processor

Info

Publication number: KR100913799B1
Application number: KR1020077020230A
Authority: KR
Inventors: 히데노리 히가시; 아키히로 야마자키
Original assignee: 후지쯔 가부시끼가이샤
Priority date: 2007-09-04
Filing date: 2007-09-04
Publication date: 2009-08-26
Anticipated expiration: 2025-03-18
Also published as: KR20070100842A

Abstract

본 발명은 OS 기동 전에 불안정한 동작을 하는 예조가 있는 CPU의 축퇴를 실행할 수 있도록 하기 위해, 서비스 프로세서가 취득하고 있는 에러 CPU 정보(마이크로 프로그램에서는 직접 인식할 수 없다)를 마이크로 프로그램의 종료 직전에 서비스 프로세서로부터 마이크로 프로그램이 취득하고, 취득한 에러 CPU 정보에 기초하여 서비스 프로세서에 불안정한 동작을 하는 예조가 있는 CPU의 축퇴를 의뢰하며 서비스 프로세서는 OS 기동 전에 불안정한 동작을 하는 예조가 있는 CPU의 축퇴를 실행하는 것을 목적으로 한다. The present invention provides an error CPU information (not directly recognized by a micro program) obtained by a service processor immediately before the end of a micro program in order to enable the degeneracy of a CPU with a preliminary example of unstable operation before starting the OS. The microprocessor acquires from the processor and requests the degeneracy of the CPU which has a foreseeable operation of the service processor based on the acquired error CPU information, and the service processor executes the degeneracy of the CPU having a foreseeable operation before OS startup. For the purpose of

Description

CPU SUPPRESSION SYSTEM AND CPU SUPPRESSION METHOD USING SERVICE PROCESSOR}

본 발명은 운용 전에 불안정한 동작을 하는 예조(豫兆)가 있는 CPU를 검출하여 시스템에 내장하지 않도록 하는 서비스 프로세서를 이용한 CPU 축퇴 시스템 및 축퇴 방법에 관한 것이다. The present invention relates to a CPU degenerate system and a degenerate method using a service processor that detects a CPU having a preliminary unstable operation before operation and does not incorporate it into the system.

도 1은 종래의 CPU 축퇴를 실현하기 위한 서비스 프로세서를 이용한 CPU 축퇴 시스템의 구성을 도시한 개략도이다. 도 1의 마이크로 프로그램(Micro Program)(11)은 시스템의 전원 투입이나 리부트(reboot)/리셋 시 등의 시스템 리셋을 계기로, 시스템을 구성하는 유닛의 진단을 행하는 모듈로서 기능하도록 되어 있다. 도 1에 도시한 마이크로 프로그램(11)은 운용 시스템측의 CPU, 즉 도 1의 CPU(A)(10)에 의해 실행된다. 또한, 운용 시스템측의 CPU(A)(10)는 도면 상에서는 하나의 CPU처럼 그려지고 있지만, 실제로는 복수의 CPU에 의해 운용 시스템측의 CPU가 구성되어 있는 것은 물론이다.1 is a schematic diagram showing the configuration of a CPU degenerate system using a service processor for realizing a conventional CPU degenerate. The micro program 11 of Fig. 1 functions as a module for diagnosing the units constituting the system, in response to a system reset such as when the system is powered on or rebooted / reset. The micro program 11 shown in FIG. 1 is executed by the CPU on the operating system side, that is, the CPU (A) 10 of FIG. The CPU (A) 10 on the operating system side is depicted as one CPU in the drawing, but of course, the CPU on the operating system side is configured by a plurality of CPUs.

도 1에 도시한 마이크로 프로그램(Micro Program)(11)에 운용 시스템측의 CPU 제어가 전달될 때에는, 화살표(1)로 도시하는 서비스 프로세서측에 의한 CPU의 동작 체크가 이루어진 후이므로, 운용 시스템측의 CPU가 자체 하드웨어 리소스를 사용하여 적어도 기본적인 동작을 할 수 있는 상태에 있다. 서비스 프로세서 CPU(B)(20)에 의한 운용 시스템측의 CPU의 동작 체크에 의해 NG(No Good)로 판단된 운용 시스템측의 CPU에 대해서는, 서비스 프로세서 CPU(B)(20)가 CPU 정지 처리를 실행하도록 하기 위해, 마이크로 프로그램(11)에 제어가 전달되지 않게 되어 있다.When the CPU control on the operating system side is transmitted to the Micro Program 11 shown in Fig. 1, since the operation of the CPU is checked by the service processor side shown by the arrow 1, the operating system side The CPU is in a state where it can at least perform basic operations using its own hardware resources. The service processor CPU (B) 20 performs CPU stop processing for the CPU on the operating system side determined to be NG (No Good) by the operation check of the CPU on the operating system side by the service processor CPU (B) 20. In order to execute the control, control is not transferred to the microprogram 11.

도 1에 도시한 마이크로 프로그램(11)은 도 1의 화살표 (2), (3), (4)로 도시한 바와 같이, 운용 시스템의 CPU(A)(10), 메모리(Memory)(12), 그리고 I/O 유닛(I/0 Unit)(13)이라고 한 시스템 구성 유닛의 초기 설정 및 진단을 행하여, 서비스 프로세서측과 정보를 교환한다.The micro program 11 shown in Fig. 1 is represented by arrows (2), (3), and (4) in Fig. 1, and includes a CPU (A) 10 and a memory (12) of the operating system. And initial setting and diagnosis of a system configuration unit called an I / O unit 13 to exchange information with the service processor side.

도 2는 마이크로 프로그램(Micro Program), 시스템 CPU 및 서비스 프로세서 사이에 있어서의 종래의 서비스 프로세서를 이용한 CPU 축퇴 시스템의 동작을 설명하는 처리 시퀀스를 도시한 개략도이다. 도 2에 도시한 시퀀스 차트 중의 번호는 도 1에 도시한 종래의 서비스 프로세서를 이용한 CPU 축퇴 시스템의 구성도에 있어서의 번호에 대응하고 있으므로, 도 1 및 도 2를 이용하여 종래의 서비스 프로세서를 이용한 CPU 축퇴 시스템의 동작을 설명한다. 도 2에 도시한 바와 같이 우선, 서비스 프로세서측에 있어서 시스템 파워 온을 행한다(A1). 이어서 시스템 CPU(10)의 동작 체크[도 1의(1)]를 행한다(A2). 여기서 운용 시스템측의 CPU(10)가 자체 하드웨어 리소스를 사용하여 적어도 기본적인 동작을 할 수 있는 상태인 경우에는, CPU 제어를 마이크로 프로그램(11)에 전달하지만(A4) 운용 시스템측의 CPU의 동작 체크에 의해 NG(No Good)로 판단된 경우에는, 서비스 프로세서에 의해 CPU 정지 처리가 행해져, 마이크로 프로그램(11)에는 제어가 전달되지 않는다(A3).Fig. 2 is a schematic diagram showing a processing sequence for explaining the operation of a CPU degenerate system using a conventional service processor between a micro program, a system CPU, and a service processor. Since the numbers in the sequence chart shown in FIG. 2 correspond to the numbers in the configuration diagram of the CPU degenerate system using the conventional service processor shown in FIG. 1, the conventional service processor is used using FIG. 1 and FIG. The operation of the CPU degenerate system will be described. As shown in FIG. 2, first, system power-on is performed on the service processor side (A1). Next, the operation check (Fig. 1 (1)) of the system CPU 10 is performed (A2). Here, when the CPU 10 on the operating system side is capable of performing at least basic operations using its own hardware resources, the CPU control is transmitted to the microprogram 11 (A4), but the operation of the CPU on the operating system side is checked. If it is determined as NG (No Good) by the service processor, CPU stop processing is performed by the service processor, and control is not transmitted to the microprogram 11 (A3).

서비스 프로세서(20)로부터 제어가 전달되는 마이크로 프로그램(11)은 시스템 구성 유닛의 초기 설정 및 진단을 개시하여, 우선 CPU 진단 처리[도 1의(2)]를 행한다(A5). 시스템 구성 유닛의 초기 설정 및 진단 중, 서비스 프로세서(20)는 시스템 CPU(10)의 에러의 발생 상황을 항상 감시하는 동시에 마이크로 프로그램이 다음 제어에 이행하는 직전에만, 에러 발생 상황을 시스템측 CPU에 통지한다(A6). 마이크로 프로그램(Micro Program)(11)이 시스템을 구성하는 유닛의 초기 설정/진단[도 1의 화살표(2), (3), (4)]을 실행하고 있을 때부터, OS 기동 후에 전달될 때까지, CPU 하드웨어 에러의 발생을 서비스 프로세서 CPU(B)(20)가 인식[도 1의 화살표(5)]하여, 에러의 발생 횟수가 미리 설정되어 있는 임계값을 넘으면, 「후에 중대한 에러를 초래할 가능성이 있고 불안정한 동작을 하는 예조가 있는 CPU」로 간주하여 OS에 의해 대기 상태에 있었다. 즉 종래에는, 물리적으로 CPU를 분리하는 것이 아닌, 대상의 CPU에 대해 프로세스의 할당을 행하지 않는 상태로 하여 소프트적으로 CPU를 분리하고, 다음 리셋을 계기로 마이크로 프로그램(Micro Program)(11)이 축퇴를 실행하고 있었다. 그리고, 마이크로 프로그램(11)에 의한 진단의 결과, OK이면 정상 CPU에서 프로그램 처리를 계속하도록 하여 진단 처리를 종료한다(A8). 한편, 마이크로 프로그램에 의한 CPU 진단 처리의 결과, 시스템측 CPU 자체에서 NG의 상황을 나타내면, 마이크로 프로그램(11) 자체에 의해 상기 CPU를 축퇴하도록 하고(A7), CPU 축퇴한 경우라도 남겨진 정상 CPU에서 프로그램 처리의 계속이 가능하면 처리 계속하여 마이크로 프로그램(11)에 의한 진단 처리를 종료한다(A8). 즉, 시스템으로부터 불안정한 동작을 하는 예조가 있는 CPU에 대해 프로세스의 할당을 행하지 않는 상태로 하고 있지만, 그것은 이미 OS(시스템)이 기동된 후이며, 그와 같은 CPU를 분리하기 위해서는, 다음에 마이크로 프로그램이 동작하는 리부트(reboot) 등의 타이밍이었다. (특허 문헌 1 참조)The micro program 11 to which control is transmitted from the service processor 20 starts initial setting and diagnosis of a system configuration unit, and performs CPU diagnostic processing (FIG. 1 (2)) first (A5). During the initial setting and diagnosis of the system configuration unit, the service processor 20 always monitors the occurrence of the error of the system CPU 10 and at the same time immediately before the micro program transitions to the next control, the error occurrence situation is transmitted to the system side CPU. Notify (A6). When the Micro Program 11 is executing the initial setting / diagnosis (arrows (2), (3), (4) in FIG. 1) of the units constituting the system, and is delivered after OS startup. Until the occurrence of the CPU hardware error is recognized by the service processor CPU (B) 20 (arrow 5 in FIG. 1), and the number of occurrences of the error exceeds a preset threshold value, "a serious error will occur later." It was a standby state by the OS, considering a possible CPU with a predictable and unstable behavior. That is, conventionally, the CPU is not detached physically but the CPU is softly separated without the process being allocated to the target CPU. Then, the micro program 11 causes the micro-program 11 to be reset. Was running degenerate. As a result of the diagnosis by the microprogram 11, if it is OK, the program processing is continued in the normal CPU and the diagnosis processing is terminated (A8). On the other hand, when the system-side CPU itself indicates the NG situation as a result of the CPU diagnostic processing by the microprogram, the microprogram 11 itself degenerates the CPU (A7), and even if the CPU degenerates, the normal CPU remains. If the program processing can be continued, the processing continues and the diagnostic processing by the microprogram 11 is terminated (A8). That is, the process is not assigned to a CPU that has an example of unstable operation from the system, but it is already after the OS (system) is started. This was the timing of rebooting and the like. (See Patent Document 1)

[특허 문헌 1] 일본 특허 공개 평성 08-087341호 공보 [Patent Document 1] Japanese Unexamined Patent Application Publication No. 08-087341

이와 같이 종래의 서비스 프로세서를 이용한 CPU 축퇴 시스템은 불안정한 동작을 하는 예조가 있는 CPU를 OS 기동 후에 인식하고, 다음 시스템 리셋을 계기로 CPU 축퇴를 실행하고 있었기 때문에, OS 기동 전에 불안정한 동작을 하는 예조가 있는 CPU의 축퇴를 실행할 수 없다고 하는 과제가 있었다.As described above, since the CPU degenerate system using the conventional service processor recognizes a CPU having an unstable operation after OS startup, and executes CPU degeneracy at the next system reset, the CPU degenerate system has an unstable operation before OS startup. There was a problem that the degeneracy of the existing CPU could not be performed.

그런데 컴퓨터 분야에서는 CPU의 생산 과정에 의해 생기는 품질의 변동은 불가피하다. 이 품질의 변동에 의해, 불안정한 동작을 하는 예조가 있는 CPU의 특정과, 불안정 동작이 어떤 레벨인가의 인식, 그리고, 시스템의 운용 전에 이들을 구성 유닛으로부터 분리함으로써 시스템의 안정 운용은 중요한 과제이다.However, in the computer field, the quality fluctuation caused by the production process of the CPU is inevitable. Due to this change in quality, stable operation of the system is important by identifying a CPU with a preliminary example of unstable operation, recognizing what level of unstable operation is, and separating them from the constituent unit before operating the system.

운용 전에 불안정한 동작을 하는 예조가 있는 CPU를 인식ㆍ축퇴하여 시스템 구성 유닛에 내장하지 않는 것은, 운용 시스템에 있어서의 견뢰성(堅牢性)의 향상, 운용 후에 고장이 발생했을 때의 고장 보수 작업 시간ㆍ보수에 관한 비용의 절감을 위해서는 중요한 일이다. 그러나, 이들을 하드웨어 기능에 의해 실현한 경우, 실장하여야 할 하드웨어의 증대에 의한 시스템의 비용 상승 및 시스템의 사이즈의 증대가 과제 극복의 저해 요인이 되고 있었다.Recognizing and degenerating CPUs with unpredictable CPUs that perform unstable behavior before operation, improves the robustness of the operating system, and trouble-shooting time when a failure occurs after operation It is important to reduce the cost of maintenance. However, when these are realized by a hardware function, the increase in the cost of the system and the increase in the size of the system due to the increase of the hardware to be mounted have been an obstacle to overcoming the problem.

상기 과제를 해결하기 위해 본 발명은 마이크로 프로그램에 의해 불안정한 동작을 하는 예조가 있는 CPU를 인식하여 축퇴해야 할 CPU를 결정하고, OS 기동 전에 운용 시스템에 조립되지 않도록 한 서비스 프로세서를 이용한 CPU 축퇴 시스템 및 축퇴 방법을 제공하는 것을 목적으로 한다.In order to solve the above problems, the present invention provides a CPU degenerate system using a service processor that recognizes a CPU with a preliminary example of unstable operation by a micro program, determines a CPU to be degenerate, and is not assembled to an operating system before OS startup. It is an object to provide a degeneracy method.

본 발명은 서비스 프로세서가 취득하고 있는 에러 CPU 정보(마이크로 프로그램에서는 직접 인식할 수 없음)를 마이크로 프로그램의 종료 직전에 서비스 프로세서로부터 마이크로 프로그램이 취득하고, 취득한 에러 CPU 정보에 기초하여 서비스 프로세서에 불안정한 동작을 하는 예조가 있는 CPU의 축퇴를 의뢰하며, 서비스 프로세서는 OS 기동 전에 불안정한 동작을 하는 예조가 있는 CPU의 축퇴를 실행하는 것을 특징으로 한다.The present invention obtains the error CPU information (not directly recognized by the micro program) acquired by the service processor from the service processor immediately before the micro program ends, and operates unstable to the service processor based on the acquired error CPU information. Request degeneracy of the CPU with a foreword, and the service processor is characterized in that degeneracy of the CPU with a foreplay of unstable operation before starting the OS.

본 발명에 따르면, OS 기동 전에 불안정한 동작을 하는 예조가 있는 CPU의 축퇴를 실행하는 것으로, 운용 시스템에 있어서의 견뢰성의 향상, 운용 후에 고장이 발생했을 때의 고장 보수 작업 시간ㆍ보수에 관한 비용의 절감을 도모할 수 있다.According to the present invention, by degenerating a CPU with a preliminary example of unstable operation before OS startup, it is possible to improve the fastness in the operating system and to provide the cost of troubleshooting and maintenance work time when a failure occurs after the operation. We can save.

도 1은 종래의 CPU 축퇴를 실현하기 위한 서비스 프로세서를 이용한 CPU 축퇴 시스템의 구성을 도시한 개략도. 1 is a schematic diagram showing the configuration of a CPU degenerate system using a service processor for realizing a conventional CPU degenerate.

도 2는 마이크로 프로그램(Micro Program), 시스템 CPU 및 서비스 프로세서 사이에 있어서의 종래의 서비스 프로세서를 이용한 CPU 축퇴 시스템의 동작을 설명하는 처리 시퀀스를 도시한 개략도. Fig. 2 is a schematic diagram showing a processing sequence for explaining the operation of a CPU degenerate system using a conventional service processor between a micro program, a system CPU, and a service processor.

도 3은 본 발명의 CPU 축퇴를 실현하기 위한 서비스 프로세서를 이용한 CPU 축퇴 시스템의 구성을 도시한 개략도. Fig. 3 is a schematic diagram showing the configuration of a CPU degenerate system using a service processor for realizing CPU degeneracy of the present invention.

도 4는 마이크로 프로그램(Micro Program), 시스템 CPU 및 서비스 프로세서 사이에 있어서의 본 발명의 서비스 프로세서를 이용한 CPU 축퇴 시스템의 동작을 설명하는 처리 시퀀스를 도시한 개략도. Fig. 4 is a schematic diagram showing a processing sequence for explaining the operation of the CPU degenerate system using the service processor of the present invention between a Micro program, a system CPU, and a service processor.

도 3은 본 발명의 CPU 축퇴를 실현하기 위한 서비스 프로세서를 이용한 CPU 축퇴 시스템의 구성을 도시한 개략도이다. 도 3의 마이크로 프로그램(Micro Program)(11)은 시스템의 전원 투입이나 리부트/리셋 시 등의 시스템 리셋을 계기로, 시스템을 구성하는 유닛의 진단을 행하는 모듈로서 기능하도록 되어 있다. 도 3에 도시한 마이크로 프로그램(11)은 운용 시스템측의 CPU, 즉 도 3의 CPU(A)(10)에 의해 실행된다. 또한, 운용 시스템측의 CPU(A)(10)는 도면 상에서는 하나의 CPU처럼 그려지고 있지만, 실제로는 복수의 CPU에 의해 운용 시스템측의 CPU가 구성되어 있는 것은 물론이다.Fig. 3 is a schematic diagram showing the configuration of a CPU degenerate system using a service processor for realizing CPU degeneracy of the present invention. The micro program 11 of FIG. 3 functions as a module for diagnosing the units constituting the system, in response to a system reset such as when the system is powered on or rebooted or reset. The micro program 11 shown in FIG. 3 is executed by the CPU on the operating system side, that is, the CPU (A) 10 of FIG. The CPU (A) 10 on the operating system side is depicted as one CPU in the drawing, but of course, the CPU on the operating system side is configured by a plurality of CPUs.

도 3에 도시한 마이크로 프로그램(Micro Program)(11)에 운용 시스템측의 CPU 제어가 전달될 때에는, 화살표(1)로 도시하는 서비스 프로세서측에 의한 CPU의 동작 체크가 이루어진 후이므로, 운용 시스템측의 CPU가 자체 하드웨어 리소스를 사용하여 적어도 기본적인 동작을 할 수 있는 상태에 있다. 서비스 프로세서 CPU(B)(20)에 의한 운용 시스템측의 CPU의 동작 체크에 의해 NG(No Good)로 판단된 운용 시스템측의 CPU에 대해서는, 서비스 프로세서 CPU(B)(20)가 CPU 정지 처리를 실행하도록 하기 위해, 마이크로 프로그램(11)에 제어가 전달되지 않게 되어 있다.When CPU control on the operating system side is transmitted to the Micro Program 11 shown in Fig. 3, since the operation of the CPU is checked by the service processor side shown by the arrow 1, the operating system side The CPU is in a state where it can at least perform basic operations using its own hardware resources. The service processor CPU (B) 20 performs CPU stop processing for the CPU on the operating system side determined to be NG (No Good) by the operation check of the CPU on the operating system side by the service processor CPU (B) 20. In order to execute the control, control is not transferred to the microprogram 11.

도 3에 도시한 마이크로 프로그램(11)은 도 1의 화살표(2), (3), (4)로 도시한 바와 같이, 운용 시스템의 CPU(A)(10), 메모리(Memory)(12), 그리고 I/O 유닛(I/0 Unit)(13)이라고 하는 시스템 구성 유닛의 초기 설정 및 진단을 행하여, 서비스 프로세서측과 정보를 교환한다.The micro program 11 shown in FIG. 3 is represented by arrows 2, 3, and 4 of FIG. 1, and includes a CPU (A) 10 and a memory (12) of the operating system. And initial setting and diagnosis of a system configuration unit called an I / O unit 13 to exchange information with the service processor side.

도 4는 마이크로 프로그램(Micro Program), 시스템 CPU 및 서비스 프로세서 사이에 있어서의 본 발명의 서비스 프로세서를 이용한 CPU 축퇴 시스템의 동작을 설명하는 처리 시퀀스를 도시한 개략도이다. 도 4에 도시한 시퀀스 차트 중의 번호는 도 3에 도시한 본 발명의 서비스 프로세서를 이용한 CPU 축퇴 시스템의 구성도에 있어서의 번호에 대응되므로, 도 3 및 도 4를 이용하여 본 발명의 서비스 프로세서를 이용한 CPU 축퇴 시스템의 동작을 설명한다. 도 4에 도시한 바와 같이 우선, 서비스 프로세서측에서 시스템 파워온을 행한다(B1). 이어서 시스템 CPU(10)의 동작 체크[도 3의(1)]를 행한다(B2). 여기서 운용 시스템측의 CPU(10)가 자체 하드웨어 리소스를 사용하여 적어도 기본적인 동작을 할 수 있는 상태에 있는 경우에는, CPU 제어를 마이크로 프로그램(11)에 전달하지만(B4) 운용 시스템측의 CPU의 동작 체크에 의해 NG(No Good)로 판단된 경우에는, CPU 정지 처리가 행해져, 마이크로 프로그램(11)에는 제어가 전달되지 않는다(B3).Fig. 4 is a schematic diagram showing a processing sequence for explaining the operation of the CPU degenerate system using the service processor of the present invention between a micro program, a system CPU, and a service processor. Since the numbers in the sequence chart shown in FIG. 4 correspond to the numbers in the configuration diagram of the CPU degenerate system using the service processor of the present invention shown in FIG. 3, the service processor of the present invention will be described with reference to FIGS. The operation of the used CPU degenerate system will be described. As shown in FIG. 4, first, system power-on is performed on the service processor side (B1). Subsequently, an operation check (Fig. 3 (1)) of the system CPU 10 is performed (B2). Here, when the CPU 10 on the operating system side is in a state capable of at least basic operation using its own hardware resources, CPU control is transferred to the microprogram 11 (B4), but the operation of the CPU on the operating system side. When it is judged as NG (No Good) by the check, CPU stop processing is performed, and control is not transmitted to the microprogram 11 (B3).

서비스 프로세서(20)로부터 제어가 전달되는 마이크로 프로그램(11)은 시스템 구성 유닛의 초기 설정 및 진단을 개시하여 우선 CPU 진단 처리[도 3의(2)]를 행한다(B5). CPU 진단 처리 중, 서비스 프로세서(20)는 시스템 CPU(10)의 에러의 발생 상황을 항상 감시하는 동시에 마이크로 프로그램이 다음 제어로 이행하는 때 에만 에러 발생 상황을 시스템측 CPU에 통지한다(B6). 즉, 마이크로 프로그램(Micro Program)(11)이 시스템을 구성하는 유닛의 초기 설정/진단[도 3의 화살표(2), (3), (4)]을 실행 중에 있어서의 CPU 하드웨어 에러의 발생을 서비스 프로세서 CPU(B)(20)가 인식[도 3의 화살표(5)]하고, 에러의 발생 횟수가 미리 설정되어 있는 임계값을 넘으면, 「후에 중대한 에러를 초래할 가능성이 있고 불안정한 동작을 하는 예조가 있는 CPU」로 간주하여 서비스 프로세서(20)에 CPU 축퇴를 하도록 의뢰한다(B13). 그리고, 마이크로 프로그램(11)에 의한 진단의 결과, OK면 정상 CPU에서 프로그램 처리를 계속한다(B9). 한편, 마이크로 프로그램에 의한 CPU 진단 처리의 결과, 시스템측 CPU 자체에서 NG(NoGood)의 상황을 나타내면, 서비스 프로세서(20)에 CPU 축퇴를 하도록 의뢰한다(B7). 서비스 프로세서(20)는 마이크로 프로그램(11)에 의뢰된 CPU 축퇴를 실행(B8)하여 남겨진 정상 CPU에서 프로그램 처리의 계속이 가능하면 처리 계속한다(B9).The microprogram 11 to which control is transferred from the service processor 20 starts initial setting and diagnosis of the system configuration unit, and first performs CPU diagnostic processing (Fig. 3 (2)) (B5). During the CPU diagnostic process, the service processor 20 always monitors the occurrence of the error of the system CPU 10 and notifies the system side CPU of the error occurrence only when the microprogram moves to the next control (B6). That is, the occurrence of CPU hardware error during the initial setting / diagnosis (arrows 2, 3, and 4 of FIG. 3) of the unit constituting the system is executed by the micro program 11. If the service processor CPU (B) 20 recognizes (the arrow 5 in Fig. 3), and the number of occurrences of the error exceeds a preset threshold value, "preparation that may cause a serious error later and perform unstable operation. CPU ”is requested to the service processor 20 to degenerate the CPU (B13). If the result of the diagnosis by the microprogram 11 is OK, the program processing is continued in the normal CPU (B9). On the other hand, when the system-side CPU itself indicates the situation of NG (NoGood) as a result of the CPU diagnostic processing by the microprogram, the service processor 20 is requested to degenerate the CPU (B7). The service processor 20 continues the process if it is possible to continue the program processing in the normal CPU left by executing the CPU degeneration requested by the microprogram 11 (B8) (B9).

그런데 마이크로 프로그램(Micro Program)은 시스템을 구성하는 유닛의 초기 설정/진단[도 3의(2), (3), (4)]을 실행 중에 있어서의 CPU 하드웨어 에러의 발생을 서비스 프로세서 CPU(B)(20)가 항상 감시하여 그 발생 상황을 인식[도 3의(5)](B6)하고, 마이크로 프로그램이 종료하기 직전까지 시스템측의 에러 CPU 정보[이에 대해서는 후술함]를 축적하여 「Error Info」(21)를 작성(도 3의[Error Info])로 한다[도 3의(6)](B10). 따라서「Error Info」(21)는 에러 CPU 정보 축적부를 구성하게 된다.By the way, the Micro Program detects the occurrence of the CPU hardware error during the initial setting / diagnosis of the units constituting the system ((2), (3), (4) of FIG. 3). 20 always monitors and recognizes the occurrence (Fig. 3 (5)) (B6), and accumulates the error CPU information (described later) on the system side until immediately before the micro program ends. Let Info "21 be created ([Error Info] of FIG. 3) (FIG. 3 (6)) (B10). Therefore, "Error Info" 21 constitutes an error CPU information accumulating unit.

마이크로 프로그램(11)은 마이크로 프로그램의 종료 직전에 서비스 프로세 서(20)에 대해 에러가 일어난 CPU의 통지를 의뢰하고[도 3의(7)](B11), 서비스 프로세서(20)는 에러가 일어난 CPU를「Error Info」(21)에 기초하여 마이크로 프로그램(11)에 통지한다[도 3의(8)](B12).The microprogram 11 requests notification of the CPU in which an error has occurred to the service processor 20 immediately before the microprogram ends (Fig. 3 (7)) (B11), and the service processor 20 receives an error. The generated CPU is notified to the microprogram 11 based on "Error Info" 21 (Fig. 3 (8)) (B12).

마이크로 프로그램(11)은 에러가 일어난 CPU의 내용을 판단하여, 서비스 프로세서(20)에 축퇴 의뢰를 하는 CPU를 결정하고, 그 CPU를 서비스 프로세서(20)에 대해 축퇴 의뢰를 한다[도 3의(9)](B13). 본 발명에서는 마이크로 프로그램의 초기 설정/진단 처리의 리소스로서 CPU 자원이 필요하므로, 불안정한 동작을 하는 예조가 있는 CPU의 축퇴를 마이크로 프로그램의 종료 직전에서 행하는 것으로 하고 있다.The microprogram 11 judges the contents of the CPU in which the error has occurred, determines the CPU to make a degeneracy request to the service processor 20, and requests the CPU to degenerate the service processor 20 (Fig. 3 ( 9)] (B13). In the present invention, CPU resources are required as a resource for initial setting / diagnostic processing of the microprogram. Therefore, degeneracy of the CPU having a preliminary operation of unstable operation is performed immediately before the end of the microprogram.

마이크로 프로그램(11)에서의 축퇴 의뢰를 받아, 서비스 프로세서(20)는 상기 CPU의 축퇴를 실행한다(B14).In response to the degeneracy request from the microprogram 11, the service processor 20 executes degeneracy of the CPU (B14).

또한, 마이크로 프로그램의 동작 중(시스템 구성 유닛의 초기 설정/진단 중)에, CPU 하드웨어 에러의 발생이 없었던 경우는, 통상대로, 마이크로 프로그램에 의한 초기 설정/진단 처리를 종료한다.If no CPU hardware error occurs during the operation of the microprogram (during initial setting / diagnosis of the system configuration unit), the initial setting / diagnosis processing by the microprogram is terminated as usual.

이와 같이 본 발명에 따르면, 마이크로 프로그램이 직접 인식할 수 없는 에러 CPU 정보를 서비스 프로세서측에서 취득함으로써, OS 기동 전에 불안정한 동작을 하는 예조가 있는 CPU의 축퇴가 실행 가능해지므로, 운용 시스템에 있어서의 견뢰성의 향상, 운용 후에 고장이 발생했을 때의 고장 보수 작업 시간ㆍ보수에 관한 비용의 절감을 도모할 수 있다.As described above, according to the present invention, by acquiring the error CPU information that the microprogram cannot directly recognize from the service processor side, the degeneracy of the CPU having a preliminary operation that makes the operation unstable before the OS startup becomes feasible, and thus the solidity in the operating system. It is possible to improve the cost and reduce the time for maintenance work and maintenance when a failure occurs after operation.

시스템측의 에러 CPU 정보에 대해 설명하면, 에러 CPU 정보에는 명령 동기에 러(Synchronous Error)에 속하고, 명령 페치(fetch) 중인 메모리 UE(Uncorrectable Error) 등에 보여지는 IAE(Instruction Access Error), 캐시에 Load/Store 액세스하고 있을 때의 UE 등에 보이는 DAE(Data Access Error, 명령 실행을 방해하는 것과 같은 에러에 의해, 상기 IAE나 DAE가 아닌, 예컨대 프로그램으로부터 보여지는 레지스터[PC(Program Counter Register), CCR(Condition Codes Register) 등] 내의 UE 등에 보여지는 I_UGE(Instruction Urgent Error)와 같이 에러 레벨이 높은 것, 또한, 명령 비동기 에러(Asynchronous Error)에 속하는 RE(Restrainable Error) 즉 현재 실행 중의 프로그램에 유해한 영향을 부여하지 않는 에러, 예컨대, 하드에 의해 수정되는 CE(correctable error)와 같이 에러 레벨이 낮은 것이 포함되지만, 본원 발명에 있어서는, 에러 레벨의 고저에 상관없이, 「불안정한 동작을 하는 예조가 있는 CPU」로 판단된 경우에는 서비스 프로세서에 CPU 축퇴 의뢰를 냄으로써, 운용 시스템에 있어서의 견뢰성의 향상, 운용 후에 고장이 발생했을 때의 고장 보수 작업 시간ㆍ보수에 관한 비용의 절감을 도모할 수 있다.When the error CPU information on the system side is described, the error CPU information belongs to a Synchronous Error, and an Instruction Access Error (IAE), cache, which is shown in a memory UE (Uncorrectable Error) or the like that is being fetched from an instruction. DAE (Data Access Error, etc.) that is shown to the UE when Load / Store is being accessed to the UE. A high error level, such as Instruction Urgent Error (I_UGE) shown to a UE in a CCR (Condition Codes Register, etc.), and is also harmful to a RE (Restrainable Error) belonging to an Asynchronous Error, that is, a program currently running. Errors that do not affect, for example, low error levels such as CE (correctable error) corrected by hard, are included in the present invention. Regardless of the height of the bell, if it is determined that the CPU has a foreseeable behavior, a request for degeneracy of the CPU is sent to the service processor to improve the fastness of the operating system and to troubleshoot when a failure occurs after operation. The cost of working time and maintenance can be reduced.

불안정한 동작을 하는 예조가 있는 CPU의 검출ㆍ그 레벨의 인식/축퇴를 행하는 하드웨어 기능을 비용면 등의 이유에 의해 완전하게는 실장할 수 없는 것과 같은 컴퓨터 시스템에 있어서, 마이크로 프로그램에 의해 이들의 기능을 실장함으로써, 불안정 동작을 하는 예조가 있는 CPU의 검출ㆍ그 레벨의 인식 기능을 갖는 저렴하고 또한 고품질인 컴퓨터 시스템을 구축할 수 있다.In a computer system in which a hardware function that detects a CPU that has an unstable operation and recognizes or degenerates its level cannot be completely implemented for reasons such as cost, these functions are performed by a micro program. By implementing the above, an inexpensive and high quality computer system having a function of detecting a CPU having a preliminary example of unstable operation and recognizing the level thereof can be constructed.

Claims

A service processor that performs CPU check of an operating system and transmits CPU control to a microprogram when it is determined that the CPU has a basic operating capability by its own hardware resource; and a CPU of the operating system upon system reset. A microprogram for diagnosing a system configuration unit, wherein the microprogram obtains error CPU information from the service processor until immediately before the end of diagnosis by the microprogram, and is unstable to the service processor based on the error CPU information. Requesting degeneracy of a CPU with a working example

CPU degeneration system using a service processor, characterized in that.

2. The service processor of claim 1, wherein the service processor includes an error CPU information storage unit that monitors an error occurrence state of a CPU of the operating system and stores the error occurrence state until immediately before the end of the diagnostic processing by the microprogram. CPU degenerate system using a service processor.

The service processor of claim 2, wherein the service processor notifies the microprogram of the error CPU information stored in the error CPU information storage unit when receiving the request for the transmission of the error CPU information from the microprogram. CPU degenerate system.

The microprocessor of claim 3, wherein the microprogram determines a CPU to degenerate based on the error CPU information notified from the service processor, and requests the service processor to degenerate the CPU. A CPU degenerate system using a service processor, characterized in that the CPU degenerate.

The service processor recognizing the occurrence of a CPU hardware error while the microprogram is performing initial setup or diagnostics of the units making up the system;

Creating error CPU information on the system side,

Requesting notification of the CPU information in which the error occurred to the service processor immediately before the end of the micro program, wherein the service processor notifies the microprogram of the CPU information in which the error is received by the request;

Determining, by the micro-program, the content of the notified error CPU information, and requesting degeneracy to the service processor by deciding a CPU to degenerate the service processor;

Receiving a degeneracy request from the microprogram, and the service processor executing degeneracy of the CPU

CPU degeneracy method using a service processor comprising a.

delete