[go: up one dir, main page]

CN114816539A - Equipment board card, electronic equipment and control method of equipment board card - Google Patents

Equipment board card, electronic equipment and control method of equipment board card Download PDF

Info

Publication number
CN114816539A
CN114816539A CN202110112993.1A CN202110112993A CN114816539A CN 114816539 A CN114816539 A CN 114816539A CN 202110112993 A CN202110112993 A CN 202110112993A CN 114816539 A CN114816539 A CN 114816539A
Authority
CN
China
Prior art keywords
controller
sensor
signal
management controller
hardware controller
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110112993.1A
Other languages
Chinese (zh)
Inventor
王新兵
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN202110112993.1A priority Critical patent/CN114816539A/en
Publication of CN114816539A publication Critical patent/CN114816539A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/4401Bootstrapping
    • G06F9/442Shutdown

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Power Sources (AREA)

Abstract

The embodiment of the application discloses an equipment board card, electronic equipment and a control method of the equipment board card, and belongs to the technical field of board cards. The equipment board card comprises a PSU, a protection circuit, a management controller, a hardware controller and at least one sensor; the PSU is coupled with a protection circuit, the protection circuit is respectively coupled with a management controller and a hardware controller, and the management controller is coupled with the hardware controller; the sensor is used for sending an alarm signal to the hardware controller under the condition that the running state of the equipment board card is determined to be abnormal; the hardware controller is used for sending an interrupt signal to the management controller based on the alarm signal; the hardware controller is also used for sending a closing signal to the protection circuit under the condition that the management controller does not send a shutdown signal after time out. The embodiment of the application can effectively reduce the probability of board burning of the equipment board card.

Description

Equipment board card, electronic equipment and control method of equipment board card
Technical Field
The embodiment of the application relates to the technical field of board cards, in particular to an equipment board card, electronic equipment and a control method of the equipment board card.
Background
In industrial use, various board burning phenomena are often generated in the server, the phenomenon is as small as the burning of electronic components and the phenomenon is as large as the burning of the whole server, and no matter what board burning phenomenon, the server is seriously influenced.
In the related art, there are many reasons for causing the server to burn, and the server is burned due to the reasons of poor PCB (Printed Circuit Board) manufacturing process, damage or abnormality of electronic components, exceeding of the component standard in the working environment, poor contact of the connector, the component reaching the service life, poor heat dissipation, and the like, and the Board burning phenomenon is generally difficult to reproduce. Usually, the baseboard management controller sends a shutdown instruction to the programmable logic device, and the programmable logic device sends a shutdown signal to the protection circuit, so as to stop supplying power to the server board card.
However, the processing timeliness of the baseboard management controller in the related art is poor, and it cannot be guaranteed that a shutdown instruction is sent to the programmable logic device in time, so that the board burning phenomenon may be caused.
Disclosure of Invention
The embodiment of the application provides an equipment board card, electronic equipment and a control method of the equipment board card, and the probability of board burning of the equipment board card is effectively reduced. The technical scheme is as follows:
on one hand, the embodiment of the application provides an equipment board card, which comprises a power supply unit PSU, a protection circuit, a management controller, a hardware controller and at least one sensor; wherein the PSU is coupled to the protection circuit, the protection circuit is coupled to the management controller and the hardware controller, respectively, and the management controller is coupled to the hardware controller;
the sensor is used for sending an alarm signal to the hardware controller under the condition that the running state of the equipment board card is determined to be abnormal;
the hardware controller is used for sending an interrupt signal to the management controller based on the alarm signal, wherein the interrupt signal is used for triggering the management controller to send a shutdown signal to the hardware controller under the condition that abnormal data is successfully recorded;
the hardware controller is further configured to send a shutdown signal to the protection circuit when the management controller does not send the shutdown signal after timeout, where the shutdown signal is used to trigger the protection circuit to stop supplying power to the device board based on the PSU.
On the other hand, an embodiment of the present application provides an electronic device, where the electronic device includes the device board described in the above aspect.
On the other hand, the embodiment of the application provides a control method of an equipment board card, wherein the equipment board card comprises a power supply unit PSU, a protection circuit, a management controller, a hardware controller and at least one sensor; wherein the PSU is coupled to the protection circuit, the protection circuit is coupled to the management controller and the hardware controller, respectively, and the management controller is coupled to the hardware controller;
the method comprises the following steps:
the sensor sends an alarm signal to the hardware controller under the condition that the running state of the equipment board card is determined to be abnormal;
the hardware controller sends an interrupt signal to the management controller based on the alarm signal, wherein the interrupt signal is used for triggering the management controller to send a shutdown signal to the hardware controller under the condition that abnormal data is successfully recorded;
and the hardware controller sends a closing signal to the protection circuit when the management controller does not send the shutdown signal after overtime, wherein the closing signal is used for triggering the protection circuit to stop supplying power to the equipment board card based on the PSU.
The technical scheme provided by the embodiment of the application can bring the following beneficial effects:
compared with the prior art that the hardware controller sends the closing signal to the protection circuit after waiting for receiving the shutdown signal from the management controller all the time, the embodiment of the application can close the power supply protection device board card before the device board card burns, and the probability of the device board card burning is effectively reduced.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings required to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the description below are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings without creative efforts.
Fig. 1 is a schematic diagram of an equipment board card provided in an embodiment of the present application;
fig. 2 is a schematic diagram of an equipment board card provided in another embodiment of the present application;
FIG. 3 is a schematic diagram of a current sensor provided by one embodiment of the present application;
FIG. 4 is a schematic diagram of a temperature sensor provided by one embodiment of the present application;
FIG. 5 is a schematic diagram of a hardware controller provided by one embodiment of the present application;
FIG. 6 is a schematic diagram of an electronic device provided by an embodiment of the application;
fig. 7 is a flowchart of a method for controlling an equipment board card according to an embodiment of the present application;
fig. 8 is a flowchart of a method for controlling a server board according to an embodiment of the present application.
Detailed Description
To make the objects, technical solutions and advantages of the present application more clear, embodiments of the present application will be described in further detail below with reference to the accompanying drawings.
Referring to fig. 1, a schematic diagram of a device board card provided in an embodiment of the present application is shown. The device board 100 includes a PSU (Power Supply Unit) 110, a protection circuit 120, a management controller 130, a hardware controller 140, and at least one sensor 150; the PSU is coupled with the protection circuit, the protection circuit is respectively coupled with the management controller and the hardware controller, and the management controller is coupled with the hardware controller.
The PSU is a device for supplying power to a device board, and the PSU may provide a voltage of 12V (volts) or 48V, but may also provide other voltages in other possible implementations, which is not limited in this embodiment of the present application.
The protection circuit 120 is a circuit for protecting the device board 100, the protection circuit 120 plays a role of a master switch in the device board 100, and if the protection circuit 120 is in an off state, the device board 100 is in a power-off state.
The management controller 130 refers to a controller for providing a management function. The management controller 130 may locally and remotely manage the operating state of the electronic device, and the management controller 130 is a basic core function subsystem of the electronic device and may be responsible for core functions of hardware state management, operating system management, health state management, power consumption management, and the like of the electronic device. The management controller is a small-sized operating system independent of the electronic equipment system and is a chip integrated on the equipment board card.
The hardware controller 140 refers to a controller for providing a hardware control function, and for example, the hardware controller 140 may be used to control a chip (a chip other than the management controller 130) on the device board 100, and for example, the hardware controller 140 may be used to control the shutdown (or power down) of the chip on the device board 100.
The sensor 150 is a device for detecting an operating state of the device board, and for example, the sensor may be used for detecting an operating state of the device board 100, such as current or voltage or temperature. Illustratively, the at least one sensor 150 is respectively coupled with the hardware controller 140. In the embodiment of the present application, the sensor is configured to send an alarm signal to the hardware controller 140 when it is determined that the operation state of the device board 100 is abnormal. The alarm signal is used to indicate that the operation state of the device board 100 is abnormal.
For example, if the operating state includes a current state, the sensor may be a current sensor, and the current sensor is configured to send an alarm signal to the hardware controller when it is determined that the current of the device board card is abnormal; for another example, if the operating state includes a temperature state, the sensor may be a temperature sensor, and the temperature sensor is configured to send an alarm signal to the hardware controller when it is determined that the temperature of the device board is abnormal.
The hardware controller 140 is configured to send an interrupt signal to the management controller 130 based on the alarm signal, where the interrupt signal is configured to trigger the management controller 130 to send a shutdown signal to the hardware controller 140 if the abnormal data is successfully recorded. The hardware controller 140 sends an interrupt signal to the management controller 130 when receiving the alarm signal, the management controller 130 accesses the sensor for determining the abnormality when receiving the interrupt signal, so as to successfully record the abnormal data, and the management controller 130 sends a shutdown signal to the hardware controller 140 when successfully recording the abnormal data, where the shutdown signal is used to trigger the hardware controller 140 to send a shutdown signal to the protection circuit 120.
The hardware controller 140 is further configured to send a shutdown signal to the protection circuit 120 in a case that the management controller 130 does not send a shutdown signal after timeout, where the shutdown signal is used to trigger the protection circuit 120 to stop supplying power to the device board 100 based on the PSU 110. Illustratively, the shutdown signal is used to trigger the protection circuit 120 to stop supplying power to the chips (e.g., CPU, network chip, south bridge chip, north bridge chip, clock chip, etc.) on the device board 100 based on the PSU 110.
In the related art, the management controller has poor timeliness and may be hung up, and if the hardware controller still sends a shutdown signal to the protection circuit after receiving a shutdown signal from the management controller, the device board card may be burnt out.
Illustratively, the hardware controller may include a CPLD (Complex Programmable Logic Device) that is constructed of fully Programmable and/or arrays and macro cell libraries, and/or arrays that are reprogrammable and capable of performing numerous Logic functions. Macro cells are functional blocks that perform combinational or sequential logic, while providing more flexibility in outputting true or complement values and feeding back in different paths. CPLDs are suitable for implementing various arithmetic and Combinational Logic (combinatorial Logic). The CPLD includes several PALs (Programmable Array Logic), and the interconnection lines between the PALs can be programmed.
The device board is a board in an electronic device, the electronic device may include a terminal and a server, the terminal may include a base station, a user equipment or other devices, and the server may include a cloud server or a general server. In the case that the device board is a board in a server, the Management Controller may include a BMC (Baseboard Management Controller), and the BMC may locally and remotely manage the running state of the server, support a visual console interface, and easily perform hardware Management and troubleshooting on the server. The BMC is a basic core function subsystem of the server and is responsible for core functions of hardware state management, operating system management, health state management, power consumption management and the like of the server. The BMC is a small operating system independent of the server system and is a chip integrated on a device board card.
It should be noted that, the above description only takes the example that the hardware controller is the CPLD and the management controller is the BMC, and in other possible implementation manners, the hardware controller and the management controller in different electronic devices may be different, and the embodiment of the present application does not limit this.
To sum up, among the technical scheme that this application embodiment provided, under the overtime condition that has not received the shutdown signal that comes from the management controller, the hardware controller directly sends the shutdown signal to the protection circuit, rather than waiting for the response of management controller, compare in the correlation technique hardware controller wait for always to receive the shutdown signal that comes from the management controller after, just send the shutdown signal to the protection circuit, this application embodiment can just close power protection equipment integrated circuit board earlier before the board burning phenomenon takes place for the equipment integrated circuit board, effectively reduces the probability that the board burning phenomenon of equipment integrated circuit board takes place.
Please refer to fig. 2, which illustrates a schematic diagram of a device board card according to another embodiment of the present application.
In the exemplary embodiment, as shown in FIG. 2, sensor 150 includes a temperature sensor 151 and a current sensor 152. The temperature sensor 151 is a temperature detecting device, which can sense the information of the measured temperature, and can convert the detected information into an electric signal meeting certain standards or other required forms of information output according to certain rules, so as to meet the requirements of information transmission, processing, storage, display, recording, control and the like. The current sensor 152 is a current detection device, which can sense the information of the current to be detected, and can convert the detected information into an electric signal meeting certain standards or other information in a required form according to a certain rule for output, so as to meet the requirements of information transmission, processing, storage, display, recording, control and the like.
The temperature sensor 151 is configured to detect whether the temperature of the device board 100 is abnormal, and the current sensor 152 is configured to detect whether the current of the device board 100 is abnormal. Illustratively, the device board includes m temperature sensors 151 and n current sensors 152, where m and n are positive integers. For example, at the same time point, (m + n) signals, which may or may not include the alarm signal, may be received by the hardware controller 140. In a possible implementation, the alarm signal may include a high level signal (e.g., level "1"), and the non-alarm signal may include a low level signal (e.g., level "0"), at which point the hardware controller 140 may determine that the alarm signal is received if the hardware controller 140 receives the high level signal; if the hardware controller receives a low level signal, the hardware controller may determine that a non-alarm signal is received.
In an exemplary embodiment, the management controller 130 is coupled to at least one sensor 150, and the management controller 130 may be coupled to the at least one sensor through an I2C (Inter-Integrated Circuit) bus, and the I2C bus is a Serial bus consisting of an SDA (Serial Data, Data line) and an SCL (Serial Clock), and may transmit and receive Data. The interface of the I2C bus is directly on the component, so the I2C bus occupies very little space, reduces the space of the circuit board and the number of chip pins, and reduces the interconnection cost. The management controller 130 is configured to send a protection threshold of each of the at least one sensor 150 to a corresponding sensor, and the sensor is configured to send an alarm signal to the hardware controller 140 when it is determined that the operation state of the device board 100 reaches the protection threshold. The protection threshold value can be set according to the actual running state of each power supply, and certain margin is reserved. For example, the protection threshold may include 1.2-1.5 times the actual operating state (the actual operating state is measured when the device board card is not abnormal). For example, the operating state includes a current state, and the protection threshold may be 12A-15A in the case where the actual operating state is 10A; for another example, the operating condition includes a temperature condition, and the protection threshold may be 60 ℃ to 75 ℃ in the case where the actual operating condition is 50 ℃. Of course, in other possible implementation manners, the protection threshold may also be determined based on other manners, which is not limited in this embodiment of the application. In the exemplary embodiment, supervisory controller 130 sets an over-temperature protection point (in the case where the sensor is a temperature sensor, the protection threshold may be referred to as an over-temperature protection point) or an over-current protection point (in the case where the sensor is a current sensor, the protection threshold may be referred to as an over-current protection point) for each of the temperature sensor and the current sensor via the I2C bus, which masks, i.e., triggers, only an alarm (alert) signal to hardware controller 140. All sensors will have independent alarm signals sent to the hardware controller 140, and the hardware controller 140 can locate a particular sensor quickly after receiving the alarm signal.
Exemplarily, as shown in fig. 3, it shows a schematic diagram of a current sensor provided by an embodiment of the present application. By stringing a precision resistor 310 in the load circuit, the current sensor 300 detects the voltage across the precision resistor 310, thereby calculating the magnitude of the current, and determines whether an alarm signal needs to be sent to the hardware controller based on the magnitude of the current and the protection threshold. For example, the current sensor may be disposed at a location in the device board where a large current is likely to be generated, or the current sensor may be disposed at a location in the device board where a large power is likely to be generated. For example, the current sensor may be placed on the basis of a chip area, for example, the current sensor may be placed in an area corresponding to a fan, and/or the current sensor may be placed in an area corresponding to a backplane, and/or the current sensor may be placed in an area corresponding to a network card. A0A1 is used to indicate device addresses, with different device addresses corresponding to different A0A1 values.
Exemplarily, as shown in fig. 4, it shows a schematic diagram of a temperature sensor provided by an embodiment of the present application. By placing the temperature sensor 400 at the local hot spot, a local hot spot of a Printed Circuit Board (PCB) is detected, and when the temperature reaches a protection threshold, the temperature sensor 400 triggers an alarm signal. For example, a temperature sensor may be disposed in a region corresponding to the air outlet or the air inlet, and/or a temperature sensor may be disposed at a position where a large current is likely to be generated in the device board card.
By way of example, the current sensor and the temperature sensor at each part of the device board card can be monitored simultaneously, and the problem that the PCB is burnt due to high temperature caused by local micro short circuit of the PCB can be well avoided; the normal working temperature of the board card is required to be below 60 ℃, but the board burning is required to be at least more than 150 ℃, so that enough allowance is provided, and false triggering is not easy to occur. In addition, the current can be monitored firstly, if the current is too large, the temperature sensor at the corresponding position is used for confirming whether the temperature is ultrahigh again, and if the current is confirmed, the power-off protection is triggered, so that the probability of false triggering can be reduced. Under the condition, if the current sensor determines that the current of the equipment board card reaches the protection threshold, an alarm signal is sent to the hardware controller, after the hardware controller receives the alarm signal, the hardware controller waits for a period of time to determine whether the alarm signal from the temperature sensor at the position corresponding to the current sensor is received, and if the hardware controller receives the alarm signal from the temperature sensor, an interrupt signal is sent to the management controller, so that the probability of false triggering is reduced. Due to the time required for temperature rise, the hardware controller needs to wait for a period of time, which can be set by a technician, for example, the waiting time may be 2ms (milliseconds), and the waiting time is not too long, which may cause the board burning phenomenon of the device board card under the condition that the alarm signal of the temperature sensor has not been received, and therefore, the waiting time needs to be set reasonably.
For example, the number of integrated circuits of the sensor is not limited, and if the independent current sensor and the independent temperature sensor are adopted, the sensor can be close to a monitored power supply or a temperature source as much as possible, so that the risk of noise coupling due to the overlong sensor line can be reduced.
The current sensor provided by the embodiment of the application can be used for monitoring the real-time power consumption of each part of the whole electronic equipment at the same time, can realize the current and power consumption monitoring of each path of 12V or 48V power supply, and can know the power consumption distribution of the whole electronic equipment more conveniently. In addition, the real-time temperature of each part of the electronic equipment can be monitored by utilizing the temperature sensor, and a refined fan speed regulation strategy can be realized, so that the total power consumption of the electronic equipment is reduced.
In the illustrative embodiment, the protection circuit 120 includes an electronic switch 121 and an electronic switch controller 122. Electronic switch 121 is coupled to electronic switch controller 122; electronic switch 121 is coupled to PSU 110; electronic switch controller 122 is coupled to hardware controller 140. Wherein the electronic switch controller 122 is configured to send a close enable signal to the electronic switch 121 upon receiving a close signal from the hardware controller 140; the electronic switch 121 is configured to stop power supply to the chip on the device board 100 based on the PSU110 upon receiving a shutdown enable signal from the electronic switch controller 122.
Illustratively, the electronic switch controller 122 may include an EFUSE (Electrical Fuse) controller, and the electronic switch 121 may include a MOSFET (Metal-Oxide-Semiconductor Field-Effect Transistor).
In the exemplary embodiment, the device board 100 also includes a VR (Voltage Regulator) 160. The VR160 is hardware used to convert the electrical energy provided by the PSU110 to a usable voltage for each chip on the device board 100. The VR160 is coupled to a chip on the device board 100, for example, different chips may require different operating voltages, so that different chips may correspond to different VRs and different operating voltages correspond to different VRs (i.e., the operating voltages converted by different VRs may be different), for example, assuming that there are several operating voltages as follows: 1.8V, 1.2V, 3.3V, 0.9V, VR1 corresponding to 1.8V, VR2 corresponding to 1.2V, VR3 corresponding to 3.3V, VR4 corresponding to 0.9V may be present in the device board.
In a possible implementation, in case the electronic switch 121 comprises a MOSFET, the D (Drain) pole of the MOSFET is connected to the PSU110, the S (Source) pole of the MOSFET is connected to the VR160, and the G (Gate) pole of the MOSFET is connected to the electronic switch controller 122.
In the illustrative embodiment, as shown in FIG. 5, the hardware controller 140 includes an exception logging register 141, the exception logging register 141 for logging an alarm signal. The exception log register 141 is coupled to the management controller 130 and the at least one sensor 150, respectively. The management controller 130 is configured to, in a case where the interrupt signal is received, access the exception recording register 141 to determine a target sensor in the at least one sensor 150, where the target sensor is a sensor that determines that the operating state of the device board is abnormal; and accessing the target sensor, and recording and saving sensor data in the target sensor.
In the exemplary embodiment, in the case where the management controller 130 does not successfully record the abnormality data, the management controller 130 is configured to access the abnormality recording register 141 again to determine the target sensor after the management controller 130 is restarted; the target sensor is accessed again, and the sensor data in the target sensor is recorded and saved. In practical applications, there may be a case where the management controller 130 has not successfully recorded the abnormal data, but the hardware controller 140 has sent a shutdown signal to the protection circuit, that is, the time for the management controller 130 to record the abnormal data exceeds the time for the hardware controller 140 to wait for the management controller 130 to respond, and after the management controller 130 can wait for restart, the management controller 130 accesses the abnormal recording register 141 again to determine the target sensor; the target sensor is then accessed again, and sensor data in the target sensor is recorded and saved, with the anomaly data including sensor data in the target sensor. In a possible implementation manner, after the management controller successfully records the abnormal data, a deletion signal may be sent to the abnormal recording register, where the deletion signal is used to delete the alarm signal stored in the abnormal recording register this time.
In an exemplary embodiment, as shown in FIG. 5, hardware controller 140 includes an OR gate 142, each input of OR gate 142 being coupled to a corresponding one of at least one sensor 150, and an output of OR gate 142 being coupled to management controller 130; wherein the or gate 140 is configured to send an interrupt signal to the management controller 130 in case of receiving an alarm signal from any of the at least one sensor. Illustratively, OR gate 140 sends an interrupt signal to management controller 130 upon receiving an alarm signal from any one or more of the at least one sensor. The or gate 142 refers to a circuit for performing an or operation, and exemplarily, the or gate 142 refers to a circuit for performing an or operation on a signal from the sensor. When the alarm signal is received, the alarm signal includes a high level signal, for example, and after the or gate circuit 142 performs an or operation on the alarm signal, the interrupt signal is also a high level signal.
In the illustrative embodiment, the storage area of the abnormality recording register 141 matches the number of inputs of the or gate 142, and the storage area of the abnormality recording register 141 matches the number of sensors. That is, the abnormality recording register 141 is used to record a signal from each sensor, which may be an alarm signal or a non-alarm signal. In a possible implementation manner, the exception logging register 141 is only used for logging an alarm signal, and in the case that the hardware controller receives the alarm signal, the hardware controller writes the alarm signal into a storage area of the exception logging register corresponding to the alarm signal. The abnormality recording register 141 includes a storage area corresponding to each sensor, and can store signals transmitted from each sensor. Illustratively, still taking the above example as an example, assuming that the number of sensors is (m + n), the abnormality recording register 141 may include (m + n) storage areas, and the or gate circuit 142 may include (m + n) input terminals.
In an exemplary embodiment, the hardware controller 140 is further configured to perform an anti-shake process on the alarm signal to obtain an alarm signal after the anti-shake process, where the anti-shake process is used to determine whether the alarm signal is generated due to false triggering; and sending an interrupt signal to the management controller 130 when the alarm signal after the anti-shake processing is used for indicating that the operating state of the equipment board card is abnormal.
Illustratively, the hardware controller 140 may delay (delay) the alarm signal for a period of time (e.g., 1ms), and if the delayed alarm signal is still a high signal, the hardware controller 140 may determine that the alarm signal is not a signal generated due to false triggering, and the alarm signal is actually valid.
In the illustrative embodiment, the hardware controller 140 includes a timer 143. The hardware controller 140 is further configured to start the timer 143 in case of receiving the alarm signal from the sensor 150; acquiring the measurement duration of the timer 143; in the case where the measured time period of the timer 143 reaches the preset time period, a shutdown signal is transmitted to the protection circuit 120.
Illustratively, the timer 143 is coupled to the management controller 130, and the management controller 130 may send a heartbeat signal to the timer 143, and if the management controller 130 is in an abnormal state, the timer 143 does not receive the heartbeat signal from the management controller 130.
All alarms sent to the hardware controller 140 are processed or processed, an interrupt is sent to the management controller 130, and the hardware controller 140 starts a timer 143 (the timer may also be called a watchdog timer) and records the alarm in the exception log register 141. The management controller 130, upon receiving the interrupt signal, may query the exception log register 141 through the I2C interface to determine which temperature or current sensor detected the exception, and then go to the corresponding sensor to read the more detailed status and record and save the log. Finally, the hardware controller is controlled by the I2C interface to close each chip on the device board card according to a normal power-off time sequence, and then the protection circuit is closed, so that the purpose of preventing the board from being burnt is achieved, if the timer overflows in the process, the hardware controller 140 can directly close the chips according to the power-off time sequence, and no matter whether the management controller 130 records logs or receives a power-off signal of the management controller 130.
Due to the fact that the hardware timer is arranged in the hardware controller, the hardware controller can be guaranteed to be closed within a preset time when the server board card is in an overcurrent or overtemperature state, and therefore the probability that the server board card cannot be protected when the software of the management controller is hung up is reduced.
In a possible implementation, the hardware controller 140 is further configured to: in the case of receiving a shutdown signal from the management controller 130, turning off the chip on the device board 100 based on the power-off timing; in the event that the chip on the device board 100 is successfully shut down, a shut down signal is sent to the protection circuit 120. For example, when the hardware controller 140 receives the shutdown signal from the management controller 130, the hardware controller 140 powers down the chips on the device board 100 according to the power-down sequence of the chips, and after the power-down of the chips is successfully completed, the hardware controller 140 sends a shutdown signal to the protection circuit 120.
In the illustrative embodiment, management controller 130 is configured to send a power-up signal to hardware controller 140; the hardware controller 140 is further configured to trigger a power-on sequence based on the power-on signal.
Illustratively, after the management controller 130 is started, a power-on signal is sent to the hardware controller 140 through an I2C bus or a GPIO (General Purpose Input Output interface), the hardware controller 140 triggers a power-on sequence, and the management controller 130 sets an over-temperature protection point or an over-current protection point of each of the temperature sensor and the current sensor through an I2C bus and shields other types of errors, that is, only an over-current or an over-temperature trigger an alarm signal to the hardware controller 140, if the hardware controller 140 receives a certain alarm signal (the alarm signal sent by the current sensor may be referred to as an over-current signal, and the alarm signal sent by the temperature sensor may be referred to as an over-temperature signal), the anti-shake processing is first performed to ensure that the protection process is not triggered erroneously, if it is confirmed that the alarm signal is really valid, the hardware controller 140 sends an interrupt signal to the management controller 130, at the same time, the timer 143 is started (for example, 50ms), and if the management controller 130 is hung up or busy processing other services and cannot respond to the interrupt of the hardware controller 140 all the time, and does not trigger the shutdown signal, the hardware controller 140 may directly shutdown according to the power-down sequence after the timer 143 times out. Meanwhile, after the management controller 130 receives the interrupt, it may determine the specific faulty sensor from the hardware controller 140 through the I2C bus, read the corresponding register information from the specific sensor, record the log, and finally instruct the hardware controller 140 to turn off the electronic switch through the I2C bus.
For example, the sensor and hardware controller may be referred to as a detection circuit.
Referring to fig. 6, a schematic diagram of an electronic device according to an embodiment of the present application is shown. The electronic device 600 includes the device card 100 according to the above embodiment. The electronic device may exemplarily include a terminal and a server, the terminal may include a base station, a user equipment, a smart phone, a tablet computer, a notebook computer, a desktop computer, a smart speaker, a smart watch, or other devices, the server may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a Network service, cloud communication, a middleware service, a domain name service, a security service, a CDN (Content Delivery Network), and a big data and artificial intelligence platform, but is not limited thereto. The terminal and the server may be directly or indirectly connected through wired or wireless communication, and the application is not limited herein.
Referring to fig. 7, a flowchart of a method for controlling a device board provided in an embodiment of the present application is shown, where the method may be applied to the device board described in the above embodiment, and the device board includes a PSU, a protection circuit, a management controller, a hardware controller, and at least one sensor. The PSU is coupled with the protection circuit, the protection circuit is respectively coupled with the management controller and the hardware controller, and the management controller is coupled with the hardware controller. The method comprises the following steps:
in step 701, the sensor sends an alarm signal to the hardware controller when determining that the running state of the equipment board card is abnormal.
In step 702, the hardware controller sends an interrupt signal to the management controller based on the alarm signal.
In the embodiment of the present application, the interrupt signal is used to trigger the management controller to send a shutdown signal to the hardware controller when the abnormal data is successfully recorded.
And 703, the hardware controller sends a shutdown signal to the protection circuit when the management controller does not send a shutdown signal after timeout, and the shutdown signal is used for triggering the protection circuit to stop supplying power to the equipment board card based on the PSU.
It should be noted that the method provided by the foregoing embodiment and the structural embodiment belong to the same concept, and specific implementation processes thereof are detailed in the structural embodiment and will not be described herein again. For details which are not disclosed in the method embodiments of the present application, reference is made to the structural embodiments of the present application.
To sum up, among the technical scheme that this application embodiment provided, under the overtime condition that has not received the shutdown signal that comes from the management controller, the hardware controller directly sends the shutdown signal to the protection circuit, rather than waiting for the response of management controller, compare in the correlation technique hardware controller wait for always to receive the shutdown signal that comes from the management controller after, just send the shutdown signal to the protection circuit, this application embodiment can just close power protection equipment integrated circuit board earlier before the board burning phenomenon takes place for the equipment integrated circuit board, effectively reduces the probability that the board burning phenomenon of equipment integrated circuit board takes place.
For example, an electronic device is taken as a server, an equipment board is taken as a server board on the server, a hardware controller is taken as a CPLD, and a management controller is taken as a BMC, please refer to fig. 8, which shows a flowchart of a control method for a server board provided in an embodiment of the present application, where the method may include the following steps:
step 801, the BMC starts and sends a power-on signal to the CPLD.
In step 802, the CPLD triggers a power-on sequence based on the power-on signal.
And step 803, the BMC sets an over-current protection point and an over-temperature protection point of the current sensor and the temperature sensor, and shields other types of errors.
It should be noted that step 803 may be executed before step 801, may be executed after step 801, or may be executed simultaneously with step 801.
At step 804, the CPLD receives an alert signal from the sensor.
The sensor determines that the running state of the server board card is abnormal, and sends an alarm signal to the CPLD, and the CPLD detects that a certain power supply is over-current or a certain temperature sensor is over-temperature.
And step 805, the CPLD performs anti-shake processing on the alarm signal to obtain an alarm signal after the anti-shake processing.
And the CPLD performs anti-shake processing on the alarm signal to determine whether an over-current or over-temperature signal is really generated.
In step 806, the CPLD sends an interrupt signal to the BMC and starts a timer at the same time when the alarm signal after the anti-shake processing is used to indicate that the operating state of the server board card is abnormal.
And the CPLD confirms that an over-temperature or over-current signal is triggered, starts a timer and sends an interrupt signal to the BMC.
At step 807, the BMC records the exception data.
Illustratively, the BMC records exception data and a talent that tells the user the flow of subsequent processing.
Step 808, the BMC sends a shutdown signal to the CPLD when the abnormal data is successfully recorded.
In step 809, the CPLD determines whether a shutdown signal is received. If no shutdown signal is received, then the process starts from step 810; if a shutdown signal is received, the process starts in step 811.
Step 810, the CPLD determines whether the timer has overflowed; if not, then the process starts again from step 810; if so, execution begins at step 811.
In step 811, the CPLD sends a shutdown signal to the protection circuit.
Illustratively, the CPLD turns off the system at a power-down sequence.
At step 812, the protection circuit is closed.
Illustratively, the system power is off.
It should be understood that reference to "a plurality" herein means two or more. "and/or" describes the association relationship of the associated objects, meaning that there may be three relationships, e.g., a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship. In addition, the step numbers described herein only exemplarily show one possible execution sequence among the steps, and in some other embodiments, the steps may also be executed out of the numbering sequence, for example, two steps with different numbers are executed simultaneously, or two steps with different numbers are executed in a reverse order to the order shown in the figure, which is not limited by the embodiment of the present application.
It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, where the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.
The above description is only exemplary of the present application and should not be taken as limiting the present application, and any modifications, equivalents, improvements and the like that are made within the spirit and principle of the present application should be included in the protection scope of the present application.

Claims (13)

1. An equipment board card is characterized by comprising a power supply unit PSU, a protection circuit, a management controller, a hardware controller and at least one sensor; wherein the PSU is coupled to the protection circuit, the protection circuit is coupled to the management controller and the hardware controller, respectively, and the management controller is coupled to the hardware controller;
the sensor is used for sending an alarm signal to the hardware controller under the condition that the running state of the equipment board card is determined to be abnormal;
the hardware controller is used for sending an interrupt signal to the management controller based on the alarm signal, wherein the interrupt signal is used for triggering the management controller to send a shutdown signal to the hardware controller under the condition that abnormal data is successfully recorded;
the hardware controller is further configured to send a shutdown signal to the protection circuit when the management controller does not send the shutdown signal after timeout, where the shutdown signal is used to trigger the protection circuit to stop supplying power to the device board based on the PSU.
2. The device board of claim 1, wherein the hardware controller includes an exception logging register for logging the alarm signal;
the exception register is respectively coupled with the management controller and the at least one sensor;
the management controller is used for accessing the abnormality recording register and determining a target sensor in the at least one sensor under the condition of receiving the interrupt signal, wherein the target sensor is a sensor for determining that the running state of the equipment board card is abnormal; and accessing the target sensor, and recording and saving sensor data in the target sensor.
3. The equipment board of claim 2, wherein in the event that the management controller fails to record exception data, after the management controller is restarted, the management controller is configured to access the exception record register again to determine the target sensor; and accessing the target sensor again, and recording and storing the sensor data in the target sensor.
4. The device board of claim 1, wherein the hardware controller comprises an or gate circuit, each input of the or gate circuit being coupled to a corresponding sensor of the at least one sensor, an output of the or gate circuit being coupled to the management controller;
wherein the OR gate is configured to send the interrupt signal to the management controller upon receiving an alarm signal from any of the at least one sensor.
5. The equipment board of claim 1, wherein the management controller is configured to send a protection threshold for each of the at least one sensor to a corresponding sensor, and wherein the sensor is configured to send the alarm signal to the hardware controller upon determining that the operational status of the equipment board has reached the protection threshold.
6. The equipment board of claim 1, wherein the hardware controller comprises a timer;
the hardware controller is also used for starting the timer under the condition of receiving an alarm signal from the sensor; acquiring the measuring time length of the timer; and sending the closing signal to the protection circuit under the condition that the measuring time length of the timer reaches a preset time length.
7. The device board of claim 1, wherein the hardware controller is further configured to:
performing anti-shaking processing on the alarm signal to obtain an alarm signal after the anti-shaking processing, wherein the anti-shaking processing is used for confirming whether the alarm signal is generated due to false triggering;
and sending an interrupt signal to the management controller under the condition that the alarm signal after the anti-shake processing is used for indicating that the running state of the equipment board card is abnormal.
8. The equipment board of claim 1, wherein the protection circuit includes an electronic switch and an electronic switch controller;
the electronic switch is coupled with the electronic switch controller;
the electronic switch is coupled with the PSU;
the electronic switch controller is coupled with the hardware controller;
wherein the electronic switch controller is configured to send a close enable signal to the electronic switch upon receiving a close signal from the hardware controller;
the electronic switch is used for stopping supplying power to the equipment board card based on the PSU under the condition of receiving a closing enabling signal from the electronic switch controller.
9. The device board of claim 1, wherein the hardware controller is further configured to:
under the condition that a shutdown signal from the management controller is received, closing a chip on the equipment board card based on a power-off time sequence;
and sending the closing signal to the protection circuit under the condition of successfully closing the chip on the equipment board card.
10. The equipment board of claim 1,
the management controller is used for sending a power-on signal to the hardware controller;
the hardware controller is further configured to trigger a power-on sequence based on the power-on signal.
11. The equipment board of any of claims 1-10, wherein the sensors include temperature sensors and current sensors;
the temperature sensor is used for detecting whether the temperature of the equipment board card is abnormal or not, and the current sensor is used for detecting whether the current of the equipment board card is abnormal or not.
12. An electronic device, characterized in that it comprises a device card according to any of claims 1 to 11.
13. The control method of the equipment board card is characterized in that the equipment board card comprises a power supply unit PSU, a protection circuit, a management controller, a hardware controller and at least one sensor; wherein the PSU is coupled to the protection circuit, the protection circuit is coupled to the management controller and the hardware controller, respectively, and the management controller is coupled to the hardware controller;
the method comprises the following steps:
the sensor sends an alarm signal to the hardware controller under the condition that the running state of the equipment board card is determined to be abnormal;
the hardware controller sends an interrupt signal to the management controller based on the alarm signal, wherein the interrupt signal is used for triggering the management controller to send a shutdown signal to the hardware controller under the condition that abnormal data is successfully recorded;
and the hardware controller sends a closing signal to the protection circuit when the management controller does not send the shutdown signal after timeout, wherein the closing signal is used for triggering the protection circuit to stop supplying power to the equipment board card based on the PSU.
CN202110112993.1A 2021-01-27 2021-01-27 Equipment board card, electronic equipment and control method of equipment board card Pending CN114816539A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110112993.1A CN114816539A (en) 2021-01-27 2021-01-27 Equipment board card, electronic equipment and control method of equipment board card

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110112993.1A CN114816539A (en) 2021-01-27 2021-01-27 Equipment board card, electronic equipment and control method of equipment board card

Publications (1)

Publication Number Publication Date
CN114816539A true CN114816539A (en) 2022-07-29

Family

ID=82523735

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110112993.1A Pending CN114816539A (en) 2021-01-27 2021-01-27 Equipment board card, electronic equipment and control method of equipment board card

Country Status (1)

Country Link
CN (1) CN114816539A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117992389A (en) * 2024-04-03 2024-05-07 南京群顶科技股份有限公司 A power management device and method for edge computing gateway
WO2024103745A1 (en) * 2022-11-16 2024-05-23 苏州元脑智能科技有限公司 Mainboard protection system and method

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2024103745A1 (en) * 2022-11-16 2024-05-23 苏州元脑智能科技有限公司 Mainboard protection system and method
US12346185B2 (en) 2022-11-16 2025-07-01 Suzhou Metabrain Intelligent Technology Co., Ltd. Mainboard protection system and method
CN117992389A (en) * 2024-04-03 2024-05-07 南京群顶科技股份有限公司 A power management device and method for edge computing gateway

Similar Documents

Publication Publication Date Title
EP4288857B1 (en) Independent slot control for peripheral cards
EP1358555B1 (en) Service processor and system and method using a service processor
CN110941323B (en) Computer-implemented method, computing device, and computer-readable storage medium
US10691185B2 (en) Cooling behavior in computer systems
CN112286709A (en) Diagnosis method, diagnosis device and diagnosis equipment for server hardware faults
CN114816539A (en) Equipment board card, electronic equipment and control method of equipment board card
US20240220385A1 (en) Power source consumption management apparatus for four-way server
CN113204466A (en) Over-temperature protection method and electronic equipment
US7490252B2 (en) Abnormal power interruption internal circuitry protection method and system for computer platform
CN115525486A (en) SSD SMBUS temperature alarm and low power consumption state test verification method and device
HK40070996A (en) Equipment board card, electronic equipment and control method of equipment board card
CN117033063B (en) Server liquid leakage processing method, system, device, electronic equipment and medium
CN218824636U (en) Power supply detection device for server hard disk backboard
CN114884021B (en) Power supply control method of power supply circuit and related components
CN113254304B (en) Method for determining shutdown type of server, server and storage medium
CN116088648A (en) Server cabinet, circuit control method and computing node
CN222028611U (en) A computing device
CN105468495A (en) Complex programmable logic array control device
CN115493762B (en) Liquid leakage detection method, circuit, device, computer equipment and storage medium
CN220252420U (en) Liquid level detection control circuit of immersed liquid cooling equipment
CN110647435A (en) Server, hard disk remote control method and control assembly
CN118860104A (en) Mainboard safety control method and related device
CN113268788B (en) Anti-theft control and management system, method and medium for high-confidentiality server
TWI757923B (en) Pre-boot execution environment determination system and method thereof
CN216210909U (en) CPU frequency reduction control system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40070996

Country of ref document: HK

SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination