[go: up one dir, main page]

CN102346707B - Server system and operation method thereof - Google Patents

Server system and operation method thereof Download PDF

Info

Publication number
CN102346707B
CN102346707B CN201010243788.0A CN201010243788A CN102346707B CN 102346707 B CN102346707 B CN 102346707B CN 201010243788 A CN201010243788 A CN 201010243788A CN 102346707 B CN102346707 B CN 102346707B
Authority
CN
China
Prior art keywords
hardware
node management
abstraction layer
management unit
hardware abstraction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201010243788.0A
Other languages
Chinese (zh)
Other versions
CN102346707A (en
Inventor
赖德贤
陈谕正
龚景富
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Quanta Computer Inc
Original Assignee
Quanta Computer Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Quanta Computer Inc filed Critical Quanta Computer Inc
Priority to CN201010243788.0A priority Critical patent/CN102346707B/en
Publication of CN102346707A publication Critical patent/CN102346707A/en
Application granted granted Critical
Publication of CN102346707B publication Critical patent/CN102346707B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Computer And Data Communications (AREA)

Abstract

A server system and an operation method thereof, the operation method comprises: under the control of the hardware abstraction layer, a plurality of node management units share a hardware resource; (B) when one of the node management units wants to use the hardware resource, the node management unit sends a command or data to the hardware abstraction layer, and the hardware abstraction layer replaces the node management unit to use the hardware resource; and (C) if an external instruction is received, the hardware abstraction layer recognizes that the external instruction is received by the transmission port of the hardware resource and is transmitted to a corresponding node management unit for execution, and after the external instruction is executed, the corresponding node management unit transmits a message back to the hardware abstraction layer, so that the hardware abstraction layer transmits the message back to an external system manager through the transmission port.

Description

服务器系统与其操作方法Server system and its method of operation

技术领域 technical field

本发明涉及一种服务器系统与其操作方法。  The invention relates to a server system and its operating method. the

背景技术 Background technique

传统上,刀锋型服务器(blade server)已广泛地应用在多种应用场合中。一般来说,为数众多的刀锋型服务器集合在机架(chassis)系统中,藉此提升使用者的操作便利性。刀锋型服务器将计算机服务工作站中所有计算机服务系统的核心运算电路丛集在一起。系统管理人员负责对计算机服务工作站内部的各计算机服务系统及网络配置进行维护及控管。藉此,系统管理人员可以对丛集在一起的多台计算机服务系统进行维护及控管。  Traditionally, blade servers have been widely used in various applications. Generally speaking, a large number of blade servers are assembled in a chassis system, thereby improving user's operation convenience. The blade server clusters together the core computing circuits of all the computer service systems in the computer service workstation. System administrators are responsible for maintaining and controlling the computer service systems and network configuration inside the computer service workstation. In this way, system administrators can maintain and control multiple computer service systems clustered together. the

以目前而言,服务器对节点(node)的管理主要是遵循IPMI(IntelligentPlatform Management Interface,智慧型平台管理接口)的规范,利用BMC(Baseboard Management Controller,基板管理控制器)来进行节点监控、记录及错误恢复等功能。在此所谓的节点指的是具有独立运算能力的运算单元,其至少包括CPU(中央处理单元)与存储器等。在目前市面上的产品而言,单一BMC只能管理单一节点,无法同时管理多个节点。此外,在已知技术中,机架系统内会有硬件式CMM(Chassis Management Module,机架管理模块),以管理整个机架系统。  At present, the management of the server to the node (node) mainly follows the specification of IPMI (Intelligent Platform Management Interface, intelligent platform management interface), and uses BMC (Baseboard Management Controller, baseboard management controller) to monitor, record and Error recovery and other functions. The so-called node here refers to a computing unit with independent computing capability, which at least includes a CPU (Central Processing Unit) and a memory. For the products currently on the market, a single BMC can only manage a single node, and cannot manage multiple nodes at the same time. In addition, in the known technology, there is a hardware CMM (Chassis Management Module, Chassis Management Module) in the rack system to manage the entire rack system. the

随着云端技术的发展,对数据中心(data center)的需求日益增加,而如何能在有限的机房空间中放置更多的节点以提高运算能力乃是发展重点。  With the development of cloud technology, the demand for data centers is increasing, and how to place more nodes in the limited space of the computer room to increase computing power is the focus of development. the

本申请提出一种服务器系统及其操作方法,其能有效减少BMC芯片数量,以让服务器内的板卡空间增加,以利放置更多节点来提高运算能力,并可降低服务器成本。  This application proposes a server system and its operation method, which can effectively reduce the number of BMC chips, so as to increase the board space in the server, facilitate the placement of more nodes to improve computing capabilities, and reduce server costs. the

发明内容 Contents of the invention

本发明涉及一种服务器系统及其操作方法,其通过一硬件抽象层使得BMC的多个节点管理单元(其为软件,各别用于管理一节点)能共用BMC的硬件资源。 The invention relates to a server system and an operation method thereof, which enables multiple node management units (which are software, respectively used to manage a node) of the BMC to share the hardware resources of the BMC through a hardware abstraction layer.

根据本发明的一实施例,提出一种服务器系统,包括:至少一系统板,该系统板包括一基板管理控制器与多个节点,该基板管理控制器包括多个节点管理单元、一硬件抽象层与一硬件资源,这些节点管理单元分别管理这些节点,在该硬件抽象层的控制下,这些节点管理单元共用该硬件资源;一连接端口,用以连接至一外部系统管理者;以及一内部通道,连接至该系统板与该连接端口。  According to an embodiment of the present invention, a server system is proposed, including: at least one system board, the system board includes a baseboard management controller and a plurality of nodes, the baseboard management controller includes a plurality of node management units, a hardware abstraction Layer and a hardware resource, these node management units respectively manage these nodes, under the control of the hardware abstraction layer, these node management units share the hardware resource; a connection port, used to connect to an external system manager; and an internal channel, connected to the system board and the connection port. the

根据本发明的另一实施例,提出一种服务器系统的操作方法,该服务器系统包括至少一系统板,该系统板包括一基板管理控制器与多个节点,该基板管理控制器包括多个节点管理单元、一硬件抽象层与一硬件资源,这些节点管理单元分别管理这些节点。该方法包括:(A)在该硬件抽象层的控制下,这些节点管理单元共用该硬件资源;(B)这些节点管理单元的其中之一节点管理单元欲使用该硬件资源时,该节点管理单元送出一指令或一数据至该硬件抽象层,该硬件抽象层据以代替该节点管理单元来使用该硬件资源;以及(C)若接收到一外部指令,则该硬件抽象层辨别该外部指令是由该硬件资源的那一个传输端口所接收,以传送至一相对应节点管理单元执行,且当该外部指令被执行后,该相对应节点管理单元将一信息回传给该硬件抽象层,以由该硬件抽象层将该信息由该传输端口回传给一外部系统管理者。  According to another embodiment of the present invention, an operating method of a server system is proposed, the server system includes at least one system board, the system board includes a baseboard management controller and a plurality of nodes, the baseboard management controller includes a plurality of nodes A management unit, a hardware abstraction layer and a hardware resource, these node management units respectively manage these nodes. The method includes: (A) under the control of the hardware abstraction layer, these node management units share the hardware resource; (B) when one of the node management units intends to use the hardware resource, the node management unit sending an instruction or a data to the hardware abstraction layer, and the hardware abstraction layer replaces the node management unit to use the hardware resource; and (C) if an external instruction is received, the hardware abstraction layer identifies that the external instruction is Received by the transmission port of the hardware resource to be sent to a corresponding node management unit for execution, and when the external command is executed, the corresponding node management unit sends a message back to the hardware abstraction layer to The HAL sends the information back to an external system manager through the transport port. the

为了对本发明的上述及其他方面有更佳的了解,下文特举优选实施例,并配合附图,作详细说明如下:  In order to have a better understanding of the above-mentioned and other aspects of the present invention, the preferred embodiments are specifically cited below, and in conjunction with the accompanying drawings, the detailed description is as follows:

附图说明Description of drawings

图1显示根据本发明实施例的机架系统示意图。  FIG. 1 shows a schematic diagram of a rack system according to an embodiment of the present invention. the

图2显示根据本发明实施例的BMC的示意图。  Figure 2 shows a schematic diagram of a BMC according to an embodiment of the present invention. the

图3显示多个NMU通过HAL来共用BMC的硬件部分的示意图。  FIG. 3 shows a schematic diagram of multiple NMUs sharing the hardware part of the BMC through the HAL. the

图4A~图4C显示根据本发明实施例的通过HAL而转送指令/信息的示意图。  FIGS. 4A-4C show schematic diagrams of forwarding instructions/information through the HAL according to an embodiment of the present invention. the

【主要元件符号说明】  【Description of main component symbols】

100:机架系统    101:连接端口  100: rack system 101: connection port

102:区域网络    103:I2C总线  102: Area network 103: I 2 C bus

110~130:系统板        111、121、131:BMC  110~130: System board 111, 121, 131: BMC

112-1~112-Y、122-1~122-Y、132-1~132-Y:节点  112-1~112-Y, 122-1~122-Y, 132-1~132-Y: nodes

211:HAL                212-1~212-Y:节点管理单元  211: HAL 212-1~212-Y: Node management unit

221:GPIO引脚           222:存储单元  221: GPIO pin 222: Storage unit

223:串行端口           224:感应单元  223: Serial port 224: Sensing unit

225:系统接口           226:LAN接口  225: System interface 226: LAN interface

227:I2C接口            410:系统管理者  227: I 2 C interface 410: System manager

421~466:步骤  421~466: Steps

具体实施方式 Detailed ways

在本发明实施例中,单一BMC可以管理多个节点。在本发明实施例中,通过HAL(Hardware Abstraction Layer,硬件抽象层)以将BMC从单一节点管理扩充为多节点管理,并仍完全相容IPMI规范。如此,可以有效降低机架系统中的BMC芯片数量,不仅可以降低成本,也可以节省空间,且可降低机架系统的内部环境温度。  In the embodiment of the present invention, a single BMC can manage multiple nodes. In the embodiment of the present invention, the BMC is expanded from single-node management to multi-node management through HAL (Hardware Abstraction Layer, hardware abstraction layer), and is still fully compatible with the IPMI specification. In this way, the number of BMC chips in the rack system can be effectively reduced, which can not only reduce the cost, but also save space, and can reduce the internal ambient temperature of the rack system. the

图1显示根据本发明实施例的机架系统示意图。如图1所示,根据本发明实施例的机架系统100至少包括:连接端口101、LAN(Local AreaNetwork,局域网)102、I2C(Inter-Integrated Circuit,内部集成电路)总线103、以及多个系统板。虽然图1中以机架系统100包括3个系统板110~130为例,但知本发明实施例并不受限于此。系统板110包括:BMC 111与节点112-1~112-Y;系统板120包括:BMC 121与节点122-1~122-Y。系统板130包括:BMC 131与节点132-1~132-Y。在此,Y为正整数。  FIG. 1 shows a schematic diagram of a rack system according to an embodiment of the present invention. As shown in FIG. 1 , the rack system 100 according to the embodiment of the present invention at least includes: a connection port 101, a LAN (Local Area Network, local area network) 102, an I 2 C (Inter-Integrated Circuit, internal integrated circuit) bus 103, and multiple system board. Although the rack system 100 including three system boards 110 - 130 is taken as an example in FIG. 1 , it is known that the embodiments of the present invention are not limited thereto. The system board 110 includes: BMC 111 and nodes 112-1~112-Y; the system board 120 includes: BMC 121 and nodes 122-1~122-Y. The system board 130 includes: a BMC 131 and nodes 132-1˜132-Y. Here, Y is a positive integer.

系统管理者所发出的指令与信号等可通过连接端口101而传送至相对应的系统板。当然,由系统板所发出的讯息可通过连接端口101而传回至系统管理者。  Commands and signals issued by the system administrator can be transmitted to corresponding system boards through the connection port 101 . Of course, the messages sent by the system board can be sent back to the system administrator through the connection port 101 . the

如图1所示,LAN 102与I2C总线103提供这些系统板的BMC之间的互相沟通路径。此外,在本发明其他实施例中,BMC还可选择性具有CMM功能。  As shown in FIG. 1 , LAN 102 and I 2 C bus 103 provide communication paths between BMCs of these system boards. In addition, in other embodiments of the present invention, the BMC may also optionally have a CMM function.

图2显示根据本发明实施例的BMC的示意图。如图2所示,BMC包括硬件部分与软件部分。BMC的软件部分包括:HAL 211与节点管理单元(NMU,Node Management Unit)212-1~212-Y。BMC的硬件部分包括: GPIO(General Purpose Input/Output,一般用途输入/输出)引脚221、存储单元222、串行端口223、感应单元224、系统接口(System Interface,简称SI)225、LAN接口226与I2C接口227。  Figure 2 shows a schematic diagram of a BMC according to an embodiment of the present invention. As shown in Figure 2, BMC includes a hardware part and a software part. The software part of the BMC includes: HAL 211 and node management units (NMU, Node Management Unit) 212-1~212-Y. The hardware part of BMC includes: GPIO (General Purpose Input/Output, general purpose input/output) pin 221, storage unit 222, serial port 223, sensing unit 224, system interface (System Interface, referred to as SI) 225, LAN interface 226 and I 2 C interface 227 .

对于每个节点而言,BMC会读取感应单元224的读数来监控节点的物理参数(如CPU温度、存储器温度、电压等等)。举例而言,BMC可能会有三个CPU温度感测器,分别感测其所管理的三个节点的内部CPU的温度。而且,BMC通过GPIO引脚221来控制系统的开关机。另外,系统管理者可以通过LAN接口226或系统接口225等接口来传送IPMI指令给BMC,以要求BMC执行IPMI指令。  For each node, the BMC will read the readings of the sensing unit 224 to monitor the physical parameters of the node (such as CPU temperature, memory temperature, voltage, etc.). For example, the BMC may have three CPU temperature sensors, respectively sensing the temperature of the internal CPUs of the three nodes it manages. Moreover, the BMC controls the power on and off of the system through the GPIO pin 221 . In addition, the system administrator can send IPMI commands to the BMC through interfaces such as the LAN interface 226 or the system interface 225, so as to require the BMC to execute the IPMI commands. the

NMU为实现IPMI规范的管理软件。亦即,以BMC 111而言,NMU1~NMU 3可分别用于管理节点112-1~112-3。在本发明实施例中,由于用单一BMC来管理多个节点的关系,多个NMU必需要共用BMC的硬件部分,因此硬件抽象层(HAL)211可用于解决此议题。HAL 211会为每个NMU建立一套逻辑(虚拟)硬件装置,并与实体硬件装置作对应关系。  NMU is a management software that implements the IPMI specification. That is, in terms of BMC 111, NMU1-NMU3 can be used to manage nodes 112-1-112-3 respectively. In the embodiment of the present invention, since a single BMC is used to manage the relationship of multiple nodes, multiple NMUs must share the hardware part of the BMC, so the hardware abstraction layer (HAL) 211 can be used to solve this problem. HAL 211 will create a set of logical (virtual) hardware devices for each NMU, and make a corresponding relationship with the physical hardware devices. the

图3显示多个NMU通过HAL来共用BMC的硬件部分的示意图。如图3所示,当NMU欲存取SDR(Sensor Data Record,感应数据记录)时,NMU并不需要知道节点的SDR实际在存储单元222的存取地址。当NMU欲读取SDR数据时,NMU只要告诉HAL 211所要读取的是其对应节点的那一笔SDR数据(其比如为CPU温度、存储器温度、施加电压等),HAL 211即会将此NMU所对应的节点的该笔SDR数据回传给NMU。SDR1~SDR3分别代表节点1~3的SDR数据,其分别对应于NMU 1~NMU 3。  FIG. 3 shows a schematic diagram of multiple NMUs sharing the hardware part of the BMC through the HAL. As shown in FIG. 3, when the NMU wants to access the SDR (Sensor Data Record, sensing data record), the NMU does not need to know the actual access address of the node's SDR in the storage unit 222. When the NMU wants to read the SDR data, the NMU only needs to tell the HAL 211 to read the SDR data of its corresponding node (such as CPU temperature, memory temperature, applied voltage, etc.), and the HAL 211 will send the NMU The SDR data of the corresponding node is sent back to the NMU. SDR1~SDR3 respectively represent the SDR data of nodes 1~3, which correspond to NMU 1~NMU 3 respectively. the

同样地,当NMU欲存储SDR数据时,NMU也不需要知道节点的SDR实际在存储单元222的存储地址。当NMU欲存储SDR数据时,NMU只要将欲存储的SDR数据传给HAL 211,HAL 211即会将此SDR数据存储至存储单元222内。也就是说,HAL 211会进行对应(mapping),以将NMU所欲存/取的数据对应至存储单元222。  Similarly, when the NMU intends to store SDR data, the NMU does not need to know the actual storage address of the node's SDR in the storage unit 222 . When the NMU wants to store SDR data, the NMU only needs to transmit the SDR data to be stored to the HAL 211, and the HAL 211 will store the SDR data in the storage unit 222. That is to say, the HAL 211 will perform mapping to map the data to be stored/retrieved by the NMU to the storage unit 222 . the

SEL乃是系统事件记录(System Event Log),其用以存储节点的系统事件(比如系统异常等)。相似地,当NMU 1~NMU 3欲存取SEL 1~SEL 3时,也是由HAL 211负责存/取存储单元222,如同上述般。FRU是现场可替代单元(Field Replaceable Unit),其记录此系统板的编号、产品名称等系统信息。相似地,当NMU 1~NMU 3欲存取FRU 1~FRU 3时,也是由HAL 211 负责存取存储单元222,如同上述般。更甚者,HAL 211所能负责数据对应的功能并不仅局限于SDR、SEL及FRU。IPMI规范所提及的其他功能,例如网络连线序列(SOL,Serial Over LAN)、平台事件滤波(PEF,Platform EventFilter)、感应监控(Sensor Monitor)、机架控制(Chassis Control)等,NMU均可通过HAL达成对应或转送的功能。  SEL is a system event log (System Event Log), which is used to store system events of nodes (such as system exceptions, etc.). Similarly, when the NMU 1-NMU 3 intend to access the SEL 1-SEL 3, the HAL 211 is also responsible for storing/retrieving the storage unit 222, as mentioned above. FRU is a field replaceable unit (Field Replaceable Unit), which records system information such as the serial number and product name of the system board. Similarly, when the NMU 1-NMU 3 want to access the FRU 1-FRU 3, the HAL 211 is also responsible for accessing the storage unit 222, as mentioned above. What's more, the functions that HAL 211 can be responsible for data correspondence are not limited to SDR, SEL and FRU. Other functions mentioned in the IPMI specification, such as network connection sequence (SOL, Serial Over LAN), platform event filter (PEF, Platform Event Filter), sensor monitor (Sensor Monitor), rack control (Chassis Control), etc., NMU The function of correspondence or forwarding can be achieved through HAL. the

图4A~图4C显示根据本发明实施例的通过HAL而转送指令/信息的示意图。如图4A所示,系统管理者410与HAL 211之间的沟通是双向的,而且HAL 211与NMU之间的沟通也是双向的。  FIGS. 4A-4C show schematic diagrams of forwarding instructions/information through the HAL according to an embodiment of the present invention. As shown in FIG. 4A, the communication between the system manager 410 and the HAL 211 is bidirectional, and the communication between the HAL 211 and the NMU is also bidirectional. the

图4B显示系统管理者410通过HAL 211而传送IPMI指令给BMC的示意图。如图4B所示,系统管理者410会传送IPMI指令给HAL 211。接着,HAL 211判断此IPMI指令是经由系统接口(SI)传输而来(如步骤421所示)或是经由LAN接口(LAN)传输而来(如步骤422所示)。如果IPMI指令是经由SI传输而来,则HAL 211接着判断此IPMI是由系统接口的第一个传输端口SI 1(其对应至节点1)、第二个传输端口SI 2(其对应至节点2)或第三个传输端口SI 3(其对应至节点3)而来,如步骤431~433所示。亦即,在本实施例中,BMC的系统接口有多个SI传输端口,其中有3个SI传输端口用以使BMC连接至系统管理者410。如果IPMI指令是经由LAN接口传输而来,则HAL 211接着判断此IPMI是由LAN接口的第一个传输端口LAN 1(其对应至节点1)、第二个传输端口LAN 2(其对应至节点2)或第三个传输端口LAN 3(其对应至节点3)而来,如步骤434~436所示。亦即,在本实施例中,BMC的LAN接口有多个LAN传输端口,其中有3个LAN传输端口用以使BMC连接至系统管理者410。HAL 211经过步骤431~436的判断之后,HAL会判断出系统管理者410所送来的此IPMI指令是要给NMU1~NMU 3的那一个,接着,HAL 211将此IPMI指令送给目的NMU。  FIG. 4B shows a schematic diagram of the system manager 410 sending IPMI commands to the BMC through the HAL 211. As shown in FIG. 4B, the system manager 410 will send the IPMI command to the HAL 211. Then, the HAL 211 judges whether the IPMI command is transmitted via the system interface (SI) (as shown in step 421) or via the LAN interface (LAN) (as shown in step 422). If the IPMI command is transmitted via SI, HAL 211 then judges that the IPMI is the first transmission port SI 1 (corresponding to node 1) and the second transmission port SI 2 (corresponding to node 2) of the system interface. ) or the third transmission port S1 3 (which corresponds to node 3), as shown in steps 431-433. That is, in this embodiment, the system interface of the BMC has multiple SI transmission ports, among which there are 3 SI transmission ports for connecting the BMC to the system manager 410 . If the IPMI command is transmitted via the LAN interface, then HAL 211 judges that the IPMI is composed of the first transmission port LAN 1 (corresponding to node 1) and the second transmission port LAN 2 (corresponding to node 1) of the LAN interface. 2) or the third transmission port LAN 3 (which corresponds to node 3), as shown in steps 434-436. That is, in this embodiment, the LAN interface of the BMC has a plurality of LAN transmission ports, among which there are 3 LAN transmission ports for connecting the BMC to the system manager 410 . After HAL 211 judges through step 431~436, HAL can judge that this IPMI command sent by system administrator 410 is to give that one of NMU1~NMU 3, then, HAL 211 sends this IPMI command to destination NMU. the

图4C显示BMC通过HAL 211回传信息给系统管理者410的示意图。当NMU接收到系统管理者410所传来的IPMI指令后,此NMU会进行相对应的操作,之后,此NMU会将回应信息通过HAL 211而传回给系统管理者410。如图4C所示,NMU会送出回应信息给HAL 211。接着,HAL 211判断此回应信息是经由系统接口(SI)而接收到(如步骤441)或经由LAN接口而接收到(如步骤442)。如果此回应信息是经由系统接口而接收到,HAL 211分析所接收到的回应信息,HAL 211可判断此回应信息是由那一个NMU 所发出(步骤451~453及步骤454~456)。亦即,在本实施例中,BMC的系统接口有多个SI传输端口,其中有3个SI传输端口用以使系统管理者410连接至BMC;且BMC的LAN接口有多个LAN传输端口,其中有3个LAN传输端口用以使系统管理者410连接至BMC。HAL 211会判断NMU是否经由系统接口传送此回应信息,再判断此回应信息是由那一个NMU所发送(步骤451~453),如此,HAL 211即可将回应信息经由原接收接口(比如是SI)回传给系统管理者410(步骤461~463)。相似地,HAL 211会判断NMU是否经由LAN接口而传送回应信息,接着,HAL 211判断是此回应信息是由那一个NMU所发送(步骤454~456),即可将回应信息经由原接收接口(LAN接口)回传给系统管理者410(步骤464~466)。  FIG. 4C shows a schematic diagram of the BMC returning information to the system manager 410 through the HAL 211. When the NMU receives the IPMI command from the system manager 410, the NMU will perform corresponding operations, and then, the NMU will return the response information to the system manager 410 through the HAL 211. As shown in Figure 4C, the NMU will send a response message to the HAL 211. Then, the HAL 211 determines whether the response message is received via the system interface (SI) (as in step 441) or via the LAN interface (as in step 442). If the response information is received via the system interface, the HAL 211 analyzes the received response information, and the HAL 211 can determine which NMU the response information is sent from (steps 451-453 and steps 454-456). That is, in the present embodiment, the system interface of BMC has a plurality of SI transmission ports, wherein 3 SI transmission ports are used to make the system manager 410 connect to BMC; and the LAN interface of BMC has a plurality of LAN transmission ports, There are 3 LAN transmission ports for connecting the system manager 410 to the BMC. HAL 211 will judge whether the NMU transmits the response information via the system interface, and then determine which NMU the response information is sent from (steps 451 to 453). In this way, HAL 211 can send the response information through the original receiving interface (such as SI ) is sent back to the system manager 410 (steps 461-463). Similarly, the HAL 211 will judge whether the NMU transmits the response information via the LAN interface, and then, the HAL 211 judges whether the response information is sent by which NMU (steps 454-456), and then the response information can be sent via the original receiving interface ( LAN interface) back to the system administrator 410 (steps 464-466). the

也就是说,在本发明实施例中,当系统管理者410通过LAN接口或系统接口传送IPMI指令给BMC时,HAL 211会辨别此IPMI指令是由那一个传输端口所接收并将指令送至相对应的NMU去执行。当NMU执行指令完毕,NMU会将信息回传给HAL 211,HAL 211会将此回应信息由原来的传输端口回传给系统管理者410。当然,本发明实施例并不受限于HAL 211只能经由LAN接口或系统接口来转送IPMI指令,HAL 211也可经由IPMI规范内所支持的接口来转送IPMI指令。  That is to say, in the embodiment of the present invention, when the system manager 410 transmits the IPMI command to the BMC through the LAN interface or the system interface, the HAL 211 will identify which transmission port the IPMI command is received by and send the command to the corresponding The corresponding NMU to execute. When the NMU executes the command, the NMU will return the information to the HAL 211, and the HAL 211 will return the response information to the system manager 410 through the original transmission port. Certainly, the embodiment of the present invention is not limited to the fact that the HAL 211 can only transmit the IPMI command via the LAN interface or the system interface, and the HAL 211 can also transmit the IPMI command via the interface supported in the IPMI specification. the

综上所述,本发明实施例至少具有下列优点:(1)本发明实施例可减少在高密度服务器(如刀锋型服务器)所需要的BMC芯片数量,以减低成本;以及(2)本发明实施例可有效利用空间,增加服务器的节点个数及运算能力,并且有效降低系统的温度(因为BMC芯片数量减少)。  In summary, the embodiments of the present invention at least have the following advantages: (1) the embodiments of the present invention can reduce the number of BMC chips needed in high-density servers (such as blade servers), so as to reduce costs; and (2) the present invention The embodiment can effectively use space, increase the number of nodes and computing power of the server, and effectively reduce the temperature of the system (because the number of BMC chips is reduced). the

综上所述,虽然本发明已以优选实施例公开如上,然其并非用以限定本发明。本本领域技术人员在不脱离本发明的精神和范围内,当可作各种的更动与润饰。因此,本发明的保护范围当视所附权利要求书所界定者为准。  In summary, although the present invention has been disclosed as above with preferred embodiments, it is not intended to limit the present invention. Those skilled in the art may make various changes and modifications without departing from the spirit and scope of the present invention. Therefore, the protection scope of the present invention should be defined by the appended claims. the

Claims (8)

1.一种服务器系统,包括:1. A server system comprising: 至少一系统板,该系统板包括一基板管理控制器与多个节点,该基板管理控制器包括多个节点管理单元、一硬件抽象层与一硬件资源,这些节点管理单元分别管理这些节点,在该硬件抽象层的控制下,这些节点管理单元共用该硬件资源;At least one system board, the system board includes a baseboard management controller and a plurality of nodes, the baseboard management controller includes a plurality of node management units, a hardware abstraction layer and a hardware resource, these node management units respectively manage these nodes, in Under the control of the hardware abstraction layer, these node management units share the hardware resources; 一连接端口,用以连接至一外部系统管理者;以及a connection port for connecting to an external system manager; and 一内部通道,连接至该系统板与该连接端口,an internal channel connected to the system board and the connection port, 其中,该系统板还包括多个传输端口,这些传输端口用以使该基板管理控制器连接至该外部系统管理者;Wherein, the system board further includes a plurality of transmission ports, and these transmission ports are used to connect the baseboard management controller to the external system manager; 若一外部指令通过该硬件资源而传送至该基板管理控制器,则该硬件抽象层辨别该外部指令是由哪一个传输端口接收,以将该外部指令传送至一相对应节点管理单元执行;以及If an external command is transmitted to the baseboard management controller through the hardware resource, the hardware abstraction layer identifies which transmission port the external command is received from, so as to transmit the external command to a corresponding node management unit for execution; and 当该相对应节点管理单元执行该外部指令后,该相对应节点管理单元将一信息回传给该硬件抽象层,以将该信息由该传输端口回传给该外部系统管理者。After the corresponding node management unit executes the external command, the corresponding node management unit returns a message to the hardware abstraction layer, so as to send the message back to the external system manager through the transmission port. 2.如权利要求1所述的服务器系统,其中,该硬件抽象层为各节点管理单元建立一逻辑硬件装置,以对应至该硬件资源。2. The server system according to claim 1, wherein the hardware abstraction layer establishes a logical hardware device for each node management unit to correspond to the hardware resource. 3.如权利要求2所述的服务器系统,其中,当这些节点管理单元的其中之一节点管理单元欲使用该硬件资源时,该节点管理单元传送一指令至该硬件抽象层,该硬件抽象层根据该指令而存取该硬件资源并将一结果回传给该节点管理单元。3. The server system according to claim 2, wherein, when one of the node management units wants to use the hardware resource, the node management unit sends an instruction to the hardware abstraction layer, and the hardware abstraction layer The hardware resource is accessed according to the command and a result is returned to the node management unit. 4.如权利要求2所述的服务器系统,其中,当这些节点管理单元的其中之一节点管理单元欲使用该硬件资源时,该节点管理单元传送一数据至该硬件抽象层,该硬件抽象层根据该数据而存取该硬件资源。4. The server system as claimed in claim 2, wherein, when one of the node management units intends to use the hardware resource, the node management unit transmits a data to the hardware abstraction layer, and the hardware abstraction layer The hardware resource is accessed according to the data. 5.一种服务器系统的操作方法,该服务器系统包括至少一系统板,该系统板包括一基板管理控制器与多个节点,该基板管理控制器包括多个节点管理单元、一硬件抽象层与一硬件资源,这些节点管理单元分别管理这些节点,该操作方法包括:5. A method for operating a server system, the server system comprising at least one system board, the system board comprising a baseboard management controller and a plurality of nodes, the baseboard management controller comprising a plurality of node management units, a hardware abstraction layer and A hardware resource, these node management units respectively manage these nodes, and the operation method includes: (A)在该硬件抽象层的控制下,这些节点管理单元共用该硬件资源;(A) Under the control of the hardware abstraction layer, these node management units share the hardware resources; (B)当这些节点管理单元的其中之一节点管理单元欲使用该硬件资源时,该节点管理单元送出一指令或一数据至该硬件抽象层,该硬件抽象层据以代替该节点管理单元来使用该硬件资源;以及(B) When one of the node management units intends to use the hardware resource, the node management unit sends an instruction or a data to the hardware abstraction layer, and the hardware abstraction layer replaces the node management unit to use the hardware resource; and (C)若接收到一外部指令,则该硬件抽象层辨别该外部指令是由该硬件资源的哪一个传输端口所接收,以传送至一相对应节点管理单元执行,且当该外部指令被执行后,该相对应节点管理单元将一信息回传给该硬件抽象层,以由该硬件抽象层将该信息由该传输端口回传给一外部系统管理者。(C) If an external command is received, the hardware abstraction layer identifies which transmission port of the hardware resource the external command is received to transmit to a corresponding node management unit for execution, and when the external command is executed Afterwards, the corresponding node management unit returns an information to the hardware abstraction layer, so that the hardware abstraction layer sends the information back to an external system manager through the transmission port. 6.如权利要求5所述的操作方法,其中,该步骤(A)包括:6. The operating method as claimed in claim 5, wherein, the step (A) comprises: 该硬件抽象层为各节点管理单元建立一逻辑硬件装置,以对应至该硬件资源。The hardware abstraction layer establishes a logical hardware device for each node management unit to correspond to the hardware resource. 7.如权利要求6所述的操作方法,其中,该步骤(B)包括:7. The operation method as claimed in claim 6, wherein, the step (B) comprises: 当这些节点管理单元的其中之一节点管理单元欲使用该硬件资源时,该节点管理单元传送该指令至该硬件抽象层,该硬件抽象层根据该指令而存取该硬件资源并将一结果回传给该节点管理单元。When one of the node management units intends to use the hardware resource, the node management unit sends the instruction to the hardware abstraction layer, and the hardware abstraction layer accesses the hardware resource according to the instruction and returns a result Passed to the node management unit. 8.如权利要求6所述的操作方法,其中,该步骤(B)包括:8. The operating method as claimed in claim 6, wherein, the step (B) comprises: 当这些节点管理单元的其中之一节点管理单元欲使用该硬件资源时,该节点管理单元传送该数据至该硬件抽象层,该硬件抽象层根据该数据而存取该硬件资源。When one of the node management units wants to use the hardware resource, the node management unit sends the data to the hardware abstraction layer, and the hardware abstraction layer accesses the hardware resource according to the data.
CN201010243788.0A 2010-07-30 2010-07-30 Server system and operation method thereof Active CN102346707B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201010243788.0A CN102346707B (en) 2010-07-30 2010-07-30 Server system and operation method thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201010243788.0A CN102346707B (en) 2010-07-30 2010-07-30 Server system and operation method thereof

Publications (2)

Publication Number Publication Date
CN102346707A CN102346707A (en) 2012-02-08
CN102346707B true CN102346707B (en) 2014-12-17

Family

ID=45545402

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201010243788.0A Active CN102346707B (en) 2010-07-30 2010-07-30 Server system and operation method thereof

Country Status (1)

Country Link
CN (1) CN102346707B (en)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9529583B2 (en) * 2013-01-15 2016-12-27 Intel Corporation Single microcontroller based management of multiple compute nodes
TWI614613B (en) * 2014-09-11 2018-02-11 廣達電腦股份有限公司 Server system and associated control method
CN105988908B (en) * 2015-02-04 2018-11-06 昆达电脑科技(昆山)有限公司 The global data processing system of single BMC multiservers
US10587935B2 (en) * 2015-06-05 2020-03-10 Quanta Computer Inc. System and method for automatically determining server rack weight
CN105099776A (en) * 2015-07-21 2015-11-25 曙光云计算技术有限公司 Cloud server management system
US10116750B2 (en) * 2016-04-01 2018-10-30 Intel Corporation Mechanism for highly available rack management in rack scale environment
CN108337307B (en) * 2018-01-31 2021-06-29 郑州云海信息技术有限公司 A kind of multi-channel server and communication method between nodes thereof
CN109271330A (en) * 2018-08-16 2019-01-25 华东计算技术研究所(中国电子科技集团公司第三十二研究所) General BMC system based on integrated information system
CN113970961A (en) * 2021-10-25 2022-01-25 西安超越申泰信息科技有限公司 Method and server for BIOS to control heat dissipation through BMC
CN118012807A (en) * 2024-02-01 2024-05-10 超聚变数字技术有限公司 Calling method, system, chip and server of hardware equipment

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030130969A1 (en) * 2002-01-10 2003-07-10 Intel Corporation Star intelligent platform management bus topology
CN1983987A (en) * 2006-05-12 2007-06-20 华为技术有限公司 Monitor of rear card board in intelligent-platform management interface system
US20070233833A1 (en) * 2006-03-29 2007-10-04 Inventec Corporation Data transmission system for electronic devices with server units
CN101056205A (en) * 2007-04-04 2007-10-17 杭州华为三康技术有限公司 A management method, system and device based on ATCA architecture-based server

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030130969A1 (en) * 2002-01-10 2003-07-10 Intel Corporation Star intelligent platform management bus topology
US20070233833A1 (en) * 2006-03-29 2007-10-04 Inventec Corporation Data transmission system for electronic devices with server units
CN1983987A (en) * 2006-05-12 2007-06-20 华为技术有限公司 Monitor of rear card board in intelligent-platform management interface system
CN101056205A (en) * 2007-04-04 2007-10-17 杭州华为三康技术有限公司 A management method, system and device based on ATCA architecture-based server

Also Published As

Publication number Publication date
CN102346707A (en) 2012-02-08

Similar Documents

Publication Publication Date Title
TWI423039B (en) Server system and operation method thereof
CN102346707B (en) Server system and operation method thereof
US8880687B1 (en) Detecting and managing idle virtual storage servers
US20200278880A1 (en) Method, apparatus, and system for accessing storage device
US9864517B2 (en) Actively responding to data storage traffic
US8095701B2 (en) Computer system and I/O bridge
EP2705433B1 (en) Method and system for dynamically creating and servicing master-slave pairs within and across switch fabrics of a portable computing device
US7921185B2 (en) System and method for managing switch and information handling system SAS protocol communication
US10346156B2 (en) Single microcontroller based management of multiple compute nodes
US20110145452A1 (en) Methods and apparatus for distribution of raid storage management over a sas domain
US11403141B2 (en) Harvesting unused resources in a distributed computing system
US20160080210A1 (en) High density serial over lan managment system
TW200925878A (en) System and method for management of an IOV adapter through a virtual intermediary in an IOV management partition
CN107835089B (en) Method and device for managing resources
US20180081558A1 (en) Asynchronous Discovery of Initiators and Targets in a Storage Fabric
WO2022141250A1 (en) Data transmission method and related apparatus
WO2014206078A1 (en) Memory access method, device and system
JP2017537404A (en) Memory access method, switch, and multiprocessor system
JP2023533445A (en) Memory Allocation and Memory Write Redirection in Cloud Computing Systems Based on Memory Module Temperature
JP6760579B2 (en) Network line card (LC) integration into host operating system (OS)
WO2025086691A1 (en) Rdma network configuration method, and server
WO2019223444A1 (en) Data storage system
US10209923B2 (en) Coalescing configuration engine, coalescing configuration tool and file system for storage system
US11294847B1 (en) Fibre channel host onboarding system
CN104461951A (en) Physical and virtual multipath I/O dynamic management method and system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant