Disclosure of Invention
In this embodiment, a cluster server is provided to solve the problem of low storage resource utilization rate of the cluster server in the related art.
The embodiment provides a cluster server, which comprises a switch and at least three servers, wherein the servers are connected with the switch;
The server comprises a storage device, wherein the storage device comprises a hard disk controller and a disk array, and each hard disk controller is connected with the disk array of at least one other server through a disk connector;
the at least three servers comprise a main server, and the main server is used for controlling each server to acquire or release control rights to the disk array of the current server and/or the disk array of at least one other server.
In some embodiments, each hard disk controller is connected to a disk array of two other servers through a disk connector, and the storage devices of each server are connected in a ring topology.
In some embodiments, the server comprises a central processor, wherein the central processor is connected with the switch, and the central processor of the main server is used for monitoring the online state of other servers and notifying adjacent servers to obtain the control right of the disk array of the dropped server under the condition that the other servers are dropped.
In some embodiments, the other servers establish heartbeat connection with the main server, and the main server monitors the online state of the other servers through heartbeat information.
In some embodiments, the server further includes a baseboard management controller, the baseboard management controller is connected with the switch, the baseboard management controller is further connected with a hard disk controller of the current server, and the baseboard management controller is used for acquiring or releasing control rights to a disk array of the current server and/or a disk array of at least one other server.
In some of these embodiments,
The baseboard management controller is used for monitoring the running state of each hardware in the current server and sending the running state to the central processing unit of the current server;
The central processor of the other servers is used for releasing the disk array control right under the condition that the running state is abnormal, and notifying the running state abnormality to the main server;
and the central processor of the main server is used for notifying adjacent servers to acquire the control right of the disk array of the server with abnormal running state after receiving the notification of the abnormal running state.
In some embodiments, the other servers are further configured to perform self-checking repair on hardware of the current server after transferring control of the disk array of the current server to the other servers, and re-acquire control of the disk array of the current server after the self-checking repair is successful.
In some embodiments, the disk array of each server is powered by an independent power supply, and the other servers perform self-checking repair by restarting the current server.
In some of these embodiments, the disk connector is a serial attached small computer system interface connector.
In some of these embodiments, the storage devices of each of the servers are physically centrally disposed within the server.
Compared with the related art, the cluster server provided in the embodiment comprises a switch and at least three servers, wherein the servers are connected with the switch, the servers comprise storage equipment, the storage equipment comprises hard disk controllers and disk arrays, each hard disk controller is connected with the disk arrays of at least one other server through the disk connectors, the at least three servers comprise main servers, the main servers are used for controlling each server to acquire or release control rights to the disk arrays of the current server and/or the disk arrays of at least one other server, the problem that the storage resource utilization rate of the cluster server in the related art is low is solved, and the storage resource utilization rate of the cluster server is improved.
The details of one or more embodiments of the application are set forth in the accompanying drawings and the description below to provide a more thorough understanding of the other features, objects, and advantages of the application.
Detailed Description
The present application will be described and illustrated with reference to the accompanying drawings and examples for a clearer understanding of the objects, technical solutions and advantages of the present application.
Unless defined otherwise, technical or scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terms "a," "an," "the," "these" and similar terms in this application are not intended to be limiting in number, but may be singular or plural. The terms "comprises," "comprising," "includes," "including," "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, and system, article, or apparatus that comprises a list of steps or modules (units) is not limited to the list of steps or modules (units), but may include other steps or modules (units) not listed or inherent to such process, method, article, or apparatus. The terms "connected," "coupled," and the like in this disclosure are not limited to physical or mechanical connections, but may include electrical connections, whether direct or indirect. The term "plurality" as used herein means two or more. "and/or" describes the association relationship of the association object, and indicates that three relationships may exist, for example, "a and/or B" may indicate that a exists alone, a and B exist simultaneously, and B exists alone. Typically, the character "/" indicates that the associated object is an "or" relationship. The terms "first," "second," "third," and the like, as referred to in this disclosure, merely distinguish similar objects and do not represent a particular ordering for objects.
The present embodiment provides a cluster server, the cluster server includes three or more servers. Fig. 1 is a schematic diagram of the servers of the present embodiment, which may also be referred to as hosts, each of which includes a computing section 10 and a storage section 20 as shown in fig. 1. The computing section 10 typically includes a central processing unit 110 (CPU, also referred to as a master controller or master), and the storage section typically consists of a storage device 210.
Storage device 210 includes disk array 211. It should be noted that, in this embodiment, the disk array 211 may include only one disk drive, or may be a disk group formed by combining a plurality of disk drives. Also, the disk drives making up the disk array are not limited to HDD disk drives or SDD disk drives, but may be a combination of HDD disk drives and SDD disk drives in some embodiments. In addition, the disk array 211 may be a high-capacity disk drive formed by serially connecting all disk drives by JBOD (Just a Bunch Of Disks) technology, or may be a disk drive formed by using RAID (redundant array of independent disks) technology by a server, so as to improve the fault tolerance of the disk.
The interface device between the computing section 10 and the disk array 211 is referred to as a hard disk controller 212, and also as a disk drive adapter. The hard disk controller 212 is used at a software level to interpret commands given by the computing section 10, send various control signals to the disk drive, detect the disk drive status, or write and read data to and from the disk in accordance with a prescribed disk data format, also controlled by the hard disk controller 212. At the hardware level, hard disk controller 212 provides one or more physical interfaces for interfacing with disk array 211. The hard disk controller 212 may interface with one or more disk arrays 211 through these physical interfaces and gain or release control of the physically interfaced disk arrays 211.
Each disk array 211 may include one or more physical interfaces for interfacing with hard disk controller 212. For example, a disk array 211 based on SAS (serial attached small computer system interface) technology may be implemented by connecting with a hard disk controller 212 of a plurality of servers to share the same disk array 211 by the plurality of servers.
The computing portion 10 and the storage portion 20 of each server may be physically centrally located, such as within the same server chassis. The calculation section 10 and the storage section 20 may be provided on the same main circuit board or may be provided separately. For example, the storage section 20 is provided on a server back plane, and the calculation section 10 is provided on a main circuit board.
In addition to the storage section 20 and the computing section 10, the server typically has two core firmware, BIOS (basic input output system) (not shown) and BMC (baseboard management controller) (not shown), respectively. In the computer system, the BIOS has the function of being lower and basic than the operating system of the server, is mainly responsible for detecting, accessing and debugging the lower hardware resources and is distributed to the operating system so as to ensure the whole and smooth and safe operation of the system. The BMC is a small operating system, which is independent from the operating system of the server, and is usually integrated on the motherboard, or is plugged into the motherboard through PCIe or other forms. The external appearance of the BMC is usually a standard RJ45 network port, and the BMC has an independent IP firmware system. Typically, the server may use BMC instructions for unattended operations, such as remote management, monitoring, installation, restarting, etc. of the server.
Fig. 2 is a schematic structural diagram of a cluster server according to the present embodiment. In fig. 2, five servers are taken as an example for illustration, in other embodiments, the number of servers may be any number greater than three, and is generally specifically set according to the requirements of computing resources and storage resources of the cluster server, and the number is not limited in this embodiment.
The cluster server as shown in fig. 2 includes a switch 40 and five servers. Each server is connected to a switch 40. The hard disk controller 212 of each server is connected to the disk array 211 of the current server and the disk array 211 of at least one other server by a disk connector (e.g., SAS connector). Wherein, the other servers refer to other servers except the current server in the cluster server.
Of these five servers, one server is a master server (server a in fig. 2) by self-election or user configuration, and the other servers are referred to as slave servers with respect to the master server. The main server is used for controlling each server to acquire or release control rights to the disk array of the current server and/or the disk array of at least one other server.
However, the master-slave servers in the present embodiment are not limited to the servers having a master-slave relationship in actual business processes, but merely represent that the master server is dominant in implementing the present embodiment. In actual business processes, the master server may have the same rank as the other slave servers, or a lower rank or higher rank than some other slave server.
According to the cluster server provided by the embodiment, one of the servers is used as the main server, the hard disk controllers of the servers are connected with the disk array of the current server and the disk array of at least one other server through the disk connectors, and the main server is used for controlling the other servers to acquire or release the control right to the disk array of the current server and/or the disk array of the at least one other server, so that under the condition that one other server fails, the main server can control the other servers to take over the disk array 211 of the failed server, and the utilization rate of the disk array 211 is improved. Compared to the prior art that uses an SAS switch such as expensive cost to realize the sharing of the disk array 211, the present embodiment does not need to add any SAS switch, but can directly use the switch 40 for service processing with the cluster server, thereby greatly reducing the cost.
The computing section 10 of each server includes a central processor 110, and the central processor 110 of each server is connected to the switch 40. The central processor 110 of each slave server may report the status of the current server to the master server through the switch 40 periodically or aperiodically, and once the status reported by the slave server indicates that the slave server cannot normally complete the service processing, the master server may notify the slave server to release the control right on the disk array of the current server, and notify other slave servers to acquire the control right on the disk array of the failed slave server. In some embodiments, the slave server may also actively release control of the current server's disk array in the event of a failure.
However, upon a secondary server dropping or abnormal power down, the secondary server cannot report the current server's state to the primary server, and thus, in some embodiments, the primary server may actively monitor the state of the secondary server, such as monitoring the online state of other secondary servers. When the other servers are disconnected, the central processing unit 110 of the master server notifies the servers adjacent to the disconnected slave server to acquire control rights to the disk array 211 of the disconnected server.
The heartbeat connection may be used to detect the online status of other slave servers. Namely, the other slave servers and the master server establish heartbeat connection, the slave servers periodically send heartbeat information (keep-alive information) to the master server through the heartbeat connection, and if the master server does not receive the heartbeat information sent by a certain slave server after exceeding a set time interval, the slave server is considered to be disconnected.
In each server, both the central processor 110 and the BMC may be used to control the hard disk controller 212 to gain or release control of the disk array 211. The BMC is connected with the switch 40 through an RJ45 network port, and is also connected with the hard disk controller 212 of the current server. The BMC is configured to control the hard disk controller 212 to acquire or release control of the disk array 211 of the current server and/or the disk array 211 of at least one other server. In some cases, if the slave server's operating system crashes or a CPU failure results in an inability to control the hard disk controller 212 to release control of the current server's disk array 211, the master server may control the slave server's BMC through the switch 40 to release control of the current server's disk array 211. Similarly, the master server may also control the BMCs of other slave servers to obtain control of a certain disk array 211 through the switch 40.
Since the BMC is a small operating system independent of the server operating system, the BMC can still operate normally even if the operating system of the slave server crashes due to hardware failure or software failure, and it is ensured that the control right of the disk array 211 of the cluster server can be handed over normally.
The BMC is independently existed as a third party in the server, can monitor hardware information of the whole server, such as temperature, power supply voltage, fan rotating speed and the like of the system, and can monitor the working state of a system network module, a user interaction module (such as a USB module and a display module) or other modules. Once a certain module of the server has an abnormality that can affect the normal service capability of the server, the BMC determines that the server does not complete the storage function, and then the BMC transmits the abnormality information to the central processor 110 of the current server or directly transmits the abnormality information to the central processor of the main server through the switch 40. The monitoring of the running state of the current server by the central processor 110 may be implemented by the BMC. For example, the BMC monitors the running state of each hardware in the current server and sends the running state to the central processor 110 of the current server, the central processors 110 of other servers release the control right of the disk array 211 and notify the main server of the running state abnormality when the running state is abnormal, and the central processor 110 of the main server notifies the adjacent server to acquire the control right of the disk array 211 of the server with the running state abnormality after receiving the notification of the running state abnormality.
To avoid the cost increase caused by interconnecting all disk arrays 211 in a cluster server with a SAS switch, each hard disk controller 212 in this embodiment connects the disk array 211 of the current server to the disk array 211 of at least one other server via a disk connector (SAS connector). By such connection, the storage devices of the servers may form a linear topology such as that shown in fig. 3. Under the linear topology structure, when the servers at the two ends of the topology structure have faults, the storage device can only be taken over by one adjacent server, and under the condition that the calculation load of the adjacent server is large, the adjacent server can possibly cause self faults caused by further increased load after taking over the storage device, and the stability of the cluster server is reduced. Or two continuous adjacent servers at two ends of the topological structure have faults, the storage equipment of the outermost server cannot be taken over by any one server, and therefore, the utilization rate of the storage equipment still has room for improvement.
To this end, in some of these embodiments, each hard disk controller 212 connects the disk array 211 of the current server and the disk arrays of two other servers through a disk connector (SAS connector), the storage devices of each server forming a ring topology such as that shown in fig. 4. The connection mode ensures that two adjacent servers can take over the storage devices of the fault server under the condition that any one server fails, ensures that one server can take over the disk arrays of the two fault servers even if two continuous adjacent servers fail, and only can cause that the storage devices of one server cannot be taken over by any one server under the condition that three continuous adjacent servers fail. Therefore, the stability of the cluster server and the utilization rate of the storage equipment are improved by adopting the ring topology structure.
The working procedure of the cluster server of this embodiment is described below.
Example 1
In this embodiment, the hard disk controller 212 of each server is controlled by the host server to acquire or release control rights to the disk array of the current server and/or other servers.
Referring to fig. 4, which is a topology structure, taking a master server as a server a and other servers as slave servers as examples, the working process of the cluster server provided in this embodiment includes the following steps:
And step 1, the master server and the slave server monitor the running state of each hardware in each server.
In step 2, when the operation state of the server B is abnormal, the hard disk controller 212 controlling the server B releases the control right to the disk array 211 of the server B.
And step 3, the server B sends a disk array control right handover instruction to the server A.
The disk array control right handover instruction sent by the server B to the server a carries the identification information of the server B, or carries the identification information of the disk array of the server B.
Step 4, when the server a receives the disk array control right handover command sent by the server B through the switch 40, according to the identification information carried in the disk array control right handover command, the neighboring servers of the server B corresponding to the identification information are determined to be the server a and the server C respectively by querying a mapping table configured in advance.
In step 5, if the server a confirms that the current server is adjacent to the server B, the hard disk controller 211 of the server a may be directly controlled to obtain the control right for the disk array 211 of the server B.
Example 2
In this embodiment, the hard disk controller 212 of each server is controlled by the host server to acquire or release control rights to the disk array of the current server and/or other servers.
Referring to fig. 4, a topology is shown in which a master server is taken as a server a, and other servers are taken as slave servers. The working process of the cluster server provided by the embodiment comprises the following steps:
And step 1, the master server and the slave server monitor the running state of each hardware in each server.
In step 2, when the operation state of the server B is abnormal, the hard disk controller 212 controlling the server B releases the control right to the disk array 211 of the server B.
And step 3, the server B sends a disk array control right handover instruction to the server A.
The disk array control right handover instruction sent by the server B to the server a carries the identification information of the server B, or carries the identification information of the disk array of the server B.
Step 4, when the server a receives the disk array control right handover command sent by the server B through the switch 40, according to the identification information carried in the disk array control right handover command, the neighboring servers of the server B corresponding to the identification information are determined to be the server a and the server C respectively by querying a mapping table configured in advance.
In step 5, the server a finds itself adjacent to the server B, but the running state of the server a indicates that the workload is already large, and at this time, the server a sends a disk array control right acquisition instruction to the server C, where the disk array control right acquisition instruction carries the identification information of the server B or the identification information of the disk array 211 of the failed server.
Step 6, after receiving the disk array control right acquiring instruction, the server C acquires the control right of the disk array 211 of the server B according to the identification information carried in the instruction.
The adjacent servers may be one or more servers. For example, in a ring topology, each server has two adjacent servers, and in other embodiments, where servers employ disk arrays 211 such as SAS technology, the two adjacent servers may jointly take control of the disk array 211 of the same failed server.
The main server may maintain a mapping table of the physical interfaces of the hard disk controller 212 and the disk array 211 to obtain the identification information of the disk array 211 connected to each physical interface or the identification information of the server to which the disk array 211 belongs, and may also maintain topology information of a cluster server to obtain the adjacent servers of each server. After obtaining the disk array control right obtaining instruction from the server, determining the physical interface connected to the disk array 211 to be taken over according to the identification information carried in the disk array control right obtaining instruction, so as to control the hard disk controller 212 to obtain the control right of the disk array 211 of the fault server connected to the physical interface.
Example 3
In this embodiment, the hard disk controller 212 of each server is controlled by the master server to acquire or release control rights to the disk array of the current server and/or other servers, but each slave server may not need to actively report the running state of the respective server.
Referring to fig. 4, a topology is shown in which a master server is taken as a server a, and other servers are taken as slave servers. The working process of the cluster server provided by the embodiment comprises the following steps:
and 2, monitoring heartbeat information of each server in the cluster server by the server A, and determining that the server B is disconnected according to the heartbeat information.
And 3, the server A determines that the adjacent servers of the server B are the server A and the server C by inquiring a pre-configured mapping table according to the identification information of the server B.
In step 4, the server a finds itself adjacent to the server B, but the running state of the server a indicates that the workload is already large, and at this time, the server a sends a disk array control right acquisition instruction to the server C, where the disk array control right acquisition instruction carries the identification information of the server B or the identification information of the disk array 211 of the server B.
Step 5, after receiving the disk array control right acquiring instruction, the server C acquires the control right of the disk array 211 of the server B according to the identification information carried in the instruction.
The main server may maintain a mapping table of the physical interfaces of the hard disk controller 212 and the disk array 211 to obtain the identification information of the disk array 211 connected to each physical interface or the identification information of the server to which the disk array 211 belongs, and may also maintain topology information of a cluster server to obtain the adjacent servers of each server. After obtaining the disk array control right obtaining instruction from the server, determining the physical interface connected to the disk array 211 to be taken over according to the identification information carried in the disk array control right obtaining instruction, so as to control the hard disk controller 212 to obtain the control right of the disk array 211 of the fault server connected to the physical interface.
In step 6, after the server B actively or passively transfers the control right of the disk array 211 of the server B to the server C due to a failure or a disconnection, the self-checking repair is performed on the hardware of the server B.
Step 7, if the self-checking repair of the server B is successful, the server B re-acquires the control right to the disk array 211 of the server B.
When the server B re-acquires the control right to the disk array 211 of the server B, a disk array control right acquisition request may be sent to the server a. After receiving the request for obtaining the disk array control right of the server B, the server A sends a disk array control right releasing instruction to the server C which controls the disk array control right of the server B at present, and sends a confirmation message to the server B after receiving the notice that the server C successfully releases the disk array control right. And the server B receives the confirmation message and re-acquires the control right of the disk array. By the method, the self-checking and self-repairing of the fault server are realized.
The disk array 211 of each server is powered by an independent power supply, and the server can perform self-checking repair by restarting the current server, and ensure that the disk array 211 of the current server is continuously powered off and can be taken over and utilized by other servers.
The cluster server may further comprise a control node, which is connected to the switch 40 for configuring each server, for example for configuring a control program of each server, or identification information of each server, or a mapping table stored in each server. In addition, the BMC of each server can be controlled by the control node to realize a remote unattended function, such as remote restarting and the like.
In summary, the conventional cluster service usually cuts off the node service to be abnormal, and cannot call the storage part. The embodiment realizes the completion of cluster service from the aspect of hardware, and effectively utilizes the storage part of the abnormal equipment to multiplex and call the content of the storage part. In the embodiment, the disk arrays of the servers are interconnected by using the disk connectors, so that the storage parts of the servers become a whole capable of performing control right handover, and one of the servers is used as a main server to participate in cluster control, thereby greatly improving the stability and safety of a cluster scheme, and once a certain abnormality occurs, the handover of the disk array control right can be rapidly determined, so that the stability of the cluster scheme is greatly improved.
It should be understood that the specific embodiments described herein are merely illustrative of this application and are not intended to be limiting. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present disclosure in accordance with the embodiments provided herein.
It is to be understood that the drawings are merely illustrative of some embodiments of the present application and that it is possible for those skilled in the art to adapt the present application to other similar situations without the need for inventive work. In addition, it should be appreciated that while the development effort might be complex and lengthy, it would nevertheless be a routine undertaking of design, fabrication, or manufacture for those of ordinary skill having the benefit of this disclosure, and thus should not be construed as a departure from the disclosure.
The term "embodiment" in this disclosure means that a particular feature, structure, or characteristic described in connection with the embodiment may be included in at least one embodiment of the application. The appearances of such phrases in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive. It will be clear or implicitly understood by those of ordinary skill in the art that the embodiments described in the present application can be combined with other embodiments without conflict.
The above examples merely represent a few embodiments of the present application, which are described in more detail and are not to be construed as limiting the scope of the patent claims. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the application, which are all within the scope of the application. Accordingly, the scope of the application should be assessed as that of the appended claims.