CN108600008B - Server management method, server management device and distributed system - Google Patents
Server management method, server management device and distributed system Download PDFInfo
- Publication number
- CN108600008B CN108600008B CN201810374590.2A CN201810374590A CN108600008B CN 108600008 B CN108600008 B CN 108600008B CN 201810374590 A CN201810374590 A CN 201810374590A CN 108600008 B CN108600008 B CN 108600008B
- Authority
- CN
- China
- Prior art keywords
- server
- tasks
- series
- slave
- slave server
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/08—Configuration management of networks or network elements
- H04L41/0803—Configuration setting
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Hardware Redundancy (AREA)
Abstract
The invention provides a server management method, a server management device and a distributed system, and relates to the technical field of big data. The server management method is applied to a master server of a distributed system, the distributed system is also provided with a plurality of slave servers which are in communication connection with the master server, and the server management method comprises the following steps: distributing a series of tasks to any slave server based on the registration request information sent by the slave server so as to enable the slave server to execute the series of tasks; detecting whether logout request information sent by the slave server is received, wherein the logout request information is generated when an abnormal condition exists on a disk, a central processing unit or a memory of the slave server or generated when an abnormal condition exists on a network or a process of the slave server; and if the logout request information is received, acquiring and storing the data generated by executing the series of tasks from the server. By the method, the problem of low reliability of task execution in the prior art can be solved.
Description
Technical Field
The invention relates to the technical field of big data, in particular to a server management method, a server management device and a distributed system.
Background
With the continuous development of information technology, the amount of data is more and more. How to obtain valuable effective data from a large amount of data is a key problem to be solved by a big data technology. In the prior art, a distributed system is generally formed by a plurality of servers to process massive data, so as to efficiently acquire data with available value at high speed.
The inventor researches and discovers that in the existing distributed system, in a task of executing data processing, due to the occurrence of an abnormal condition, a processing process of data can be terminated, so that the problem of data loss is easy to occur, and further, the problem of low reliability of task execution caused by data loss exists.
Disclosure of Invention
In view of the above, an object of the present invention is to provide a server management method, a server management apparatus and a distributed system, so as to solve the problem of low reliability of task execution in the prior art.
In order to achieve the above purpose, the embodiment of the invention adopts the following technical scheme:
a server management method applied to a master server of a distributed system, the distributed system further having a plurality of slave servers communicatively connected to the master server, the server management method comprising:
distributing a series of tasks to any slave server based on the registration request information sent by the slave server so as to enable the slave server to execute the series of tasks;
detecting whether logout request information sent by the slave server is received, wherein the logout request information is generated when an abnormal condition exists on a disk, a central processing unit or a memory of the slave server or generated when an abnormal condition exists on a network or a process of the slave server;
and if the logout request information is received, acquiring and storing data generated by the execution of the series of tasks by the slave server.
In a preferred alternative of the embodiments of the present invention, in the server management method, after the step of allocating a series of tasks to the slave server based on any one of the registration request information transmitted from the slave server so that the slave server executes the series of tasks is performed, the method further includes:
determining task attributes of the series of tasks according to the series of tasks;
and configuring a corresponding heartbeat mechanism for the slave server executing the series of tasks according to the task attributes.
In a preferred option of the embodiment of the present invention, in the server management method, after the step of acquiring and storing the data generated by the execution of the series of tasks by the slave server if the logout request information is received is performed, the method further includes:
detecting whether the registration request information retransmitted from the server is received;
and if the retransmitted registration request information is received, transmitting the stored data generated based on the execution of the series of tasks to the slave server so that the slave server executes the series of tasks.
In a preferred alternative of the embodiment of the present invention, in the server management method, the step of sending the stored data generated by executing the series of tasks to the slave server so that the slave server executes the series of tasks when receiving the retransmitted registration request information includes:
if the retransmitted registration request information is received, judging whether the slave server requests to distribute a unhook task according to the registration request information, wherein the unhook task is a series of tasks which are not executed and completed by the slave server due to the abnormal condition;
and if the slave server requests to distribute the unhook task, sending the stored data generated based on executing the series of tasks to the slave server so as to enable the slave server to execute the series of tasks.
In a preferred option of the embodiment of the present invention, in the server management method, after the step of acquiring and storing the data generated by the execution of the series of tasks by the slave server if the logout request information is received is performed, the method further includes:
assigning the series of tasks to other slave servers;
and sending the stored data generated based on the execution of the series of tasks to the other slave servers so as to enable the slave servers to execute the series of tasks.
An embodiment of the present invention further provides a server management apparatus, which is applied to a master server of a distributed system, where the distributed system further includes a plurality of slave servers in communication connection with the master server, and the server management apparatus includes:
a task allocation module for allocating a series of tasks to any one of the slave servers based on the registration request information transmitted from the slave server so that the slave server executes the series of tasks;
the system comprises a first detection module, a second detection module and a control module, wherein the first detection module is used for detecting whether logout request information sent by a slave server is received or not, and the logout request information is generated when an abnormal condition exists on a disk, a central processing unit or a memory of the slave server or generated when an abnormal condition exists on a network or a process of the slave server;
and the data acquisition module is used for acquiring and storing the data generated by the execution of the series of tasks by the slave server when the logout request information is received.
In a preferred option of the embodiment of the present invention, the server management apparatus further includes:
the task attribute determining module is used for determining the task attributes of the series of tasks according to the series of tasks;
and the heartbeat mechanism configuration module is used for configuring a corresponding heartbeat mechanism for the slave server executing the series of tasks according to the task attributes.
In a preferred option of the embodiment of the present invention, the server management apparatus further includes:
the second detection module is used for detecting whether the registration request information retransmitted from the server is received or not;
and the data sending module is used for sending the stored data generated based on the execution of the series of tasks to the slave server when the retransmitted registration request information is received, so that the slave server executes the series of tasks.
In a preferred option of the embodiment of the present invention, in the server management apparatus, the data transmission module includes:
the task judgment submodule is used for judging whether the slave server requests to distribute the unhook task according to the registration request information when the retransmitted registration request information is received, wherein the unhook task is a series of tasks which are not executed and completed by the slave server due to the abnormal condition;
and the data sending submodule is used for sending the stored data generated based on the execution of the series of tasks to the slave server when the slave server requests to distribute the unhooked tasks so as to enable the slave server to execute the series of tasks.
The embodiment of the invention also provides a distributed system which comprises a master server and a plurality of slave servers in communication connection with the master server;
the master server is used for distributing a series of tasks to any slave server based on the registration request information sent by the slave server;
the slave server is used for executing the series of tasks, generating logout request information and sending the logout request information to the master server when a disk, a central processing unit or a memory of the slave server has an abnormal condition or a network or a process of the slave server has an abnormal condition, and stopping the execution of the series of tasks;
the main server is also used for detecting whether logout request information sent by the slave server is received or not, and acquiring and storing data generated by the slave server executing the series of tasks when the logout request information is received.
According to the server management method, the server management device and the distributed system, the master server distributes tasks or stores data to the slave servers through the registration request information or the logout request information based on the slave servers, so that the slave servers are simply and conveniently managed, the integrity of the data can be ensured, the tasks can be conveniently executed again, the reliability of task execution is improved, and the problem of low reliability of task execution in the prior art is solved.
Furthermore, by configuring the heartbeat mechanism of the slave server based on the type of the distributed task, the problem of resource waste can be avoided on the basis of ensuring the effectiveness of the connection between the master service and the slave server.
In order to make the aforementioned and other objects, features and advantages of the present invention comprehensible, preferred embodiments accompanied with figures are described in detail below.
Drawings
Fig. 1 is a connection block diagram of a distributed system according to an embodiment of the present invention.
Fig. 2 is a block diagram of a main server according to an embodiment of the present invention.
Fig. 3 is a flowchart illustrating a server management method according to an embodiment of the present invention.
Fig. 4 is another flowchart illustrating a server management method according to an embodiment of the present invention.
Fig. 5 is a schematic flowchart of step S150 in fig. 4.
Fig. 6 is another flowchart illustrating a server management method according to an embodiment of the present invention.
Fig. 7 is a block diagram of a server management apparatus according to an embodiment of the present invention.
Fig. 8 is another block diagram of a server management apparatus according to an embodiment of the present invention.
Fig. 9 is a block diagram of a data sending module according to an embodiment of the present invention.
Fig. 10 is another block diagram of a server management apparatus according to an embodiment of the present invention.
Icon: 10-a distributed system; 20-a primary server; 22-a memory; 24-a processor; 30-a slave server; 100-a server management device; 110-a task allocation module; 120-a first detection module; 130-a data acquisition module; 140-a second detection module; 150-a data transmission module; 151-task judgment submodule; 153-a data transmission submodule; 160-task attribute determination module; 170-heartbeat mechanism configuration module.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. The components of embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations.
Thus, the following detailed description of the embodiments of the present invention, presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures. In the description of the present invention, the terms "first", "second", and the like are used only for distinguishing the description, and are not intended to be construed as only or implying relative importance.
As shown in fig. 1, an embodiment of the present invention provides a distributed system 10 including a master server 20 and a plurality of slave servers 30. Each of the slave servers 30 is communicatively connected to the master server 20, and performs a data processing task based on control of the master server 20.
Further, in the present embodiment, the master server 20 is configured to distribute a series of tasks to any one of the slave servers 30 based on the registration request information transmitted from the slave server 30. The slave server 30 is configured to execute the series of tasks, and when an abnormal condition exists in a disk, a central processing unit, or a memory of the slave server 30 or an abnormal condition exists in a network or a process of the slave server 30, generate a logout request message and send the logout request message to the master server 20, and stop executing the series of tasks. The main server 20 is further configured to detect whether logout request information sent by the slave server 30 is received, and when the logout request information is received, obtain and store data generated by the slave server 30 executing the series of tasks.
Optionally, the types of the master server 20 and each of the slave servers 30 are not limited, and may be set according to actual application requirements, and for example, the types may include, but are not limited to, devices with processing functions, such as a web server, a data server, a computer, a Mobile Internet Device (MID), and the like. The types of the master server 20 and the slave servers 30 may be the same or different, as long as data interaction can be performed efficiently.
In this embodiment, each slave server 30 is of the same type as the master server 20, and may include the same components. The description will be given taking the main server 20 as an example with reference to fig. 2. The host server 20 may include, among other things, a memory 22 and a processor 24.
The memory 22 and the processor 24 are electrically connected, directly or indirectly, to enable data transfer or interaction. For example, the components may be electrically connected to each other via one or more communication buses or signal lines. The memory 22 may also store a server management apparatus 100. Wherein the server management device 100 includes at least one software function module which can be stored in the memory 22 in the form of software or firmware (firmware). The processor 24 is configured to execute executable computer programs stored in the memory 22, such as software functional modules and computer programs included in the server management apparatus 100, so as to implement the server management method.
The Memory 22 may be, but is not limited to, a Random Access Memory (RAM), a Read Only Memory (ROM), a Programmable Read-Only Memory (PROM), an Erasable Read-Only Memory (EPROM), an electrically Erasable Read-Only Memory (EEPROM), and the like. Wherein the memory 22 is used for storing programs, and the processor 24 executes the programs after receiving the execution instructions.
The processor 24 may be an integrated circuit chip having signal processing capabilities. The Processor 24 may be a general-purpose Processor including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but may also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components. The various methods, steps and logic blocks disclosed in the embodiments of the present invention may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
It is understood that the structure shown in fig. 2 is only an illustration, and the master server 20 may further include more or less components than those shown in fig. 2, or have a different configuration from that shown in fig. 2, for example, may further include a communication unit for data interaction with each of the slave servers 30. The components shown in fig. 2 may be implemented in hardware, software, or a combination thereof.
With reference to fig. 3, an embodiment of the present invention further provides a server management method applicable to the main server 20 of the distributed system 10. Wherein the method steps defined by the method related flow may be implemented by the processor 24. The specific flow shown in fig. 3 will be described in detail below.
In step S110, a series of tasks is distributed to the slave server 30 based on any one of the registration request information transmitted from the server 30, so that the slave server 30 executes the series of tasks.
In this embodiment, the execution of the task by each slave server 30 may be active, that is, each slave server 30 may perform the request of the task based on its own requirement. For example, a slave server 30 may send registration request information to the master server 20 when it is normal and not currently performing a task, so as to register with the master server 20, so that the master server 20 may acquire information (e.g., status information) of the slave server 30. Also, the registration request information may further include requirement information requesting execution of a task, so that the master service may distribute a series of tasks to the slave server 30 based on the requirement information.
Step S120 detects whether the logout request information transmitted from the server 30 is received.
In this embodiment, the logout request information is generated when an abnormal condition exists on the disk, the central processing unit, or the memory of the slave server 30 or when an abnormal condition exists on the network or the process of the slave server 30. That is, when the above-described abnormal situations occur in the slave server 30, the logout request information may be transmitted to the master server 20 so that the master server 20 can acquire the current situation of the slave server 30. When the abnormal situation occurs, the slave server 30 may enter a suspend state to stop the execution of the series of tasks.
Step S130, if the logout request information is received, acquiring and storing the data generated by the slave server 30 executing the series of tasks.
In the present embodiment, upon receiving the logout request information, it can be determined that there is an abnormal situation in the corresponding slave server 30 and it is difficult to continue execution of the series of tasks assigned to the slave server 30. In order to ensure that the series of tasks can be effectively completed, the data generated by the slave device through executing the series of tasks can be acquired and stored.
Alternatively, the processing manner for the saved data and the corresponding series of tasks is not limited, and may be set according to the actual application requirements, for example, the processing for the series of tasks may be suspended to allocate the series of tasks to the slave server 30 after the abnormal condition of the corresponding slave server 30 is eliminated, or the series of tasks may be directly allocated to another slave server 30.
In one example, after performing step S130, the server management method may further include the steps of: assigning the series of tasks to other slave servers 30; the stored data generated based on the execution of the series of tasks is transmitted to the other slave server 30 to cause the slave server 30 to execute the series of tasks. By the method, the serial tasks can be quickly processed, and the problem that the interrupt execution of the serial tasks affects the progress of task processing is avoided.
In another example, in conjunction with fig. 4, after performing step S130, the server management method may further include step S140 and step S150.
In step S140, it is detected whether the registration request information retransmitted from the server 30 is received.
In the present embodiment, the slave server 30 may resend the registration request information to the master server 20 after the abnormal condition is eliminated to request the master server 20 to perform task allocation, so as to achieve reasonable utilization of the resources of the slave server 30.
Step S150, upon receiving the retransmitted registration request information, transmits the stored data generated by executing the series of tasks to the slave server 30, so that the slave server 30 executes the series of tasks.
Alternatively, when receiving the registration request information retransmitted from the slave server 30, the master server 20 may reallocate the series of tasks that have stopped being executed due to the abnormal condition to the slave server 30 based on the request from the slave server 30, or may actively reallocate the series of tasks that have stopped being executed due to the abnormal condition to the slave server 30. In this embodiment, in conjunction with fig. 5, step S150 may include step S151 and step S153.
In step S151, if the retransmitted registration request information is received, it is determined whether the slave server 30 requests to allocate an unhook task according to the registration request information.
In the present embodiment, the unhook task is a series of tasks that the slave server 30 stops executing due to the occurrence of an abnormal situation. The registration request information retransmitted from the slave server 30 may be a request for the master server 20 to assign a task of unhook, or a request for the master server 20 to assign a new series of tasks.
In step S153, when the slave server 30 requests the allocation of the unhooked task, the stored data generated by executing the series of tasks is transmitted to the slave server 30, so that the slave server 30 executes the series of tasks.
In this embodiment, by executing step S153, it is ensured that the same series is executed by the same slave server 30, so as to avoid the problem that the same series of tasks is executed by different slave servers 30, which causes repeated execution of part of the tasks, and further avoid the problem of resource waste caused by repeated execution of part of the tasks.
Further, to ensure the validity of the connection between the master server 20 and the slave server 30, in this embodiment, the master server 20 may further configure a heartbeat mechanism for the slave server 30, and with reference to fig. 6, after performing step S110, the server management method may further include step S160 and step S170.
And step S160, determining the task attributes of the series of tasks according to the series of tasks.
Step S170, configuring a corresponding heartbeat mechanism for the slave server 30 executing the series of tasks according to the task attributes.
In this embodiment, by configuring the corresponding heartbeat mechanism for the slave server 30 according to the attribute of the task allocated to the slave server 30, the problem of resource waste can be avoided on the basis of ensuring the effectiveness of the connection between the master service and the slave server 30. For example, for a more important series of tasks, the slave server 30 performing the series of tasks may be configured with a heartbeat mechanism with a shorter time constant. For a general series of tasks, the slave server 30 performing the series of tasks may be configured with a heartbeat mechanism with a longer time constant. Specifically, for a more important series of tasks, communication may be performed between the master server 20 and the slave server 30 performing the series of tasks once every 10 ms. For a general series of tasks, communication may be performed between the master server 20 and the slave server 30 performing the series of tasks once every 20ms interval.
With reference to fig. 7, the embodiment of the present invention further provides a server management apparatus 100 applicable to the main server 20 of the distributed system 10. The server management apparatus 100 may include a task assigning module 110, a first detecting module 120, and a data obtaining module 130.
The task allocation module 110 is configured to allocate a series of tasks to the slave server 30 based on any one of the registration request information sent from the slave server 30, so that the slave server 30 executes the series of tasks. In this embodiment, the task allocation module 110 may be configured to execute step S110 shown in fig. 3, and the foregoing description of step S110 may be referred to for the detailed description of the task allocation module 110.
The first detecting module 120 is configured to detect whether logout request information sent by the slave server 30 is received, where the logout request information is generated when an abnormal condition exists on a disk, a central processing unit, or a memory of the slave server 30 or generated when an abnormal condition exists on a network or a process of the slave server 30. In this embodiment, the first detecting module 120 may be configured to perform step S120 shown in fig. 3, and the detailed description about the first detecting module 120 may refer to the foregoing description about step S120.
The data obtaining module 130 is configured to, when receiving the logout request message, obtain and store data generated by the slave server 30 executing the series of tasks. In this embodiment, the data obtaining module 130 may be configured to perform step S130 shown in fig. 3, and the detailed description about the data obtaining module 130 may refer to the foregoing description about step S130.
Further, in this embodiment, with reference to fig. 8, the server management apparatus 100 may further include a second detection module 140 and a data transmission module 150.
The second detecting module 140 is configured to detect whether the registration request information retransmitted from the server 30 is received. In this embodiment, the second detecting module 140 may be configured to perform step S140 shown in fig. 4, and the detailed description about the second detecting module 140 may refer to the foregoing description about step S140.
The data sending module 150 is configured to, upon receiving the retransmitted registration request information, send stored data generated based on the execution of the series of tasks to the slave server 30, so that the slave server 30 executes the series of tasks. In this embodiment, the data sending module 150 may be configured to execute step S150 shown in fig. 4, and the foregoing description of step S150 may be referred to for specific description of the data sending module 150.
In this embodiment, referring to fig. 9, the data sending module 150 may include a task determining sub-module 151 and a data sending sub-module 153.
The task determining sub-module 151 is configured to determine, when receiving the retransmitted registration request information, whether the slave server 30 requests to allocate a task to be taken off hook according to the registration request information, where the task to be taken off hook is a series of tasks that are not executed and completed by the slave server 30 due to an abnormal condition. In this embodiment, the task determination sub-module 151 may be configured to perform step S151 shown in fig. 5, and the detailed description of the task determination sub-module 151 may refer to the description of step S151.
The data transmission sub-module 153 is configured to transmit the stored data generated based on the execution of the series of tasks to the slave server 30 so that the slave server 30 executes the series of tasks when the slave server 30 requests the allocation of the unhooked tasks. In this embodiment, the data sending submodule 153 may be configured to execute step S153 shown in fig. 5, and the foregoing description of step S153 may be referred to for the detailed description of the data sending submodule 153.
Further, in this embodiment, in combination with fig. 10, the server management apparatus 100 may further include a task attribute determining module 160 and a heartbeat mechanism configuring module 170.
The task attribute determining module 160 is configured to determine task attributes of the series of tasks according to the series of tasks. In this embodiment, the task attribute determining module 160 may be configured to execute step S160 shown in fig. 6, and the detailed description about the task attribute determining module 160 may refer to the foregoing description about step S160.
The heartbeat mechanism configuring module 170 is configured to configure a corresponding heartbeat mechanism for the slave server 30 executing the series of tasks according to the task attribute. In this embodiment, the heartbeat mechanism configuration module 170 may be configured to execute step S170 shown in fig. 6, and the detailed description about the heartbeat mechanism configuration module 170 may refer to the foregoing description about step S170.
In summary, according to the server management method, the server management apparatus 100 and the distributed system 10 provided by the present invention, the master server 20 distributes tasks or stores data to the slave servers 30 based on the registration request information or the cancellation request information of the slave servers 30, so that the slave servers 30 can be managed easily, and the integrity of the data can be ensured to facilitate the task to be executed again, thereby improving the reliability of task execution and further improving the problem of low reliability of task execution in the prior art. Secondly, by configuring the heartbeat mechanism of the slave server 30 based on the type of the assigned task, the problem of resource waste can be avoided on the basis of ensuring the effectiveness of the connection between the master service and the slave server 30.
In the embodiments provided in the present invention, it should be understood that the disclosed apparatus and method can be implemented in other ways. The apparatus and method embodiments described above are illustrative only, as the flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
In addition, the functional modules in the embodiments of the present invention may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part.
The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, an electronic device, or a network device) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes. It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.
Claims (6)
1. A server management method applied to a master server of a distributed system, wherein the distributed system further has a plurality of slave servers communicatively connected to the master server, the server management method comprising:
distributing a series of tasks to any slave server based on the registration request information sent by the slave server so as to enable the slave server to execute the series of tasks;
detecting whether logout request information sent by the slave server is received, wherein the logout request information is generated when an abnormal condition exists on a disk, a central processing unit or a memory of the slave server or generated when an abnormal condition exists on a network or a process of the slave server;
if the logout request information is received, acquiring and storing data generated by the slave server executing the series of tasks;
detecting whether the registration request information retransmitted from the server is received;
if the re-sent registration request information is received, sending the stored data generated based on the execution of the series of tasks to the slave server so that the slave server executes the series of tasks;
the step of transmitting the stored data generated based on the execution of the series of tasks to the slave server if the registration request information retransmitted is received, so that the slave server executes the series of tasks includes:
if the retransmitted registration request information is received, judging whether the slave server requests to distribute a unhook task according to the registration request information, wherein the unhook task is a series of tasks which are not executed and completed by the slave server due to the abnormal condition;
and if the slave server requests to distribute the unhook task, sending the stored data generated based on executing the series of tasks to the slave server so as to enable the slave server to execute the series of tasks.
2. The server management method according to claim 1, wherein after the step of assigning a series of tasks to the slave server based on the registration request information transmitted from any one of the slave servers to cause the slave server to execute the series of tasks is performed, the method further comprises:
determining task attributes of the series of tasks according to the series of tasks;
and configuring a corresponding heartbeat mechanism for the slave server executing the series of tasks according to the task attributes.
3. The server management method according to claim 1 or 2, wherein after performing the step of acquiring and storing data resulting from the execution of the series of tasks by the server if the logout request information is received, the method further comprises:
assigning the series of tasks to other slave servers;
and sending the stored data generated based on the execution of the series of tasks to the other slave servers so as to enable the slave servers to execute the series of tasks.
4. A server management apparatus applied to a master server of a distributed system, wherein the distributed system further has a plurality of slave servers communicatively connected to the master server, the server management apparatus comprising:
a task allocation module for allocating a series of tasks to any one of the slave servers based on the registration request information transmitted from the slave server so that the slave server executes the series of tasks;
the system comprises a first detection module, a second detection module and a control module, wherein the first detection module is used for detecting whether logout request information sent by a slave server is received or not, and the logout request information is generated when an abnormal condition exists on a disk, a central processing unit or a memory of the slave server or generated when an abnormal condition exists on a network or a process of the slave server;
the data acquisition module is used for acquiring and storing data generated by the slave server executing the series of tasks when the logout request information is received;
the second detection module is used for detecting whether the registration request information retransmitted from the server is received or not;
a data transmission module for transmitting the stored data generated based on the execution of the series of tasks to the slave server to enable the slave server to execute the series of tasks when the retransmitted registration request information is received;
the data transmission module comprises:
the task judgment submodule is used for judging whether the slave server requests to distribute the unhook task according to the registration request information when the retransmitted registration request information is received, wherein the unhook task is a series of tasks which are not executed and completed by the slave server due to the abnormal condition;
and the data sending submodule is used for sending the stored data generated based on the execution of the series of tasks to the slave server when the slave server requests to distribute the unhooked tasks so as to enable the slave server to execute the series of tasks.
5. The server management apparatus according to claim 4, further comprising:
the task attribute determining module is used for determining the task attributes of the series of tasks according to the series of tasks;
and the heartbeat mechanism configuration module is used for configuring a corresponding heartbeat mechanism for the slave server executing the series of tasks according to the task attributes.
6. A distributed system comprising a master server and a plurality of slave servers communicatively coupled to said master server;
the master server is used for distributing a series of tasks to any slave server based on the registration request information sent by the slave server;
the slave server is used for executing the series of tasks, generating logout request information and sending the logout request information to the master server when a disk, a central processing unit or a memory of the slave server has an abnormal condition or a network or a process of the slave server has an abnormal condition, and stopping the execution of the series of tasks;
the main server is also used for detecting whether logout request information sent by the slave server is received or not, and acquiring and storing data generated by the slave server executing the series of tasks when the logout request information is received;
the master server is also used for detecting whether registration request information retransmitted by the slave server is received;
if the master server receives the retransmitted registration request information, sending the stored data generated based on the execution of the series of tasks to the slave server so that the slave server executes the series of tasks;
the slave server is also used for requesting the distribution of an unhook task to the master server, wherein the unhook task is a series of tasks which are stopped to be executed by the slave server due to the occurrence of an abnormal condition;
if the master server receives the retransmitted registration request information, the step of transmitting the stored data generated based on the execution of the series of tasks to the slave server so that the slave server executes the series of tasks includes:
if the main server receives the retransmitted registration request information, judging whether the slave server requests to distribute the unhook task according to the registration request information;
and if the slave server requests to distribute the unhook tasks, the master server sends the stored data generated based on the execution of the series of tasks to the slave server so that the slave server executes the series of tasks.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810374590.2A CN108600008B (en) | 2018-04-24 | 2018-04-24 | Server management method, server management device and distributed system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810374590.2A CN108600008B (en) | 2018-04-24 | 2018-04-24 | Server management method, server management device and distributed system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108600008A CN108600008A (en) | 2018-09-28 |
CN108600008B true CN108600008B (en) | 2021-12-17 |
Family
ID=63614529
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810374590.2A Active CN108600008B (en) | 2018-04-24 | 2018-04-24 | Server management method, server management device and distributed system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108600008B (en) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111858744B (en) * | 2019-04-29 | 2024-07-09 | 北京嘀嘀无限科技发展有限公司 | Database data synchronization method, server and system |
CN113032188B (en) * | 2019-12-24 | 2023-11-03 | 腾讯科技(深圳)有限公司 | Method, device, server and storage medium for determining main server |
CN113254199A (en) * | 2021-05-07 | 2021-08-13 | 埃森智能科技(深圳)有限公司 | Method, system and equipment for simultaneously processing multiple tasks |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2010039526A (en) * | 2008-07-31 | 2010-02-18 | Toshiba Corp | Computer program and master computer |
CN102279730A (en) * | 2010-06-10 | 2011-12-14 | 阿里巴巴集团控股有限公司 | Parallel data processing method, device and system |
CN102385536A (en) * | 2010-08-27 | 2012-03-21 | 中兴通讯股份有限公司 | Method and system for realization of parallel computing |
CN103092712A (en) * | 2011-11-04 | 2013-05-08 | 阿里巴巴集团控股有限公司 | Method and device for recovering interrupt tasks |
CN104050029A (en) * | 2014-05-30 | 2014-09-17 | 北京先进数通信息技术股份公司 | Task scheduling system |
CN104580338A (en) * | 2013-10-29 | 2015-04-29 | 华为技术有限公司 | Service processing method, system and equipment |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104503845B (en) * | 2015-01-14 | 2017-07-14 | 北京邮电大学 | A kind of task distribution method and system |
CN107341051A (en) * | 2016-05-03 | 2017-11-10 | 北京京东尚科信息技术有限公司 | Cluster task coordination approach, system and device |
-
2018
- 2018-04-24 CN CN201810374590.2A patent/CN108600008B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2010039526A (en) * | 2008-07-31 | 2010-02-18 | Toshiba Corp | Computer program and master computer |
CN102279730A (en) * | 2010-06-10 | 2011-12-14 | 阿里巴巴集团控股有限公司 | Parallel data processing method, device and system |
CN102385536A (en) * | 2010-08-27 | 2012-03-21 | 中兴通讯股份有限公司 | Method and system for realization of parallel computing |
CN103092712A (en) * | 2011-11-04 | 2013-05-08 | 阿里巴巴集团控股有限公司 | Method and device for recovering interrupt tasks |
CN104580338A (en) * | 2013-10-29 | 2015-04-29 | 华为技术有限公司 | Service processing method, system and equipment |
CN104050029A (en) * | 2014-05-30 | 2014-09-17 | 北京先进数通信息技术股份公司 | Task scheduling system |
Also Published As
Publication number | Publication date |
---|---|
CN108600008A (en) | 2018-09-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2018149221A1 (en) | Device management method and network management system | |
CN109729106B (en) | Method, system and computer program product for processing computing tasks | |
US20170031622A1 (en) | Methods for allocating storage cluster hardware resources and devices thereof | |
CN108600008B (en) | Server management method, server management device and distributed system | |
CN107534570A (en) | Virtualize network function monitoring | |
CN1786919A (en) | Method and apparatus for automated resource management in logically partitioned data processing system | |
US20130305245A1 (en) | Methods for managing work load bursts and devices thereof | |
CN110231991B (en) | Task allocation method and device, electronic equipment and readable storage medium | |
CN108282514A (en) | A kind of distributed service method for building up and device | |
US10778807B2 (en) | Scheduling cluster resources to a job based on its type, particular scheduling algorithm,and resource availability in a particular resource stability sub-levels | |
CN112860387A (en) | Distributed task scheduling method and device, computer equipment and storage medium | |
CN112559173A (en) | Resource adjusting method and device, electronic equipment and readable storage medium | |
US8468530B2 (en) | Determining and describing available resources and capabilities to match jobs to endpoints | |
US9386087B2 (en) | Workload placement in a computer system | |
US10705740B2 (en) | Managing of storage resources | |
US9766995B2 (en) | Self-spawning probe in a distributed computing environment | |
CN106293945A (en) | A kind of resource perception method and system across virtual machine | |
CN109428926B (en) | Method and device for scheduling task nodes | |
CN101719853B (en) | How to detect the running status of the server | |
EP3346671B1 (en) | Service processing method and equipment | |
CN113596123A (en) | Software downloading method, communication device and storage medium | |
CN113366444B (en) | Information processing apparatus, information processing system, computer-readable recording medium, and information processing method | |
CN109324914B (en) | Service calling method, service calling device and central server | |
CN113703930A (en) | Task scheduling method, device and system and computer readable storage medium | |
US12086642B2 (en) | Context aware distribution of computing |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
TA01 | Transfer of patent application right |
Effective date of registration: 20210319 Address after: 4 / F, block B, building 5, No. 200, Tianfu 5th Street, high tech Zone, Chengdu, Sichuan 610000 Applicant after: Zhiyun Technology Co.,Ltd. Address before: No. 1501, 15th floor, building 12, 219 Tianhua 2nd Road, hi tech Zone, Chengdu, Sichuan 610000 Applicant before: CHENGDU ZHIYUN SCIENCE & TECHNOLOGY Co.,Ltd. |
|
TA01 | Transfer of patent application right | ||
GR01 | Patent grant | ||
GR01 | Patent grant |