[go: up one dir, main page]

CN103905250B - A kind of method of optimum management cluster state - Google Patents

A kind of method of optimum management cluster state Download PDF

Info

Publication number
CN103905250B
CN103905250B CN201410106962.5A CN201410106962A CN103905250B CN 103905250 B CN103905250 B CN 103905250B CN 201410106962 A CN201410106962 A CN 201410106962A CN 103905250 B CN103905250 B CN 103905250B
Authority
CN
China
Prior art keywords
cluster
status
state
resource
group
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201410106962.5A
Other languages
Chinese (zh)
Other versions
CN103905250A (en
Inventor
孟宪伟
周博
王倩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inspur Cloud Information Technology Co Ltd
Original Assignee
Inspur Electronic Information Industry Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inspur Electronic Information Industry Co Ltd filed Critical Inspur Electronic Information Industry Co Ltd
Priority to CN201410106962.5A priority Critical patent/CN103905250B/en
Publication of CN103905250A publication Critical patent/CN103905250A/en
Application granted granted Critical
Publication of CN103905250B publication Critical patent/CN103905250B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

本发明提供一种优化管理集群状态的方法,该方法适用于大规模高可用集群的状态管理,涉及集群状态,组状态和资源状态,特别是针对带宽和响应时间要求较高的环境。只保留资源状态更新逻辑,取消组状态更新及集群状态更新函数,组状态和集群状态设置完全包含在资源状态更新逻辑中,同时取消了集群IP的限制,优化了集群处理逻辑。

The invention provides a method for optimizing and managing the cluster state, which is applicable to the state management of large-scale high-availability clusters, involves cluster state, group state and resource state, and is especially aimed at environments with high bandwidth and response time requirements. Only the resource status update logic is reserved, and the group status update and cluster status update functions are cancelled. The group status and cluster status settings are completely included in the resource status update logic. At the same time, the cluster IP restriction is canceled, and the cluster processing logic is optimized.

Description

一种优化管理集群状态的方法A Method for Optimally Managing Cluster State

技术领域technical field

本发明涉及计算机领域,尤其涉及高可用集群管理,具体地说是一种优化管理集群状态的方法。The invention relates to the computer field, in particular to high-availability cluster management, in particular to a method for optimizing and managing the cluster state.

背景技术Background technique

在高可用集群管理中,状态管理是很重要的,因为它是一切活动的触发条件和最终处理,集群是否能够保持高可用性,很大程度上取决于状态管理的正确性和及时性。而在正常的集群活动中,无论是启停集群还是启停组或资源,都会触发很多的资源,组和集群的状态更新,经常接触高可用集群的人会发现,此时状态更新占去了大部分带宽,甚至会拖延到正常的集群活动。在集群有异常发生的时候,集群的状态更新同样会影响到集群对于异常的处理速度。In high-availability cluster management, state management is very important, because it is the trigger condition and final processing of all activities. Whether the cluster can maintain high availability largely depends on the correctness and timeliness of state management. In normal cluster activities, whether it is starting and stopping the cluster or starting and stopping the group or resources, it will trigger a lot of resource, group and cluster status updates. People who often contact high-availability clusters will find that status updates take up Most of the bandwidth will be stalled even to normal cluster activity. When an exception occurs in the cluster, the status update of the cluster will also affect the processing speed of the cluster for the exception.

因此,如何能够有效地减少集群状态更新对于高可用集群管理就显得格外重要。另外原有的集群管理中一般都会为了标记主节点,而单独设立一个集群IP,这即对于一些状态管理有所障碍,对于用户也会造成一定的困扰,因为在大部分应用场景中,由于处于内网环境中,因此IP都为非常宝贵的资源,如果能去掉集群IP的要求,便节省了IP资源。Therefore, how to effectively reduce cluster status updates is particularly important for high-availability cluster management. In addition, in the original cluster management, a separate cluster IP is generally set up in order to mark the master node, which hinders some status management and causes certain troubles for users, because in most application scenarios, due to In the intranet environment, IP is a very precious resource. If the requirement of cluster IP can be removed, IP resources will be saved.

发明内容Contents of the invention

本发明使用一种优化的集群资源管理方法,提高集群管理效率,减少了带宽消耗,并且清晰了状态管理逻辑。该方法主要包括以下几个方面:The invention uses an optimized cluster resource management method, improves cluster management efficiency, reduces bandwidth consumption, and clarifies state management logic. This method mainly includes the following aspects:

(1) 集群状态结构(1) Cluster state structure

集群状态结构和现有高可用集群状态管理逻辑相同,存在一个集群状态值和两个状态列表,分别为组状态列表和资源状态列表。The cluster state structure is the same as the existing high-availability cluster state management logic. There is a cluster state value and two state lists, namely the group state list and the resource state list.

(2) 状态更新逻辑(2) Status update logic

为了简化状态更新,只保留资源状态更新逻辑,取消组状态更新及集群状态更新函数,组状态和集群状态设置完全包含在资源状态更新逻辑中。此处虽然增加了单条资源状态处理的逻辑,但是由于减少了状态更新命令的总数,所以总体上状态更新节约了不少资源。In order to simplify the state update, only the resource state update logic is retained, the group state update and cluster state update functions are canceled, and the group state and cluster state settings are completely included in the resource state update logic. Although the logic of single resource state processing is added here, since the total number of state update commands is reduced, the overall state update saves a lot of resources.

a) 单个资源状态更新a) Single resource status update

启停资源或者单个资源报异常时,直接触发节点发送此资源状态更新命令,主节点收到后,更新本地资源状态列表,并且更新该资源所在组状态列表及集群状态。When a resource is started or stopped or a single resource reports an exception, the node is directly triggered to send the resource status update command. After the master node receives it, it updates the local resource status list, and updates the resource group status list and cluster status.

b)组状态更新b) Group status update

启停组操作后,执行节点把资源启动情况返回给主节点,主节点按照返回依次更新资源状态列表及集群状态。After the start-stop group operation, the execution node returns the resource startup status to the master node, and the master node updates the resource status list and cluster status in sequence according to the return.

(3) 同步状态(3) Synchronization status

同样,为了保证集群的高可用性,集群内所有节点必须共享集群的各资源、组状态。因此主节点在处理完资源状态更新后,需要同步给集群内所有其他节点,此时同步的状态也只是有资源状态,集群内各节点收取资源状态更新状态,同理更新本地资源状态,并在内部逻辑中更新组状态及集群状态。Similarly, in order to ensure the high availability of the cluster, all nodes in the cluster must share the resources and group status of the cluster. Therefore, after the master node finishes processing the resource status update, it needs to synchronize to all other nodes in the cluster. At this time, the status of the synchronization is only the resource status. Each node in the cluster receives the resource status update status, similarly updates the local resource status, and Update group status and cluster status in internal logic.

(4) 状态获取(4) Status acquisition

外部获取状态通过控制台连接主节点访问集群状态,根据具体访问要求,直接根据本地集群状态列表返回。The external acquisition status is connected to the master node through the console to access the cluster status, and according to the specific access requirements, it is directly returned according to the local cluster status list.

(5) 取消集群IP(5) Cancel the cluster IP

集群IP是为了标记主节点,但是由于当所有集群资源处于停止状态时,集群IP依然存在,这和本方法根据资源状态标记集群状态有冲突,而且往往还多占用了一个宝贵的内网IP,The cluster IP is used to mark the master node, but since the cluster IP still exists when all cluster resources are stopped, this conflicts with this method of marking the cluster status according to the resource status, and often takes up a valuable intranet IP.

因此如果想利用本文提出的优化集群状态管理的方法,就需要取消集群IP的设置。这里仅仅是需要取消集群IP的设置,主节点的设置依然存在,节点之间知晓主节点存在,并且决策出节点后,要通知控制台知晓,以便外部连接主节点获取集群信息。Therefore, if you want to use the method proposed in this article to optimize cluster state management, you need to cancel the cluster IP setting. Here it is only necessary to cancel the setting of the cluster IP, the setting of the master node still exists, the nodes know the existence of the master node, and after the node is decided, the console should be notified so that the external connection to the master node can obtain the cluster information.

本发明与现有技术相比,所产生的有益效果是:Compared with the prior art, the present invention has the beneficial effects of:

提供了一个优化管理集群状态的方法,这样既节省了带宽,保证了集群的通讯效率,又简化了处理逻辑,降低了程序出错的概率,而且还取消了集群IP的设置,节省了IP资源。提高集群管理效率,减少了带宽消耗,并且清晰了状态管理逻辑,优化的状态管理逻辑和无集群IP的管理方法为集群管理提供了便捷的路径。优化高可用集群管理软件的状态管理,提高管理效率并减少带宽占用。Provides a method to optimize the management of the cluster state, which not only saves bandwidth, ensures the communication efficiency of the cluster, but also simplifies the processing logic, reduces the probability of program errors, and cancels the cluster IP setting, saving IP resources. Improve cluster management efficiency, reduce bandwidth consumption, and clarify the state management logic. The optimized state management logic and the management method without cluster IP provide a convenient path for cluster management. Optimize the state management of high-availability cluster management software, improve management efficiency and reduce bandwidth occupation.

附图说明Description of drawings

附图1是本发明的状态更新/获取流程图。Accompanying drawing 1 is the flow chart of status updating/obtaining of the present invention.

具体实施方式detailed description

本发明使用一种优化的集群资源管理方法,提高集群管理效率,减少了带宽消耗,并且清晰了状态管理逻辑。该方法主要包括以下几个方面:The invention uses an optimized cluster resource management method, improves cluster management efficiency, reduces bandwidth consumption, and clarifies state management logic. This method mainly includes the following aspects:

(1) 集群状态结构(1) Cluster state structure

集群状态结构和现有高可用集群状态管理逻辑相同,存在一个集群状态值和两个状态列表,分别为组状态列表和资源状态列表。The cluster state structure is the same as the existing high-availability cluster state management logic. There is a cluster state value and two state lists, namely the group state list and the resource state list.

(2) 状态更新逻辑(2) Status update logic

为了简化状态更新,只保留资源状态更新逻辑,取消组状态更新及集群状态更新函数,组状态和集群状态设置完全包含在资源状态更新逻辑中。此处虽然增加了单条资源状态处理的逻辑,但是由于减少了状态更新命令的总数,所以总体上状态更新节约了不少资源。In order to simplify the state update, only the resource state update logic is retained, the group state update and cluster state update functions are canceled, and the group state and cluster state settings are completely included in the resource state update logic. Although the logic of single resource state processing is added here, since the total number of state update commands is reduced, the overall state update saves a lot of resources.

a)单个资源状态更新a) Single resource status update

启停资源或者单个资源报异常时,直接触发节点发送此资源状态更新命令,主节点收到后,更新本地资源状态列表,并且更新该资源所在组状态列表及集群状态。When a resource is started or stopped or a single resource reports an exception, the node is directly triggered to send the resource status update command. After the master node receives it, it updates the local resource status list, and updates the resource group status list and cluster status.

b)组状态更新b) Group status update

启停组操作后,执行节点把资源启动情况返回给主节点,主节点按照返回依次更新资源状态列表及集群状态。After the start-stop group operation, the execution node returns the resource startup status to the master node, and the master node updates the resource status list and cluster status in sequence according to the return.

(3) 同步状态(3) Synchronization status

同样,为了保证集群的高可用性,集群内所有节点必须共享集群的各资源、组状态。因此主节点在处理完资源状态更新后,需要同步给集群内所有其他节点,此时同步的状态也只是有资源状态,集群内各节点收取资源状态更新状态,同理更新本地资源状态,并在内部逻辑中更新组状态及集群状态。Similarly, in order to ensure the high availability of the cluster, all nodes in the cluster must share the resources and group status of the cluster. Therefore, after the master node finishes processing the resource status update, it needs to synchronize to all other nodes in the cluster. At this time, the status of the synchronization is only the resource status. Each node in the cluster receives the resource status update status, similarly updates the local resource status, and Update group status and cluster status in internal logic.

(4) 状态获取(4) Status acquisition

外部获取状态通过控制台连接主节点访问集群状态,根据具体访问要求,直接根据本地集群状态列表返回。The external acquisition status is connected to the master node through the console to access the cluster status, and according to the specific access requirements, it is directly returned according to the local cluster status list.

(5) 取消集群IP(5) Cancel the cluster IP

集群IP是为了标记主节点,但是由于当所有集群资源处于停止状态时,集群IP依然存在,这和本方法根据资源状态标记集群状态有冲突,而且往往还多占用了一个宝贵的内网IP,The cluster IP is used to mark the master node, but since the cluster IP still exists when all cluster resources are stopped, this conflicts with this method of marking the cluster status according to the resource status, and often takes up a valuable intranet IP.

因此如果想利用本文提出的优化集群状态管理的方法,就需要取消集群IP的设置。这里仅仅是需要取消集群IP的设置,主节点的设置依然存在,节点之间知晓主节点存在,并且决策出节点后,要通知控制台知晓,以便外部连接主节点获取集群信息。Therefore, if you want to use the method proposed in this article to optimize cluster state management, you need to cancel the cluster IP setting. Here it is only necessary to cancel the setting of the cluster IP, the setting of the master node still exists, the nodes know the existence of the master node, and after the node is decided, the console should be notified so that the external connection to the master node can obtain the cluster information.

Claims (1)

1.一种优化管理集群状态的方法,其特征在于该方法主要涉及2个部分,一是所有集群动作都只触发资源状态更改,二是取消主节点上集群IP设置;1. A method for optimizing and managing the cluster state, characterized in that the method mainly involves two parts, one is that all cluster actions only trigger resource state changes, and the other is to cancel the cluster IP setting on the master node; 该方法主要组成如下:The method mainly consists of the following: 1)、依然存在三种状态,但只保留资源状态触发,组状态及集群状态处理都存在于资源状态处理逻辑内部;1) There are still three states, but only resource state triggers are retained, and group state and cluster state processing exist within the resource state processing logic; 2)、集群内所有节点必须共享集群的各资源、组状态;因此主节点在处理完资源状态更新后,需要同步给集群内所有其他节点,此时同步的状态也只是有资源状态,集群内各节点收取资源状态更新状态,同理更新本地资源状态,并在内部逻辑中更新组状态及集群状态;2) All nodes in the cluster must share the resources and group status of the cluster; therefore, after the master node finishes updating the resource status, it needs to synchronize to all other nodes in the cluster. Each node receives the resource status update status, similarly updates the local resource status, and updates the group status and cluster status in the internal logic; 3)、外部获取状态通过控制台连接主节点访问集群状态,根据具体访问要求,直接利用本地集群状态列表返回;3) The external acquisition status is accessed through the console to connect to the master node to access the cluster status, and according to the specific access requirements, directly use the local cluster status list to return; 4)、取消集群IP,只保留主节点标记,此标记需要额外通知控制台。4) Cancel the cluster IP and only keep the master node mark, which needs to be notified to the console additionally.
CN201410106962.5A 2014-03-21 2014-03-21 A kind of method of optimum management cluster state Active CN103905250B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410106962.5A CN103905250B (en) 2014-03-21 2014-03-21 A kind of method of optimum management cluster state

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410106962.5A CN103905250B (en) 2014-03-21 2014-03-21 A kind of method of optimum management cluster state

Publications (2)

Publication Number Publication Date
CN103905250A CN103905250A (en) 2014-07-02
CN103905250B true CN103905250B (en) 2018-02-23

Family

ID=50996407

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410106962.5A Active CN103905250B (en) 2014-03-21 2014-03-21 A kind of method of optimum management cluster state

Country Status (1)

Country Link
CN (1) CN103905250B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101115242A (en) * 2007-08-30 2008-01-30 中兴通讯股份有限公司 Method and device for preventing group updating message congestion of digital cluster system
CN102571960A (en) * 2012-01-12 2012-07-11 浪潮(北京)电子信息产业有限公司 Method and device for monitoring high-availability cluster state
CN102855157A (en) * 2012-07-19 2013-01-02 浪潮电子信息产业股份有限公司 Method for comprehensively scheduling load of servers
CN103279386A (en) * 2013-06-09 2013-09-04 浪潮电子信息产业股份有限公司 Method for achieving high availability of computer operation scheduling system

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6944785B2 (en) * 2001-07-23 2005-09-13 Network Appliance, Inc. High-availability cluster virtual server system
US7228351B2 (en) * 2002-12-31 2007-06-05 International Business Machines Corporation Method and apparatus for managing resource contention in a multisystem cluster

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101115242A (en) * 2007-08-30 2008-01-30 中兴通讯股份有限公司 Method and device for preventing group updating message congestion of digital cluster system
CN102571960A (en) * 2012-01-12 2012-07-11 浪潮(北京)电子信息产业有限公司 Method and device for monitoring high-availability cluster state
CN102855157A (en) * 2012-07-19 2013-01-02 浪潮电子信息产业股份有限公司 Method for comprehensively scheduling load of servers
CN103279386A (en) * 2013-06-09 2013-09-04 浪潮电子信息产业股份有限公司 Method for achieving high availability of computer operation scheduling system

Also Published As

Publication number Publication date
CN103905250A (en) 2014-07-02

Similar Documents

Publication Publication Date Title
Mayer et al. Fogstore: Toward a distributed data store for fog computing
US10540368B2 (en) System and method for resolving synchronization conflicts
US9367261B2 (en) Computer system, data management method and data management program
CN104348913B (en) A kind of extendible big data interactive method of close coupling
US20140181818A1 (en) Optimization of packet processing by delaying a processor from entering an idle state
CN112667362A (en) Method and system for deploying Kubernetes virtual machine cluster on Kubernetes
US20150169313A1 (en) Integrated system and firmware update method
US8095495B2 (en) Exchange of syncronization data and metadata
CN106599061B (en) SQLite-based embedded database synchronization method
US9313269B2 (en) Blending single-master and multi-master data synchronization techniques
CN108121782A (en) Distribution method, database middleware system and the electronic equipment of inquiry request
US20160142475A1 (en) Shard management service
CN104503845A (en) Task distributing method and system
CN102629221B (en) Task synchronization method, device and system for distributed shared storage
EP2932370A1 (en) System and method for performing a transaction in a massively parallel processing database
CN107682206A (en) The dispositions method and system of business process management system based on micro services
WO2016082594A1 (en) Data update processing method and apparatus
CN105204933A (en) Multitask switching execution method based on single process, multitask switching execution system based on single process and processor
CN106886450A (en) Method for scheduling task and system
CN103823748A (en) Partition software reliability analysis method based on stochastic Petri network
CN104298601A (en) Software system testing method based on Hadoop platform
CN106354566A (en) Command processing method and server
CN113254437B (en) Batch processing job processing method and device
CN106257424B (en) A method of the distributed data base system based on KVM cloud platform realizes automatic telescopic load balancing
CN104901998A (en) Integrated cloud service monitoring method

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20180813

Address after: 250101 S06 tower, 1036, Chao Lu Road, hi tech Zone, Ji'nan, Shandong.

Patentee after: Shandong wave cloud Mdt InfoTech Ltd

Address before: 250014 1036 Shun Ya Road, hi tech Zone, Ji'nan, Shandong.

Patentee before: Langchao Electronic Information Industry Co., Ltd.

CP03 Change of name, title or address
CP03 Change of name, title or address

Address after: 250100 No. 1036 Tidal Road, Jinan High-tech Zone, Shandong Province, S01 Building, Tidal Science Park

Patentee after: Inspur cloud Information Technology Co., Ltd

Address before: 250101 S06 tower, 1036, Chao Lu Road, hi tech Zone, Ji'nan, Shandong.

Patentee before: SHANDONG LANGCHAO YUNTOU INFORMATION TECHNOLOGY Co.,Ltd.