CN102722146B - Distributed system control structure with failure protection function, and failure protection method - Google Patents
Distributed system control structure with failure protection function, and failure protection method Download PDFInfo
- Publication number
- CN102722146B CN102722146B CN 201210162638 CN201210162638A CN102722146B CN 102722146 B CN102722146 B CN 102722146B CN 201210162638 CN201210162638 CN 201210162638 CN 201210162638 A CN201210162638 A CN 201210162638A CN 102722146 B CN102722146 B CN 102722146B
- Authority
- CN
- China
- Prior art keywords
- node
- distributed system
- communication
- layer
- failure
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
- 238000000034 method Methods 0.000 title claims abstract description 9
- 238000012790 confirmation Methods 0.000 claims description 13
- 238000010586 diagram Methods 0.000 description 2
- 238000010187 selection method Methods 0.000 description 1
Images
Landscapes
- Alarm Systems (AREA)
- Hardware Redundancy (AREA)
Abstract
本发明提供的是一种具有失效保护的分布式系统控制结构及失效保护方法。对分布式系统中原有的连接进行失效保护设置,从分布式系统的第二层开始,同层相邻节点进行连接设置;在上下层通信或管理过程中,上层节点将控制命令群发至与其相连接的下层节点,根据下层节点返回的信息探测通信或管理是否失效;如果发现通信或管理失效,由相邻下层节点进行控制,以恢复失效的通信或管理。本发明适用于对于安全性和可靠性要求高的场合,例如火灾报警系统、矿井安全系统等尤为适用。
The invention provides a distributed system control structure with failure protection and a failure protection method. The failure protection setting of the original connection in the distributed system is set. Starting from the second layer of the distributed system, the adjacent nodes of the same layer are connected. The connected lower-level nodes detect whether the communication or management is invalid according to the information returned by the lower-level nodes; if the communication or management is found to be invalid, the adjacent lower-level nodes are controlled to restore the failed communication or management. The present invention is suitable for occasions requiring high safety and reliability, such as fire alarm systems, mine safety systems and the like.
Description
技术领域 technical field
本发明涉及的是一种分布式系统控制领域,本发明也涉及一种分布式系统控制失效保护方法。The invention relates to the field of distributed system control, and also relates to a distributed system control failure protection method.
背景技术 Background technique
分布式系统控制的应用非常广泛,因此分布式系统控制的可靠性和安全性尤为重要。目前分布式系统控制的可靠性主要来源于其结构自身的容错机制,即若全局性通信或管理中断,局部站仍能维持工作,但却不能恢复中断的通信或管理。因此需要一种有效的保护方法,使分布式系统控制的通信或控制在失效后得到有效的保护,即使通信和控制仍然有效。The application of distributed system control is very extensive, so the reliability and security of distributed system control are particularly important. At present, the reliability of distributed system control mainly comes from the fault-tolerant mechanism of its structure itself, that is, if the global communication or management is interrupted, the local station can still maintain work, but it cannot resume the interrupted communication or management. Therefore, an effective protection method is needed, so that the communication or control controlled by the distributed system can be effectively protected after failure, even if the communication and control are still valid.
现有的分布式系统控制结构如图1所示,该结构对通信或管理的可靠性依赖于分布式系统控制本身的结构,并且当系统出现通信或管理中断后,不能恢复。因此,需要一种方法解决上述问题。The existing distributed system control structure is shown in Figure 1. The reliability of the structure for communication or management depends on the structure of the distributed system control itself, and when the system is interrupted in communication or management, it cannot be recovered. Therefore, a method is needed to solve the above problems.
发明内容 Contents of the invention
本发明的目的在于提供一种可靠性及安全性高的具有失效保护的分布式系统控制结构。本发明的目的还在于提供一种分布式系统控制的失效保护方法。The object of the present invention is to provide a distributed system control structure with high reliability and safety and fail-safe. The purpose of the present invention is also to provide a failure protection method for distributed system control.
本发明的目的是这样实现的:The purpose of the present invention is achieved like this:
本发明的具有失效保护的分布式系统控制结构为:分布式系统的第二层的相邻节点依次连接;分布式系统的第三层节点中,同属于第上一层同一个节点控制下的节点依次连接;分布式系统其他层的连接方式与第三层相同。The control structure of the distributed system with failure protection of the present invention is: the adjacent nodes of the second layer of the distributed system are connected sequentially; among the nodes of the third layer of the distributed system, all nodes belonging to the same node on the upper layer are controlled Nodes are connected sequentially; the other layers of the distributed system are connected in the same way as the third layer.
本发明的分布式系统控制的失效保护方法包括:The failure protection method of the distributed system control of the present invention comprises:
分布式系统的第二层的相邻节点依次连接;分布式系统的第三层节点中,同属于第上一层同一个节点控制下的节点依次连接;分布式系统其他层的连接方式与第三层相同;Adjacent nodes on the second layer of the distributed system are connected sequentially; among nodes on the third layer of the distributed system, nodes under the control of the same node belonging to the upper layer are connected in sequence; the connection mode of other layers of the distributed system is the same as that of the first Three layers are the same;
上层节点控制下层某一个节点时,向下层控制的所有节点同时发送信息;如果是所控制节点,则执行完控制任务后返回确认消息,如果不是所控制节点也要返回确认消息;When the upper layer node controls a certain node in the lower layer, it will send information to all nodes controlled by the lower layer at the same time; if it is the controlled node, it will return a confirmation message after executing the control task, and if it is not the controlled node, it will also return a confirmation message;
当检测到某一下层节点无确认消息返回后,即通信或管理失效判定为失效节点,则失效节点相邻的节点对失效节点进行控制,将上层节点发送的通信或管理信号发送至失效节点,恢复失效的通信或管理;待上层节点重新收到失效节点发送的确认信息后,重新跳转回原有的通信连接。When it is detected that a lower node does not return a confirmation message, that is, the communication or management failure is judged to be a failure node, the node adjacent to the failure node controls the failure node, and sends the communication or management signal sent by the upper node to the failure node. Restore the failed communication or management; after the upper node receives the confirmation message sent by the failed node again, it jumps back to the original communication connection.
本发明对分布式系统中原有的连接进行失效保护设置,从分布式系统的第二层开始,同层相邻节点进行连接设置;在上下层通信或管理过程中,上层节点将控制命令群发至与其相连接的下层节点,根据下层节点返回的信息探测通信或管理是否失效;如果发现通信或管理失效,由相邻下层节点进行控制,以恢复失效的通信或管理。The present invention sets the failure protection for the original connection in the distributed system. Starting from the second layer of the distributed system, the adjacent nodes of the same layer are connected and set; The lower-level nodes connected to it detect whether the communication or management is invalid according to the information returned by the lower-level nodes; if the communication or management failure is found, the adjacent lower-level nodes are controlled to restore the failed communication or management.
本发明通过对分布式系统节点的失效保护,能够增加分布式系统控制的可靠性和安全性。当分布式系统控制失效后,能迅速的恢复通信或控制,对于安全性和可靠性要求高的场合,例如火灾报警系统、矿井安全系统等尤为适用。The invention can increase the reliability and security of the distributed system control through the failure protection of the distributed system nodes. When the distributed system control fails, it can quickly restore communication or control, which is especially suitable for occasions with high safety and reliability requirements, such as fire alarm systems and mine safety systems.
附图说明 Description of drawings
图1现有分布式系统结构示意图。Fig. 1 is a schematic diagram of the existing distributed system structure.
图2本发明的分布式系统失效保护示意图。Fig. 2 is a schematic diagram of the failure protection of the distributed system of the present invention.
具体实施方式 Detailed ways
下面结合附图对本发明做进一步的说明:Below in conjunction with accompanying drawing, the present invention will be further described:
结合图2,具有失效保护的分布式系统控制结构共分为3层:第一层有1个主控机,第二层有n个从机,第三层在每个从机控制下有m个控制单元。对第二层将从机1至从机n依次连接,第三层每个从机控制下的m个控制单元依次连接,不同控制从机控制下的控制单元不连接。Combined with Figure 2, the control structure of the distributed system with fail-safe is divided into 3 layers: the first layer has 1 master, the second layer has n slaves, and the third layer has m under the control of each slave. a control unit. For the second layer, slave 1 to slave n are sequentially connected, and in the third layer, the m control units under the control of each slave are connected sequentially, and the control units under the control of different control slaves are not connected.
探测通信或管理是否失效。上层节点向下层某一节点发送控制命令时,会将命令群发至上层节点所控制的所有下层节点。下层节点收到命令后,若是给自己的控制命令,则执行控制指令并返回确认消息;若不是给自己的控制命令,仍然要返回确认消息。上层节点根据返回的确认消息来判断通信是否有效,若没有收到确认消息,则认为通信失效,否则认为通信有效。Detect communication or management failures. When the upper node sends a control command to a node in the lower layer, the command will be sent to all lower nodes controlled by the upper layer node. After the lower node receives the command, if it is a control command for itself, it will execute the control command and return a confirmation message; if it is not a control command for itself, it will still return a confirmation message. The upper layer node judges whether the communication is valid according to the returned confirmation message. If no confirmation message is received, the communication is considered invalid, otherwise the communication is considered valid.
恢复失效通信或管理。若上层节点发现有未返回的确认消息,则认为通信失效,这时启用失效保护。由于同层节点依次连接,所以当失效后需要选择失效保护的通道。选择方法如下:若节点i失效,判断节点i是否是当前层节点的最后一个节点,若是最后一个节点则由第i-1节点将控制信息发送至i节点,以恢复失效;若i不是最后一个节点,则由第i+1节点将控制信息发送至i节点。当上层节点发现失效的节点恢复正常后,重新启用原通道,相邻下层节点将不再保护原失效节点。Restoration of failed communications or management. If the upper layer node finds that there is a confirmation message that has not been returned, it will consider the communication to be invalid, and the failsafe will be enabled at this time. Since nodes at the same layer are connected sequentially, it is necessary to select a failsafe channel when a failure occurs. The selection method is as follows: if node i fails, judge whether node i is the last node of the current layer node, if it is the last node, the i-1th node will send the control information to node i to restore the failure; if i is not the last node node, the i+1th node sends the control information to the i node. When the upper node finds that the failed node has returned to normal, the original channel will be re-enabled, and the adjacent lower nodes will no longer protect the original failed node.
Claims (1)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN 201210162638 CN102722146B (en) | 2012-05-24 | 2012-05-24 | Distributed system control structure with failure protection function, and failure protection method |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN 201210162638 CN102722146B (en) | 2012-05-24 | 2012-05-24 | Distributed system control structure with failure protection function, and failure protection method |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN102722146A CN102722146A (en) | 2012-10-10 |
| CN102722146B true CN102722146B (en) | 2013-12-18 |
Family
ID=46947947
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN 201210162638 Expired - Fee Related CN102722146B (en) | 2012-05-24 | 2012-05-24 | Distributed system control structure with failure protection function, and failure protection method |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN102722146B (en) |
Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN1581813A (en) * | 2003-08-01 | 2005-02-16 | 光桥科技(中国)有限公司 | Method for conducting data transmission using logic loop network in ethernet |
| CN1741489A (en) * | 2005-09-01 | 2006-03-01 | 西安交通大学 | High usable self-healing Logic box fault detecting and tolerating method for constituting multi-machine system |
| CN1889496A (en) * | 2006-07-19 | 2007-01-03 | 山东富臣发展有限公司 | Layer control tree-shape network based on CAN bus for supporting plug and use |
| WO2008058933A1 (en) * | 2006-11-13 | 2008-05-22 | Siemens Aktiengesellschaft | Method for establishing bidirectional data transmission paths in a wireless meshed communication network |
| CN101378327A (en) * | 2007-08-29 | 2009-03-04 | 中国移动通信集团公司 | Communication network system and method for processing communication network business |
Family Cites Families (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| SE524863C2 (en) * | 2001-04-23 | 2004-10-12 | Transmode Systems Ab | Optical coarse wavelength division multiplexing system has multiple logical optical rings that form multiplexed ring structure, such that each ring links several nodes of ring structure |
-
2012
- 2012-05-24 CN CN 201210162638 patent/CN102722146B/en not_active Expired - Fee Related
Patent Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN1581813A (en) * | 2003-08-01 | 2005-02-16 | 光桥科技(中国)有限公司 | Method for conducting data transmission using logic loop network in ethernet |
| CN1741489A (en) * | 2005-09-01 | 2006-03-01 | 西安交通大学 | High usable self-healing Logic box fault detecting and tolerating method for constituting multi-machine system |
| CN1889496A (en) * | 2006-07-19 | 2007-01-03 | 山东富臣发展有限公司 | Layer control tree-shape network based on CAN bus for supporting plug and use |
| WO2008058933A1 (en) * | 2006-11-13 | 2008-05-22 | Siemens Aktiengesellschaft | Method for establishing bidirectional data transmission paths in a wireless meshed communication network |
| CN101378327A (en) * | 2007-08-29 | 2009-03-04 | 中国移动通信集团公司 | Communication network system and method for processing communication network business |
Also Published As
| Publication number | Publication date |
|---|---|
| CN102722146A (en) | 2012-10-10 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| UA105043C2 (en) | Communication network and method for safety-related communication in tunnel and mining structures | |
| CN103856357B (en) | A kind of stacking system fault handling method and stacking system | |
| CN204836244U (en) | Use two data center of living systems | |
| WO2009101531A8 (en) | System and method for network recovery from multiple link failures | |
| CN102708028B (en) | Trusted redundant fault-tolerant computer system | |
| JP5719744B2 (en) | Multi-system controller | |
| CN106233260A (en) | Redundant system controls device and system switching method thereof | |
| CN107659948B (en) | Method and device for controlling access of AP (access point) | |
| WO2015117389A1 (en) | Backup protection method and device for carrier grade nat (cgn) | |
| CN104317679A (en) | Communication fault-tolerant method based on thread redundancy for SCADA (Supervisory Control and Data Acquisition) system | |
| CN102722146B (en) | Distributed system control structure with failure protection function, and failure protection method | |
| CN117032113A (en) | DCS controller and trusted working method and system of main and standby controllers thereof | |
| CN104714439A (en) | Safety relay box system | |
| US20150169427A1 (en) | Fault-Tolerant Failsafe Computer System Using COTS Components | |
| CN105022666A (en) | Method, device and system for controlling MapReduce task scheduling | |
| WO2016010521A1 (en) | Partial redundancy for i/o modules or channels in distributed control systems | |
| CN106027313A (en) | Disaster tolerance system and method of network link based on VPN (Virtual Private Network) | |
| CN102819252B (en) | Method for realizing multi-redundancy of process control station in distributed control system | |
| JP2010231257A (en) | High availability system and method for handling failure of high availability system | |
| JP2017228159A (en) | Control device and control method of control device | |
| JP2015002546A (en) | Remote i/o unit of duplex supervisory control system, and maintenance method thereof | |
| CN117193195A (en) | Trusted controller redundancy method, system, equipment and media for DCS system | |
| CN104683153A (en) | Cluster-based router host and spare MPU control method and system thereof | |
| CN103051407A (en) | Clock protection method and system and related ordinary clock (OC) equipment | |
| CN209821633U (en) | CCR-FARs Structure of Oilfield Control System |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| C06 | Publication | ||
| PB01 | Publication | ||
| C10 | Entry into substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| C14 | Grant of patent or utility model | ||
| GR01 | Patent grant | ||
| CF01 | Termination of patent right due to non-payment of annual fee | ||
| CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20131218 Termination date: 20190524 |
