[go: up one dir, main page]

CN115941441A - System link automatic monitoring operation and maintenance method, system, equipment and medium - Google Patents

System link automatic monitoring operation and maintenance method, system, equipment and medium Download PDF

Info

Publication number
CN115941441A
CN115941441A CN202211419040.0A CN202211419040A CN115941441A CN 115941441 A CN115941441 A CN 115941441A CN 202211419040 A CN202211419040 A CN 202211419040A CN 115941441 A CN115941441 A CN 115941441A
Authority
CN
China
Prior art keywords
maintenance
link
work order
monitoring
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211419040.0A
Other languages
Chinese (zh)
Inventor
董琳
唐云
李志刚
白李东方
陈磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhongneng Integrated Smart Energy Technology Co Ltd
Original Assignee
Zhongneng Integrated Smart Energy Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhongneng Integrated Smart Energy Technology Co Ltd filed Critical Zhongneng Integrated Smart Energy Technology Co Ltd
Priority to CN202211419040.0A priority Critical patent/CN115941441A/en
Publication of CN115941441A publication Critical patent/CN115941441A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The application relates to a method, a system, equipment and a storage medium for automatically monitoring operation and maintenance of a system link, wherein the method comprises the following steps: monitoring various devices of the link based on the link monitoring device to obtain monitoring data; based on the acquired monitoring data, according to an early warning rule preset by various devices of the link, when the monitoring data reaches a warning threshold value configured by the early warning rule, a warning event is formed; the operation and maintenance service management system is connected with the operation and maintenance service management system on the basis of the alarm event, when the alarm event is formed, the operation and maintenance service management system generates an operation and maintenance work order, and the operation and maintenance work order analyzes and determines the operation and maintenance service priority according to a preset operation and maintenance business rule and is matched with corresponding operation and maintenance personnel; by accurately positioning the link fault point, the alarm information is connected with the operation and maintenance service management system to automatically generate the operation and maintenance work order, the intelligent scheduling of the alarm work order is realized, the dependence of the operation and maintenance work order coordination and distribution personnel is eliminated, the operation and maintenance service is ensured to be effectively provided in time, and the operation and maintenance quality is improved.

Description

系统链路自动化监控运维方法、系统、设备以及介质System link automatic monitoring operation and maintenance method, system, equipment and medium

技术领域technical field

本申请涉及信息技术领域,特别是涉及一种系统链路自动化监控运维方法、系统、设备以及介质。The present application relates to the field of information technology, in particular to a method, system, equipment and medium for automatic monitoring, operation and maintenance of system links.

背景技术Background technique

安全态势感知场站侧系统是指部署在各能源集团场站的网络安全综合监测系统,由数据采集装置、厂级分析平台以及其他通用和专用设备共同组成。当前接入安全态势感知系统的场站数量三千多家,场站侧系统链路运维具有数量大,分布广,设备种类多的特点,各类型场站的系统部署方案也有较大差别,链路运维工作复杂多样。The security situation awareness station side system refers to the network security comprehensive monitoring system deployed in the stations of various energy groups, which is composed of data acquisition devices, plant-level analysis platforms, and other general and special equipment. At present, there are more than 3,000 stations connected to the security situation awareness system. The system link operation and maintenance of the station side has the characteristics of large number, wide distribution, and various types of equipment. The system deployment schemes of various types of stations are also quite different. Link O&M tasks are complex and varied.

当前场站侧系统链路发生异常情况(设备离线、高延迟)等情况时,运维人员依据中心侧平台的运维模块筛选厂级平台的托管状态,无法精确定位到故障点,运维时需联系场站专工逐一排查场站侧系统链路各设备的运行情况,运维处理耗时长;运维工单的分配及处理优先级通过协调分配人员的人工判断后手动调整分派,无法实现复杂场景的自动匹配,运维整体质量无法保障。When abnormal conditions (device offline, high delay) occur on the system link at the front station side, the operation and maintenance personnel screen the hosting status of the plant-level platform based on the operation and maintenance module of the center-side platform, and cannot accurately locate the fault point. It is necessary to contact the field station specialist to check the operation status of each equipment in the system link on the field station side one by one, and the operation and maintenance process takes a long time; the allocation and processing priority of the operation and maintenance work order is manually adjusted and assigned after the manual judgment of the coordinating and allocating personnel, which cannot be realized The automatic matching of complex scenarios cannot guarantee the overall quality of operation and maintenance.

发明内容Contents of the invention

基于此,本申请提供了一种系统链路自动化监控运维方法、系统、设备以及介质,能够实现运维服务从链路监测、运维工单创建到运维服务的全流程自动化。Based on this, the present application provides a system link automatic monitoring operation and maintenance method, system, equipment and medium, which can realize the whole process automation of operation and maintenance services from link monitoring, operation and maintenance work order creation to operation and maintenance services.

第一方面,提供一种系统链路自动化监控运维方法,该方法包括:基于链路监测设备监测链路各类设备,获取监测数据;基于获取的所述监测数据根据所述链路各类设备预设的预警规则,在所述监测数据达到了所述预警规则配置的告警阈值时,则形成告警事件;基于所述告警事件与运维服务管理系统对接,在形成所述告警事件时,所述运维服务管理系统生成运维工单所述运维工单根据预设的运维业务规则分析确定运维服务优先级,并匹配至对应的运维人员。In the first aspect, a system link automatic monitoring operation and maintenance method is provided. The method includes: monitoring various types of equipment on the link based on the link monitoring device, and obtaining monitoring data; According to the early warning rules preset by the device, when the monitoring data reaches the alarm threshold configured by the early warning rules, an alarm event will be generated; based on the connection between the alarm event and the operation and maintenance service management system, when the alarm event is formed, The operation and maintenance service management system generates an operation and maintenance work order. The operation and maintenance work order analyzes and determines the priority of the operation and maintenance service according to the preset operation and maintenance business rules, and matches it to the corresponding operation and maintenance personnel.

可选的是,所述基于链路监测设备监测链路各类设备,获取监测数据,包括:获取采集器监测数据;查询采集器IP列表,获取采集器IP列表中相应采集器的心跳数据和流量数据;对获取的所述采集器的心跳数据和流量数据进行分析将分析异常的所述采集器的所述心跳数据和所述流量数据存入异常列表,并等待与下一次获取的所述采集器的心跳数据和流量数据进行匹配。Optionally, the monitoring of various types of equipment based on the link monitoring device to obtain the monitoring data includes: obtaining the monitoring data of the collector; querying the IP list of the collector, and obtaining the heartbeat data and data of the corresponding collector in the IP list of the collector flow data; analyze the acquired heartbeat data and flow data of the collector, store the heartbeat data and the flow data of the abnormal collector in the abnormal list, and wait for the next acquired The heartbeat data of the collector is matched with the traffic data.

可选的是,所述基于链路监测设备监测链路各类设备,获取监测数据,还包括:获取链路设备监测数据;查询链路设备IP列表,获取链路设备IP列表中相应链路设备的设备数据进行分析;将分析异常的所述链路设备的设备数据存入异常列表,并等待与下一次获取的所述链路设备的设备数据进行匹配。Optionally, the monitoring of various devices on the link based on the link monitoring device and obtaining the monitoring data also includes: obtaining the monitoring data of the link device; querying the IP list of the link device, and obtaining the corresponding link in the IP list of the link device The device data of the device is analyzed; the device data of the link device that is analyzed abnormally is stored in the exception list, and waits to be matched with the device data of the link device acquired next time.

可选的是,所述基于获取的所述监测数据根据所述链路各类设备预设的预警规则,在所述监测数据达到了所述预设规则配置的告警阈值时,则形成告警事件,包括:基于所述预警规则,将存于所述异常列表中的所述监测数据与所述预警规则的所述告警阈值比较;当任意一项所述监测数据的数据指标达到配置的告警阈值时,形成告警事件,其中所述告警事件记录的告警数据包括精确的故障设备、故障信息、故障时间。Optionally, based on the acquired monitoring data according to the pre-set early warning rules of various types of equipment on the link, when the monitoring data reaches the alarm threshold configured by the preset rules, an alarm event is formed , including: based on the early warning rule, comparing the monitoring data stored in the exception list with the alarm threshold of the early warning rule; when any data index of the monitoring data reaches the configured alarm threshold , an alarm event is formed, wherein the alarm data recorded in the alarm event includes accurate fault equipment, fault information, and fault time.

可选的是,所述运维业务规则包括:运维服务优先级规则,所述运维服务优先级规则包括将所述运维工单的重要程度根据所述运维工单上显示的所述重点保障标识、重点保障时间周期、故障设备、故障信息、场站维护类型、服务级别由高到低排序;工单分配规则,所述工单分配规则包括根据所述运维工单的运维服务优先级、场站信息以及对应的场站运维负责人、运维工程师的技能水平、运维排班班次信息综合分析匹配。Optionally, the operation and maintenance business rules include: operation and maintenance service priority rules, and the operation and maintenance service priority rules include assigning the importance of the operation and maintenance work order according to the The key support identification, key support time period, faulty equipment, fault information, station maintenance type, and service level are sorted from high to low; work order distribution rules, the work order distribution rules include operation and maintenance work orders according to the operation and maintenance work order Comprehensive analysis and matching of maintenance service priority, station information, and the skill level of the corresponding station operation and maintenance person, operation and maintenance engineer, and operation and maintenance schedule information.

可选的是,所述运维工单根据预设的运维业务规则分析确定运维服务优先级,并匹配至对应的运维人员,包括:基于所述运维工单内容根据运维服务优先级规则分析确定运维工单的优先级,将优先的所述运维工单优先进行匹配;基于匹配出来的优先的所述运维工单根据所述工单分配规则,确定所述运维工单上待运维的设备所对应的场站、所述场站对应的所述场站运维负责人及运维人员。Optionally, the operation and maintenance work order analyzes and determines the priority of the operation and maintenance service according to the preset operation and maintenance business rules, and matches to the corresponding operation and maintenance personnel, including: based on the content of the operation and maintenance work order, according to the operation and maintenance service Priority rule analysis determines the priority of the operation and maintenance work order, and the priority operation and maintenance work order is matched first; based on the matched priority operation and maintenance work order, determine the operation and maintenance work order The station corresponding to the equipment to be operated and maintained on the maintenance work order, the person in charge of the operation and maintenance of the station corresponding to the station, and the operation and maintenance personnel.

可选的是,所述的系统链路自动化监控运维方法,还包括:基于处理好的所述运维工单根据所述运维服务管理系统将运维结果反馈至所述链路监测设备;基于所述链路监测设备预设的所述预警规则,确定是否触发所述告警事件,并生成反馈记录返回至所述运维服务管理系统;所述运维服务管理系统基于所述反馈记录,判断运维是否处理完成;若所述反馈记录显示为无所述告警事件,则表示运维已经处理完成,所述运维工单关闭;否则,所述运维工单流转到运维处理环节继续流转处理。Optionally, the system link automatic monitoring operation and maintenance method further includes: based on the processed operation and maintenance work order, feeding back the operation and maintenance results to the link monitoring device according to the operation and maintenance service management system ; Based on the pre-warning rule preset by the link monitoring device, determine whether to trigger the alarm event, and generate a feedback record to return to the operation and maintenance service management system; the operation and maintenance service management system is based on the feedback record , to determine whether the operation and maintenance process is completed; if the feedback record shows that there is no such alarm event, it means that the operation and maintenance has been processed and the operation and maintenance work order is closed; otherwise, the operation and maintenance work order is transferred to the operation and maintenance process Links continue to flow and process.

第二方面,提供了一种系统链路自动化监控运维系统,该系统包括:获取模块:用于基于链路监测设备监测链路各类设备,获取监测数据;告警模块:用于基于获取的所述监测数据根据所述链路各类设备预设的预警规则,在所述监测数据达到了所述预设规则配置的告警阈值时,则形成告警事件;对接模块:用于基于所述告警事件与运维服务管理系统对接,在形成所述告警事件时,所述运维服务管理系统生成运维工单;匹配模块:用于所述运维工单根据预设的运维业务规则分析确定运维服务优先级,并匹配至对应的运维人员;处理模块,用于所述运维人员按所述运维工单确定的所述运维服务优先级和对应的所述告警事件开展运维服务,并将运维内容记录到所述运维工单。In the second aspect, a system link automatic monitoring operation and maintenance system is provided. The system includes: an acquisition module: used to monitor various equipment on the link based on link monitoring equipment, and acquire monitoring data; The monitoring data is based on the pre-set early warning rules of various types of equipment on the link. When the monitoring data reaches the alarm threshold configured by the preset rules, an alarm event is formed; the docking module: used to The event is connected with the operation and maintenance service management system. When the alarm event is formed, the operation and maintenance service management system generates an operation and maintenance work order; matching module: used for analyzing the operation and maintenance work order according to the preset operation and maintenance business rules Determine the priority of the operation and maintenance service and match it to the corresponding operation and maintenance personnel; the processing module is used for the operation and maintenance personnel to carry out the operation according to the priority of the operation and maintenance service determined by the operation and maintenance work order and the corresponding alarm event Operation and maintenance services, and record the operation and maintenance content to the operation and maintenance work order.

第三方面,提供了一种计算机设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,其特征在于,所述处理器执行所述计算机程序时实现上述所述方法的步骤。In a third aspect, a computer device is provided, including a memory, a processor, and a computer program stored on the memory and operable on the processor, characterized in that, when the processor executes the computer program, the above-mentioned method steps.

第四方面,提供了一种计算机可读存储介质,其上存储有计算机程序,计算机程序被处理器执行时实现上述所述方法的步骤。In a fourth aspect, a computer-readable storage medium is provided, on which a computer program is stored, and when the computer program is executed by a processor, the steps of the above-mentioned method are implemented.

根据本申请实施例所提供的技术内容,获取监测的链路各类设备信息,根据获取的监测数据分析匹配各类型设备的监测和预警规则,匹配数据指标触发配置的告警阈值后形成告警事件;告警事件对接运维服务管理系统,自动生成运维工单;运维工单根据运维业务规则综合分析确定运维服务优先级,智能匹配到最佳运维工程师;运维工程师按工单运维优先级顺序和告警数据开展运维服务,运维处理结果自动分析及反馈。本方法能够精准定位链路故障点,告警信息对接运维服务管理系统自动生成运维工单,实现告警工单智能排程,消除运维工单协调分配人员的依赖,确保及时有效提供运维服务,提高运维质量。通过将运维工单处理结果反馈链路监测设备进行核对,链路监测设备复核运维工单处理的情况,实现运维闭环,大大提升运维服务的有效性,实现运维服务从链路监测、运维工单创建到运维服务的全流程自动化。According to the technical content provided by the embodiment of the present application, obtain the information of various types of equipment on the monitored link, analyze and match the monitoring and early warning rules of various types of equipment according to the acquired monitoring data, and form an alarm event after matching the data index to trigger the configured alarm threshold; The alarm event is connected to the operation and maintenance service management system, and the operation and maintenance work order is automatically generated; the operation and maintenance work order determines the priority of the operation and maintenance service based on the comprehensive analysis of the operation and maintenance business rules, and intelligently matches the best operation and maintenance engineer; the operation and maintenance engineer operates according to the work order. Carry out operation and maintenance services based on maintenance priority order and alarm data, and automatically analyze and feedback the operation and maintenance processing results. This method can accurately locate the link fault point, and the alarm information is connected to the operation and maintenance service management system to automatically generate an operation and maintenance work order, realize intelligent scheduling of alarm work orders, eliminate the dependence on the coordination and allocation of operation and maintenance work orders, and ensure timely and effective operation and maintenance service and improve the quality of operation and maintenance. By feeding back the processing result of the operation and maintenance work order to the link monitoring device for checking, the link monitoring device reviews the processing of the operation and maintenance work order to realize the closed loop of operation and maintenance, greatly improve the effectiveness of the operation and maintenance service, and realize the operation and maintenance service from the link The whole process automation from monitoring, operation and maintenance work order creation to operation and maintenance services.

附图说明Description of drawings

图1为一个实施例中显示系统链路自动化监控运维方法的应用环境图;Fig. 1 is an application environment diagram showing a system link automatic monitoring operation and maintenance method in an embodiment;

图2为一个实施例中显示系统链路自动化监控运维方法的流程示意图;FIG. 2 is a schematic flow diagram showing a system link automatic monitoring operation and maintenance method in an embodiment;

图3为一个实施例中显示链路监测设备的流程框图;Figure 3 is a block diagram showing a link monitoring device in one embodiment;

图4为一个实施例中显示获取采集器监测数据的流程示意图;Fig. 4 is a schematic flow chart showing acquisition of collector monitoring data in one embodiment;

图5为一个实施例中显示获取链路设备监测数据的流程示意图;FIG. 5 is a schematic flow diagram showing the acquisition of link device monitoring data in an embodiment;

图6为一个实施例中显示运维工单处理后流程示意图;Fig. 6 is a schematic diagram showing the process flow after the operation and maintenance work order is processed in one embodiment;

图7为一个实施例中显示计算机设备的示意性结构图。Fig. 7 is a schematic block diagram showing a computer device in one embodiment.

具体实施方式Detailed ways

以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。The present application will be described in further detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present application, and are not intended to limit the present application.

为了方便理解,首先对本申请所适用的系统进行描述。本申请提供的一种系统链路自动化监控运维方法,可以应用于如图1所示的系统架构中。该系统包括:用户空间文件服务器103和终端设备101,终端设备101通过网络与用户空间文件服务器103通过网络进行通信。其中,用户空间文件服务器103可以是一个基于NFSv3\v4协议的文件服务器,运行在Linux坏境下,而NFS(网络文件系统)是文件系统之上的一个网络抽象,可允许运行于终端设备101的远程客户端以与本地文件系统相类似的方式,通过网络进行访问。终端设备101可以但不限于是各种个人计算机、笔记本电脑、智能手机、平板电脑等,用户空间文件服务器103可以用独立的服务器或者是多个服务器组成的服务器集群来实现。For ease of understanding, the system to which this application applies is described first. A system link automatic monitoring operation and maintenance method provided in the present application can be applied to the system architecture shown in FIG. 1 . The system includes: a user space file server 103 and a terminal device 101, and the terminal device 101 communicates with the user space file server 103 through the network. Wherein, the user space file server 103 can be a file server based on the NFSv3\v4 protocol, running under the environment of Linux, and NFS (Network File System) is a network abstraction on the file system, which can be allowed to run on the terminal device 101 The remote client of the file system is accessed through the network in a similar manner to the local file system. The terminal device 101 may be, but not limited to, various personal computers, notebook computers, smart phones, tablet computers, etc., and the user space file server 103 may be implemented by an independent server or a server cluster composed of multiple servers.

图2为本申请实施例提供的一种系统链路自动化监控运维方法的流程示意图,该方法可以由如图1所示系统中的用户空间文件服务器执行。如图2所示,该方法可以包括以下步骤:FIG. 2 is a schematic flowchart of a system link automatic monitoring operation and maintenance method provided by an embodiment of the present application. The method can be executed by a user space file server in the system shown in FIG. 1 . As shown in Figure 2, the method may include the following steps:

S201:基于链路监测设备监测链路各类设备,获取监测数据;S201: Based on the link monitoring equipment, monitor various equipment of the link, and obtain monitoring data;

S202:基于获取的监测数据根据链路各类设备预设的预警规则,在监测数据达到了预设规则配置的告警阈值时,则形成告警事件;S202: Based on the acquired monitoring data and according to the pre-set early warning rules of various link devices, when the monitoring data reaches the alarm threshold configured by the preset rules, an alarm event is generated;

S203:基于告警事件与运维服务管理系统对接,在形成告警事件时,运维服务管理系统生成运维工单;S203: Based on the connection between the alarm event and the operation and maintenance service management system, when an alarm event is generated, the operation and maintenance service management system generates an operation and maintenance work order;

S204:运维工单根据预设的运维业务规则分析确定运维服务优先级,并匹配至对应的运维人员;S204: The operation and maintenance work order analyzes and determines the priority of the operation and maintenance service according to the preset operation and maintenance business rules, and matches it to the corresponding operation and maintenance personnel;

S205:运维人员按运维工单确定的运维服务优先级和对应的告警事件开展运维服务,并将运维内容记录到运维工单。S205: The operation and maintenance personnel carry out the operation and maintenance service according to the operation and maintenance service priority determined in the operation and maintenance work order and the corresponding alarm events, and record the operation and maintenance content in the operation and maintenance work order.

在本实施例中,需要说明的是,参照图3,链路监测设备包括处理器、存储器、显示系统、输入装置以及通讯接口。存储器存储操作系统、可执行程序、监测的数据记录;处理器用于执行监测相关的程序;显示系统显示程序执行情况、监控的记录显示。另外,显示系统包括但不限于是外接的显示大屏、投影仪、液晶显示器等;输入装置包括但不限于是链路自动化监测设备机壳上设置的按键、触控板、外接的键盘、鼠标等;通讯接口用于各类数据的通讯,可以为有线网络模块、wifi模块、运营商通讯模块、RS232/485串口模块等。In this embodiment, it should be noted that referring to FIG. 3 , the link monitoring device includes a processor, a memory, a display system, an input device, and a communication interface. The memory stores the operating system, executable programs, and monitoring data records; the processor is used to execute monitoring-related programs; the display system displays program execution conditions and monitor records. In addition, the display system includes, but is not limited to, an external large display screen, projector, liquid crystal display, etc.; the input device includes, but is not limited to, buttons, a touch panel, an external keyboard, and a mouse set on the chassis of the link automation monitoring equipment. etc.; the communication interface is used for various data communication, which can be wired network module, wifi module, operator communication module, RS232/485 serial port module, etc.

应当理解的是,本申请的上述具体实施方式仅仅用于示例性说明或解释本申请的原理,而不构成对本申请的限制。因此,在不偏离本申请的精神和范围的情况下所做的任何修改、等同替换、改进等,均应包含在本申请的保护范围之内。此外,本申请所附权利要求旨在涵盖落入所附权利要求范围和边界、或者这种范围和边界的等同形式内的全部变化和修改例。It should be understood that the above specific implementation manners of the present application are only used to illustrate or explain the principle of the present application, but not to limit the present application. Therefore, any modification, equivalent replacement, improvement, etc. made without departing from the spirit and scope of the present application shall fall within the protection scope of the present application. Furthermore, the claims appended to this application are intended to embrace all changes and modifications that come within the scope and metes and bounds of the appended claims, or equivalents of such scope and metes and bounds.

可以看出,本申请实施例中通过对链路各类设备进行监测自动化,精准定位链路故障点,将告警事件对接运维服务管理系统自动生成运维工单,实现告警工单智能排程,消除运维工单协调分配运维人员的依赖,确保及时有效提供运维服务,提高运维质量。It can be seen that in the embodiment of the present application, through automatic monitoring of various link devices, precise location of link failure points, and alarm events connected to the operation and maintenance service management system to automatically generate operation and maintenance work orders, realizing intelligent scheduling of alarm work orders , Eliminate the dependence on the coordination and allocation of operation and maintenance personnel for operation and maintenance work orders, ensure timely and effective provision of operation and maintenance services, and improve the quality of operation and maintenance.

参照图2、图4所示,在一些实施例中,在S201,基于链路监测设备监测链路各类设备,获取监测数据具体包括:Referring to Fig. 2 and Fig. 4, in some embodiments, in S201, based on the link monitoring equipment to monitor various types of equipment on the link, the acquisition of monitoring data specifically includes:

S2011:获取采集器监测数据;S2011: Obtain the monitoring data of the collector;

S2012:查询采集器IP列表,获取采集器IP列表中相应采集器的心跳数据和流量数据;S2012: Query the collector IP list, and obtain the heartbeat data and traffic data of the corresponding collector in the collector IP list;

S2013:对获取的采集器的心跳数据和流量数据进行分析;S2013: Analyze the acquired heartbeat data and flow data of the collector;

S2014:将分析异常的采集器的心跳数据和流量数据存入异常列表,并等待与下一次获取的采集器的心跳数据和流量数据进行匹配。S2014: Store the heartbeat data and flow data of the abnormally analyzed collector into the exception list, and wait for matching with the heartbeat data and flow data of the collector to be acquired next time.

在本实施例中,需要说明的是,通过ElasticSearch定时查询获取采集器监测数据,ElasticSearch提供了一个分布式多用户能力的全文搜索引擎,能很方便地使大量数据具有搜索、分析和探索的能力。其基于RESTful web接口,通过ElasticSearch定时查询,匹配数据中所存在的各区采集器IP,并匹配该采集器发送的心跳数据即心跳包和流量数据并将其进入缓存至服务器。对于存在异常的采集器,缓存至异常列表。这里,通过分析心跳数据,分析采集器是否在正常工作,通过监测在监控周期内是否存在日志数据,存在日志数据说明心跳正常,采集器正常在线工作,如无数据则说明异常,将其存入异常列表。流量数据主要是为了验证采集器获取的数据是否是安全的,在忙时发现获取的流量过小,而在闲时发现流量超标,流量出现异常峰谷值,则说明采集器工作异常,将其存入异常列表,并等待与下一次获取的采集器的心跳数据和流量数据进行匹配。In this embodiment, it should be noted that the monitoring data of the collector is acquired through ElasticSearch timing query, and ElasticSearch provides a distributed multi-user full-text search engine, which can easily enable a large amount of data to have the ability to search, analyze and explore . It is based on the RESTful web interface, through ElasticSearch timing query, matches the IP of the collectors in each area existing in the data, and matches the heartbeat data sent by the collector, that is, the heartbeat packet and traffic data, and enters it into the cache to the server. For collectors with exceptions, cache them in the exception list. Here, by analyzing the heartbeat data, analyze whether the collector is working normally, and by monitoring whether there is log data in the monitoring period, the existence of log data indicates that the heartbeat is normal, and the collector is working normally online. If there is no data, it means that it is abnormal, and it is stored in List of exceptions. The flow data is mainly to verify whether the data acquired by the collector is safe. When it is found that the acquired flow is too small when it is busy, but when it is found that the flow exceeds the standard when it is idle, and the flow has abnormal peaks and valleys, it means that the collector is working abnormally. Store it in the exception list, and wait for matching with the heartbeat data and flow data of the collector to be acquired next time.

参照图2、图5所示,在一些实施例中,在S201,基于链路监测设备监测链路各类设备,获取监测数据还包括:Referring to Fig. 2 and Fig. 5, in some embodiments, in S201, based on the link monitoring device monitoring various devices of the link, obtaining monitoring data also includes:

S2015:获取链路设备监测数据;S2015: Obtain link device monitoring data;

S2016:查询链路设备IP列表,获取链路设备IP列表中相应链路设备的设备数据进行分析;S2016: Query the link device IP list, and obtain the device data of the corresponding link device in the link device IP list for analysis;

S2017:将分析异常的链路设备的设备数据存入异常列表,并等待与下一次获取的链路设备的设备数据进行匹配。S2017: Store the device data of the abnormally analyzed link device into the exception list, and wait for matching with the device data of the link device acquired next time.

在本实施例中,需要说明的是,按预设的周期定时检测系统链路中的核心设备是否网络连通,其中核心设备包括但不限于厂级平台IP、VPDN物联网卡的IP、加密机IP,该业务位于主业务节点,可使用多进程加多线程来处理业务,通过查询ICMP报文,来确定核心设备是否连通。其中ICMP(Internet Control Message Protocol)Internet控制报文协议。它是TCP/IP协议簇的一个子协议,用于在IP主机、路由器之间传递控制消息。控制消息是指网络通不通、主机是否可达、路由是否可用等网络本身的消息,ICMP使用IP的基本支持,就像它是一个更高级别的协议,但是,ICMP实际上是IP的一个组成部分,必须由每个IP模块实现。通过查询ICMP报文,记录设备是否连通,以及出现的丢包率、延迟等现象。In this embodiment, it should be noted that whether the core equipment in the system link is connected to the network is regularly detected according to the preset period, and the core equipment includes but is not limited to the IP of the factory-level platform, the IP of the VPDN Internet of Things card, and the encryption machine. IP, the service is located at the main service node, and can use multi-process and multi-thread to process the service, and check whether the core device is connected by querying the ICMP message. Among them, ICMP (Internet Control Message Protocol) Internet Control Message Protocol. It is a sub-protocol of the TCP/IP protocol cluster and is used to transmit control messages between IP hosts and routers. The control message refers to the message of the network itself such as whether the network is unreachable, whether the host is reachable, whether the route is available, etc. ICMP uses the basic support of IP, just like it is a higher-level protocol, but ICMP is actually a component of IP section, must be implemented by each IP block. By querying ICMP packets, record whether the device is connected, as well as the packet loss rate and delay.

并且还需通过字典数据库,查询出所有供查询的设备IP,获取到系统链路设备的分析结果数据,将其缓存至服务器。对于存在异常的设备,将其缓存至异常列表,并等待与下一次获取的链路设备的设备数据进行匹配。And it is also necessary to query all the device IPs for query through the dictionary database, obtain the analysis result data of the system link device, and cache it to the server. For devices with exceptions, cache them into the exception list, and wait for matching with the device data of the link device to be acquired next time.

参照图2所示,在一些实施例中,在S202,基于获取的监测数据根据链路各类设备预设的预警规则,在监测数据达到了预设规则配置的告警阈值时,则形成告警事件,具体包括:基于预警规则,将存于异常列表中的监测数据与预警规则的告警阈值比较;当任意一项监测数据的数据指标达到配置的告警阈值时,形成告警事件,其中告警事件记录的告警数据包括精确的故障设备、故障信息、故障时间。Referring to FIG. 2 , in some embodiments, in S202, based on the acquired monitoring data according to the pre-set warning rules of various types of equipment on the link, when the monitoring data reaches the alarm threshold configured by the preset rules, an alarm event is generated. , which specifically includes: based on the early warning rules, comparing the monitoring data stored in the exception list with the alarm thresholds of the early warning rules; when any data index of the monitoring data reaches the configured alarm threshold, an alarm event is generated, and the alarm event records Alarm data includes accurate fault equipment, fault information, and fault time.

在本实施例中,需要说明的是,监测的各类型设备可根据应用场景设置通用的预警规则或是自定义的预警规则;也可通过设备IP单独配置自定义的预警规则。其中通用的预警规则是指一般情况在设置一条链路监控的时候会有自动分析程序给出一个预设的告警阈值。而自定义的预警规则是指可使用人工的方式对链路检测阈值进行配置。In this embodiment, it should be noted that, for each type of equipment monitored, general warning rules or custom warning rules can be set according to application scenarios; custom warning rules can also be configured separately through device IP. Among them, the general warning rule means that in general, when setting a link monitoring, an automatic analysis program will give a preset warning threshold. The self-defined early warning rule means that the link detection threshold can be manually configured.

对两次分析均处于异常列表的监测数据与对应的预警规则设置的告警阈值相比较,当任意一项监测数据的数据指标达到配置的告警阈值,形成告警事件,其中告警事件记录数据包含精确的故障设备、故障信息、故障时间。通过告警事件可精确的确定出现问题的故障设备。Compare the monitoring data that is in the abnormal list in the two analyzes with the alarm threshold set by the corresponding early warning rule. When the data index of any monitoring data reaches the configured alarm threshold, an alarm event is formed. The alarm event record data contains accurate Faulty equipment, fault information, fault time. The faulty device that has a problem can be accurately determined through the alarm event.

参照图2所示,在一些实施例中,S204,运维业务规则包括:运维服务优先级规则,运维服务优先级规则包括将运维工单的重要程度根据运维工单上显示的重点保障标识、重点保障时间周期、故障设备、故障信息、场站维护类型、服务级别由高到低排序;工单分配规则,工单分配规则包括根据运维工单的运维服务优先级、场站信息以及对应的场站运维负责人、运维工程师的技能水平、运维排班班次信息综合分析匹配。Referring to FIG. 2 , in some embodiments, S204, the operation and maintenance business rules include: operation and maintenance service priority rules, and the operation and maintenance service priority rules include assigning the importance of the operation and maintenance work order according to the value displayed on the operation and maintenance work order. Key support identification, key support time period, faulty equipment, fault information, station maintenance type, and service level are sorted from high to low; work order allocation rules, work order allocation rules include the priority of operation and maintenance services according to the operation and maintenance work order, Comprehensive analysis and matching of the station information and the corresponding station operation and maintenance person in charge, the skill level of the operation and maintenance engineer, and the operation and maintenance schedule information.

在本实施例中,需要说明的是,通过运维服务优先级规则以及工单分配规则的配合设置,可以将运维工单按重要程度快速精准与合适的运维人员相匹配。In this embodiment, it should be noted that, through the coordinated setting of operation and maintenance service priority rules and work order allocation rules, operation and maintenance work orders can be quickly and accurately matched with appropriate operation and maintenance personnel according to their importance.

在一些实施例中,S204,运维工单根据预设的运维业务规则分析确定运维服务优先级,并匹配至对应的运维人员,包括:基于运维工单内容根据运维服务优先级规则分析确定运维工单的优先级,将优先的运维工单优先进行匹配;基于匹配出来的优先的运维工单根据工单分配规则,确定运维工单上待运维的设备所对应的场站、场站对应的场站运维负责人及运维工程师的技能水平、运维排班班次信息,来确定合适的运维人员。In some embodiments, S204, the operation and maintenance work order analyzes and determines the priority of the operation and maintenance service according to the preset operation and maintenance business rules, and matches it to the corresponding operation and maintenance personnel, including: based on the content of the operation and maintenance work order and according to the priority of the operation and maintenance service Level rule analysis determines the priority of operation and maintenance work orders, and prioritizes matching of priority operation and maintenance work orders; based on the matched priority operation and maintenance work orders, determine the equipment to be operated and maintained on the operation and maintenance work order according to the work order allocation rules The corresponding station, the skill level of the person in charge of operation and maintenance of the station corresponding to the station, the skill level of the operation and maintenance engineer, and the information of the operation and maintenance schedule to determine the appropriate operation and maintenance personnel.

在本实施例中,需要说明的是,运维工单根据预设的运维业务规则分析确定运维服务优先级,并匹配至对应的运维人员,具体是首先记录和维护场站的属性信息,用于与告警事件的场站信息以及运维服务优先级信息匹配,其中具体匹配内容包括但不限于场站名称、场站部署设备、各设备IP、所属省份、维护类型、重点保障标识、服务级别、运维负责人等。之后再记录分析运维工程师的相关信息,用于根据工单分配原则将运维工单匹配于合适运维工程师。在匹配过程中,还要根据设置的运维工程师技能属性做更好的匹配,其中,运维工程师技能属性包括但不限于资质、擅长专业领域、技能水平、运维排班班次等信息。并且在匹配到合适的运维工程师之后,还设置可通过电子邮件、短信息、微信通知等方式发送工单处理提醒到分配的运维工程师。In this embodiment, it should be noted that the operation and maintenance work order analyzes and determines the priority of the operation and maintenance service according to the preset operation and maintenance business rules, and matches it to the corresponding operation and maintenance personnel. Specifically, the attributes of the site are firstly recorded and maintained Information, used to match the station information of the alarm event and the priority information of the operation and maintenance service. The specific matching content includes but not limited to the name of the station, the equipment deployed at the station, the IP of each device, the province it belongs to, the type of maintenance, and the identification of key guarantees , service level, operation and maintenance person in charge, etc. Then record and analyze the relevant information of the operation and maintenance engineer, which is used to match the operation and maintenance work order with the appropriate operation and maintenance engineer according to the work order allocation principle. During the matching process, a better match should be made according to the set skill attributes of the operation and maintenance engineer. Among them, the skill attributes of the operation and maintenance engineer include but are not limited to information such as qualifications, professional fields, skill levels, and operation and maintenance schedules. And after a suitable operation and maintenance engineer is matched, it can also be set to send work order processing reminders to the assigned operation and maintenance engineer through email, short message, WeChat notification, etc.

另外,在生成运维工单后,除了根据运维业务规则实现运维工单分配,还支持运维工单的查看、领取、处理、转派、关闭等一系列操作,最终实现对设备故障的快速精准维护。In addition, after the operation and maintenance work order is generated, in addition to the allocation of the operation and maintenance work order according to the operation and maintenance business rules, it also supports a series of operations such as viewing, receiving, processing, transfer, and closing of the operation and maintenance work order, and finally realizes the equipment failure fast and accurate maintenance.

参照图6所示,在一些实施例中,系统链路自动化监控运维方法,方法还包括S206:Referring to Figure 6, in some embodiments, the system link automatic monitoring operation and maintenance method, the method also includes S206:

S2061:基于处理好的运维工单根据运维服务管理系统将运维结果反馈至链路监测设备;S2061: Feedback the operation and maintenance results to the link monitoring device according to the operation and maintenance service management system based on the processed operation and maintenance work order;

S2062:基于链路监测设备预设的预警规则,确定是否触发告警事件,并生成反馈记录返回至运维服务管理系统;S2062: Determine whether to trigger an alarm event based on the pre-set early warning rules of the link monitoring device, and generate a feedback record to return to the operation and maintenance service management system;

S2063:运维服务管理系统基于反馈记录,判断运维是否处理完成;S2063: The operation and maintenance service management system judges whether the operation and maintenance has been processed based on the feedback records;

S2064:若反馈记录显示为无告警事件,则表示运维已经处理完成,运维工单关闭;S2064: If the feedback record shows no alarm event, it means that the operation and maintenance has been processed and the operation and maintenance work order is closed;

S2065:否则,运维工单流转到运维处理环节继续流转处理。S2065: Otherwise, the operation and maintenance work order flow is transferred to the operation and maintenance processing link to continue the flow and processing.

在本实施例中,需要说明的是,在运维人员将处理好的运维工单提交后,运维服务管理系统将处理完成记录反馈回链路监测设备,链路自动化监控接收到处理完成信号后触发该场站的链路监测设备,分析匹配对应预警规则判断是否会再次触发告警事件,并将监控情况返回给运维服务管理系统;运维服务管理系统依据反馈的记录判断是否处理完成。若反馈记录为无告警事件,该运维工单关闭,本次运维完成;若反馈记录为有告警,则该工单流转到运维处理环节继续流转处理,同时通过电子邮件、短信息、微信通知等方式发送工单再次处理提醒到分配的运维工程师。通过将运维工单处理结果反馈链路监测设备进行核对,链路监测设备复核运维工单处理的情况,实现运维闭环,大大提升运维服务的有效性,实现运维服务从链路监测、运维工单创建到运维服务的全流程自动化。In this embodiment, it should be noted that after the operation and maintenance personnel submit the processed operation and maintenance work order, the operation and maintenance service management system will feed back the processing completion record to the link monitoring device, and the automatic link monitoring will receive the processing completion After the signal triggers the link monitoring equipment of the station, analyzes and matches the corresponding early warning rules to determine whether the alarm event will be triggered again, and returns the monitoring situation to the operation and maintenance service management system; the operation and maintenance service management system judges whether the processing is completed based on the feedback records . If the feedback record shows that there is no alarm event, the operation and maintenance work order is closed and the operation and maintenance is completed; if the feedback record is that there is an alarm, the work order will be transferred to the operation and maintenance processing link to continue processing Send the work order to the assigned operation and maintenance engineer through WeChat notification and other methods to process the work order again. By feeding back the processing result of the operation and maintenance work order to the link monitoring device for checking, the link monitoring device reviews the processing of the operation and maintenance work order to realize the closed loop of operation and maintenance, greatly improve the effectiveness of the operation and maintenance service, and realize the operation and maintenance service from the link The whole process automation from monitoring, operation and maintenance work order creation to operation and maintenance services.

以上各个步骤流程主要是获取监测的链路各类设备信息,根据获取的监测数据分析匹配各类型设备的监测和预警规则,匹配数据指标触发配置的告警阈值后形成告警事件;告警事件对接运维服务管理系统,自动生成运维工单;运维工单根据运维业务规则综合分析确定运维服务优先级,智能匹配到最佳运维工程师;运维工程师按工单运维优先级顺序和告警数据开展运维服务,运维处理结果自动分析及反馈。本方法能够精准定位链路故障点,告警信息对接运维服务管理系统自动生成运维工单,实现告警工单智能排程,消除运维工单协调分配人员的依赖,确保及时有效提供运维服务,提高运维质量。The process of each step above is mainly to obtain the information of various equipment on the monitored link, analyze and match the monitoring and early warning rules of various types of equipment according to the obtained monitoring data, and form an alarm event after matching the data index to trigger the configured alarm threshold; the alarm event is connected to the operation and maintenance The service management system automatically generates an operation and maintenance work order; the operation and maintenance work order determines the priority of the operation and maintenance service according to the comprehensive analysis of the operation and maintenance business rules, and intelligently matches the best operation and maintenance engineer; Alarm data is used to carry out operation and maintenance services, and the results of operation and maintenance processing are automatically analyzed and fed back. This method can accurately locate the link fault point, and the alarm information is connected to the operation and maintenance service management system to automatically generate an operation and maintenance work order, realize intelligent scheduling of alarm work orders, eliminate the dependence on the coordination and allocation of operation and maintenance work orders, and ensure timely and effective operation and maintenance service and improve the quality of operation and maintenance.

本申请实施例还提供了一种系统链路自动化监控运维系统,该系统可以包括:The embodiment of the present application also provides an automatic system link monitoring operation and maintenance system, which may include:

获取模块,用于基于链路监测设备监测链路各类设备,获取监测数据;The acquisition module is used to monitor various equipment on the link based on the link monitoring equipment, and acquire monitoring data;

告警模块,用于基于获取的监测数据根据链路各类设备预设的预警规则,在监测数据达到了预设规则配置的告警阈值时,则形成告警事件;The alarm module is used to generate an alarm event when the monitoring data reaches the alarm threshold configured by the preset rule according to the preset early warning rules of various devices on the link based on the acquired monitoring data;

对接模块,用于基于告警事件与运维服务管理系统对接,在形成告警事件时,运维服务管理系统生成运维工单;The docking module is used for docking with the operation and maintenance service management system based on the alarm event. When an alarm event is formed, the operation and maintenance service management system generates an operation and maintenance work order;

匹配模块,用于运维工单根据预设的运维业务规则分析确定运维服务优先级,并匹配至对应的运维人员;The matching module is used to analyze and determine the priority of the operation and maintenance service according to the preset operation and maintenance business rules of the operation and maintenance work order, and match it to the corresponding operation and maintenance personnel;

处理模块,用于运维人员按运维工单确定的运维服务优先级和对应的告警事件开展运维服务,并将运维内容记录到运维工单。The processing module is used for the operation and maintenance personnel to carry out the operation and maintenance service according to the operation and maintenance service priority determined by the operation and maintenance work order and the corresponding alarm events, and record the operation and maintenance content in the operation and maintenance work order.

根据本申请的实施例,本申请还提供了一种计算机设备、一种计算机可读存储介质。According to the embodiment of the present application, the present application also provides a computer device and a computer-readable storage medium.

如图7所示,是根据本申请实施例的计算机设备的框图。计算机设备旨在表示各种形式的数字计算机或移动装置。其中数字计算机可以包括台式计算机、便携式计算机、工作台、个人数字助理、服务器、大型计算机和其它适合的计算机。移动装置可以包括平板电脑、智能电话、可穿戴式设备等。As shown in FIG. 7 , it is a block diagram of a computer device according to an embodiment of the present application. Computing equipment is intended to mean any form of digital computer or mobile device. Wherein digital computers may include desktop computers, laptop computers, workstations, personal digital assistants, servers, mainframe computers, and other suitable computers. Mobile devices may include tablets, smartphones, wearable devices, and the like.

如图7所示,设备600包括计算单元601、ROM 602、RAM 603、总线604以及输入/输出(I/O)接口605,计算单元601、ROM 602以及RAM 603通过总线604彼此相连。输入/输出(I/O)接口605也连接至总线604。As shown in Figure 7, device 600 includes computing unit 601, ROM 602, RAM 603, bus 604 and input/output (I/O) interface 605, and computing unit 601, ROM 602 and RAM 603 are connected to each other by bus 604. An input/output (I/O) interface 605 is also connected to the bus 604 .

计算单元601可以根据存储在只读存储器(ROM)602中的计算机指令或者从存储单元608加载到随机访问存储器(RAM)603中的计算机指令,来执行本申请方法实施例中的各种处理。计算单元601可以是各种具有处理和计算能力的通用和/或专用处理组件。计算单元601可以包括但不限于中央处理单元(CPU)、图形处理单元(GPU)、各种专用的人工智能(AI)计算芯片、各种运行机器学习模型算法的计算单元、数字信号处理器(DSP)、以及任何适当的处理器、控制器、微控制器等。在一些实施例中,本申请实施例提供的方法可被实现为计算机软件程序,其被有形地包含于计算机可读存储介质,例如存储单元608。The computing unit 601 can execute various processes in the method embodiments of the present application according to computer instructions stored in the read-only memory (ROM) 602 or loaded from the storage unit 608 into the random access memory (RAM) 603 . The computing unit 601 may be various general-purpose and/or special-purpose processing components having processing and computing capabilities. The computing unit 601 may include, but not limited to, a central processing unit (CPU), a graphics processing unit (GPU), various dedicated artificial intelligence (AI) computing chips, various computing units that run machine learning model algorithms, digital signal processors ( DSP), and any suitable processor, controller, microcontroller, etc. In some embodiments, the methods provided in the embodiments of the present application may be implemented as computer software programs, which are tangibly included in computer-readable storage media, such as the storage unit 608 .

RAM 603还可存储设备600操作所需的各种程序和数据。计算机程序的部分或者全部可以经由ROM 602和/或通信单元609而被载入和/或安装到设备600上。The RAM 603 can also store various programs and data necessary for the operation of the device 600. Part or all of the computer program can be loaded and/or installed on the device 600 via the ROM 602 and/or the communication unit 609.

设备600中的输入单元606、输出单元607、存储单元608和通信单元609可以连接至I/O接口605。其中,输入单元606可以是诸如键盘、鼠标、触摸屏、麦克风等;输出单元607可以是诸如显示器、扬声器、指示灯等。设备600能够通过通信单元609与其他设备进行信息、数据等的交换。The input unit 606 , output unit 607 , storage unit 608 and communication unit 609 in the device 600 may be connected to the I/O interface 605 . Wherein, the input unit 606 may be such as a keyboard, mouse, touch screen, microphone, etc.; the output unit 607 may be such as a display, a speaker, an indicator light, and the like. The device 600 is capable of exchanging information, data, and the like with other devices through the communication unit 609 .

需要说明的是,该设备还可以包括实现正常运行所必需的其他组件。也可以仅包含实现本申请方案所必需的组件,而不必包含图中所示的全部组件。It should be noted that the device may also include other components necessary for normal operation. It is also possible to include only the components necessary to realize the solution of the present application, instead of all the components shown in the figure.

此处描述的系统和技术的各种实施方式可以在数字电子电路系统、集成电路系统、场可编程门阵列(FPGA)、专用集成电路(ASIC)、专用标准产品(ASSP)、芯片上系统的系统(SOC)、负载可编程逻辑设备(CPLD)、计算机硬件、固件、软件和/或它们的组合中实现。Various implementations of the systems and techniques described herein can be implemented in digital electronic circuitry, systems integrated circuits, field programmable gate arrays (FPGAs), application specific integrated circuits (ASICs), application specific standard products (ASSPs), systems on chips system (SOC), load programmable logic device (CPLD), computer hardware, firmware, software, and/or a combination thereof.

用于实施本申请的方法的计算机指令可以采用一个或多个编程语言的任何组合来编写。这些计算机指令可以提供给计算单元601,使得计算机指令当由诸如处理器等计算单元601执行时使执行本申请方法实施例中涉及的各步骤。Computer instructions for implementing the methods of the present application may be written in any combination of one or more programming languages. These computer instructions may be provided to the computing unit 601, so that when the computer instructions are executed by the computing unit 601 such as a processor, various steps involved in the method embodiments of the present application are performed.

本申请提供的计算机可读存储介质可以是有形的介质,其可以包含或存储计算机指令,用以执行本申请方法实施例中涉及的各步骤。计算机可读存储介质可以包括但不限于电子的、磁性的、光学的、电磁的等形式的存储介质。The computer-readable storage medium provided in the present application may be a tangible medium, which may contain or store computer instructions for executing the steps involved in the method embodiments of the present application. Computer-readable storage media may include, but are not limited to, electronic, magnetic, optical, electromagnetic, and other forms of storage media.

上述具体实施方式,并不构成对本申请保护范围的限制。本领域技术人员应该明白的是,根据设计要求和其他因素,可以进行各种修改、组合、子组合和替代。任何在本申请的精神和原则之内所作的修改、等同替换和改进等,均应包含在本申请保护范围之内。The above specific implementation methods are not intended to limit the protection scope of the present application. It should be apparent to those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made depending on design requirements and other factors. Any modifications, equivalent replacements and improvements made within the spirit and principles of this application shall be included within the protection scope of this application.

Claims (10)

1. A system link automatic monitoring operation and maintenance method is characterized by comprising the following steps:
monitoring various devices of the link based on the link monitoring device to obtain monitoring data;
based on the acquired monitoring data, according to an early warning rule preset by various devices of the link, when the monitoring data reaches a warning threshold value configured by the early warning rule, a warning event is formed;
the operation and maintenance service management system is connected with the operation and maintenance service management system based on the alarm event, and generates an operation and maintenance work order when the alarm event is formed;
the operation and maintenance work order analyzes and determines the operation and maintenance service priority according to the preset operation and maintenance business rule and matches the operation and maintenance service priority to the corresponding operation and maintenance personnel;
and the operation and maintenance personnel develop operation and maintenance service according to the operation and maintenance service priority determined by the operation and maintenance work order and the corresponding alarm event, and record operation and maintenance contents to the operation and maintenance work order.
2. The method for automatically monitoring, operating and maintaining the system link according to claim 1, wherein the monitoring of various devices of the link based on the link monitoring device to obtain monitoring data comprises:
acquiring monitoring data of a collector;
inquiring an IP list of a collector to obtain heartbeat data and flow data of a corresponding collector in the IP list of the collector;
analyzing the acquired heartbeat data and flow data of the collector;
and storing the heartbeat data and the flow data of the collector which are analyzed to be abnormal into an abnormal list, and waiting for matching with the heartbeat data and the flow data of the collector which are obtained next time.
3. The method for automatically monitoring, operating and maintaining system links according to claim 1, wherein the monitoring of the various devices of the link based on the link monitoring device to obtain the monitoring data further comprises:
acquiring link equipment monitoring data;
inquiring the IP list of the link equipment, and acquiring equipment data of corresponding link equipment in the IP list of the link equipment for analysis;
and storing the device data of the link device which is analyzed to be abnormal into an abnormal list, and waiting for matching with the device data of the link device which is acquired next time.
4. The method for automatically monitoring, operating and maintaining the system link according to claim 2 or 3, wherein the step of forming an alarm event when the monitoring data reaches an alarm threshold configured by the alarm rule based on the acquired monitoring data according to an alarm rule preset by each type of device in the link includes:
comparing the monitoring data stored in the exception list with the alarm threshold of the early warning rule based on the early warning rule;
and forming an alarm event when the data index of any monitoring data reaches a configured alarm threshold value, wherein the alarm data recorded by the alarm event comprises accurate fault equipment, fault information and fault time.
5. The method for automatically monitoring, operating and maintaining system links according to claim 1, wherein the operation and maintenance business rules comprise:
the operation and maintenance service priority rule comprises the operation and maintenance work order importance degree, which is sorted from high to low according to the important guarantee identification, the important guarantee time period, the fault equipment, the fault information, the station maintenance type and the service level displayed on the operation and maintenance work order;
and the work order distribution rule comprises comprehensive analysis and matching according to the operation and maintenance service priority and the station information of the operation and maintenance work order, and the skill levels and the operation and maintenance shift scheduling information of the corresponding station operation and maintenance responsible persons and operation and maintenance engineers.
6. The method for automatically monitoring operation and maintenance of system link according to claim 5, wherein the operation and maintenance work order analyzes and determines the operation and maintenance service priority according to the preset operation and maintenance business rule, and matches the operation and maintenance service priority to the corresponding operation and maintenance personnel, and the method comprises the following steps:
analyzing and determining the priority of the operation and maintenance work order according to the operation and maintenance service priority rule based on the operation and maintenance work order content, and preferentially matching the operation and maintenance work order which is preferential;
and determining a station corresponding to equipment to be operated and maintained on the operation and maintenance work order, and the station operation and maintenance responsible person and the operation and maintenance personnel corresponding to the station based on the matched preferential operation and maintenance work order according to the work order distribution rule.
7. The method for automated operation and maintenance of system links according to any one of claims 1 to 6, wherein the method further comprises:
feeding back an operation and maintenance result to the link monitoring equipment according to the operation and maintenance service management system based on the processed operation and maintenance work order;
determining whether the alarm event is triggered or not based on the early warning rule preset by the link monitoring equipment, generating a feedback record and returning the feedback record to the operation and maintenance service management system;
the operation and maintenance service management system judges whether the operation and maintenance is finished based on the feedback record;
if the feedback record shows that the alarm event does not exist, the operation and maintenance is finished, and the operation and maintenance work order is closed;
otherwise, the operation and maintenance worker single flow is transferred to an operation and maintenance processing link to continue the circulation processing.
8. A system link automation monitoring operation and maintenance system, the system comprising:
the acquisition module is used for monitoring various devices of the link based on the link monitoring device and acquiring monitoring data;
the warning module is used for forming a warning event when the monitoring data reaches a warning threshold value configured by the warning rule based on the acquired monitoring data according to the warning rule preset by each type of equipment of the link;
the docking module is used for docking with an operation and maintenance service management system based on the alarm event, and the operation and maintenance service management system generates an operation and maintenance work order when the alarm event is formed;
the matching module is used for analyzing and determining the operation and maintenance service priority according to the preset operation and maintenance business rule by the operation and maintenance work order and matching the operation and maintenance work order with the corresponding operation and maintenance personnel;
and the processing module is used for the operation and maintenance personnel to carry out operation and maintenance service according to the operation and maintenance service priority determined by the operation and maintenance work order and the corresponding alarm event, and recording operation and maintenance contents to the operation and maintenance work order.
9. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the steps of the method of any of claims 1 to 7 are implemented when the computer program is executed by the processor.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 7.
CN202211419040.0A 2022-11-14 2022-11-14 System link automatic monitoring operation and maintenance method, system, equipment and medium Pending CN115941441A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211419040.0A CN115941441A (en) 2022-11-14 2022-11-14 System link automatic monitoring operation and maintenance method, system, equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211419040.0A CN115941441A (en) 2022-11-14 2022-11-14 System link automatic monitoring operation and maintenance method, system, equipment and medium

Publications (1)

Publication Number Publication Date
CN115941441A true CN115941441A (en) 2023-04-07

Family

ID=86549855

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211419040.0A Pending CN115941441A (en) 2022-11-14 2022-11-14 System link automatic monitoring operation and maintenance method, system, equipment and medium

Country Status (1)

Country Link
CN (1) CN115941441A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116703167A (en) * 2023-08-08 2023-09-05 深圳市明心数智科技有限公司 Alarm monitoring processing method, device, equipment and storage medium for cultivation equipment
CN119005918A (en) * 2024-10-22 2024-11-22 富鸿资本(湖南)融资租赁有限公司 Work order flow processing method based on operation and maintenance of power station

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116703167A (en) * 2023-08-08 2023-09-05 深圳市明心数智科技有限公司 Alarm monitoring processing method, device, equipment and storage medium for cultivation equipment
CN116703167B (en) * 2023-08-08 2024-01-26 深圳市明心数智科技有限公司 Alarm monitoring processing method, device, equipment and storage medium for cultivation equipment
CN119005918A (en) * 2024-10-22 2024-11-22 富鸿资本(湖南)融资租赁有限公司 Work order flow processing method based on operation and maintenance of power station
CN119005918B (en) * 2024-10-22 2025-02-07 富鸿资本(湖南)融资租赁有限公司 Work order process processing method based on power station operation and maintenance

Similar Documents

Publication Publication Date Title
WO2020259421A1 (en) Method and apparatus for monitoring service system
US10069900B2 (en) Systems and methods for adaptive thresholding using maximum concentration intervals
CN114500250B (en) System linkage comprehensive operation and maintenance system and method in cloud mode
US20190095266A1 (en) Detection of Misbehaving Components for Large Scale Distributed Systems
CN109831478A (en) Rule-based and model distributed processing intelligent decision system and method in real time
CN115941441A (en) System link automatic monitoring operation and maintenance method, system, equipment and medium
US20210366268A1 (en) Automatic tuning of incident noise
US9600795B2 (en) Measuring process model performance and enforcing process performance policy
WO2023207689A1 (en) Change risk assessment method and apparatus, and storage medium
CN107220121A (en) Sandbox environment method of testing and its system under a kind of NUMA architecture
CN113656239A (en) Monitoring method and device for middleware and computer program product
CN108039971A (en) A kind of alarm method and device
CN115632926A (en) Alarm information processing method, device, equipment, storage medium and product
CN109684321A (en) Data quality management method, device, electronic equipment, storage medium
CN114418342A (en) A business data processing method, device and readable storage medium
CN105139296A (en) Power grid business data full life cycle quality management system
CN110851316B (en) Abnormality early warning method, abnormality early warning device, abnormality early warning system, electronic equipment and storage medium
CN117194092A (en) Root cause locating method, root cause locating device, computer equipment and storage medium
CN108154343B (en) Emergency processing method and system for enterprise-level information system
CN107682173B (en) Automatic fault positioning method and system based on transaction model
CN115098326A (en) System anomaly detection method and device, storage medium and electronic equipment
CN116166427A (en) Automatic capacity expansion and contraction method, device, equipment and storage medium
CN111932706B (en) Informationized inspection method and device, storage medium and electronic equipment
CN110493071A (en) Message system resources balance device, method and apparatus
CN118331843B (en) Hierarchical data automatic test method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination