CN117937752A - All-in-one DCIM system for ultra-large data management - Google Patents
All-in-one DCIM system for ultra-large data management Download PDFInfo
- Publication number
- CN117937752A CN117937752A CN202410085814.3A CN202410085814A CN117937752A CN 117937752 A CN117937752 A CN 117937752A CN 202410085814 A CN202410085814 A CN 202410085814A CN 117937752 A CN117937752 A CN 117937752A
- Authority
- CN
- China
- Prior art keywords
- data
- module
- faults
- conversion
- cleaning
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- H—ELECTRICITY
- H02—GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
- H02J—CIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
- H02J13/00—Circuit arrangements for providing remote indication of network conditions, e.g. an instantaneous record of the open or closed condition of each circuitbreaker in the network; Circuit arrangements for providing remote control of switching means in a power distribution network, e.g. switching in and out of current consumers by using a pulse code signal carried by the network
- H02J13/00006—Circuit arrangements for providing remote indication of network conditions, e.g. an instantaneous record of the open or closed condition of each circuitbreaker in the network; Circuit arrangements for providing remote control of switching means in a power distribution network, e.g. switching in and out of current consumers by using a pulse code signal carried by the network characterised by information or instructions transport means between the monitoring, controlling or managing units and monitored, controlled or operated power network element or electrical equipment
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/21—Design, administration or maintenance of databases
- G06F16/215—Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/25—Integrating or interfacing systems involving database management systems
- G06F16/258—Data format conversion from or to a database
-
- H—ELECTRICITY
- H02—GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
- H02J—CIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
- H02J13/00—Circuit arrangements for providing remote indication of network conditions, e.g. an instantaneous record of the open or closed condition of each circuitbreaker in the network; Circuit arrangements for providing remote control of switching means in a power distribution network, e.g. switching in and out of current consumers by using a pulse code signal carried by the network
- H02J13/00006—Circuit arrangements for providing remote indication of network conditions, e.g. an instantaneous record of the open or closed condition of each circuitbreaker in the network; Circuit arrangements for providing remote control of switching means in a power distribution network, e.g. switching in and out of current consumers by using a pulse code signal carried by the network characterised by information or instructions transport means between the monitoring, controlling or managing units and monitored, controlled or operated power network element or electrical equipment
- H02J13/00028—Circuit arrangements for providing remote indication of network conditions, e.g. an instantaneous record of the open or closed condition of each circuitbreaker in the network; Circuit arrangements for providing remote control of switching means in a power distribution network, e.g. switching in and out of current consumers by using a pulse code signal carried by the network characterised by information or instructions transport means between the monitoring, controlling or managing units and monitored, controlled or operated power network element or electrical equipment involving the use of Internet protocols
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/06—Management of faults, events, alarms or notifications
- H04L41/0631—Management of faults, events, alarms or notifications using root cause analysis; using analysis of correlation between notifications, alarms or events based on decision criteria, e.g. hierarchy, tree or time analysis
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/06—Management of faults, events, alarms or notifications
- H04L41/0677—Localisation of faults
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/10—Flow control; Congestion control
- H04L47/12—Avoiding congestion; Recovering from congestion
- H04L47/125—Avoiding congestion; Recovering from congestion by balancing the load, e.g. traffic engineering
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/04—Network architectures or network communication protocols for network security for providing a confidential data exchange among entities communicating through data packet networks
- H04L63/0428—Network architectures or network communication protocols for network security for providing a confidential data exchange among entities communicating through data packet networks wherein the data content is protected, e.g. by encrypting or encapsulating the payload
- H04L63/0435—Network architectures or network communication protocols for network security for providing a confidential data exchange among entities communicating through data packet networks wherein the data content is protected, e.g. by encrypting or encapsulating the payload wherein the sending and receiving network entities apply symmetric encryption, i.e. same key used for encryption and decryption
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
- H04L67/1001—Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L69/00—Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
- H04L69/16—Implementation or adaptation of Internet protocol [IP], of transmission control protocol [TCP] or of user datagram protocol [UDP]
- H04L69/166—IP fragmentation; TCP segmentation
Landscapes
- Engineering & Computer Science (AREA)
- Signal Processing (AREA)
- Computer Networks & Wireless Communication (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Computer Security & Cryptography (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Power Engineering (AREA)
- Quality & Reliability (AREA)
- Computer Hardware Design (AREA)
- Computing Systems (AREA)
- Remote Monitoring And Control Of Power-Distribution Networks (AREA)
Abstract
本发明公开了超大型数据管理的一体化DCIM系统,属于DCIM系统技术领域。为解决现有分布式DCIM系统架构技术无法对未来超大型数据中心进行纳管,存在系统卡顿、数据刷新时间无法满足终端客户要求的问题,本发明通过数据接入系统、数据处理系统以及管理融合平台,3台管理服务器同时运行,通过系统运行控制软件实现服务器的负载均衡,将动环监控系统、电力监控系统、BA自控系统打通,将其融合为一个管理平台,这样不仅提高了服务器集群的处理能力,同时对于运维人员也减少了跨平台使用的问题,大大提高了运维的工作效率。
The present invention discloses an integrated DCIM system for super-large data management, and belongs to the technical field of DCIM systems. In order to solve the problem that the existing distributed DCIM system architecture technology cannot manage future super-large data centers, there are system freezes, and the data refresh time cannot meet the requirements of end customers, the present invention uses a data access system, a data processing system, and a management fusion platform, and three management servers run at the same time. The load balancing of the server is achieved through the system operation control software, and the dynamic environment monitoring system, the power monitoring system, and the BA automatic control system are connected and integrated into a management platform, which not only improves the processing capacity of the server cluster, but also reduces the problem of cross-platform use for operation and maintenance personnel, and greatly improves the work efficiency of operation and maintenance.
Description
技术领域Technical Field
本发明涉及DCIM系统技术领域,特别涉及超大型数据管理的一体化DCIM系统。The present invention relates to the technical field of DCIM systems, and in particular to an integrated DCIM system for ultra-large data management.
背景技术Background technique
DCIM系统可以帮助管理员优化数据中心的运行效率、降低能源消耗,并提供预测性维护和容量规划等功能。它还能够帮助管理员实时监控设备的性能和状态,并提供报警和通知功能,以便及时处理故障和问题。总的来说,DCIM系统可以提高数据中心的可靠性、可用性和安全性,提高运维效率,并减少成本。The DCIM system can help administrators optimize the operating efficiency of data centers, reduce energy consumption, and provide predictive maintenance and capacity planning functions. It can also help administrators monitor the performance and status of equipment in real time, and provide alarm and notification functions to handle faults and problems in a timely manner. In general, the DCIM system can improve the reliability, availability and security of data centers, improve operation and maintenance efficiency, and reduce costs.
近几年随着大模型训练、算力芯片、GPU、NPU、800G光模块、高密度电池等技术突破,且制冷技术高速发展,单机柜功率从8kW提升60kW或更高,并能够实现量产及项目落地,未来液冷技术上下游产业链将不断完善,单机柜功率还将不断攀升,同建筑物内将产出更高的机柜,对于建筑物DCIM系统将带来巨大考验及压力;In recent years, with the breakthroughs in large model training, computing chips, GPU, NPU, 800G optical modules, high-density batteries and other technologies, and the rapid development of refrigeration technology, the power of a single cabinet has increased from 8kW to 60kW or higher, and mass production and project implementation have been achieved. In the future, the upstream and downstream industrial chains of liquid cooling technology will continue to improve, and the power of a single cabinet will continue to rise. More cabinets will be produced in the same building, which will bring great challenges and pressure to the building's DCIM system.
现有分布式DCIM系统架构技术无法对未来超大型数据中心进行纳管,The existing distributed DCIM system architecture technology cannot manage future ultra-large data centers.
1、将存在系统卡顿、数据刷新时间无法满足终端客户要求;1. There will be system freezes and data refresh time cannot meet the requirements of end customers;
2、同时存在系统割裂、故障定位难的问题。2. There are also problems of system fragmentation and difficulty in locating faults.
发明内容Summary of the invention
本发明的目的在于提供超大型数据管理的一体化DCIM系统,以解决上述背景技术中现有分布式DCIM系统架构技术无法对未来超大型数据中心进行纳管的问题。The purpose of the present invention is to provide an integrated DCIM system for ultra-large data management to solve the problem in the above-mentioned background technology that the existing distributed DCIM system architecture technology is unable to manage future ultra-large data centers.
为实现上述目的,本发明提供如下技术方案:超大型数据管理的一体化DCIM系统,包括:To achieve the above object, the present invention provides the following technical solution: an integrated DCIM system for ultra-large data management, comprising:
数据接入系统,用于:Data access system for:
为整个DCIM系统各个采集模块的数据接口,同时也是反馈设备状态的来源,接收动环监控系统、电力监控系统、BA系统的设备信息及运行状态;It is the data interface for each acquisition module of the entire DCIM system and also the source of feedback on equipment status. It receives equipment information and operating status from the dynamic environment monitoring system, power monitoring system, and BA system.
数据处理系统,用于:Data processing systems for:
将末端采集到的信息进行计算、分析及存储,具备组态化联动控制功能,实现相关系统或设备的自动化联动控制,具备复杂事件分析功能,自动查找根源告警,方便运维中快速定位根源,快速解决问题,快速恢复故障,其中数据处理系统包括3台管理服务器和3台存储服务器,管理服务器是将采集上的数据进行分析处理,对于设备的故障进行上报及计算,存储服务器是将采集数据进行全量存储,数据接入系统所接收到的信息通过交换机上传至数据处理系统;The information collected at the terminal is calculated, analyzed and stored. It has the function of configurable linkage control to realize the automatic linkage control of related systems or equipment. It has the function of complex event analysis to automatically find the root cause alarm, which is convenient for quickly locating the root cause during operation and maintenance, quickly solving problems, and quickly recovering from faults. The data processing system includes 3 management servers and 3 storage servers. The management server analyzes and processes the collected data, reports and calculates the equipment faults, and the storage server stores the collected data in full. The information received by the data access system is uploaded to the data processing system through the switch;
管理融合平台,用于:Management converged platform for:
将动环监控系统、电力监控系统以及BA自控系统打通,将其融合为一个管理平台,提高了服务器集群的处理能力。The dynamic environment monitoring system, power monitoring system and BA automatic control system are connected and integrated into a management platform, which improves the processing capacity of the server cluster.
进一步地,所述数据接入系统包括:数据采集模块、数据传输模块、数据清洗转换模块、数据保障模块、数据流量控制模块以及数据发送模块,其中Furthermore, the data access system includes: a data acquisition module, a data transmission module, a data cleaning and conversion module, a data protection module, a data flow control module and a data sending module, wherein
数据采集模块,用于:Data acquisition module for:
从各个数据源采集数据,包括变电所及电池室区域、IT包间区域和冷冻站区域,以及外部的数据源,其中包括云服务、外部监控系统,通过数据采集,系统可以获取实时的数据信息,为后续的数据处理和分析提供基础;Collect data from various data sources, including substations and battery rooms, IT rooms, and refrigeration stations, as well as external data sources, including cloud services and external monitoring systems. Through data collection, the system can obtain real-time data information, providing a basis for subsequent data processing and analysis;
数据传输模块,用于:Data transmission module for:
负责将采集到的数据进行传输,确保数据的快速、稳定和安全的传递,选用采用TCP/IP协议进行传输;Responsible for transmitting the collected data to ensure fast, stable and secure data transmission, using TCP/IP protocol for transmission;
数据清洗转换模块,用于:Data cleaning and conversion module, used to:
在数据传输过程中,能够对数据进行清洗和转换,确保数据的质量和格式的一致性,能够对数据进行去重、去噪、格式转换,使数据更加规范和易于管理;During the data transmission process, the data can be cleaned and converted to ensure the quality and consistency of the data format. The data can be deduplicated, de-noised, and format converted to make the data more standardized and easier to manage.
数据保障模块,用于:Data Assurance Module for:
负责确保数据的安全性,通过加密技术和身份验证的手段,对数据进行保护,防止数据在传输过程中被篡改、泄露或未经授权的访问;Responsible for ensuring the security of data, protecting data through encryption technology and identity authentication to prevent data from being tampered with, leaked or accessed without authorization during transmission;
数据流量控制模块,用于:Data flow control module, used for:
对数据流量进行控制和管理,确保数据的平衡和稳定的传输,根据数据中心的需求和资源情况,对数据进行优先级和带宽的分配,保证关键数据的及时传输;Control and manage data traffic to ensure balanced and stable data transmission. Prioritize and allocate bandwidth to data based on the needs and resources of the data center to ensure timely transmission of critical data.
数据发送模块,用于:Data sending module, used for:
将采集整理后的数据发送至数据处理系统。The collected and organized data are sent to the data processing system.
进一步地,所述数据清洗转换模块包括:数据去重模块、数据去噪模块以及数据格式转换模块;Furthermore, the data cleaning and conversion module includes: a data deduplication module, a data denoising module and a data format conversion module;
数据去重模块,用于:Data deduplication module, used for:
通过哈希算法去重算法,对采集到的数据进行去重处理;The collected data is deduplicated using a hash algorithm.
数据去噪模块,用于:Data denoising module, used to:
通过滤波技术,对数据进行去噪处理,提高数据的准确性和可靠性;Through filtering technology, the data is denoised to improve the accuracy and reliability of the data;
数据格式转换模块,用于:Data format conversion module, used for:
根据数据中心的需求,对采集到的数据进行格式转换,使其与系统内部的数据格式一致,方便后续的数据处理和分析。According to the needs of the data center, the collected data is converted into a format that is consistent with the data format within the system, facilitating subsequent data processing and analysis.
进一步地,所述数据清洗转换模块还包括:Furthermore, the data cleaning conversion module also includes:
数据归一化模块,用于:Data normalization module, used to:
对数据进行归一化处理,将其转换为统一的单位和量级,方便数据的比较和分析;Normalize the data and convert it into uniform units and magnitudes to facilitate data comparison and analysis;
数据校验模块,用于:Data verification module, used for:
通过循环冗余校验对数据进行校验,若数据校验失败,进行数据重传或报警处理;The data is verified through cyclic redundancy check. If the data verification fails, the data is retransmitted or an alarm is issued.
数据补全模块,用于:Data completion module, used to:
根据历史数据进行插值或补全,填补数据的空缺,确保数据的完整性和连续性。Interpolate or complete historical data to fill in data gaps and ensure data integrity and continuity.
进一步地,所述数据保障模块包括:Furthermore, the data protection module includes:
数据加密模块,用于:Data encryption module for:
在数据清洗和转换过程中,采用对称加密算法对敏感数据进行加密处理,确保数据在传输和存储过程中的安全性;During data cleaning and conversion, symmetric encryption algorithms are used to encrypt sensitive data to ensure data security during transmission and storage;
数据脱敏模块,用于:Data desensitization module, used for:
对于涉及个人隐私或敏感信息的数据,在进行清洗和转换时,采用数据脱敏技术,将敏感信息进行屏蔽或替换,以保护用户隐私和数据安全;For data involving personal privacy or sensitive information, data desensitization technology is used during cleaning and conversion to shield or replace sensitive information to protect user privacy and data security;
数据权限控制模块,用于:Data permission control module, used for:
在数据清洗和转换过程中,设置严格的数据权限控制机制,限制用户对数据的访问和操作权限,只有经过授权的用户才能进行数据清洗和转换操作,确保数据的安全性和可控性。During the data cleaning and conversion process, a strict data permission control mechanism is set up to limit users' access to and operation rights over data. Only authorized users can perform data cleaning and conversion operations to ensure data security and controllability.
进一步地,所述数据保障模块还包括:Furthermore, the data protection module also includes:
数据审计和监控模块,用于:Data auditing and monitoring module for:
建立数据清洗和转换的审计和监控机制,记录数据的清洗和转换操作,以便后续的审计和追溯,监控数据清洗和转换过程中的异常操作和数据变动,及时发现和处理数据安全风险;Establish an audit and monitoring mechanism for data cleaning and conversion, record data cleaning and conversion operations for subsequent audit and tracing, monitor abnormal operations and data changes during data cleaning and conversion, and promptly identify and handle data security risks;
数据备份和恢复模块,用于:Data backup and recovery module for:
在数据清洗和转换过程中,定期进行数据备份,确保数据的可恢复性和持久性,在数据清洗和转换过程中,及时备份数据,以防止数据丢失或损坏,保障数据的安全性和可用性;During the data cleaning and conversion process, data backup should be performed regularly to ensure data recoverability and durability. During the data cleaning and conversion process, data should be backed up in a timely manner to prevent data loss or damage and ensure data security and availability;
数据异常检测模块,用于:Data anomaly detection module, used to:
建立异常检测和报警机制,监测数据清洗和转换过程中的异常情况,一旦发现异常情况,及时触发报警机制。Establish an anomaly detection and alarm mechanism to monitor anomalies during data cleaning and conversion, and trigger the alarm mechanism in a timely manner once an anomaly is found.
进一步地,所述数据流量控制模块包括:Furthermore, the data flow control module includes:
流量监控与分析模块,用于:Traffic monitoring and analysis module for:
通过Snort流量分析工具对数据流量进行实时监控和分析,可以及时发现异常流量或攻击行为,从而保护数据的安全;By using the Snort traffic analysis tool to monitor and analyze data traffic in real time, abnormal traffic or attack behaviors can be discovered in time, thus protecting data security.
流量优先级模块,用于:Traffic prioritization module for:
针对流量较大的数据,优先发送,确保重要数据的及时性;Data with large flow rates are sent first to ensure the timeliness of important data;
流量阻断模块,用于:Traffic blocking module, used for:
结合流量监控和分析结果,设置异常流量检测规则,并配置报警机制,当监测到异常流量时,触发报警机制,同时数据清洗转换以及数据审计和监控停止进行,防止出现数据泄漏。Combined with the traffic monitoring and analysis results, set up abnormal traffic detection rules and configure the alarm mechanism. When abnormal traffic is detected, the alarm mechanism is triggered, and data cleaning conversion, data auditing and monitoring are stopped to prevent data leakage.
进一步地,所述数据处理系统包括:Furthermore, the data processing system comprises:
数据存储模块,对整理发送过来的信息进行存储,其中包括线上存储和线下存储,线上存储通过存储服务器进行存储;The data storage module stores the information sent by the sorting, including online storage and offline storage. The online storage is stored through the storage server;
机组联动模块,将3台管理服务器同时运行,通过系统运行控制软件实现服务器的负载均衡;The unit linkage module runs three management servers simultaneously and realizes server load balancing through system operation control software;
定位报警模块,用于:Positioning alarm module, used for:
对待监控的设备进行定位报警,自动查找根源告警,方便运维中快速定位根源,快速解决问题,快速恢复故障;Locate and alarm the monitored equipment, automatically find the root cause of the alarm, and facilitate quick root cause location, problem solving, and fault recovery during operation and maintenance;
报表生成模块,用于:Report generation module for:
运维人员设计满足其个性化需求的报表报告。Operation and maintenance personnel design reports that meet their personalized needs.
进一步地,所述定位报警模块包括:Furthermore, the positioning alarm module includes:
标识符设定模块,用于给每个设备分配一个唯一的标识符,这样可以确保每个设备都有明确的标识,同时建立坐标系,每个设备都有属于自己的坐标点;The identifier setting module is used to assign a unique identifier to each device, so as to ensure that each device has a clear identification and establish a coordinate system so that each device has its own coordinate point;
故障方向确定模块,用于在DCIM系统中,有两类故障,一是清楚故障源,另一种是不清楚故障源,设置对设备状态故障的实时监测,使用传感器和数据采集器记录设备的状态信息,两类故障均包括电力故障和环境故障,设定电力故障为D,环境故障为H;The fault direction determination module is used in the DCIM system. There are two types of faults: one is that the fault source is clear, and the other is that the fault source is unclear. Set up real-time monitoring of equipment status faults and use sensors and data collectors to record equipment status information. Both types of faults include power faults and environmental faults. Set power faults as D and environmental faults as H.
精准故障标记模块,用于当遇到清楚故障源时,标记为精准故障(即精准代表符号X),通过分析监测数据来确定是否出现了特定的故障模式,出现特定故障模式即为清楚故障源;The precise fault marking module is used to mark a clear fault source as a precise fault (i.e., the precise representative symbol X), and to determine whether a specific fault mode has occurred by analyzing the monitoring data. The occurrence of a specific fault mode is a clear fault source;
故障定位模块,用于排除遇到清楚故障源后,通过分析监测数据来确定出现故障的设备,对故障设备进行坐标点定位,确定出现问题的具体设备和方向;The fault location module is used to identify the faulty device by analyzing the monitoring data after the fault source is clear, locate the coordinate point of the faulty device, and determine the specific device and direction where the problem occurs;
故障设备显示模块,用于在DCIM系统的用户界面中,将出现故障的设备和方向在含有坐标轴的平面图上显示出来,以便用户准确了解故障设备和方向的位置和状态;A faulty equipment display module is used to display the faulty equipment and direction on a plane diagram with coordinate axes in the user interface of the DCIM system, so that the user can accurately understand the location and status of the faulty equipment and direction;
警报通知模块,用于一旦确定出现故障的设备和方向,DCIM系统及时发出警报和通知相应的人员,以便他们能够快速采取纠正措施。The alarm notification module is used to determine the faulty equipment and direction. Once the DCIM system promptly issues an alarm and notifies the relevant personnel, they can take corrective measures quickly.
进一步地,所述报表生成模块包括:Furthermore, the report generation module includes:
设备状态收集模块,用于对DCIM系统中监测到的设备状态数据进行收集和记录,其中包括电力和环境数据;The equipment status collection module is used to collect and record the equipment status data monitored in the DCIM system, including power and environmental data;
数据分析整理模块,用于对收集到的设备状态数据进行分析和整理,提取有关故障、异常或其他重要信息的数据;The data analysis and collation module is used to analyze and collate the collected equipment status data and extract data related to faults, anomalies or other important information;
信息融合模块,根据需要,用于生成设备状态报表,其中包括故障设备清单、设备状态变化趋势以及故障频率统计的报表,同时对出现故障的设备的坐标点进行记录,另外在DCIM系统的用户界面中,在含有坐标轴的平面图上进行故障次数显示,其中分为红、黄、蓝三个等级,其中红色代表该设备出现5次及以上故障,黄色代表该设备出现2-4次故障,蓝色代表该设备出现0或1次故障。The information fusion module is used to generate equipment status reports as needed, including a list of faulty equipment, equipment status change trends, and fault frequency statistics. The coordinate points of the faulty equipment are recorded at the same time. In the user interface of the DCIM system, the number of faults is displayed on a plane diagram containing coordinate axes, which is divided into three levels: red, yellow, and blue. Red means that the equipment has 5 or more faults, yellow means that the equipment has 2-4 faults, and blue means that the equipment has 0 or 1 fault.
与现有技术相比,本发明的有益效果是:Compared with the prior art, the present invention has the following beneficial effects:
1.现有技术下,分布式DCIM系统存在卡顿、数据刷新时间无法满足终端客户要求,而本发明通过数据传输模块将采集到的数据进行传输,确保数据的快速、稳定和安全的传递,选用采用TCP/IP协议进行传输,相比RS485接口传输速度更快,效率更好,有助于提升整个系统的响应时间,通过数据存储模块对整理发送过来的信息进行存储,其中包括线上存储和线下存储,线上存储通过存储服务器进行存储,通过机组联动模块将3台管理服务器同时运行,通过系统运行控制软件实现服务器的负载均衡,通过定位报警模块,对待监控的设备进行定位报警,自动查找根源告警,方便运维中快速定位根源,快速解决问题,快速恢复故障,系统运行时,3台管理服务器同时运行,通过系统运行控制软件实现服务器的负载均衡,将动环监控系统、电力监控系统、BA自控系统打通,将其融合为一个管理平台,这样不仅提高了服务器集群的处理能力,同时对于运维人员也减少了跨平台使用的问题,大大提高了运维的工作效率,对于200万测点的项目若按照传统方式需要配置10台服务器做处理计算,一体化平台仅需配置3台管理服务器,传统系统架构界面刷新时间在10秒以上,而一体化技术对于200万测点系统可实现6秒的刷新时间,对于管理系统若需要扩容时仅需在集群内增加服务器并进行调试即可,无需对各个子系统进行调整,对于传统系统架构而言,一体化技术可减少15%-20%的投资,并且在系统应用方面具备极大优势。1. Under the existing technology, the distributed DCIM system has stuck and the data refresh time cannot meet the requirements of end customers. The present invention transmits the collected data through the data transmission module to ensure the fast, stable and safe transmission of data. The TCP/IP protocol is selected for transmission. Compared with the RS485 interface, the transmission speed is faster and the efficiency is better, which helps to improve the response time of the whole system. The information sent is stored through the data storage module, including online storage and offline storage. The online storage is stored through the storage server. The three management servers are run at the same time through the unit linkage module. The load balancing of the server is achieved through the system operation control software. The positioning alarm module is used to locate and alarm the equipment to be monitored, and the root cause alarm is automatically found, which is convenient for quickly locating the root cause during operation and maintenance, quickly solving the problem, and quickly recovering from the fault. When the system is running, the three management servers are running at the same time. It runs at the same time, realizes the load balancing of the server through the system operation control software, connects the dynamic environment monitoring system, power monitoring system and BA automatic control system, and integrates them into a management platform. This not only improves the processing capacity of the server cluster, but also reduces the problem of cross-platform use for operation and maintenance personnel, greatly improving the work efficiency of operation and maintenance. For a project with 2 million measurement points, if 10 servers need to be configured for processing and calculation according to the traditional method, the integrated platform only needs to be configured with 3 management servers. The refresh time of the traditional system architecture interface is more than 10 seconds, while the integrated technology can achieve a refresh time of 6 seconds for the 2 million measurement point system. If the management system needs to be expanded, it only needs to add servers in the cluster and debug them, without adjusting the various subsystems. For the traditional system architecture, the integrated technology can reduce 15%-20% of the investment and has great advantages in system application.
2.现有技术下,目前分布式DCIM系统对于数据的处理安全性不够,而本发明通过数据清洗转换模块、数据保障模块、数据流量控制模块,利用数据去重模块通过哈希算法去重算法,对采集到的数据进行去重处理,确保得到的数据没有重复,利用数据去噪模块通过滤波技术,对数据进行去噪处理,提高数据的准确性和可靠性,通过数据格式转换模块根据数据中心的需求,对采集到的数据进行格式转换,使其与系统内部的数据格式一致,方便后续的数据处理和分析,再通过数据归一化模块对数据进行归一化处理,将其转换为统一的单位和量级,方便数据的比较和分析,再通过数据校验模块利用循环冗余校验对数据进行校验,若数据校验失败,进行数据重传或报警处理,最后通过数据补全模块根据历史数据进行插值或补全,填补数据的空缺,确保数据的完整性和连续性,从而使得在数据传输过程中,能够对数据进行清洗和转换,确保数据的质量和格式的一致性,能够对数据进行去重、去噪、格式转换,使数据更加规范和易于管理;2. Under the existing technology, the current distributed DCIM system is not secure enough for data processing. The present invention uses a data cleaning and conversion module, a data security module, and a data flow control module, and uses a data deduplication module to perform deduplication processing on the collected data through a hash algorithm deduplication algorithm to ensure that the obtained data is not repeated. The data denoising module uses a filtering technology to perform denoising processing on the data to improve the accuracy and reliability of the data. The data format conversion module converts the format of the collected data according to the needs of the data center to make it consistent with the data format inside the system, which is convenient for subsequent data processing and analysis. The data is then normalized by the data normalization module to convert it into a unified unit and magnitude to facilitate data comparison and analysis. The data verification module verifies the data using a cyclic redundancy check. If the data verification fails, the data is retransmitted or an alarm is processed. Finally, the data completion module interpolates or completes according to historical data to fill in the gaps in the data to ensure the integrity and continuity of the data, so that during the data transmission process, the data can be cleaned and converted to ensure the quality and format consistency of the data, and the data can be deduplicated, de-noised, and format converted to make the data more standardized and easy to manage.
其次,在数据清洗和转换过程中,采用对称加密算法对敏感数据进行加密处理,确保数据在传输和存储过程中的安全性,对于涉及个人隐私或敏感信息的数据,在进行清洗和转换时,采用数据脱敏技术,将敏感信息进行屏蔽或替换,以保护用户隐私和数据安全,在数据清洗和转换过程中,设置严格的数据权限控制机制,限制用户对数据的访问和操作权限,只有经过授权的用户才能进行数据清洗和转换操作,确保数据的安全性和可控性,同时建立数据清洗和转换的审计和监控机制,记录数据的清洗和转换操作,以便后续的审计和追溯,监控数据清洗和转换过程中的异常操作和数据变动,及时发现和处理数据安全风险,其次结合流量监控和分析结果,设置异常流量检测规则,并配置报警机制,当监测到异常流量时,触发报警机制,同时数据清洗转换以及数据审计和监控停止进行,防止出现数据泄漏,从而提高接收动环监控系统、电力监控系统、BA系统的设备信息及运行状态的准确性以及安全性。Secondly, during the data cleaning and conversion process, a symmetric encryption algorithm is used to encrypt sensitive data to ensure the security of data during transmission and storage. For data involving personal privacy or sensitive information, data desensitization technology is used to shield or replace sensitive information during cleaning and conversion to protect user privacy and data security. During the data cleaning and conversion process, a strict data permission control mechanism is set up to limit user access and operation permissions to data. Only authorized users can perform data cleaning and conversion operations to ensure data security and controllability. At the same time, an audit and monitoring mechanism for data cleaning and conversion is established to record data cleaning and conversion operations for subsequent auditing and tracing, monitor abnormal operations and data changes during data cleaning and conversion, and promptly discover and deal with data security risks. Secondly, combined with traffic monitoring and analysis results, abnormal traffic detection rules are set, and an alarm mechanism is configured. When abnormal traffic is detected, the alarm mechanism is triggered, and data cleaning conversion, data auditing and monitoring are stopped to prevent data leakage, thereby improving the accuracy and security of the equipment information and operating status of the receiving dynamic environment monitoring system, power monitoring system, and BA system.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
图1为本发明超大型数据管理的一体化DCIM系统拓扑图;FIG1 is a topological diagram of an integrated DCIM system for ultra-large data management according to the present invention;
图2为本发明超大型数据管理的一体化DCIM系统整体程序框图;FIG2 is an overall flow chart of the integrated DCIM system for ultra-large data management according to the present invention;
图3为本发明超大型数据管理的一体化DCIM系统中数据清洗转换模块、数据保障模块以及数据流量控制模块程序框图;3 is a flowchart of a data cleaning and conversion module, a data security module, and a data flow control module in an integrated DCIM system for ultra-large data management according to the present invention;
图4为本发明超大型数据管理的一体化DCIM系统中定位报警模块与报表生成模块程序框图。FIG. 4 is a flowchart of a positioning alarm module and a report generation module in the integrated DCIM system for ultra-large data management of the present invention.
具体实施方式Detailed ways
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。The following will be combined with the drawings in the embodiments of the present invention to clearly and completely describe the technical solutions in the embodiments of the present invention. Obviously, the described embodiments are only part of the embodiments of the present invention, not all of the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by ordinary technicians in this field without creative work are within the scope of protection of the present invention.
请参阅图1-图3,本发明提供以下技术方案:Please refer to Figures 1 to 3, the present invention provides the following technical solutions:
超大型数据管理的一体化DCIM系统,包括:An all-in-one DCIM system for large-scale data management, including:
数据接入系统,用于:Data access system for:
为整个DCIM系统各个采集模块的数据接口,同时也是反馈设备状态的来源,接收动环监控系统、电力监控系统、BA系统的设备信息及运行状态;It is the data interface for each acquisition module of the entire DCIM system and also the source of feedback on equipment status. It receives equipment information and operating status from the dynamic environment monitoring system, power monitoring system, and BA system.
数据处理系统,用于:Data processing systems for:
将末端采集到的信息进行计算、分析及存储,具备组态化联动控制功能,实现相关系统或设备的自动化联动控制,具备复杂事件分析功能,自动查找根源告警,方便运维中快速定位根源,快速解决问题,快速恢复故障,其中数据处理系统包括3台管理服务器和3台存储服务器,管理服务器是将采集上的数据进行分析处理,对于设备的故障进行上报及计算,存储服务器是将采集数据进行全量存储,数据接入系统所接收到的信息通过交换机上传至数据处理系统;The information collected at the terminal is calculated, analyzed and stored. It has the function of configurable linkage control to realize the automatic linkage control of related systems or equipment. It has the function of complex event analysis to automatically find the root cause alarm, which is convenient for quickly locating the root cause during operation and maintenance, quickly solving problems, and quickly recovering from faults. The data processing system includes 3 management servers and 3 storage servers. The management server analyzes and processes the collected data, reports and calculates the equipment faults, and the storage server stores the collected data in full. The information received by the data access system is uploaded to the data processing system through the switch;
管理融合平台,用于:Management converged platform for:
将动环监控系统、电力监控系统以及BA自控系统打通,将其融合为一个管理平台,提高了服务器集群的处理能力。The dynamic environment monitoring system, power monitoring system and BA automatic control system are connected and integrated into a management platform, which improves the processing capacity of the server cluster.
数据接入系统包括:数据采集模块、数据传输模块、数据清洗转换模块、数据保障模块、数据流量控制模块以及数据发送模块,其中The data access system includes: data acquisition module, data transmission module, data cleaning and conversion module, data protection module, data flow control module and data sending module.
数据采集模块,用于:Data acquisition module for:
从各个数据源采集数据,包括变电所及电池室区域、IT包间区域和冷冻站区域,以及外部的数据源,其中包括云服务、外部监控系统,通过数据采集,系统可以获取实时的数据信息,为后续的数据处理和分析提供基础;Collect data from various data sources, including substations and battery rooms, IT rooms, and refrigeration stations, as well as external data sources, including cloud services and external monitoring systems. Through data collection, the system can obtain real-time data information, providing a basis for subsequent data processing and analysis;
数据传输模块,用于:Data transmission module for:
负责将采集到的数据进行传输,确保数据的快速、稳定和安全的传递,选用采用TCP/IP协议进行传输;Responsible for transmitting the collected data, ensuring fast, stable and secure data transmission, and using TCP/IP protocol for transmission;
数据清洗转换模块,用于:Data cleaning and conversion module, used to:
在数据传输过程中,能够对数据进行清洗和转换,确保数据的质量和格式的一致性,能够对数据进行去重、去噪、格式转换,使数据更加规范和易于管理;During the data transmission process, the data can be cleaned and converted to ensure the quality and consistency of the data format. The data can be deduplicated, de-noised, and format converted to make the data more standardized and easier to manage.
数据保障模块,用于:Data Assurance Module for:
负责确保数据的安全性,通过加密技术和身份验证的手段,对数据进行保护,防止数据在传输过程中被篡改、泄露或未经授权的访问;Responsible for ensuring the security of data, protecting data through encryption technology and identity authentication to prevent data from being tampered with, leaked or accessed without authorization during transmission;
数据流量控制模块,用于:Data flow control module, used for:
对数据流量进行控制和管理,确保数据的平衡和稳定的传输,根据数据中心的需求和资源情况,对数据进行优先级和带宽的分配,保证关键数据的及时传输;Control and manage data traffic to ensure balanced and stable data transmission. Prioritize and allocate bandwidth to data based on the needs and resources of the data center to ensure timely transmission of critical data.
数据发送模块,用于:Data sending module, used for:
将采集整理后的数据发送至数据处理系统。The collected and organized data are sent to the data processing system.
数据清洗转换模块包括:数据去重模块、数据去噪模块以及数据格式转换模块;The data cleaning and conversion module includes: data deduplication module, data denoising module and data format conversion module;
数据去重模块,用于:Data deduplication module, used for:
通过哈希算法去重算法,对采集到的数据进行去重处理;The collected data is deduplicated using a hash algorithm.
数据去噪模块,用于:Data denoising module, used to:
通过滤波技术,对数据进行去噪处理,提高数据的准确性和可靠性;Through filtering technology, the data is denoised to improve the accuracy and reliability of the data;
数据格式转换模块,用于:Data format conversion module, used for:
根据数据中心的需求,对采集到的数据进行格式转换,使其与系统内部的数据格式一致,方便后续的数据处理和分析。According to the needs of the data center, the collected data is converted into a format that is consistent with the data format within the system, facilitating subsequent data processing and analysis.
数据清洗转换模块还包括:The data cleaning and conversion module also includes:
数据归一化模块,用于:Data normalization module, used to:
对数据进行归一化处理,将其转换为统一的单位和量级,方便数据的比较和分析;Normalize the data and convert it into a unified unit and magnitude to facilitate data comparison and analysis;
数据校验模块,用于:Data verification module, used for:
通过循环冗余校验对数据进行校验,若数据校验失败,进行数据重传或报警处理;Verify the data through cyclic redundancy check. If the data verification fails, retransmit the data or issue an alarm.
数据补全模块,用于:Data completion module, used to:
根据历史数据进行插值或补全,填补数据的空缺,确保数据的完整性和连续性。Interpolate or complete historical data to fill in data gaps and ensure data integrity and continuity.
数据保障模块包括:The data protection module includes:
数据加密模块,用于:Data encryption module for:
在数据清洗和转换过程中,采用对称加密算法对敏感数据进行加密处理,确保数据在传输和存储过程中的安全性;During data cleaning and conversion, symmetric encryption algorithms are used to encrypt sensitive data to ensure data security during transmission and storage;
数据脱敏模块,用于:Data desensitization module, used for:
对于涉及个人隐私或敏感信息的数据,在进行清洗和转换时,采用数据脱敏技术,将敏感信息进行屏蔽或替换,以保护用户隐私和数据安全;For data involving personal privacy or sensitive information, data desensitization technology is used during cleaning and conversion to shield or replace sensitive information to protect user privacy and data security;
数据权限控制模块,用于:Data permission control module, used for:
在数据清洗和转换过程中,设置严格的数据权限控制机制,限制用户对数据的访问和操作权限,只有经过授权的用户才能进行数据清洗和转换操作,确保数据的安全性和可控性。During the data cleaning and conversion process, a strict data permission control mechanism is set up to limit users' access to and operation rights over data. Only authorized users can perform data cleaning and conversion operations to ensure data security and controllability.
数据保障模块还包括:The data protection module also includes:
数据审计和监控模块,用于:Data auditing and monitoring module for:
建立数据清洗和转换的审计和监控机制,记录数据的清洗和转换操作,以便后续的审计和追溯,监控数据清洗和转换过程中的异常操作和数据变动,及时发现和处理数据安全风险;Establish an audit and monitoring mechanism for data cleaning and conversion, record data cleaning and conversion operations for subsequent audit and tracing, monitor abnormal operations and data changes during data cleaning and conversion, and promptly identify and handle data security risks;
数据备份和恢复模块,用于:Data backup and recovery module for:
在数据清洗和转换过程中,定期进行数据备份,确保数据的可恢复性和持久性,在数据清洗和转换过程中,及时备份数据,以防止数据丢失或损坏,保障数据的安全性和可用性;During the data cleaning and conversion process, data backup should be performed regularly to ensure data recoverability and durability. During the data cleaning and conversion process, data should be backed up in a timely manner to prevent data loss or damage and ensure data security and availability;
数据异常检测模块,用于:Data anomaly detection module, used to:
建立异常检测和报警机制,监测数据清洗和转换过程中的异常情况,一旦发现异常情况,及时触发报警机制。Establish an anomaly detection and alarm mechanism to monitor anomalies during data cleaning and conversion, and trigger the alarm mechanism in a timely manner once an anomaly is found.
数据流量控制模块包括:The data flow control module includes:
流量监控与分析模块,用于:Traffic monitoring and analysis module for:
通过Snort流量分析工具对数据流量进行实时监控和分析,可以及时发现异常流量或攻击行为,从而保护数据的安全;By using the Snort traffic analysis tool to monitor and analyze data traffic in real time, abnormal traffic or attack behaviors can be discovered in time, thus protecting data security.
流量优先级模块,用于:Traffic prioritization module for:
针对流量较大的数据,优先发送,确保重要数据的及时性;Data with large flow rates are sent first to ensure the timeliness of important data;
流量阻断模块,用于:Traffic blocking module, used for:
结合流量监控和分析结果,设置异常流量检测规则,并配置报警机制,当监测到异常流量时,触发报警机制,同时数据清洗转换以及数据审计和监控停止进行,防止出现数据泄漏。Combined with the traffic monitoring and analysis results, set up abnormal traffic detection rules and configure the alarm mechanism. When abnormal traffic is detected, the alarm mechanism is triggered, and data cleaning conversion, data auditing and monitoring are stopped to prevent data leakage.
具体的,利用数据去重模块通过哈希算法去重算法,对采集到的数据进行去重处理,确保得到的数据没有重复,利用数据去噪模块通过滤波技术,对数据进行去噪处理,提高数据的准确性和可靠性,通过数据格式转换模块根据数据中心的需求,对采集到的数据进行格式转换,使其与系统内部的数据格式一致,方便后续的数据处理和分析,再通过数据归一化模块对数据进行归一化处理,将其转换为统一的单位和量级,方便数据的比较和分析,再通过数据校验模块利用循环冗余校验对数据进行校验,若数据校验失败,进行数据重传或报警处理,最后通过数据补全模块根据历史数据进行插值或补全,填补数据的空缺,确保数据的完整性和连续性,从而使得在数据传输过程中,能够对数据进行清洗和转换,确保数据的质量和格式的一致性,能够对数据进行去重、去噪、格式转换,使数据更加规范和易于管理;Specifically, the data deduplication module uses the hash algorithm to perform deduplication processing on the collected data to ensure that the obtained data is not repeated. The data denoising module uses filtering technology to perform denoising processing on the data to improve the accuracy and reliability of the data. The data format conversion module converts the format of the collected data according to the needs of the data center to make it consistent with the data format inside the system, which is convenient for subsequent data processing and analysis. The data normalization module then performs normalization processing on the data and converts it into a unified unit and magnitude to facilitate data comparison and analysis. The data verification module then verifies the data using cyclic redundancy check. If the data verification fails, the data is retransmitted or an alarm is processed. Finally, the data completion module interpolates or completes the historical data to fill the data gaps and ensure the integrity and continuity of the data, so that the data can be cleaned and converted during the data transmission process to ensure the quality and consistency of the format of the data. The data can be deduplicated, denoised, and format converted to make the data more standardized and easy to manage.
其次,在数据清洗和转换过程中,采用对称加密算法对敏感数据进行加密处理,确保数据在传输和存储过程中的安全性,对于涉及个人隐私或敏感信息的数据,在进行清洗和转换时,采用数据脱敏技术,将敏感信息进行屏蔽或替换,以保护用户隐私和数据安全,在数据清洗和转换过程中,设置严格的数据权限控制机制,限制用户对数据的访问和操作权限,只有经过授权的用户才能进行数据清洗和转换操作,确保数据的安全性和可控性,同时建立数据清洗和转换的审计和监控机制,记录数据的清洗和转换操作,以便后续的审计和追溯,监控数据清洗和转换过程中的异常操作和数据变动,及时发现和处理数据安全风险,其次结合流量监控和分析结果,设置异常流量检测规则,并配置报警机制,当监测到异常流量时,触发报警机制,同时数据清洗转换以及数据审计和监控停止进行,防止出现数据泄漏,从而提高接收动环监控系统、电力监控系统、BA系统的设备信息及运行状态的准确性以及安全性。Secondly, during the data cleaning and conversion process, a symmetric encryption algorithm is used to encrypt sensitive data to ensure the security of data during transmission and storage. For data involving personal privacy or sensitive information, data desensitization technology is used to shield or replace sensitive information during cleaning and conversion to protect user privacy and data security. During the data cleaning and conversion process, a strict data permission control mechanism is set up to limit user access and operation permissions to data. Only authorized users can perform data cleaning and conversion operations to ensure data security and controllability. At the same time, an audit and monitoring mechanism for data cleaning and conversion is established to record data cleaning and conversion operations for subsequent auditing and tracing, monitor abnormal operations and data changes during data cleaning and conversion, and promptly discover and deal with data security risks. Secondly, combined with traffic monitoring and analysis results, abnormal traffic detection rules are set, and an alarm mechanism is configured. When abnormal traffic is detected, the alarm mechanism is triggered, and data cleaning conversion, data auditing and monitoring are stopped to prevent data leakage, thereby improving the accuracy and security of the equipment information and operating status of the receiving dynamic environment monitoring system, power monitoring system, and BA system.
请参阅图1和图4,本发明提供以下技术方案:Please refer to Figures 1 and 4, the present invention provides the following technical solutions:
数据处理系统包括:The data processing system includes:
数据存储模块,对整理发送过来的信息进行存储,其中包括线上存储和线下存储,线上存储通过存储服务器进行存储;The data storage module stores the information sent by the sorting, including online storage and offline storage. The online storage is stored through the storage server;
机组联动模块,将3台管理服务器同时运行,通过系统运行控制软件实现服务器的负载均衡;The unit linkage module runs three management servers simultaneously and realizes server load balancing through system operation control software;
定位报警模块,用于:Positioning alarm module, used for:
对待监控的设备进行定位报警,自动查找根源告警,方便运维中快速定位根源,快速解决问题,快速恢复故障;Locate and alarm the monitored equipment, automatically find the root cause of the alarm, and facilitate quick root cause location, problem solving, and fault recovery during operation and maintenance;
报表生成模块,用于:Report generation module for:
运维人员设计满足其个性化需求的报表报告。Operation and maintenance personnel design reports that meet their personalized needs.
定位报警模块包括:The positioning alarm module includes:
标识符设定模块,用于给每个设备分配一个唯一的标识符,这样可以确保每个设备都有明确的标识,同时建立坐标系,每个设备都有属于自己的坐标点;The identifier setting module is used to assign a unique identifier to each device, so as to ensure that each device has a clear identification and establish a coordinate system so that each device has its own coordinate point;
故障方向确定模块,用于在DCIM系统中,有两类故障,一是清楚故障源,另一种是不清楚故障源,设置对设备状态故障的实时监测,使用传感器和数据采集器记录设备的状态信息,两类故障均包括电力故障和环境故障,设定电力故障为D,环境故障为H;The fault direction determination module is used in the DCIM system. There are two types of faults: one is that the fault source is clear, and the other is that the fault source is unclear. Set up real-time monitoring of equipment status faults and use sensors and data collectors to record equipment status information. Both types of faults include power faults and environmental faults. Set power faults as D and environmental faults as H.
精准故障标记模块,用于当遇到清楚故障源时,标记为精准故障(即精准代表符号X),通过分析监测数据来确定是否出现了特定的故障模式,出现特定故障模式即为清楚故障源;The precise fault marking module is used to mark a clear fault source as a precise fault (i.e., the precise representative symbol X), and to determine whether a specific fault mode has occurred by analyzing the monitoring data. The occurrence of a specific fault mode is a clear fault source;
故障定位模块,用于排除遇到清楚故障源后,通过分析监测数据来确定出现故障的设备,对故障设备进行坐标点定位,确定出现问题的具体设备和方向;The fault location module is used to identify the faulty device by analyzing the monitoring data after the fault source is clear, locate the coordinate point of the faulty device, and determine the specific device and direction where the problem occurs;
故障设备显示模块,用于在DCIM系统的用户界面中,将出现故障的设备和方向在含有坐标轴的平面图上显示出来,以便用户准确了解故障设备和方向的位置和状态;A faulty equipment display module is used to display the faulty equipment and direction on a plane diagram with coordinate axes in the user interface of the DCIM system, so that the user can accurately understand the location and status of the faulty equipment and direction;
警报通知模块,用于一旦确定出现故障的设备和方向,DCIM系统及时发出警报和通知相应的人员,以便他们能够快速采取纠正措施。The alarm notification module is used to determine the faulty equipment and direction. Once the DCIM system promptly issues an alarm and notifies the relevant personnel, they can take corrective measures quickly.
报表生成模块包括:The report generation module includes:
设备状态收集模块,用于对DCIM系统中监测到的设备状态数据进行收集和记录,其中包括电力和环境数据;The equipment status collection module is used to collect and record the equipment status data monitored in the DCIM system, including power and environmental data;
数据分析整理模块,用于对收集到的设备状态数据进行分析和整理,提取有关故障、异常或其他重要信息的数据;The data analysis and collation module is used to analyze and collate the collected equipment status data and extract data related to faults, anomalies or other important information;
信息融合模块,根据需要,用于生成设备状态报表,其中包括故障设备清单、设备状态变化趋势以及故障频率统计的报表,同时对出现故障的设备的坐标点进行记录,另外在DCIM系统的用户界面中,在含有坐标轴的平面图上进行故障次数显示,其中分为红、黄、蓝三个等级,其中红色代表该设备出现5次及以上故障,黄色代表该设备出现2-4次故障,蓝色代表该设备出现0或1次故障。The information fusion module is used to generate equipment status reports as needed, including a list of faulty equipment, equipment status change trends, and fault frequency statistics. The coordinate points of the faulty equipment are recorded at the same time. In the user interface of the DCIM system, the number of faults is displayed on a plane diagram containing coordinate axes, which is divided into three levels: red, yellow, and blue. Red means that the equipment has 5 or more faults, yellow means that the equipment has 2-4 faults, and blue means that the equipment has 0 or 1 fault.
具体的,通过数据传输模块将采集到的数据进行传输,确保数据的快速、稳定和安全的传递,选用采用TCP/IP协议进行传输,相比RS485接口传输速度更快,效率更好,有助于提升整个系统的响应时间,通过数据存储模块对整理发送过来的信息进行存储,其中包括线上存储和线下存储,线上存储通过存储服务器进行存储,通过机组联动模块将3台管理服务器同时运行,通过系统运行控制软件实现服务器的负载均衡,通过定位报警模块,对待监控的设备进行定位报警,自动查找根源告警,方便运维中快速定位根源,快速解决问题,快速恢复故障,系统运行时,3台管理服务器同时运行,通过系统运行控制软件实现服务器的负载均衡,此时每台服务器负载率为33.33%;如1台服务器故障退出集群服务时,另外2台服务器将承担运行任务,此时主用的2台服务器负载率为66.67%,2台服务器运行不影响一体化平台系统的计算及响应时间,可保持高速率状态运行;如2台服务器同时故障退出服务时,主用的1台服务器将承担全部负载,此时系统可正常使用,数据采集、分析及报警等功能不受影响,但响应时间会相应增加,当其中一台服务器出现故障时,优先进行此服务器的故障监测,监测方向分为两个部分,一个是自身监测,一个是服务器承载的仪器监测,仪器监测有多个仪器,可以列为W1,W2,W3,.....Wn,仪器被监测的方向为动态环境监控D和电力监控H,而单个仪器被监测的方向为W1=D1+H1,其中D1为D中的至少一个类型,H1为H中的至少一个类型,优先对D1+H1数量少的进行监测,以此类推,直到全部监测完毕,在进行W1监测时,优先对D1进行监测,然后再监测H1,将动环监控系统、电力监控系统、BA自控系统打通,将其融合为一个管理平台,这样不仅提高了服务器集群的处理能力,同时对于运维人员也减少了跨平台使用的问题,大大提高了运维的工作效率,对于200万测点的项目若按照传统方式需要配置10台服务器做处理计算,一体化平台仅需配置3台管理服务器,传统系统架构界面刷新时间在10秒以上,而一体化技术对于200万测点系统可实现6秒的刷新时间,对于管理系统若需要扩容时仅需在集群内增加服务器并进行调试即可,无需对各个子系统进行调整,对于传统系统架构而言,一体化技术可减少15%-20%的投资,并且在系统应用方面具备极大优势。Specifically, the collected data is transmitted through the data transmission module to ensure fast, stable and secure data transmission. The TCP/IP protocol is used for transmission. Compared with the RS485 interface, the transmission speed is faster and the efficiency is better, which helps to improve the response time of the entire system. The information sent is stored through the data storage module, including online storage and offline storage. The online storage is stored through the storage server. The three management servers are run simultaneously through the unit linkage module, and the server load balancing is achieved through the system operation control software. Through the positioning alarm module, the monitored equipment is located and alarmed, and the root cause alarm is automatically found, which is convenient for quickly locating the root cause during operation and maintenance, and quickly solving the problem. Rapid failure recovery: When the system is running, three management servers are running at the same time, and the load balancing of the servers is achieved through the system operation control software. At this time, the load rate of each server is 33.33%; if one server fails and exits the cluster service, the other two servers will take over the operation task. At this time, the load rate of the two main servers is 66.67%. The operation of the two servers does not affect the calculation and response time of the integrated platform system, and can maintain high-speed operation; if two servers fail to exit the service at the same time, the main server will bear all the loads. At this time, the system can be used normally, and functions such as data collection, analysis and alarm are not affected, but the response time will increase accordingly. When one of the servers fails, priority is given to the main server. To monitor the fault of this server, the monitoring direction is divided into two parts, one is self-monitoring, and the other is monitoring of the instruments carried by the server. There are multiple instruments for instrument monitoring, which can be listed as W1, W2, W3, ..... Wn. The direction of instrument monitoring is dynamic environment monitoring D and power monitoring H, and the direction of single instrument monitoring is W1=D1+H1, where D1 is at least one type in D, and H1 is at least one type in H. Priority is given to monitoring the smaller number of D1+H1, and so on, until all monitoring is completed. When monitoring W1, priority is given to monitoring D1, and then monitoring H1, connecting the dynamic environment monitoring system, power monitoring system, and BA automatic control system, and integrating them into a management platform. This not only improves the processing capacity of the server cluster, but also reduces the problem of cross-platform use for operation and maintenance personnel, greatly improving the efficiency of operation and maintenance. For a project with 2 million measurement points, if 10 servers are configured for processing and calculation in the traditional way, the integrated platform only needs to be configured with 3 management servers. The refresh time of the traditional system architecture interface is more than 10 seconds, while the integrated technology can achieve a refresh time of 6 seconds for the 2 million measurement point system. If the management system needs to be expanded, it only needs to add servers to the cluster and debug them, without adjusting the various subsystems. For the traditional system architecture, the integrated technology can reduce 15%-20% of the investment and has great advantages in system application.
以上所述,仅为本发明较佳的具体实施方式,但本发明的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本发明披露的技术范围内,根据本发明的技术方案及其发明构思加以等同替换或改变,都应涵盖在本发明的保护范围之内。The above description is only a preferred specific implementation manner of the present invention, but the protection scope of the present invention is not limited thereto. Any technician familiar with the technical field can make equivalent replacements or changes according to the technical scheme and inventive concept of the present invention within the technical scope disclosed by the present invention, which should be covered by the protection scope of the present invention.
Claims (10)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410085814.3A CN117937752A (en) | 2024-01-22 | 2024-01-22 | All-in-one DCIM system for ultra-large data management |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410085814.3A CN117937752A (en) | 2024-01-22 | 2024-01-22 | All-in-one DCIM system for ultra-large data management |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117937752A true CN117937752A (en) | 2024-04-26 |
Family
ID=90750203
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202410085814.3A Pending CN117937752A (en) | 2024-01-22 | 2024-01-22 | All-in-one DCIM system for ultra-large data management |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117937752A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118673396A (en) * | 2024-08-26 | 2024-09-20 | 合肥工业大学 | Big data platform operation and maintenance management system based on artificial intelligence |
CN119109931A (en) * | 2024-11-06 | 2024-12-10 | 河北思极科技有限公司 | Data storage and monitoring method and system for hydropower fusion terminal |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104407602A (en) * | 2014-10-29 | 2015-03-11 | 中国神华能源股份有限公司 | Method for determining electric system fault of electric locomotive |
CN110659180A (en) * | 2019-09-05 | 2020-01-07 | 国家计算机网络与信息安全管理中心 | Data center infrastructure management system based on cluster technology |
CN116566803A (en) * | 2023-06-15 | 2023-08-08 | 华章数据技术有限公司 | Line switching system and method based on flow monitoring |
CN116599776A (en) * | 2023-07-18 | 2023-08-15 | 深圳友讯达科技股份有限公司 | Smart electric meter management method, device, equipment and storage medium based on Internet of things |
CN116991678A (en) * | 2023-09-25 | 2023-11-03 | 华章数据技术有限公司 | Intelligent operation and maintenance system of data center |
CN117395695A (en) * | 2023-10-24 | 2024-01-12 | 北京红山信息科技研究院有限公司 | On-line diagnosis processing system for wireless network |
-
2024
- 2024-01-22 CN CN202410085814.3A patent/CN117937752A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104407602A (en) * | 2014-10-29 | 2015-03-11 | 中国神华能源股份有限公司 | Method for determining electric system fault of electric locomotive |
CN110659180A (en) * | 2019-09-05 | 2020-01-07 | 国家计算机网络与信息安全管理中心 | Data center infrastructure management system based on cluster technology |
CN116566803A (en) * | 2023-06-15 | 2023-08-08 | 华章数据技术有限公司 | Line switching system and method based on flow monitoring |
CN116599776A (en) * | 2023-07-18 | 2023-08-15 | 深圳友讯达科技股份有限公司 | Smart electric meter management method, device, equipment and storage medium based on Internet of things |
CN116991678A (en) * | 2023-09-25 | 2023-11-03 | 华章数据技术有限公司 | Intelligent operation and maintenance system of data center |
CN117395695A (en) * | 2023-10-24 | 2024-01-12 | 北京红山信息科技研究院有限公司 | On-line diagnosis processing system for wireless network |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118673396A (en) * | 2024-08-26 | 2024-09-20 | 合肥工业大学 | Big data platform operation and maintenance management system based on artificial intelligence |
CN119109931A (en) * | 2024-11-06 | 2024-12-10 | 河北思极科技有限公司 | Data storage and monitoring method and system for hydropower fusion terminal |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN117937752A (en) | All-in-one DCIM system for ultra-large data management | |
CN108270716A (en) | A kind of audit of information security method based on cloud computing | |
CN109450094A (en) | A kind of substation relay protection method for inspecting and system | |
CN107644304A (en) | A method for holographic analysis and presentation of secondary network equipment data in smart substations | |
CN110912755A (en) | System and method for network card fault monitoring and automatic recovery in cloud environment | |
CN107947367A (en) | One kind protection equipment on-line monitoring and intelligent diagnosis system | |
CN116257021A (en) | Intelligent network security situation monitoring and early warning platform for industrial control system | |
CN117596119A (en) | A device data collection and monitoring method and system based on SNMP protocol | |
CN118247939A (en) | Communication pipeline safety early warning method and system | |
CN112257069A (en) | Server security event auditing method based on flow data analysis | |
Kummerow et al. | Cyber-physical data stream assessment incorporating Digital Twins in future power systems | |
CN117729576A (en) | Alarm monitoring methods, devices, equipment and storage media | |
CN104503405B (en) | Monitoring method, device and system based on SCADA system | |
CN109639529B (en) | Diagnostic method for abnormal remote control command of intelligent substation | |
CN108156177A (en) | Information Network security postures based on big data perceive method for early warning | |
CN113824592B (en) | Quantum network management system | |
CN110532312A (en) | A kind of industry interconnection cloud platform system based on big data | |
CN111525689B (en) | Accurate two location distribution terminal monitoring management system | |
CN118889666A (en) | Distribution network line connection switch multi-source data management method and system | |
CN118473902A (en) | Method for monitoring communication content based on Internet of things | |
CN107896002A (en) | 10kV feeder loads monitor active alarm system | |
CN118055128A (en) | Data acquisition system based on cloud intelligent maintenance center | |
CN117851195A (en) | Computer host operation risk monitoring management and control system based on data analysis | |
CN110209903A (en) | A kind of industry interconnection cloud platform system based on big data | |
CN116684303A (en) | Digital twinning-based data center operation and maintenance method and system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |