CN102497292A - Method and system for monitoring computer cluster - Google Patents
Method and system for monitoring computer cluster Download PDFInfo
- Publication number
- CN102497292A CN102497292A CN201110391562XA CN201110391562A CN102497292A CN 102497292 A CN102497292 A CN 102497292A CN 201110391562X A CN201110391562X A CN 201110391562XA CN 201110391562 A CN201110391562 A CN 201110391562A CN 102497292 A CN102497292 A CN 102497292A
- Authority
- CN
- China
- Prior art keywords
- monitoring
- node
- monitored
- module
- monitored node
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000012544 monitoring process Methods 0.000 title claims abstract description 108
- 238000000034 method Methods 0.000 title claims abstract description 25
- 230000004044 response Effects 0.000 claims description 12
- 230000007423 decrease Effects 0.000 claims description 6
- 230000008569 process Effects 0.000 claims description 4
- 238000012546 transfer Methods 0.000 claims description 3
- 238000012806 monitoring device Methods 0.000 claims 1
- 230000006870 function Effects 0.000 description 3
- 238000004891 communication Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 238000011155 quantitative monitoring Methods 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000013515 script Methods 0.000 description 1
Images
Landscapes
- Debugging And Monitoring (AREA)
Abstract
本发明实施例提出了一种计算机集群监控的方法,包括以下步骤:被监控结点进行运行信息采集,将所述结点当前负载状态及被监控的内容信息分别发送给参数调整模块和主监控模块;所述主监控模块接收所述结点当前负载状态及被监控的内容信息,将所述结点的负载状态和被监控的内容信息存入数据库;所述参数调整模块根据所述结点当前负载状态进行分析,当负载状态达到预设阈值时,调整被监控结点的监控策略,并将更新后的监控策略通知被监控结点。本发明提出的上述方法,根据系统的负载状态,合理定制监控内容,可有效控制在系统高负载运行时监控程序所占资源,能够方便快捷的获取集群监控状态和报警信息。
The embodiment of the present invention proposes a computer cluster monitoring method, including the following steps: the monitored node collects operation information, and sends the current load status of the node and the monitored content information to the parameter adjustment module and the main monitoring module; the main monitoring module receives the current load status of the node and the monitored content information, and stores the load status of the node and the monitored content information into the database; the parameter adjustment module according to the node The current load status is analyzed, and when the load status reaches the preset threshold, the monitoring strategy of the monitored node is adjusted, and the updated monitoring strategy is notified to the monitored node. The method proposed by the present invention rationally customizes the monitoring content according to the load status of the system, can effectively control the resources occupied by the monitoring program when the system is running under high load, and can conveniently and quickly obtain cluster monitoring status and alarm information.
Description
技术领域 technical field
本发明涉及计算机通信领域,具体而言,本发明涉及计算机集群监控的方法及系统。The present invention relates to the field of computer communication, in particular, the present invention relates to a computer cluster monitoring method and system.
背景技术 Background technique
计算机集群简称集群,是一种计算机系统,它通过一组松散集成的计算机软件和/或硬件连接起来,高度紧密地协作完成计算工作。在某种意义上,它们可以被看作是一台计算机。集群系统中的单个计算机通常称为结点,通常通过局域网连接,但也有其它的可能连接方式。集群计算机通常用来改进单个计算机的计算速度和/或可靠性。一般情况下集群计算机比单个计算机,比如工作站或超级计算机性能价格比要高得多。A computer cluster, referred to as a cluster for short, is a computer system that is connected through a group of loosely integrated computer software and/or hardware, and highly closely cooperates to complete computing work. In a sense, they can be thought of as a computer. The individual computers in a cluster system are usually called nodes and are usually connected by a local area network, but there are other possible connections. Cluster computers are often used to improve the computing speed and/or reliability of individual computers. In general, cluster computers are much more cost-effective than individual computers, such as workstations or supercomputers.
集群应用对于现代日益增多的计算需求非常重要,可以有效的减少运算时间和充分应用服务器硬件资源。系统管理员需要及时掌握集群当前的运行状态及资源的使用情况,故而需要实时的对集群进行监控。Cluster applications are very important to the increasing computing needs of modern times, which can effectively reduce computing time and make full use of server hardware resources. System administrators need to keep abreast of the current running status and resource usage of the cluster, so they need to monitor the cluster in real time.
现有的WEB方式的集群监控已有一些成熟产品,但主要存在以下几个问题:一是监控内容固定,不可以自定制;二是存在着监控的及时性、完整性与计算性能之间的矛盾。There are some mature products in the existing WEB cluster monitoring, but there are mainly the following problems: first, the monitoring content is fixed and cannot be customized; second, there is a gap between the timeliness, integrity and computing performance of monitoring. contradiction.
因此,有必要提出一种有效的技术方案,解决现有的WEB方式中计算机集群监控的问题。Therefore, it is necessary to propose an effective technical solution to solve the problem of computer cluster monitoring in the existing WEB mode.
发明内容 Contents of the invention
本发明的目的旨在至少解决上述技术缺陷之一,特别是通过调整被监控结点的监控策略,优化系统的监控性能。The purpose of the present invention is to at least solve one of the above-mentioned technical defects, especially to optimize the monitoring performance of the system by adjusting the monitoring strategy of the monitored nodes.
本发明实施例提出了一种计算机集群监控的方法,包括以下步骤:The embodiment of the present invention proposes a method for computer cluster monitoring, comprising the following steps:
被监控结点进行运行信息采集,将所述结点当前负载状态及被监控的内容信息分别发送给参数调整模块和主监控模块;The monitored node collects operation information, and sends the current load status of the node and the monitored content information to the parameter adjustment module and the main monitoring module respectively;
所述主监控模块接收所述结点当前负载状态及被监控的内容信息,将所述结点的负载状态和被监控的内容信息存入数据库;The main monitoring module receives the current load status of the node and the monitored content information, and stores the load status of the node and the monitored content information into a database;
所述参数调整模块根据所述结点当前负载状态进行分析,当负载状态达到预设阈值时,调整被监控结点的监控策略,并将更新后的监控策略通知所述被监控结点。The parameter adjustment module analyzes the current load status of the node, adjusts the monitoring strategy of the monitored node when the load status reaches a preset threshold, and notifies the monitored node of the updated monitoring strategy.
本发明提出的上述方案,根据系统的负载状态,合理定制监控内容,可有效控制在系统高负载运行时监控程序所占资源,能够方便快捷的获取集群监控状态和报警信息。此外,本发明提出的上述方案,对现有系统的改动很小,不会影响系统的兼容性,而且实现简单、高效。The above solution proposed by the present invention rationally customizes the monitoring content according to the load state of the system, can effectively control the resources occupied by the monitoring program when the system is running under high load, and can conveniently and quickly obtain cluster monitoring status and alarm information. In addition, the above solution proposed by the present invention has little modification to the existing system, does not affect the compatibility of the system, and is simple and efficient to implement.
本发明附加的方面和优点将在下面的描述中部分给出,这些将从下面的描述中变得明显,或通过本发明的实践了解到。Additional aspects and advantages of the invention will be set forth in part in the description which follows, and will become apparent from the description, or may be learned by practice of the invention.
附图说明 Description of drawings
本发明上述的和/或附加的方面和优点从下面结合附图对实施例的描述中将变得明显和容易理解,其中:The above and/or additional aspects and advantages of the present invention will become apparent and easy to understand from the following description of the embodiments in conjunction with the accompanying drawings, wherein:
图1为本发明实施例计算机集群监控的方法流程图;Fig. 1 is the method flowchart of computer cluster monitoring of the embodiment of the present invention;
图2为本发明实施例计算机集群监控的系统结构图。Fig. 2 is a system structure diagram of computer cluster monitoring according to an embodiment of the present invention.
具体实施方式 Detailed ways
下面详细描述本发明的实施例,所述实施例的示例在附图中示出,其中自始至终相同或类似的标号表示相同或类似的元件或具有相同或类似功能的元件。下面通过参考附图描述的实施例是示例性的,仅用于解释本发明,而不能解释为对本发明的限制。Embodiments of the present invention are described in detail below, examples of which are shown in the drawings, wherein the same or similar reference numerals designate the same or similar elements or elements having the same or similar functions throughout. The embodiments described below by referring to the figures are exemplary only for explaining the present invention and should not be construed as limiting the present invention.
为了现实本发明之目的,本发明实施例提出了一种计算机集群监控的方法,包括以下步骤:In order to achieve the purpose of the present invention, the embodiment of the present invention proposes a method for computer cluster monitoring, comprising the following steps:
被监控结点进行运行信息采集,将所述结点当前负载状态及被监控的内容信息分别发送给参数调整模块和主监控模块;The monitored node collects operation information, and sends the current load status of the node and the monitored content information to the parameter adjustment module and the main monitoring module respectively;
所述主监控接收所述结点当前负载状态及被监控的内容信息,将所述结点的负载状态和被监控的内容信息存入数据库;The main monitor receives the current load status of the node and the monitored content information, and stores the load status of the node and the monitored content information into a database;
所述参数调整模块根据所述结点当前负载状态进行分析,当负载状态达到预设阈值时,调整被监控结点的监控策略,并将更新后的监控策略通知所述被监控结点。The parameter adjustment module analyzes the current load status of the node, adjusts the monitoring strategy of the monitored node when the load status reaches a preset threshold, and notifies the monitored node of the updated monitoring strategy.
如图1所示,为本发明实施例计算机集群监控的方法流程图,包括以下步骤:As shown in Figure 1, it is a flow chart of a method for computer cluster monitoring in an embodiment of the present invention, including the following steps:
S110:被监控结点进行运行信息采集,将结点当前负载状态及被监控的内容信息分别发送给参数调整模块和主监控模块。S110: The monitored node collects operation information, and sends the current load status of the node and the monitored content information to the parameter adjustment module and the main monitoring module respectively.
在步骤S110中,集群中各被监控结点首先运行信息采集,将结点当前负载状态及被监控内容分别发送到参数调整模块和集群管理结点的主监控模块。In step S110, each monitored node in the cluster first runs information collection, and sends the current load status of the node and the monitored content to the parameter adjustment module and the main monitoring module of the cluster management node respectively.
S120:主监控模块接收结点当前负载状态及被监控的内容信息,将结点的负载状态和被监控的内容信息存入数据库。S120: The main monitoring module receives the current load status of the node and the monitored content information, and stores the node load status and the monitored content information into the database.
主监控模块分析和处理得到的信息,将结点的状态和监控内容存入数据库。此外,主监控模块还可以提供一个WEB服务,可通过网页查看监控结果。The main monitoring module analyzes and processes the obtained information, and stores the status and monitoring content of the nodes into the database. In addition, the main monitoring module can also provide a WEB service, and the monitoring results can be viewed through the web page.
S130:参数调整模块根据结点当前负载状态进行分析,当负载状态达到预设阈值时,调整监控策略,并将更新后的监控策略通知被监控结点。S130: The parameter adjustment module analyzes the current load status of the node, adjusts the monitoring strategy when the load status reaches a preset threshold, and notifies the monitored node of the updated monitoring strategy.
参数调整模块根据当前的负载情况进行分析,负载的计算是内存、CPU、运行队列平均长度、I/O及网络传输量的综合考虑,如达到预设阈值则进行监控策略调整,调整被监控结点的监控策略包括:The parameter adjustment module analyzes the current load situation. The calculation of the load is a comprehensive consideration of memory, CPU, average length of the running queue, I/O and network transmission volume. If the preset threshold is reached, the monitoring strategy will be adjusted to adjust the monitored structure Point monitoring strategies include:
以网络响应时间、CPU使用率或内存占用率的变化确定监控策略。Determine the monitoring strategy based on changes in network response time, CPU usage, or memory usage.
例如,当被监控结点总体负载上升时:For example, when the overall load of the monitored nodes increases:
若网络响应时间增加,延长被监控结点信息采集模块运行时间间隔;若CPU使用率上升,降低被监控结点信息采集模块运行优先级;若内存占用率上升,在被监控结点上运行轻量级监控引擎。If the network response time increases, extend the running time interval of the monitored node information collection module; if the CPU usage increases, reduce the running priority of the monitored node information collection module; if the memory usage increases, run light on the monitored node Quantitative monitoring engine.
例如,当被监控结点总体负载不变或下降时:For example, when the overall load of the monitored nodes remains constant or decreases:
若网络响应时间减少,减少被监控结点信息采集模块运行时间间隔直至默认值;若CPU使用率下降,增加被监控结点信息采集模块运行优先级直至默认值;若内存占用率上升,在被监控结点上切换回默认监控引擎。If the network response time decreases, reduce the running time interval of the monitored node information collection module to the default value; if the CPU usage drops, increase the running priority of the monitored node information collection module to the default value; Switch back to the default monitoring engine on the monitoring node.
对于其他未说明情况,可以将已有参数保持不变。For other unspecified cases, the existing parameters can be kept unchanged.
此外,总体负载长期超过阈值则连接报警装置进行报警或远程重启被监控结点。In addition, if the overall load exceeds the threshold for a long time, an alarm device will be connected to alarm or the monitored node will be restarted remotely.
本发明提出的上述方法,可以实现基于WEB的集群监控,可自定制监控内容,同时可有效控制在系统高负载运行时监控程序所占资源,能够方便快捷的获取集群监控状态和报警信息。The method proposed by the present invention can realize cluster monitoring based on WEB, can customize monitoring content, can effectively control the resources occupied by the monitoring program when the system is running under high load, and can obtain cluster monitoring status and alarm information conveniently and quickly.
为实现上述目的,如图2所示,本发明实施例还提供了一种计算机集群监控的系统,包括信息采集模块200、主监控模块100以及参数调整模块300。To achieve the above purpose, as shown in FIG. 2 , an embodiment of the present invention also provides a computer cluster monitoring system, including an information collection module 200 , a main monitoring module 100 and a parameter adjustment module 300 .
信息采集模块200用于在被监控结点进行运行信息采集,将结点当前负载状态及被监控的内容信息分别发送给参数调整模块300和主监控模块100。The information collection module 200 is used to collect operation information on the monitored node, and send the current load status of the node and the monitored content information to the parameter adjustment module 300 and the main monitoring module 100 respectively.
主监控模块100用于接收结点当前负载状态及被监控的内容信息,将结点的负载状态和被监控的内容信息存入数据库。The main monitoring module 100 is used to receive the current load status of the node and the monitored content information, and store the node load status and the monitored content information in the database.
主监控模块100提供WEB服务,用于通过网页查看被监控的内容信息。The main monitoring module 100 provides WEB service for viewing the monitored content information through the webpage.
参数调整模块300用于根据结点当前负载状态进行分析,当负载状态达到预设阈值时,调整被监控结点的监控策略,并将更新后的监控策略通知信息采集模块200。The parameter adjustment module 300 is used to analyze according to the current load status of the node, and when the load status reaches a preset threshold, adjust the monitoring strategy of the monitored node, and notify the information collection module 200 of the updated monitoring strategy.
参数调整模块300根据结点当前负载状态进行分析包括:The analysis performed by the parameter adjustment module 300 according to the current load state of the node includes:
分析被监控结点的以下一种或多个参数:Analyze one or more of the following parameters of the monitored nodes:
内存使用率、CPU运行状态、运行队列长度、磁盘I/O、进程组及网络传输速率。Memory usage, CPU running status, run queue length, disk I/O, process group and network transfer rate.
参数调整模块300调整监控策略包括:The parameter adjustment module 300 adjusts the monitoring strategy to include:
以网络响应时间、CPU使用率或内存占用率的变化确定监控策略。Determine the monitoring strategy based on changes in network response time, CPU usage, or memory usage.
例如,当被监控结点总体负载上升时:For example, when the overall load of the monitored nodes increases:
若网络响应时间增加,延长被监控结点信息采集模块运行时间间隔;若CPU使用率上升,降低被监控结点信息采集模块运行优先级;若内存占用率上升,在被监控结点上运行轻量级监控引擎。If the network response time increases, extend the running time interval of the monitored node information collection module; if the CPU usage increases, reduce the running priority of the monitored node information collection module; if the memory usage increases, run light on the monitored node Quantitative monitoring engine.
例如,当被监控结点总体负载不变或下降时:For example, when the overall load of the monitored nodes remains constant or decreases:
若网络响应时间减少,减少被监控结点信息采集模块运行时间间隔直至默认值;若CPU使用率下降,增加被监控结点信息采集模块运行优先级直至默认值;若内存占用率上升,在被监控结点上切换回默认监控引擎。If the network response time decreases, reduce the running time interval of the monitored node information collection module to the default value; if the CPU usage drops, increase the running priority of the monitored node information collection module to the default value; Switch back to the default monitoring engine on the monitoring node.
对于其他未说明情况,可以将已有参数保持不变。For other unspecified cases, the existing parameters can be kept unchanged.
此外,总体负载长期超过阈值则连接报警装置进行报警或远程重启被监控结点。In addition, if the overall load exceeds the threshold for a long time, an alarm device will be connected to alarm or the monitored node will be restarted remotely.
应当了解,图2只是便于说明而将本发明提出的各个单元或模板集中在一块中描述。显然,本发明提出的各个单元或模板也可以以分离模块的形式存在于具体的计算机网络系统中实现。例如,将信息采集模块200和参数调整模块300置于在被监控结点,将主监控模块100置于某一监控主机上,等等。It should be understood that FIG. 2 is only for the sake of illustration and describes all the units or templates proposed in the present invention together. Apparently, each unit or template proposed by the present invention can also be implemented in a specific computer network system in the form of separate modules. For example, the information collection module 200 and the parameter adjustment module 300 are placed on the monitored node, the main monitoring module 100 is placed on a certain monitoring host, and so on.
例如,系统总体结构如下:For example, the overall system structure is as follows:
信息采集模块200中的信息采集程序运行于被监控的结点上,负责对集群进行监控以采集获取集群结点的运行状态与需要监控的信息,结点直接与主监控模块100通信,信息采集模块内又设置多个策略,可根据主监控模块100提供的扩展接口进行自定制监控内容。主监控模块100中的主监控程序运行在监控主机上,收集各信息采集程序的数据并保存在数据库中。参数调整模块300中的参数调整程序根据各结点的运行负载情况调整各结点的监控策略;报警装置根据集群系统的预设故障方案进行邮件和/或短信告警或远程重启被监控结点。The information collection program in the information collection module 200 runs on the monitored nodes, and is responsible for monitoring the cluster to collect and obtain the running status of the cluster nodes and the information to be monitored. The nodes directly communicate with the main monitoring module 100, and the information collection Multiple policies are set in the module, and the monitoring content can be customized according to the extended interface provided by the main monitoring module 100 . The main monitoring program in the main monitoring module 100 runs on the monitoring host, collects the data of each information collection program and saves it in the database. The parameter adjustment program in the parameter adjustment module 300 adjusts the monitoring strategy of each node according to the operating load of each node; the alarm device sends an email and/or SMS alarm or remotely restarts the monitored node according to the preset failure scheme of the cluster system.
例如,所述信息采集程序由一个主模块、一个通讯模块和多个功能模块组成。主模块接收来自参数调整程序的指令并配置各功能模块。功能模块分为集群状态及负载监控模块,轻量级监控引擎和默认监控引擎,默认监控引擎可以通过配置用户脚本自定制监控对象。For example, the information collection program is composed of a main module, a communication module and multiple functional modules. The main module receives instructions from the parameter adjustment program and configures each functional module. Functional modules are divided into cluster status and load monitoring module, lightweight monitoring engine and default monitoring engine. The default monitoring engine can customize monitoring objects by configuring user scripts.
例如,所述参数调整程序包含一个策略选择器,通过负载状态进行优先级、时间间隔及监控引擎切换。For example, the parameter tuning program includes a policy selector for priority, time interval and supervisory engine switching by load status.
本发明提出的上述装置,可以实现基于WEB的集群监控,可自定制监控内容,同时可有效控制在系统高负载运行时监控程序所占资源,能够方便快捷的获取集群监控状态和报警信息。The above-mentioned device proposed by the present invention can realize WEB-based cluster monitoring, can customize monitoring content, can effectively control the resources occupied by the monitoring program when the system is running under high load, and can conveniently and quickly obtain cluster monitoring status and alarm information.
本技术领域的普通技术人员可以理解实现上述实施例方法携带的全部或部分步骤是可以通过程序来指令相关的硬件完成,所述的程序可以存储于一种计算机可读存储介质中,该程序在执行时,包括方法实施例的步骤之一或其组合。Those of ordinary skill in the art can understand that all or part of the steps carried by the methods of the above embodiments can be completed by instructing related hardware through a program, and the program can be stored in a computer-readable storage medium. During execution, one or a combination of the steps of the method embodiments is included.
此外,在本发明各个实施例中的各功能单元可以集成在一个处理模块中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。所述集成的模块如果以软件功能模块的形式实现并作为独立的产品销售或使用时,也可以存储在一个计算机可读取存储介质中。In addition, each functional unit in each embodiment of the present invention may be integrated into one processing module, each unit may exist separately physically, or two or more units may be integrated into one module. The above-mentioned integrated modules can be implemented in the form of hardware or in the form of software function modules. If the integrated modules are realized in the form of software function modules and sold or used as independent products, they can also be stored in a computer-readable storage medium.
上述提到的存储介质可以是只读存储器,磁盘或光盘等。The storage medium mentioned above may be a read-only memory, a magnetic disk or an optical disk, and the like.
以上所述仅是本发明的部分实施方式,应当指出,对于本技术领域的普通技术人员来说,在不脱离本发明原理的前提下,还可以做出若干改进和润饰,这些改进和润饰也应视为本发明的保护范围。The above descriptions are only part of the embodiments of the present invention. It should be pointed out that those skilled in the art can make some improvements and modifications without departing from the principles of the present invention. It should be regarded as the protection scope of the present invention.
Claims (12)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201110391562XA CN102497292A (en) | 2011-11-30 | 2011-11-30 | Method and system for monitoring computer cluster |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201110391562XA CN102497292A (en) | 2011-11-30 | 2011-11-30 | Method and system for monitoring computer cluster |
Publications (1)
Publication Number | Publication Date |
---|---|
CN102497292A true CN102497292A (en) | 2012-06-13 |
Family
ID=46189080
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201110391562XA Pending CN102497292A (en) | 2011-11-30 | 2011-11-30 | Method and system for monitoring computer cluster |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN102497292A (en) |
Cited By (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103105923A (en) * | 2013-03-07 | 2013-05-15 | 鄂尔多斯市云泰互联科技有限公司 | Energy-efficient scheduling method and system for information technology (IT) business of cloud computing center |
CN103116538A (en) * | 2013-01-25 | 2013-05-22 | 浪潮电子信息产业股份有限公司 | Design for computer performance self-adjusting system |
CN103268224A (en) * | 2013-05-08 | 2013-08-28 | 中国科学院微电子研究所 | Software running platform based on web access mode |
CN103533058A (en) * | 2013-10-17 | 2014-01-22 | 南京大学镇江高新技术研究院 | HDFS (Hadoop distributed file system)/Hadoop storage cluster-oriented resource monitoring system and HDFS/Hadoop storage cluster-oriented resource monitoring method |
CN104104536A (en) * | 2013-04-15 | 2014-10-15 | 北京中嘉时代科技有限公司 | Strategy-based self-adjusting concurrent polling monitoring method and device |
CN104834584A (en) * | 2015-06-04 | 2015-08-12 | 深圳市中博科创信息技术有限公司 | Method and system for monitoring host computer hardware loads |
CN104898509A (en) * | 2015-04-30 | 2015-09-09 | 杭州谱谐特科技有限公司 | Industrial control computer monitoring method and system based on secure short message |
CN104954178A (en) * | 2015-05-29 | 2015-09-30 | 北京奇虎科技有限公司 | Method and device for optimizing system alarm |
CN105024880A (en) * | 2015-07-17 | 2015-11-04 | 哈尔滨工程大学 | A Resilient Monitoring Method for Mission-Critical Computer Clusters |
CN105515838A (en) * | 2015-11-26 | 2016-04-20 | 青岛海信传媒网络技术有限公司 | Service configuration method and HA (High Available) cluster system |
CN103116538B (en) * | 2013-01-25 | 2016-11-30 | 浪潮电子信息产业股份有限公司 | A kind of design for computing power self-regulating system |
CN106802853A (en) * | 2017-02-17 | 2017-06-06 | 郑州云海信息技术有限公司 | A kind of system of selection and device based on many monitor modes |
CN108449396A (en) * | 2018-03-07 | 2018-08-24 | 精硕科技(北京)股份有限公司 | Distributed Hadoop cluster management methods, main control end and controlled end |
CN109614302A (en) * | 2018-11-28 | 2019-04-12 | 华为技术服务有限公司 | Service rate adjustment method and device, and related equipment |
CN110222923A (en) * | 2015-09-11 | 2019-09-10 | 福建师范大学 | Dynamically configurable big data analysis system |
CN111405246A (en) * | 2020-03-12 | 2020-07-10 | 高宽友 | Smart city monitoring method and device and management terminal |
CN114218042A (en) * | 2021-12-14 | 2022-03-22 | 中国电信股份有限公司 | Information processing method, device and system |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030101253A1 (en) * | 2001-11-29 | 2003-05-29 | Takayuki Saito | Method and system for distributing data in a network |
CN101207550A (en) * | 2007-03-16 | 2008-06-25 | 中国科学技术大学 | Load balancing system and method for realizing load balancing of multiple services |
CN101442561A (en) * | 2008-12-12 | 2009-05-27 | 南京邮电大学 | Method for monitoring grid based on vector machine support |
US7564776B2 (en) * | 2004-01-30 | 2009-07-21 | Alcatel-Lucent Usa Inc. | Method for controlling the transport capacity for data transmission via a network, and network |
CN101499935A (en) * | 2008-01-30 | 2009-08-05 | 中兴通讯股份有限公司 | Alarm processing method for WiMAX base station |
CN101505302A (en) * | 2009-02-26 | 2009-08-12 | 中国联合网络通信集团有限公司 | Dynamic regulating method and system for security policy |
CN101667034A (en) * | 2009-09-21 | 2010-03-10 | 北京航空航天大学 | Scalable monitoring system supporting hybrid clusters |
-
2011
- 2011-11-30 CN CN201110391562XA patent/CN102497292A/en active Pending
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030101253A1 (en) * | 2001-11-29 | 2003-05-29 | Takayuki Saito | Method and system for distributing data in a network |
US7564776B2 (en) * | 2004-01-30 | 2009-07-21 | Alcatel-Lucent Usa Inc. | Method for controlling the transport capacity for data transmission via a network, and network |
CN101207550A (en) * | 2007-03-16 | 2008-06-25 | 中国科学技术大学 | Load balancing system and method for realizing load balancing of multiple services |
CN101499935A (en) * | 2008-01-30 | 2009-08-05 | 中兴通讯股份有限公司 | Alarm processing method for WiMAX base station |
CN101442561A (en) * | 2008-12-12 | 2009-05-27 | 南京邮电大学 | Method for monitoring grid based on vector machine support |
CN101505302A (en) * | 2009-02-26 | 2009-08-12 | 中国联合网络通信集团有限公司 | Dynamic regulating method and system for security policy |
CN101667034A (en) * | 2009-09-21 | 2010-03-10 | 北京航空航天大学 | Scalable monitoring system supporting hybrid clusters |
Cited By (25)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103116538B (en) * | 2013-01-25 | 2016-11-30 | 浪潮电子信息产业股份有限公司 | A kind of design for computing power self-regulating system |
CN103116538A (en) * | 2013-01-25 | 2013-05-22 | 浪潮电子信息产业股份有限公司 | Design for computer performance self-adjusting system |
CN103105923B (en) * | 2013-03-07 | 2015-05-27 | 鄂尔多斯市云泰互联科技有限公司 | Energy-efficient scheduling method and system for information technology (IT) business of cloud computing center |
CN103105923A (en) * | 2013-03-07 | 2013-05-15 | 鄂尔多斯市云泰互联科技有限公司 | Energy-efficient scheduling method and system for information technology (IT) business of cloud computing center |
CN104104536A (en) * | 2013-04-15 | 2014-10-15 | 北京中嘉时代科技有限公司 | Strategy-based self-adjusting concurrent polling monitoring method and device |
CN104104536B (en) * | 2013-04-15 | 2018-08-17 | 北京中嘉时代科技有限公司 | A kind of concurrent poll monitoring method of self-regulation and device based on strategy |
CN103268224A (en) * | 2013-05-08 | 2013-08-28 | 中国科学院微电子研究所 | Software running platform based on web access mode |
CN103533058A (en) * | 2013-10-17 | 2014-01-22 | 南京大学镇江高新技术研究院 | HDFS (Hadoop distributed file system)/Hadoop storage cluster-oriented resource monitoring system and HDFS/Hadoop storage cluster-oriented resource monitoring method |
CN103533058B (en) * | 2013-10-17 | 2017-02-08 | 南京大学镇江高新技术研究院 | HDFS (Hadoop distributed file system)/Hadoop storage cluster-oriented resource monitoring system and HDFS/Hadoop storage cluster-oriented resource monitoring method |
CN104898509A (en) * | 2015-04-30 | 2015-09-09 | 杭州谱谐特科技有限公司 | Industrial control computer monitoring method and system based on secure short message |
CN104898509B (en) * | 2015-04-30 | 2018-04-27 | 杭州谱谐特科技有限公司 | A kind of industrial personal computer monitoring method and system based on secure short message |
CN104954178B (en) * | 2015-05-29 | 2019-02-15 | 北京奇虎科技有限公司 | Method and device for optimizing system alarm |
CN104954178A (en) * | 2015-05-29 | 2015-09-30 | 北京奇虎科技有限公司 | Method and device for optimizing system alarm |
CN104834584A (en) * | 2015-06-04 | 2015-08-12 | 深圳市中博科创信息技术有限公司 | Method and system for monitoring host computer hardware loads |
CN104834584B (en) * | 2015-06-04 | 2017-07-11 | 深圳市中博科创信息技术有限公司 | A kind of method and system for monitoring host hardware load |
CN105024880A (en) * | 2015-07-17 | 2015-11-04 | 哈尔滨工程大学 | A Resilient Monitoring Method for Mission-Critical Computer Clusters |
CN110222923A (en) * | 2015-09-11 | 2019-09-10 | 福建师范大学 | Dynamically configurable big data analysis system |
CN105515838A (en) * | 2015-11-26 | 2016-04-20 | 青岛海信传媒网络技术有限公司 | Service configuration method and HA (High Available) cluster system |
CN106802853A (en) * | 2017-02-17 | 2017-06-06 | 郑州云海信息技术有限公司 | A kind of system of selection and device based on many monitor modes |
CN106802853B (en) * | 2017-02-17 | 2020-08-21 | 苏州浪潮智能科技有限公司 | Selection method and device based on multiple monitoring modes |
CN108449396A (en) * | 2018-03-07 | 2018-08-24 | 精硕科技(北京)股份有限公司 | Distributed Hadoop cluster management methods, main control end and controlled end |
CN109614302A (en) * | 2018-11-28 | 2019-04-12 | 华为技术服务有限公司 | Service rate adjustment method and device, and related equipment |
CN111405246A (en) * | 2020-03-12 | 2020-07-10 | 高宽友 | Smart city monitoring method and device and management terminal |
CN111405246B (en) * | 2020-03-12 | 2021-04-06 | 厦门宇昊软件有限公司 | Smart city monitoring method and device and management terminal |
CN114218042A (en) * | 2021-12-14 | 2022-03-22 | 中国电信股份有限公司 | Information processing method, device and system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN102497292A (en) | Method and system for monitoring computer cluster | |
CN107924359B (en) | Management of fault conditions in a computing system | |
EP3072260B1 (en) | Methods, systems, and computer readable media for a network function virtualization information concentrator | |
CN111124819B (en) | Method and device for full link monitoring | |
US11573878B1 (en) | Method and apparatus of establishing customized network monitoring criteria | |
EP3338191B1 (en) | Diagnostic framework in computing systems | |
US8516295B2 (en) | System and method of collecting and reporting exceptions associated with information technology services | |
CN111131379A (en) | Distributed flow acquisition system and edge calculation method | |
CN103684916A (en) | Method and system for intelligent monitoring and analyzing under cloud computing | |
US12035156B2 (en) | Communication method and apparatus for plurality of administrative domains | |
US8954563B2 (en) | Event enrichment using data correlation | |
US20240202010A1 (en) | Aggregating metrics of network elements of a software-defined network for different applications based on different aggregation criteria | |
US20240339834A1 (en) | Techniques for orchestrated load shedding | |
US10970148B2 (en) | Method, device and computer program product for managing input/output stack | |
Sandur et al. | Jarvis: Large-scale server monitoring with adaptive near-data processing | |
KR20250065317A (en) | System and method for managing operation in trust reality viewpointing networking infrastructure | |
CN118648320A (en) | Remote logging management in multi-vendor O-RAN networks | |
US12155210B2 (en) | Techniques for orchestrated load shedding | |
CN107566187B (en) | A SLA violation monitoring method, device and system | |
CN110377396A (en) | A kind of virtual machine Autonomic Migration Framework method, system and electronic equipment | |
CN103812706A (en) | Adaptive method for network interface for isomerous manufacturer data network | |
Mukherjee et al. | AMAS: Adaptive auto-scaling for edge computing applications | |
Kontoudis et al. | A statistical approach to virtual server resource management | |
US20250062614A1 (en) | Techniques for orchestrated load shedding | |
CN120110912A (en) | Cloud-network integrated service system and method based on broadband core network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C12 | Rejection of a patent application after its publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20120613 |