CN114328122A - IO full life cycle time delay monitoring method and related device - Google Patents
IO full life cycle time delay monitoring method and related device Download PDFInfo
- Publication number
- CN114328122A CN114328122A CN202111676322.4A CN202111676322A CN114328122A CN 114328122 A CN114328122 A CN 114328122A CN 202111676322 A CN202111676322 A CN 202111676322A CN 114328122 A CN114328122 A CN 114328122A
- Authority
- CN
- China
- Prior art keywords
- monitoring
- delay
- life cycle
- monitoring point
- target
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000012544 monitoring process Methods 0.000 title claims abstract description 176
- 238000000034 method Methods 0.000 title claims abstract description 79
- 230000008569 process Effects 0.000 claims abstract description 37
- 230000000977 initiatory effect Effects 0.000 claims abstract description 14
- 238000012806 monitoring device Methods 0.000 claims abstract description 11
- 239000003999 initiator Substances 0.000 claims description 45
- 238000012545 processing Methods 0.000 claims description 24
- 238000004590 computer program Methods 0.000 claims description 12
- 238000012423 maintenance Methods 0.000 claims description 7
- 230000009286 beneficial effect Effects 0.000 abstract description 2
- 238000010586 diagram Methods 0.000 description 6
- 230000006870 function Effects 0.000 description 5
- 238000012360 testing method Methods 0.000 description 5
- 238000013461 design Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 239000000835 fiber Substances 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000001360 synchronised effect Effects 0.000 description 2
- 238000013480 data collection Methods 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 230000001960 triggered effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/34—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/445—Program loading or initiating
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- Quality & Reliability (AREA)
- Computer Hardware Design (AREA)
- Debugging And Monitoring (AREA)
Abstract
Description
技术领域technical field
本申请涉及计算机技术领域,特别涉及一种IO全生命周期时延监测方法、IO全生命周期时延监测装置、服务器以及计算机可读存储介质。The present application relates to the field of computer technology, and in particular, to a method for monitoring IO full life cycle delay, an IO full life cycle delay monitoring device, a server, and a computer-readable storage medium.
背景技术Background technique
随着信息技术的不断发展,网络存储系统的更新越来越快。其中,网络存储系统在开发测试阶段或现场部署时由于系统设计或硬件环境等因素会导致IO性能无法达到预期,直接从软硬件环境排查性能影响因素效率太低,不能快速定位性能瓶颈点,需要对网络存储系统的性能进行测试。With the continuous development of information technology, the update of network storage system is getting faster and faster. Among them, the IO performance of the network storage system in the development and testing stage or on-site deployment may not meet expectations due to factors such as system design or hardware environment. It is too inefficient to directly check the performance influencing factors from the software and hardware environment, and it is impossible to quickly locate the performance bottleneck. Test the performance of the network storage system.
相关技术中,可以通过监测收集IO(Input/Output,输入/输出)在全生命周期各个阶段的时延数据可以快速定位性能瓶颈点,指导优化解决方案。但是,IO监测技术仅仅关注单体系统内部的IO性能消耗,无法从IO全生命周期统计整体性能数据。降低了对网络存储系统的测试数据的准确性,无法准确的反应存储系统的实际时延。In the related art, by monitoring and collecting IO (Input/Output, input/output) delay data at various stages of the whole life cycle, the performance bottleneck point can be quickly located and an optimized solution can be guided. However, the IO monitoring technology only focuses on the IO performance consumption within a single system, and cannot collect overall performance data from the full IO life cycle. The accuracy of the test data for the network storage system is reduced, and the actual delay of the storage system cannot be accurately reflected.
因此,如何提高对网络存储系统进行测试的准确性是本领域技术人员关注的重点问题。Therefore, how to improve the accuracy of testing the network storage system is a key issue concerned by those skilled in the art.
发明内容SUMMARY OF THE INVENTION
本申请的目的是提供一种IO全生命周期时延监测方法、IO全生命周期时延监测装置、服务器以及计算机可读存储介质,提高时延监测的准确性和精度。The purpose of the present application is to provide an IO full life cycle delay monitoring method, an IO full life cycle delay monitoring device, a server and a computer-readable storage medium, so as to improve the accuracy and precision of the delay monitoring.
为解决上述技术问题,本申请提供一种IO全生命周期时延监测方法,包括:In order to solve the above-mentioned technical problems, the present application provides a method for monitoring IO full life cycle delay, including:
对发起端和目标端分别设置监测点;Set up monitoring points for the initiator and target respectively;
当所述发起端向所述目标端发起访问的过程中执行到所述监测点时,基于该监测点获取当前的IO特征信息,并从高精度授时模块获取时间戳;When the monitoring point is reached in the process that the initiator initiates the access to the target, the current IO feature information is obtained based on the monitoring point, and the time stamp is obtained from the high-precision timing module;
对所有所述时间戳和所有所述IO特征信息进行延时数据统计,得到延时监测数据。Perform delay data statistics on all the time stamps and all the IO feature information to obtain delay monitoring data.
可选的,对发起端和目标端分别设置监测点,包括:Optionally, set monitoring points for the initiator and target respectively, including:
对IO访问过程进行分析,得到多个阶段点;Analyze the IO access process to obtain multiple stage points;
对所述发起端和所述目标端中对应的阶段点设置监测点。Monitoring points are set for corresponding stage points in the initiator and the target.
可选的,在所述设置监测点之前,还包括:Optionally, before the setting of monitoring points, the method further includes:
对所述发起端和所述目标端分别部署监测点收集插件和高精度授时模块。A monitoring point collection plug-in and a high-precision timing module are respectively deployed on the initiator and the target.
可选的,当所述发起端向所述目标端发起访问的过程中执行到所述监测点时,基于该监测点获取当前的IO特征信息,并从高精度授时模块获取时间戳,包括:Optionally, when the monitoring point is executed in the process that the initiator initiates the access to the target, the current IO feature information is obtained based on the monitoring point, and the time stamp is obtained from the high-precision timing module, including:
当所述发起端向所述目标端发起访问的过程中执行到所述监测点时,通过所述监测点收集插件访问所述高精度授时模块,并获取到所述时间戳;When the monitoring point is reached in the process that the initiator initiates the access to the target, the high-precision timing module is accessed through the monitoring point collection plug-in, and the timestamp is obtained;
通过所述监测点收集插件获取当前的IO特征信息。Obtain current IO feature information through the monitoring point collection plug-in.
可选的,所述IO特征信息包括端口对信息、IO唯一标识、当前处理阶段信息。Optionally, the IO feature information includes port pair information, IO unique identifier, and current processing stage information.
可选的,对所有所述时间戳和所有所述IO特征信息进行延时数据统计,得到延时监测数据,包括:Optionally, perform delay data statistics on all the time stamps and all the IO feature information to obtain delay monitoring data, including:
基于相同的IO唯一标识和相同的当前处理阶段信息对所有所述时间戳进行延时数据统计,得到所述延时监测数据。Based on the same IO unique identifier and the same current processing stage information, delay data statistics are performed on all the timestamps to obtain the delay monitoring data.
可选的,还包括:Optionally, also include:
基于延时监测数据确定延时时长大于阈值的阶段信息;Determine the stage information whose delay time is greater than the threshold based on the delay monitoring data;
根据所述阶段信息发送处理维护请求。The processing maintenance request is sent according to the stage information.
本申请还提供一种IO全生命周期时延监测装置,包括:The application also provides an IO full life cycle delay monitoring device, including:
监测点设置模块,用于对发起端和目标端分别设置监测点;The monitoring point setting module is used to set monitoring points for the initiator and the target respectively;
执行信息获取模块,用于当所述发起端向所述目标端发起访问的过程中执行到所述监测点时,基于该监测点获取当前的IO特征信息,并从高精度授时模块获取时间戳;The execution information acquisition module is used to obtain the current IO feature information based on the monitoring point and obtain the time stamp from the high-precision timing module when the monitoring point is executed in the process that the initiator initiates the access to the target terminal. ;
延时数据统计模块,用于对所有所述时间戳和所有所述IO特征信息进行延时数据统计,得到延时监测数据。A delay data statistics module, configured to perform delay data statistics on all the time stamps and all the IO feature information to obtain delay monitoring data.
本申请还提供一种服务器,包括:The application also provides a server, including:
存储器,用于存储计算机程序;memory for storing computer programs;
处理器,用于执行所述计算机程序时实现如上所述的IO全生命周期时延监测方法的步骤。The processor is configured to implement the steps of the above-mentioned method for monitoring the delay in the whole life cycle of IO when executing the computer program.
本申请还提供一种计算机可读存储介质,所述计算机可读存储介质上存储有计算机程序,所述计算机程序被处理器执行时实现如上所述的IO全生命周期时延监测方法的步骤。The present application also provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the steps of the above-mentioned method for monitoring the delay of an IO full life cycle are implemented.
本申请所提供的一种IO全生命周期时延监测方法,包括:对发起端和目标端分别设置监测点;当所述发起端向所述目标端发起访问的过程中执行到所述监测点时,基于该监测点获取当前的IO特征信息,并从高精度授时模块获取时间戳;对所有所述时间戳和所有所述IO特征信息进行延时数据统计,得到延时监测数据。A method for monitoring IO full life cycle delay provided by the present application includes: setting monitoring points on an initiator and a target respectively; when the initiator initiates access to the target, the monitoring point is executed to the monitoring point. , obtain the current IO feature information based on the monitoring point, and obtain the time stamp from the high-precision timing module; perform delay data statistics on all the time stamps and all the IO feature information to obtain delay monitoring data.
通过在发起端和目标端设置监测点,然后当执行到该监测点时,基于该监测点获取当前的IO特征信息,并从高精度授时模块获取时间戳,最后进行时延统计,实现了在发起访问的全过程中截取两端设备的各个监测点的信息,进而在进行时延统计,实现了对多端进行时延时延监测,而不是仅仅关注单体系统的IO性能,提高了时延监测的准确性和精度。By setting a monitoring point on the initiator and the target, and then when the monitoring point is executed, the current IO feature information is obtained based on the monitoring point, and the time stamp is obtained from the high-precision timing module, and finally the delay statistics are performed. In the whole process of initiating an access, the information of each monitoring point of the devices at both ends is intercepted, and then the delay statistics are carried out, which realizes the delay and delay monitoring of the multi-end, instead of only focusing on the IO performance of the single system, which improves the delay. Monitoring accuracy and precision.
本申请还提供一种IO全生命周期时延监测方法、IO全生命周期时延监测装置、服务器以及计算机可读存储介质,具有以上有益效果,在此不做赘述。The present application further provides a method for monitoring IO full life cycle delay, a device for monitoring IO full life cycle delay, a server and a computer-readable storage medium, which have the above beneficial effects, and are not repeated here.
附图说明Description of drawings
为了更清楚地说明本申请实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据提供的附图获得其他的附图。In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the following briefly introduces the accompanying drawings required for the description of the embodiments or the prior art. Obviously, the drawings in the following description are only It is an embodiment of the present application. For those of ordinary skill in the art, other drawings can also be obtained according to the provided drawings without any creative effort.
图1为本申请实施例所提供的一种IO全生命周期时延监测方法的流程图;Fig. 1 is a flow chart of a method for monitoring IO full life cycle delay provided by an embodiment of the present application;
图2为本申请实施例提供的一种IO全生命周期时延监测方法的系统结构示意图;2 is a schematic diagram of the system structure of a method for monitoring IO full life cycle delay provided by an embodiment of the present application;
图3为本申请实施例所提供的一种高精度授时模块的结构示意图;3 is a schematic structural diagram of a high-precision timing module provided by an embodiment of the application;
图4为本申请实施例所提供的一种IO全生命周期时延监测装置的结构示意图。FIG. 4 is a schematic structural diagram of an IO full life cycle delay monitoring device provided by an embodiment of the present application.
具体实施方式Detailed ways
本申请的核心是提供一种IO全生命周期时延监测方法、IO全生命周期时延监测装置、服务器以及计算机可读存储介质,提高时延监测的准确性和精度。The core of the present application is to provide an IO full life cycle delay monitoring method, an IO full life cycle delay monitoring device, a server and a computer-readable storage medium, so as to improve the accuracy and precision of the delay monitoring.
为使本申请实施例的目的、技术方案和优点更加清楚,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。In order to make the purposes, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be described clearly and completely below with reference to the drawings in the embodiments of the present application. Obviously, the described embodiments It is a part of the embodiments of the present application, but not all of the embodiments. Based on the embodiments in the present application, all other embodiments obtained by those of ordinary skill in the art without creative efforts shall fall within the protection scope of the present application.
相关技术中,可以通过监测收集IO在全生命周期各个阶段的时延数据可以快速定位性能瓶颈点,指导优化解决方案。但是,IO监测技术仅仅关注单体系统内部的IO性能消耗,无法从IO全生命周期统计整体性能数据。降低了对网络存储系统的测试数据的准确性,无法准确的反应存储系统的实际时延。其中,IO全生命周期是指从服务器发起IO直到接收存储系统网络IO数据完成。In the related art, by monitoring and collecting IO latency data at various stages of the entire life cycle, performance bottlenecks can be quickly located and an optimized solution can be guided. However, the IO monitoring technology only focuses on the IO performance consumption within a single system, and cannot collect overall performance data from the full IO life cycle. The accuracy of the test data for the network storage system is reduced, and the actual delay of the storage system cannot be accurately reflected. The IO full life cycle refers to the time from when the server initiates IO until the completion of receiving the network IO data of the storage system.
因此,本申请提供一种IO全生命周期时延监测方法,通过在发起端和目标端设置监测点,然后当执行到该监测点时,基于该监测点获取当前的IO特征信息,并从高精度授时模块获取时间戳,最后进行时延统计,实现了在发起访问的全过程中截取两端设备的各个监测点的信息,进而在进行时延统计,实现了对多端进行时延时延监测,而不是仅仅关注单体系统的IO性能,提高了时延监测的准确性和精度。Therefore, the present application provides a method for monitoring IO full life cycle delay. By setting a monitoring point on the initiator end and the target end, and then when the monitoring point is executed, the current IO feature information is obtained based on the monitoring point, and the current IO feature information is obtained from the high The precision timing module obtains time stamps, and finally performs delay statistics, which realizes the interception of the information of each monitoring point of the devices at both ends during the whole process of initiating access, and then performs delay statistics to realize delay monitoring for multiple ends. , instead of only focusing on the IO performance of a single system, which improves the accuracy and precision of latency monitoring.
以下通过一个实施例,对本申请提供的一种IO全生命周期时延监测方法进行说明。The following describes an IO full life cycle delay monitoring method provided by the present application through an embodiment.
请参考图1,图1为本申请实施例所提供的一种IO全生命周期时延监测方法的流程图。Please refer to FIG. 1. FIG. 1 is a flowchart of a method for monitoring IO full life cycle delay provided by an embodiment of the present application.
本实施例中,该方法可以包括:In this embodiment, the method may include:
S101,对发起端和目标端分别设置监测点;S101, setting monitoring points for the initiator and the target respectively;
可见,本步骤旨在IO处理过程中的发起端的设备以及目标端的设备中设置监测点。也就是在,IO处理过程中的两端全部设置对应的监测点,而不是仅仅关注于单个系统内部的IO性能。通过设置的监测点,就可以对发起端和目标端分别进行IO性能的监测,实现对IO处理过程中的所有终端的性能进行监测。It can be seen that this step aims to set monitoring points in the device of the initiator end and the device of the target end in the IO processing process. That is, both ends of the IO processing process set up corresponding monitoring points instead of just focusing on the IO performance inside a single system. By setting the monitoring points, the IO performance of the initiator and the target can be monitored respectively, and the performance of all terminals in the IO processing process can be monitored.
其中,监测点可以为设置处理流程中钩子函数,当在处理流程中激活该监测点的钩子函数时,就可以基于该钩子函数进行信息获取。该监测点还可以是设置的监控程序,检测设备中执行的处理流程,当处理流程执行的阶段符合设置的监测点时,就可以收集对应的数据。Wherein, the monitoring point may be a hook function in the set processing flow, and when the hook function of the monitoring point is activated in the processing flow, information can be acquired based on the hook function. The monitoring point may also be a set monitoring program, which detects the processing flow executed in the device, and when the stage of execution of the processing flow conforms to the set monitoring point, corresponding data can be collected.
进一步的,提高监测点设置的准确性,本步骤可以包括:Further, to improve the accuracy of monitoring point setting, this step may include:
步骤1,对IO访问过程进行分析,得到多个阶段点;Step 1, analyze the IO access process to obtain multiple stage points;
步骤2,对发起端和目标端中对应的阶段点设置监测点。Step 2, setting monitoring points for the corresponding stage points in the initiator end and the target end.
可见,本步骤旨在说明如何设置监测点。本可选方案中,对IO访问过程进行分析,得到多个阶段点,对发起端和目标端中对应的阶段点设置监测点。其中,就是分析IO访问过程需要执行哪几个阶段,每个阶段的起点即为该阶段点。进一步的,由于IO访问过程在不同的阶段执行在不同的设备中。因此,需要根据发起端对应的阶段点这是监测点,对目标端中对应的阶段点设置监测点。It can be seen that this step is intended to explain how to set up monitoring points. In this optional solution, the IO access process is analyzed to obtain multiple stage points, and monitoring points are set for the corresponding stage points in the initiator end and the target end. Among them, it is to analyze which stages of the IO access process need to be executed, and the starting point of each stage is the stage point. Further, since the IO access process is executed in different devices at different stages. Therefore, according to the stage point corresponding to the initiator, which is the monitoring point, the monitoring point needs to be set for the corresponding stage point in the target terminal.
进一步的,在本实施例中设置监测点之前,还包括:Further, before setting the monitoring point in this embodiment, it also includes:
对发起端和目标端分别部署监测点收集插件和高精度授时模块。Deploy monitoring point collection plug-ins and high-precision timing modules on the initiator and target ends respectively.
可见,本可选方案中,主要是说明为了提高信息获取的准确性,避免出现时间获取不准确的问题。本可选方案中,对发起端和目标端分别部署监测点收集插件和高精度授时模块。其中,监测点收集插件主要是用于对监测点的信息IO特征信息进行收集。高精度授时模块主要是用于获取时间戳。It can be seen that in this optional solution, the main purpose is to improve the accuracy of information acquisition and avoid the problem of inaccurate time acquisition. In this optional solution, the monitoring point collection plug-in and the high-precision timing module are respectively deployed on the initiator end and the target end. Among them, the monitoring point collection plug-in is mainly used to collect the information IO characteristic information of the monitoring point. The high-precision timing module is mainly used to obtain time stamps.
S102,当发起端向目标端发起访问的过程中执行到监测点时,基于该监测点获取当前的IO特征信息,并从高精度授时模块获取时间戳;S102, when the initiating terminal initiates the access to the target terminal to the monitoring point, obtain current IO feature information based on the monitoring point, and obtain the time stamp from the high-precision timing module;
在S101的基础上,本步骤旨在当发起端向目标端发起访问的过程中执行到监测点时,基于该监测点获取当前的IO特征信息,并从高精度授时模块获取时间戳。On the basis of S101, this step aims to obtain the current IO feature information based on the monitoring point and obtain the time stamp from the high-precision timing module when the monitoring point is reached during the process of initiating access to the target terminal by the initiator.
可见,也就是,当访问过程执行到对应的监测点时,触发该监测点,然后基于该监测点获取当前的IO特征信息,并从高精度授时模块获取时间戳。It can be seen that, when the access process is executed to the corresponding monitoring point, the monitoring point is triggered, and then the current IO feature information is obtained based on the monitoring point, and the time stamp is obtained from the high-precision timing module.
进一步的,本步骤可以包括:Further, this step may include:
步骤1,当发起端向目标端发起访问的过程中执行到监测点时,通过监测点收集插件访问高精度授时模块,并获取到时间戳;Step 1, when the monitoring point is executed in the process of initiating access to the target terminal by the initiator, the high-precision timing module is accessed through the monitoring point collection plug-in, and the time stamp is obtained;
步骤2,通过监测点收集插件获取当前的IO特征信息。Step 2, obtain the current IO feature information through the monitoring point collection plug-in.
可见,本可选方案中主要是说明如何获取信息。本可选方案中,当发起端向目标端发起访问的过程中执行到监测点时,通过监测点收集插件访问高精度授时模块,并获取到时间戳,通过监测点收集插件获取当前的IO特征信息。其中,高精度授时模块基于卫星系统的时间信号和PPS(Pulse Per Second,秒脉冲)信号,可以为远距离设备提供统一的、高同步的时间调教源。It can be seen that this optional solution mainly describes how to obtain information. In this optional solution, when a monitoring point is reached during the process of initiating an access from the initiator to the target, the high-precision timing module is accessed through the monitoring point collection plug-in, and the timestamp is obtained, and the current IO characteristics are obtained through the monitoring point collection plug-in. information. Among them, the high-precision timing module is based on the time signal of the satellite system and the PPS (Pulse Per Second) signal, which can provide a unified and highly synchronized time adjustment source for remote devices.
S103,对所有时间戳和所有IO特征信息进行延时数据统计,得到延时监测数据。S103, perform delay data statistics on all time stamps and all IO feature information to obtain delay monitoring data.
在S102的基础上,本步骤旨在对所有时间戳和所有IO特征信息进行延时数据统计,得到延时监测数据。On the basis of S102, this step aims to perform delay data statistics on all time stamps and all IO feature information to obtain delay monitoring data.
在获取到时间戳和所有IO特征信息的基础上,通过该时间戳就可以确定每个监测点之间的耗时,进而确定在IO访问过程中,每个阶段的耗时,实现对IO全流程的时延监控。On the basis of obtaining the timestamp and all IO feature information, the time-consuming between each monitoring point can be determined through the timestamp, and then the time-consuming of each stage in the IO access process can be determined, so as to realize the complete IO Process latency monitoring.
其中,IO特征信息包括端口对信息、IO唯一标识、当前处理阶段信息。The IO feature information includes port pair information, IO unique identifier, and current processing stage information.
相应的,本步骤可以包括:Correspondingly, this step may include:
基于相同的IO唯一标识和相同的当前处理阶段信息对所有时间戳进行延时数据统计,得到延时监测数据。Based on the same IO unique identifier and the same current processing stage information, delay data statistics are performed on all timestamps to obtain delay monitoring data.
可见,本可选方案中通过该IO唯一描述和相同的当前处理阶段信息可以从获取到的数据中梳理出同一个IO访问过程的时间信息,并基于该时间信息进行确定到每个阶段的时延信息。It can be seen that in this optional solution, the time information of the same IO access process can be sorted out from the obtained data through the unique description of the IO and the same current processing stage information, and the time to each stage can be determined based on the time information. extension information.
此外,本实施例还可以包括:In addition, this embodiment may also include:
步骤1,基于延时监测数据确定延时时长大于阈值的阶段信息;Step 1, based on the delay monitoring data, determine the stage information whose delay time is greater than the threshold;
步骤2,根据阶段信息发送处理维护请求。Step 2: Send and process the maintenance request according to the stage information.
可见,本可选方案中主要是说明如何发送对应的维护请求。本可选方案中,基于延时监测数据确定延时时长大于阈值的阶段信息,根据阶段信息发送处理维护请求。It can be seen that this optional solution mainly describes how to send the corresponding maintenance request. In this optional solution, based on the delay monitoring data, determine the stage information whose delay time is greater than the threshold, and send and process the maintenance request according to the stage information.
综上,本实施例通过在发起端和目标端设置监测点,然后当执行到该监测点时,基于该监测点获取当前的IO特征信息,并从高精度授时模块获取时间戳,最后进行时延统计,实现了在发起访问的全过程中截取两端设备的各个监测点的信息,进而在进行时延统计,实现了对多端进行时延时延监测,而不是仅仅关注单体系统的IO性能,提高了时延监测的准确性和精度。To sum up, in this embodiment, a monitoring point is set on the initiator end and the target end, and then when the monitoring point is executed, the current IO feature information is obtained based on the monitoring point, and the time stamp is obtained from the high-precision timing module. Delay statistics, which realizes the interception of the information of each monitoring point of the devices at both ends in the whole process of initiating an access, and then performs delay statistics to realize the delay monitoring of multiple ends, instead of only focusing on the IO of a single system. performance, improving the accuracy and precision of delay monitoring.
以下通过一个具体的实施例,对本申请提供的一种IO全生命周期时延监测方法做进一步说明。A method for monitoring IO full life cycle delay provided by the present application will be further described below through a specific embodiment.
请参考图2,图2为本申请实施例提供的一种IO全生命周期时延监测方法的系统结构示意图。Please refer to FIG. 2 . FIG. 2 is a schematic diagram of a system structure of a method for monitoring a delay in an IO full life cycle provided by an embodiment of the present application.
本实施例中,该方法对应的系统可以包括:发起端、目标端等。In this embodiment, the system corresponding to the method may include: an initiator, a target, and the like.
其中,发起端一般指访问网络存储系统的服务器或其他业务主机。目标端指提供存储IO块、文件等服务的存储系统。存储网络指FC网络或以太网等存储网络。IO监测点收集是部署于发起端和目标端的IO数据收集插件,该插件负责访问高精度授时模块,获取精确的同步时间,插件利用实时内核驱动降低软件访问时延。The initiator generally refers to a server or other service host accessing the network storage system. The target end refers to the storage system that provides services such as storage IO blocks and files. Storage network refers to storage network such as FC network or Ethernet. IO monitoring point collection is an IO data collection plug-in deployed on the initiator and target sides. The plug-in is responsible for accessing the high-precision timing module to obtain accurate synchronization time. The plug-in uses real-time kernel drivers to reduce software access latency.
其中,IO监测收集IO路径总关键节点信息时需要记录的特征数据包括PORT_PAIR(FC(Fiber Channel,光纤通道)网络指远端端口和本地端口组成的端口对,唯一标识系统中的一条IO链路,以太网使用IP(Internet Protocol,网际互连协议)和端口唯一标识一条IO链路、Exchange ID(FC网络中用于标识一个IO的唯一ID,以太网中使用报文SequenceID)、Time(时间戳)读取授时模块当前时间、Tag标识的是IO全生命周期中的处理阶段。Among them, the characteristic data that needs to be recorded when the IO monitoring collects the total key node information of the IO path includes the PORT_PAIR (FC (Fiber Channel, Fibre Channel) network refers to the port pair composed of the remote port and the local port, which uniquely identifies an IO link in the system , Ethernet uses IP (Internet Protocol, Internet Protocol) and port to uniquely identify an IO link, Exchange ID (the unique ID used to identify an IO in the FC network, and the packet SequenceID is used in Ethernet), Time (time stamp) to read the current time of the timing module, and the Tag identifies the processing stage in the IO full life cycle.
其中,高精度授时模块用于为不同系统中的IO监测采集提供高精度同步的时间戳,确保IO跨系统时间戳的一致性,利用GPS(Global Positioning System,全球定位系统)或北斗卫星系统实现高精度的时钟同步,精度可达100ns以内。Among them, the high-precision timing module is used to provide high-precision and synchronized timestamps for IO monitoring and collection in different systems to ensure the consistency of IO timestamps across systems. It is implemented by GPS (Global Positioning System) or Beidou satellite system High-precision clock synchronization, the accuracy can reach within 100ns.
请参考图3,图3为本申请实施例所提供的一种高精度授时模块的结构示意图。Please refer to FIG. 3 , which is a schematic structural diagram of a high-precision timing module provided by an embodiment of the present application.
高精度授时模块工作如图3所示,基于卫星系统的时间信号和PPS信号,可以为远距离设备提供统一的、高同步的时间调教源;为保证时间采集精度在报文处理之前、时间信息发送之前会访问内部高速计数器(假设计数器计数周期为F1),记录计数器时间T1、T2,时间信息编码后与计数器计数值T1、T2一起发送到主机;主机驱动在收到高速授时模块发出的中断请求时,首先读取CPU(central processing unit,中央处理器)内部的高速计时器C1,然后在进行时间信息的解码得到时间T3,时间采集请求读取时间信息时再次读取CPU内部高速计时器(假设计数周期为F2)C2,利用T1、T2及C1、C2可最大限度消除软件分部引入的时间延时,确保时间精度。实际时间T=T3+|T2-T1|*F1+|C2-C1|*F2。The work of the high-precision timing module is shown in Figure 3. Based on the time signal and PPS signal of the satellite system, it can provide a unified, high-synchronization time adjustment source for remote devices; in order to ensure the time acquisition accuracy, the time information before the message is processed. Before sending, it will access the internal high-speed counter (assuming that the counter count period is F1), record the counter time T1 and T2, and send the time information to the host together with the counter count values T1 and T2 after encoding; the host driver receives the interrupt from the high-speed timing module. When requesting, first read the high-speed timer C1 inside the CPU (central processing unit, central processing unit), and then decode the time information to obtain the time T3, and read the high-speed timer inside the CPU again when the time acquisition request reads the time information. (Assuming that the count period is F2) C2, using T1, T2 and C1, C2 can eliminate the time delay introduced by the software division to the greatest extent, and ensure the time accuracy. Actual time T=T3+|T2-T1|*F1+|C2-C1|*F2.
基于上述说明,本实施例中的步骤可以包括:Based on the above description, the steps in this embodiment may include:
步骤1,在发起端和目标端分别部署IO监测点收集插件和高精度授时模块;Step 1, deploy the IO monitoring point collection plug-in and the high-precision timing module on the initiator end and the target end respectively;
步骤2,分析确定IO生命周期监测阶段点,确定IO监测节点后可以在目标端端和发起端端的相关节点处理流程中嵌入IO监测点收集处理调用;Step 2, analyze and determine the IO life cycle monitoring stage point, after the IO monitoring node is determined, the IO monitoring point collection and processing call can be embedded in the relevant node processing flow of the target end and the initiator end;
步骤3,发起端发起对目标端存储系统的IO访问;Step 3, the initiator initiates IO access to the target storage system;
步骤4,IO访问过程中经过步骤2预埋的监测阶段点时,会调用注册的IO监测点收集,收集授时模块提供的时间戳和IO特性信息,包括PORT_PAIR、Exchange ID及当前阶段点的Tag属性;Step 4: When the IO access process passes the monitoring stage point embedded in Step 2, the registered IO monitoring point collection will be called to collect the timestamp and IO feature information provided by the timing module, including PORT_PAIR, Exchange ID and the current stage point Tag Attributes;
步骤5,整个IO生命周期结束后会将收集到的数据汇总集中到一起进行性能数据统计;Step 5: After the entire IO life cycle is over, the collected data will be aggregated together for performance data statistics;
步骤6,通过收集到IO特性数据分析得到一个完整IO处理流程中各个阶段的延时统计:相同Exchange id的为同一个IO,相邻Tag间时间戳差值为该阶段IO处理时延;Step 6: Obtain the delay statistics of each stage in a complete IO processing process by analyzing the collected IO characteristic data: the same Exchange id is the same IO, and the timestamp difference between adjacent tags is the IO processing delay in this stage;
步骤7,对比分析时延数据比较大的环节,确认软件处理、硬件配置是否合理。Step 7, compare and analyze the links with relatively large delay data, and confirm whether the software processing and hardware configuration are reasonable.
可见,本实施例通过在发起端和目标端设置监测点,然后当执行到该监测点时,基于该监测点获取当前的IO特征信息,并从高精度授时模块获取时间戳,最后进行时延统计,实现了在发起访问的全过程中截取两端设备的各个监测点的信息,进而在进行时延统计,实现了对多端进行时延监测,而不是仅仅关注单体系统的IO性能,提高了时延监测的准确性和精度。It can be seen that in this embodiment, a monitoring point is set at the initiator end and the target end, and then when the monitoring point is executed, the current IO feature information is obtained based on the monitoring point, and the time stamp is obtained from the high-precision timing module, and finally the delay is performed. Statistics, realize the interception of the information of each monitoring point of the devices at both ends in the whole process of initiating the access, and then carry out the delay statistics, realize the delay monitoring of the multi-end, instead of only focusing on the IO performance of the single system, improve the The accuracy and precision of delay monitoring are improved.
下面对本申请实施例提供的IO全生命周期时延监测装置进行介绍,下文描述的IO全生命周期时延监测装置与上文描述的IO全生命周期时延监测方法可相互对应参照。The following describes the IO full life cycle delay monitoring device provided by the embodiments of the present application. The IO full life cycle delay monitoring device described below and the IO full life cycle delay monitoring method described above may refer to each other correspondingly.
请参考图4,图4为本申请实施例所提供的一种IO全生命周期时延监测装置的结构示意图。Please refer to FIG. 4 , which is a schematic structural diagram of an IO full-life-cycle delay monitoring device provided by an embodiment of the present application.
本实施例中,该装置可以包括:In this embodiment, the device may include:
监测点设置模块100,用于对发起端和目标端分别设置监测点;The monitoring
执行信息获取模块200,用于当发起端向目标端发起访问的过程中执行到监测点时,基于该监测点获取当前的IO特征信息,并从高精度授时模块获取时间戳;The execution
延时数据统计模块300,用于对所有时间戳和所有IO特征信息进行延时数据统计,得到延时监测数据。The delay
可选的,该监测点设置模块100,具体用于对IO访问过程进行分析,得到多个阶段点;对发起端和目标端中对应的阶段点设置监测点。Optionally, the monitoring
可选的,该装置,还可以包括:Optionally, the device may also include:
插件部署模块,用于对发起端和目标端分别部署监测点收集插件和高精度授时模块。The plug-in deployment module is used to deploy the monitoring point collection plug-in and the high-precision timing module to the initiator and the target respectively.
可选的,该执行信息获取模块200,具体用于当发起端向目标端发起访问的过程中执行到监测点时,通过监测点收集插件访问高精度授时模块,并获取到时间戳;通过监测点收集插件获取当前的IO特征信息。Optionally, the execution
可选的,该延时数据统计模块300,具体用于基于相同的IO唯一标识和相同的当前处理阶段信息对所有时间戳进行延时数据统计,得到延时监测数据。Optionally, the delay
可选的,该装置,还可以包括:Optionally, the device may also include:
维护请求发送模块,用于基于延时监测数据确定延时时长大于阈值的阶段信息;根据阶段信息发送处理维护请求。The maintenance request sending module is used to determine the stage information whose delay time is greater than the threshold value based on the delay monitoring data; send and process the maintenance request according to the stage information.
本申请实施例还提供一种服务器,包括:The embodiment of the present application also provides a server, including:
存储器,用于存储计算机程序;memory for storing computer programs;
处理器,用于执行所述计算机程序时实现如以上实施例所述的IO全生命周期时延监测方法的步骤。The processor is configured to implement the steps of the method for monitoring the delay of the IO full life cycle as described in the above embodiments when executing the computer program.
本申请实施例还提供一种计算机可读存储介质,所述计算机可读存储介质上存储有计算机程序,所述计算机程序被处理器执行时实现如以上实施例所述的IO全生命周期时延监测方法的步骤。Embodiments of the present application further provide a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the IO full life cycle delay as described in the above embodiments is implemented The steps of the monitoring method.
说明书中各个实施例采用递进的方式描述,每个实施例重点说明的都是与其他实施例的不同之处,各个实施例之间相同相似部分互相参见即可。对于实施例公开的装置而言,由于其与实施例公开的方法相对应,所以描述的比较简单,相关之处参见方法部分说明即可。The various embodiments in the specification are described in a progressive manner, and each embodiment focuses on the differences from other embodiments, and the same and similar parts between the various embodiments can be referred to each other. For the device disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and the relevant part can be referred to the description of the method.
专业人员还可以进一步意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、计算机软件或者二者的结合来实现,为了清楚地说明硬件和软件的可互换性,在上述说明中已经按照功能一般性地描述了各示例的组成及步骤。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。Professionals may further realize that the units and algorithm steps of each example described in conjunction with the embodiments disclosed herein can be implemented in electronic hardware, computer software, or a combination of the two, in order to clearly illustrate the possibilities of hardware and software. Interchangeability, the above description has generally described the components and steps of each example in terms of function. Whether these functions are performed in hardware or software depends on the specific application and design constraints of the technical solution. Skilled artisans may implement the described functionality using different methods for each particular application, but such implementations should not be considered beyond the scope of this application.
结合本文中所公开的实施例描述的方法或算法的步骤可以直接用硬件、处理器执行的软件模块,或者二者的结合来实施。软件模块可以置于随机存储器(RAM)、内存、只读存储器(ROM)、电可编程ROM、电可擦除可编程ROM、寄存器、硬盘、可移动磁盘、CD-ROM、或技术领域内所公知的任意其它形式的存储介质中。The steps of a method or algorithm described in connection with the embodiments disclosed herein may be directly implemented in hardware, a software module executed by a processor, or a combination of the two. The software module can be placed in random access memory (RAM), internal memory, read only memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, removable disk, CD-ROM, or any other in the technical field. in any other known form of storage medium.
以上对本申请所提供的一种IO全生命周期时延监测方法、IO全生命周期时延监测装置、服务器以及计算机可读存储介质进行了详细介绍。本文中应用了具体个例对本申请的原理及实施方式进行了阐述,以上实施例的说明只是用于帮助理解本申请的方法及其核心思想。应当指出,对于本技术领域的普通技术人员来说,在不脱离本申请原理的前提下,还可以对本申请进行若干改进和修饰,这些改进和修饰也落入本申请权利要求的保护范围内。The above provides a detailed introduction to a method for monitoring IO full life cycle delay, an IO full life cycle delay monitoring device, a server, and a computer-readable storage medium provided by the present application. Specific examples are used herein to illustrate the principles and implementations of the present application, and the descriptions of the above embodiments are only used to help understand the methods and core ideas of the present application. It should be pointed out that for those of ordinary skill in the art, without departing from the principles of the present application, several improvements and modifications can also be made to the present application, and these improvements and modifications also fall within the protection scope of the claims of the present application.
Claims (10)
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111676322.4A CN114328122A (en) | 2021-12-31 | 2021-12-31 | IO full life cycle time delay monitoring method and related device |
PCT/CN2022/102681 WO2023123956A1 (en) | 2021-12-31 | 2022-06-30 | Io full-lifecycle latency monitoring method and related apparatus |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111676322.4A CN114328122A (en) | 2021-12-31 | 2021-12-31 | IO full life cycle time delay monitoring method and related device |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114328122A true CN114328122A (en) | 2022-04-12 |
Family
ID=81022063
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111676322.4A Pending CN114328122A (en) | 2021-12-31 | 2021-12-31 | IO full life cycle time delay monitoring method and related device |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN114328122A (en) |
WO (1) | WO2023123956A1 (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115756732A (en) * | 2023-01-09 | 2023-03-07 | 苏州浪潮智能科技有限公司 | IO request monitoring method and device, storage medium and electronic equipment |
WO2023123956A1 (en) * | 2021-12-31 | 2023-07-06 | 郑州云海信息技术有限公司 | Io full-lifecycle latency monitoring method and related apparatus |
WO2024169539A1 (en) * | 2023-02-16 | 2024-08-22 | 中兴通讯股份有限公司 | Statistical method and apparatus for io delay, and storage medium and electronic apparatus |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP2736194A1 (en) * | 2012-11-27 | 2014-05-28 | ADVA Optical Networking SE | Latency monitoring point |
US20160315860A1 (en) * | 2015-04-27 | 2016-10-27 | Pollere Inc. | Estimation Of Network Path Segment Delays |
CN109407984A (en) * | 2018-10-11 | 2019-03-01 | 郑州云海信息技术有限公司 | A kind of performance of storage system monitoring method, device and equipment |
CN109684410A (en) * | 2018-12-24 | 2019-04-26 | 浙江大华技术股份有限公司 | A kind of system, method and the storage medium of determining master-slave database synchronization delayed time |
CN111741295A (en) * | 2020-08-14 | 2020-10-02 | 北京全路通信信号研究设计院集团有限公司 | Monitoring system and method for continuously monitoring end-to-end QoS index of video network |
CN112000543A (en) * | 2020-07-29 | 2020-11-27 | 北京浪潮数据技术有限公司 | Method, device and equipment for detecting time delay performance of storage system |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7454402B2 (en) * | 2004-11-29 | 2008-11-18 | International Business Machines Corporation | Method for replication tracing |
US8751757B1 (en) * | 2011-12-30 | 2014-06-10 | Emc Corporation | Acquisition and kernel memory storage of I/O metrics |
CN114328122A (en) * | 2021-12-31 | 2022-04-12 | 郑州云海信息技术有限公司 | IO full life cycle time delay monitoring method and related device |
-
2021
- 2021-12-31 CN CN202111676322.4A patent/CN114328122A/en active Pending
-
2022
- 2022-06-30 WO PCT/CN2022/102681 patent/WO2023123956A1/en active Application Filing
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP2736194A1 (en) * | 2012-11-27 | 2014-05-28 | ADVA Optical Networking SE | Latency monitoring point |
US20160315860A1 (en) * | 2015-04-27 | 2016-10-27 | Pollere Inc. | Estimation Of Network Path Segment Delays |
CN109407984A (en) * | 2018-10-11 | 2019-03-01 | 郑州云海信息技术有限公司 | A kind of performance of storage system monitoring method, device and equipment |
CN109684410A (en) * | 2018-12-24 | 2019-04-26 | 浙江大华技术股份有限公司 | A kind of system, method and the storage medium of determining master-slave database synchronization delayed time |
CN112000543A (en) * | 2020-07-29 | 2020-11-27 | 北京浪潮数据技术有限公司 | Method, device and equipment for detecting time delay performance of storage system |
CN111741295A (en) * | 2020-08-14 | 2020-10-02 | 北京全路通信信号研究设计院集团有限公司 | Monitoring system and method for continuously monitoring end-to-end QoS index of video network |
Non-Patent Citations (1)
Title |
---|
吴一亮等: "在线无损视频时延测量装置的设计实现", 《中国优秀硕士学位论文全文数据库(电子期刊)》, vol. 2021, no. 11, 30 November 2021 (2021-11-30) * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2023123956A1 (en) * | 2021-12-31 | 2023-07-06 | 郑州云海信息技术有限公司 | Io full-lifecycle latency monitoring method and related apparatus |
CN115756732A (en) * | 2023-01-09 | 2023-03-07 | 苏州浪潮智能科技有限公司 | IO request monitoring method and device, storage medium and electronic equipment |
CN115756732B (en) * | 2023-01-09 | 2023-04-07 | 苏州浪潮智能科技有限公司 | IO request monitoring method and device, storage medium and electronic equipment |
WO2024169539A1 (en) * | 2023-02-16 | 2024-08-22 | 中兴通讯股份有限公司 | Statistical method and apparatus for io delay, and storage medium and electronic apparatus |
Also Published As
Publication number | Publication date |
---|---|
WO2023123956A1 (en) | 2023-07-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN114328122A (en) | IO full life cycle time delay monitoring method and related device | |
CN109656574B (en) | Transaction time delay measurement method and device, computer equipment and storage medium | |
CN111338814A (en) | Message processing method and device, storage medium and electronic device | |
CN112804121A (en) | TTE network transmission delay test system and method | |
CN109921915B (en) | Method and device for testing wake-up function of real-time clock module and electronic equipment | |
CN111030888B (en) | Domain Name System DNS Capacity Measurement Method, Apparatus, Equipment and Medium | |
CN103117900A (en) | Configurable industrial Ethernet data parsing system and parsing method | |
CN103997432A (en) | Measuring system and method for supporting analysis of OpenFlow application performance | |
CN112311628A (en) | Network speed measurement method, system, network device and storage medium | |
CN111641531B (en) | DPDK-based data packet distribution and feature extraction method | |
CN113783744A (en) | A time delay measurement method, device, computer equipment and storage medium | |
CN118827393A (en) | eBPF-based application observation link topology construction method and related equipment | |
CN110543509A (en) | Monitoring system, method and device for user access data and electronic equipment | |
CN116340111A (en) | Linux socket monitoring event monitoring method and device | |
CN114697241A (en) | End-to-end time delay test system and method | |
CN113986653A (en) | Openstack load balancing data monitoring method, system, storage medium and equipment | |
CN109981391B (en) | A sampling method, device and readable storage medium | |
CN113014346A (en) | Server time deviation monitoring method and device, computer equipment and storage medium | |
CN114760218B (en) | Link sampling method and related device thereof | |
CN118573605B (en) | Packet loss monitoring method and device in FTTR system, gateway equipment and storage medium | |
CN113110983B (en) | Transaction system time delay processing method and device, computer equipment and storage medium | |
CN119377017B (en) | Data storage method, computer equipment, storage medium and program product | |
CN113225228B (en) | Data processing method and device | |
US11290361B1 (en) | Programmable network measurement engine | |
WO2025066530A1 (en) | Application performance measurement method and apparatus, and device, system and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |