CN115834484A - Method, device and equipment for detecting transient congestion on chip and storage medium - Google Patents
Method, device and equipment for detecting transient congestion on chip and storage medium Download PDFInfo
- Publication number
- CN115834484A CN115834484A CN202211558203.3A CN202211558203A CN115834484A CN 115834484 A CN115834484 A CN 115834484A CN 202211558203 A CN202211558203 A CN 202211558203A CN 115834484 A CN115834484 A CN 115834484A
- Authority
- CN
- China
- Prior art keywords
- transient congestion
- transient
- capture
- preset
- port queue
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 230000001052 transient effect Effects 0.000 title claims abstract description 169
- 238000000034 method Methods 0.000 title claims abstract description 32
- 238000001514 detection method Methods 0.000 claims abstract description 43
- 238000004806 packaging method and process Methods 0.000 claims 1
- 238000010586 diagram Methods 0.000 description 8
- 230000003068 static effect Effects 0.000 description 7
- 238000004891 communication Methods 0.000 description 5
- 230000006870 function Effects 0.000 description 5
- 230000001960 triggered effect Effects 0.000 description 5
- 230000008859 change Effects 0.000 description 4
- 238000004590 computer program Methods 0.000 description 4
- 230000009286 beneficial effect Effects 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 238000012545 processing Methods 0.000 description 3
- 238000012544 monitoring process Methods 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 1
- 230000003139 buffering effect Effects 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 238000000802 evaporation-induced self-assembly Methods 0.000 description 1
- 230000003090 exacerbative effect Effects 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 239000013589 supplement Substances 0.000 description 1
Images
Landscapes
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
Description
技术领域technical field
本发明涉及网络通信技术领域,特别涉及一种芯片上的瞬息拥塞检测方法、装置、设备及存储介质。The invention relates to the technical field of network communication, in particular to a method, device, equipment and storage medium for instantaneous congestion detection on a chip.
背景技术Background technique
瞬息拥塞(Mircoburst)是指非常短的时间范围内,芯片端口中的网络数据包大量突发传输的现象。在极短的持续时间中,瞬时速率可能远超端口平均速率。这种瞬息拥塞现象对网络和应用性能可能造成很大的负面影响,使网络中延迟、抖动和丢包情况恶化。Mircoburst refers to the phenomenon of bursty transmission of a large number of network data packets in a chip port within a very short time range. For extremely short durations, the instantaneous rate may far exceed the average port rate. This transient congestion can have a significant negative impact on network and application performance, exacerbating latency, jitter, and packet loss in the network.
目前,网络中常用的监测手段时间维度较长,对瞬息拥塞的捕获和分析有很大困难。且频繁的瞬息拥塞事件的上报对于片上资源消耗和CPU访问带宽也具有极大的考验。At present, the monitoring methods commonly used in the network have a long time dimension, and it is very difficult to capture and analyze instantaneous congestion. Moreover, the reporting of frequent transient congestion events also poses a great challenge to on-chip resource consumption and CPU access bandwidth.
发明内容Contents of the invention
本申请实施例提供了一种芯片上的瞬息拥塞检测方法、装置、设备及存储介质。为了对披露的实施例的一些方面有一个基本的理解,下面给出了简单的概括。该概括部分不是泛泛评述,也不是要确定关键/重要组成元素或描绘这些实施例的保护范围。其唯一目的是用简单的形式呈现一些概念,以此作为后面的详细说明的序言。Embodiments of the present application provide an on-chip transient congestion detection method, device, device, and storage medium. In order to provide a basic understanding of some aspects of the disclosed embodiments, a brief summary is presented below. This summary is not an overview, nor is it intended to identify key/critical elements or delineate the scope of these embodiments. Its sole purpose is to present some concepts in a simplified form as a prelude to the more detailed description that is presented later.
第一方面,本申请实施例提供了一种芯片上的瞬息拥塞检测方法,包括:In the first aspect, the embodiment of the present application provides an on-chip transient congestion detection method, including:
获取芯片端口队列的资源占用率;Obtain the resource occupancy rate of the chip port queue;
根据所述资源占用率判断端口队列是否满足预设的瞬息拥塞捕获条件,当满足所述瞬息拥塞捕获条件时,触发预设的捕获引擎进入瞬息拥塞捕获状态,捕获端口队列的瞬息拥塞事件信息;Judging whether the port queue satisfies the preset transient congestion capture condition according to the resource occupancy rate, when the transient congestion capture condition is satisfied, triggering the preset capture engine to enter the transient congestion capture state, and capturing the transient congestion event information of the port queue;
存储捕获的所述瞬息拥塞事件信息。The captured transient congestion event information is stored.
在一个可选地实施例中,根据所述资源占用率判断端口队列是否满足预设的瞬息拥塞捕获条件,包括:In an optional embodiment, judging whether the port queue satisfies a preset instantaneous congestion capture condition according to the resource occupancy rate includes:
判断端口队列的资源占用率是否大于等于预设的阈值上限;Determine whether the resource occupancy rate of the port queue is greater than or equal to the preset upper threshold;
判断端口队列的使能开关是否处于打开状态;Determine whether the enable switch of the port queue is on;
当所述端口队列的资源占用率大于等于所述阈值上限,且所述使能开关处于打开状态时,确定端口队列满足预设的瞬息拥塞捕获条件。When the resource occupancy rate of the port queue is greater than or equal to the upper threshold and the enabling switch is in an open state, it is determined that the port queue satisfies a preset instantaneous congestion capture condition.
在一个可选地实施例中,判断端口队列的资源占用率是否大于等于预设的阈值上限之前,还包括:In an optional embodiment, before judging whether the resource occupancy rate of the port queue is greater than or equal to a preset upper threshold, it also includes:
根据端口队列速率以及端口队列当前资源占用率动态调整所述阈值上限。The upper threshold is dynamically adjusted according to the port queue rate and the current resource occupancy rate of the port queue.
在一个可选地实施例中,触发预设的捕获引擎进入瞬息拥塞捕获状态之后,还包括:In an optional embodiment, after the preset capture engine is triggered to enter the transient congestion capture state, it further includes:
判断当前端口队列资源占用率是否小于预设的阈值下限;Determine whether the current port queue resource occupancy rate is less than the preset lower threshold;
当端口队列资源占用率小于预设的阈值下限时,触发预设的捕获引擎退出瞬息拥塞捕获状态;When the port queue resource occupancy rate is less than the preset threshold lower limit, trigger the preset capture engine to exit the transient congestion capture state;
判断当前端口队列的瞬息拥塞捕获时间是否大于预设的捕获时间阈值;Judging whether the instantaneous congestion capture time of the current port queue is greater than the preset capture time threshold;
当端口队列的瞬息拥塞捕获时间大于预设的捕获时间阈值时,触发预设的捕获引擎退出瞬息拥塞捕获状态。When the transient congestion capture time of the port queue is greater than the preset capture time threshold, the preset capture engine is triggered to exit the transient congestion capture state.
在一个可选地实施例中,捕获端口队列的瞬息拥塞事件信息,包括:In an optional embodiment, capturing the transient congestion event information of the port queue includes:
捕获端口队列的捕获开始时间、捕获端口队列编号、捕获结束时间、瞬息拥塞尖峰值以及端口队列进入捕获状态的总次数。The capture start time of the capture port queue, the capture port queue number, the capture end time, the peak value of transient congestion, and the total number of times the port queue enters the capture state.
在一个可选地实施例中,存储捕获的所述瞬息拥塞事件信息,包括:In an optional embodiment, storing the captured transient congestion event information includes:
将所述瞬息拥塞事件信息存入捕获引擎的缓存模块;Storing the transient congestion event information into the cache module of the capture engine;
对所述缓存模块存储的预设合并时间窗口内的瞬息拥塞事件进行合并,得到合并后的瞬息拥塞事件信息;Merge the instantaneous congestion events within the preset merging time window stored by the cache module to obtain the merged instantaneous congestion event information;
将所述合并后的瞬息拥塞事件信息存入存储模块和/或存入遥感组包器。The combined transient congestion event information is stored in a storage module and/or in a remote sensing packager.
在一个可选地实施例中,对所述缓存模块存储的预设合并时间窗口内的瞬息拥塞事件进行合并,得到合并后的瞬息拥塞事件信息,包括:In an optional embodiment, the instantaneous congestion events stored in the cache module within the preset merging time window are merged to obtain the merged instantaneous congestion event information, including:
获取预设合并时间窗口内的所有瞬息拥塞事件的最大资源占用率,作为合并后的瞬息拥塞事件的最大资源占用率;Obtain the maximum resource occupancy rate of all transient congestion events within the preset combined time window as the maximum resource occupancy rate of the combined transient congestion event;
获取预设合并时间窗口内的所有瞬息拥塞事件的次数之和,作为合并后的瞬息拥塞事件次数;Obtain the sum of the times of all transient congestion events within the preset combined time window as the combined number of transient congestion events;
获取预设合并时间窗口内的所有瞬息拥塞事件的捕获开始时间最小值,作为合并后的瞬息拥塞事件的捕获开始时间;Obtain the minimum value of the capture start time of all transient congestion events within the preset combined time window as the capture start time of the combined transient congestion event;
获取预设合并时间窗口内的所有瞬息拥塞事件的捕获结束时间最大值,作为合并后的瞬息拥塞事件的捕获结束时间。Obtain the maximum value of the capture end time of all transient congestion events within the preset merge time window, and use it as the capture end time of the merged transient congestion event.
第二方面,本申请实施例提供了一种芯片上的瞬息拥塞检测装置,包括:In the second aspect, the embodiment of the present application provides an on-chip transient congestion detection device, including:
获取模块,用于获取芯片端口队列的资源占用率;An acquisition module, configured to acquire the resource occupancy rate of the chip port queue;
捕获模块,用于根据所述资源占用率判断端口队列是否满足预设的瞬息拥塞捕获条件,当满足所述瞬息拥塞捕获条件时,触发预设的捕获引擎进入瞬息拥塞捕获状态,捕获端口队列的瞬息拥塞事件信息;The capture module is used to judge whether the port queue satisfies the preset transient congestion capture condition according to the resource occupancy rate, and when the transient congestion capture condition is met, trigger the preset capture engine to enter the transient congestion capture state, and capture the port queue Transient congestion event information;
存储模块,用于存储捕获的所述瞬息拥塞事件信息。A storage module, configured to store the captured transient congestion event information.
第三方面,本申请实施例提供了一种电子设备,包括处理器和存储有程序指令的存储器,所述处理器被配置为在执行所述程序指令时,执行上述实施例提供的芯片上的瞬息拥塞检测方法。In a third aspect, an embodiment of the present application provides an electronic device, including a processor and a memory storing program instructions, and the processor is configured to, when executing the program instructions, execute the on-chip Transient congestion detection method.
第四方面,本申请实施例提供了一种计算机可读介质,其上存储有计算机可读指令,所述计算机可读指令被处理器执行以实现上述实施例提供芯片上的瞬息拥塞检测方法。In a fourth aspect, the embodiment of the present application provides a computer-readable medium on which computer-readable instructions are stored, and the computer-readable instructions are executed by a processor to implement the method for detecting transient congestion on a chip provided by the above-mentioned embodiment.
本申请实施例提供的技术方案可以包括以下有益效果:The technical solutions provided by the embodiments of the present application may include the following beneficial effects:
本申请实施例提供的瞬息拥塞检测方法,根据端口的资源占用情况判断是否满足瞬息拥塞捕获条件,当满足捕获条件时,控制预设的捕获引擎进行瞬息拥塞捕获状态,对芯片上的瞬息拥塞事件信息进行捕获和存储。能够在瞬息拥塞极短的持续时间中,对瞬息拥塞事件进行捕获和记录,便于后续分析,降低瞬息拥塞流量对网络和应用性能的影响,且捕获粒度能精确到端口队列级别。The transient congestion detection method provided by the embodiment of the present application judges whether the transient congestion capture condition is met according to the resource occupation of the port. When the capture condition is satisfied, the preset capture engine is controlled to perform the transient congestion capture state, and the transient congestion event on the chip is detected. information is captured and stored. It can capture and record transient congestion events in a very short duration of transient congestion, which is convenient for subsequent analysis and reduces the impact of transient congestion traffic on network and application performance, and the capture granularity can be accurate to the port queue level.
应当理解的是,以上的一般描述和后文的细节描述仅是示例性和解释性的,并不能限制本发明。It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention.
附图说明Description of drawings
此处的附图被并入说明书中并构成本说明书的一部分,示出了符合本发明的实施例,并与说明书一起用于解释本发明的原理。The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and together with the description serve to explain the principles of the invention.
图1是根据一示例性实施例示出的一种芯片上的瞬息拥塞检测方法的流程示意图;FIG. 1 is a schematic flowchart of a method for detecting transient congestion on a chip according to an exemplary embodiment;
图2是根据一示例性实施例示出的一种芯片上的瞬息拥塞检测方法示意图;Fig. 2 is a schematic diagram of an on-chip transient congestion detection method according to an exemplary embodiment;
图3是根据一示例性实施例示出的一种芯片上的瞬息拥塞检测方法示意图;Fig. 3 is a schematic diagram of an on-chip transient congestion detection method according to an exemplary embodiment;
图4是根据一示例性实施例示出的一种瞬息拥塞流量检测条件示意图;Fig. 4 is a schematic diagram showing detection conditions of transient congestion traffic according to an exemplary embodiment;
图5是根据一示例性实施例示出的一种芯片上的瞬息拥塞检测装置的结构示意图;Fig. 5 is a schematic structural diagram of an on-chip transient congestion detection device according to an exemplary embodiment;
图6是根据一示例性实施例示出的一种电子设备的结构示意图;Fig. 6 is a schematic structural diagram of an electronic device according to an exemplary embodiment;
图7是根据一示例性实施例示出的一种计算机存储介质的示意图。Fig. 7 is a schematic diagram of a computer storage medium according to an exemplary embodiment.
具体实施方式Detailed ways
以下描述和附图充分地示出本发明的具体实施方案,以使本领域的技术人员能够实践它们。The following description and drawings illustrate specific embodiments of the invention sufficiently to enable those skilled in the art to practice them.
应当明确,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其它实施例,都属于本发明保护的范围。It should be clear that the described embodiments are only some of the embodiments of the present invention, not all of them. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without creative efforts fall within the protection scope of the present invention.
下面的描述涉及附图时,除非另有表示,不同附图中的相同数字表示相同或相似的要素。以下示例性实施例中所描述的实施方式并不代表与本发明相一致的所有实施方式。相反,它们仅是如所附权利要求书中所详述的、本发明的一些方面相一致的系统和方法的例子。When the following description refers to the accompanying drawings, the same numerals in different drawings refer to the same or similar elements unless otherwise indicated. The implementations described in the following exemplary examples do not represent all implementations consistent with the present invention. Rather, they are merely examples of systems and methods consistent with aspects of the invention as recited in the appended claims.
下面将结合附图对本申请实施例提供的芯片上的瞬息拥塞检测方法进行详细介绍。参见图1,该方法具体包括以下步骤。The on-chip transient congestion detection method provided by the embodiment of the present application will be described in detail below with reference to the accompanying drawings. Referring to Fig. 1, the method specifically includes the following steps.
S101获取芯片端口队列的资源占用率。S101 acquires the resource occupancy rate of the chip port queue.
在一个实施例中,对交换机芯片上的瞬息拥塞流量进行检测,检测对象是芯片中的端口或端口队列的资源占用情况。本领域技术人员可根据实际需要检测端口瞬息拥塞事件,或更细粒度地检测端口包含的各个队列的资源占用情况。In one embodiment, the transient congestion traffic on the switch chip is detected, and the detection object is the resource occupation of ports or port queues in the chip. Those skilled in the art can detect transient port congestion events according to actual needs, or detect the resource occupancy of each queue included in the port in a more fine-grained manner.
在一种可能的实现方式中,捕获源头分为事件捕获和背景扫描捕获,事件捕获指的是数据包进出端口或端口队列时导致片上资源变动,因此,在有数据包进出时,采集有数据包进出时的端口资源占用率或端口队列资源占用率。背景扫描是在作为事件捕获的补充,在未出现事件捕获的情况下,扫描片上资源。In a possible implementation, the source of capture is divided into event capture and background scan capture. Event capture refers to the change of on-chip resources caused by data packets entering and exiting ports or port queues. Therefore, when data packets enter and exit, the collected data Port resource occupancy rate or port queue resource occupancy rate when packets are in and out. The background scan is used as a supplement to the event capture, and scans on-chip resources when no event capture occurs.
S102根据资源占用率判断端口队列是否满足预设的瞬息拥塞捕获条件,当满足瞬息拥塞捕获条件时,触发预设的捕获引擎进入瞬息拥塞捕获状态,捕获端口队列的瞬息拥塞事件信息。S102 judges whether the port queue satisfies the preset transient congestion capture condition according to the resource occupancy rate, and when the transient congestion capture condition is satisfied, triggers the preset capture engine to enter the transient congestion capture state, and captures the transient congestion event information of the port queue.
在一个示例性实施例中,预先在交换机芯片上配置多个捕获引擎,每个捕获引擎可以支持多个端口和队列。其中,捕获引擎数量可以根据芯片中入口和出口平面数量,以及所需要的检测端口或队列的实际需要灵活增加减少。例如,可以给每个端口配置一个捕获引擎,或者更细粒度的为每个队列配置一个捕获引擎,本领域技术人员可根据实际需要自行设定。In an exemplary embodiment, multiple capture engines are pre-configured on the switch chip, and each capture engine can support multiple ports and queues. Among them, the number of capture engines can be flexibly increased or decreased according to the number of entry and exit planes in the chip, and the actual needs of the required detection ports or queues. For example, a capture engine can be configured for each port, or more finely-grained, a capture engine can be configured for each queue, which can be set by those skilled in the art according to actual needs.
进一步地,根据获取的端口队列资源占用情况,判断是否进入瞬息拥塞捕获。包括:判断端口队列的资源占用率是否大于等于预设的阈值上限,判断端口队列的使能开关是否处于打开状态,当端口队列的资源占用率大于等于阈值上限,且使能开关处于打开状态时,确定端口队列满足预设的瞬息拥塞捕获条件。Further, according to the obtained port queue resource occupation, it is judged whether to enter the instantaneous congestion capture. Including: judging whether the resource occupancy rate of the port queue is greater than or equal to the preset upper threshold, judging whether the enable switch of the port queue is on, when the resource occupancy rate of the port queue is greater than or equal to the upper threshold, and the enable switch is on , to determine that the port queue satisfies the preset instantaneous congestion capture condition.
其中,本申请实施例提供的方案中,阈值上限是可以根据端口队列速率以及端口队列当前资源占用率动态调整的。在检测判断中内置一个动态阈值,根据端口队列速率和当前端口队列状态,动态调节阈值上限,当实际资源占用较高时,适当提高动态阈值,从而减少所需捕获的次数,节省存储资源和芯片功耗。对于不同速率的端口队列,通过配置不同的端口队列速率系数,来减少静态阈值配置项数目,同时保证了不同速率的端口队列采用不同的阈值上限。Wherein, in the solution provided by the embodiment of the present application, the upper threshold can be dynamically adjusted according to the port queue rate and the current resource occupancy rate of the port queue. A dynamic threshold is built into the detection and judgment. According to the port queue rate and the current port queue status, the upper limit of the threshold is dynamically adjusted. When the actual resource usage is high, the dynamic threshold is appropriately increased, thereby reducing the number of required captures and saving storage resources and chips. power consumption. For port queues with different rates, configure different port queue rate coefficients to reduce the number of static threshold configuration items, and at the same time ensure that port queues with different rates adopt different upper thresholds.
在一种可能的实现方式中,可根据如下公式确定阈值上限:In a possible implementation, the upper threshold can be determined according to the following formula:
阈值上限=静态阈值+端口队列速率系数*(实际资源占用-静态阈值);Upper threshold = static threshold + port queue rate factor * (actual resource usage - static threshold);
其中,端口队列速率系数、静态阈值均为软件配置项,本领域技术人员可根据实际情况自行配置,本申请实施例不做具体限定。Among them, the port queue rate coefficient and the static threshold are software configuration items, which can be configured by those skilled in the art according to the actual situation, and are not specifically limited in this embodiment of the application.
进一步地,确定端口队列满足预设的瞬息拥塞捕获条件之后,触发预设的捕获引擎进入瞬息拥塞捕获状态,捕获端口的瞬息拥塞事件信息。Further, after it is determined that the port queue satisfies the preset transient congestion capture condition, a preset capture engine is triggered to enter the transient congestion capture state, and the transient congestion event information of the port is captured.
在一个可选地实施例中,捕获引擎捕获端口或端口队列的捕获开始时间、捕获端口或端口队列编号、捕获结束时间、瞬息拥塞尖峰值以及端口或端口队列进入捕获状态的总次数。在捕获状态中会设置捕获计数器,可以统计端口或端口队列进入捕获状态的总次数。In an optional embodiment, the capture engine captures the capture start time of the port or the port queue, the capture port or port queue number, the capture end time, the transient congestion peak value, and the total number of times the port or the port queue enters the capture state. A capture counter is set in the capture state, which can count the total number of times the port or port queue enters the capture state.
进入捕获状态后,还包括判断瞬息拥塞事件是否结束。在一个示例性实施方式中,还包括设置阈值下限,判断当前端口资源占用率是否小于预设的阈值下限;当端口资源占用率小于预设的阈值下限时,触发预设的捕获引擎退出瞬息拥塞捕获状态。另外,还包括一个软件可配置的捕获退出最大时长,当捕获状态持续时长超过此预设捕获时间阈值时,捕获引擎退出捕获状态,该功能保证了当某个端口长期拥塞时,其他端口仍能在同一引擎中得到捕获。其中,预设捕获时间阈值的具体取值可根据实际情况自行设定。After entering the capture state, it also includes judging whether the transient congestion event is over. In an exemplary embodiment, it also includes setting a lower threshold, judging whether the current port resource occupancy rate is lower than the preset lower threshold; when the port resource occupancy rate is lower than the preset lower threshold, triggering the preset capture engine to exit transient congestion capture state. In addition, it also includes a software-configurable capture exit maximum time. When the capture state lasts longer than the preset capture time threshold, the capture engine will exit the capture state. This function ensures that when a port is congested for a long time, other ports can still are captured in the same engine. Wherein, the specific value of the preset capture time threshold can be set according to the actual situation.
其中,阈值下限可为固定数值,例如根据静态阈值设置阈值下限,也可为动态数值,根据端口速率以及实际资源占用情况,跟随阈值上限自动调整。例如,根据如下公式确定阈值下限:阈值下限=阈值上限-波动系数,波动系数可根据实际情况自行设定。通过不同的阈值上限与下限,可有效缓解当资源单阈值附近来回抖动的误捕获情况。Wherein, the lower threshold can be a fixed value, such as setting the lower threshold according to a static threshold, or it can be a dynamic value, which is automatically adjusted following the upper threshold according to the port rate and actual resource occupation. For example, the lower limit of the threshold is determined according to the following formula: lower limit of the threshold = upper limit of the threshold - fluctuation coefficient, and the fluctuation coefficient can be set according to the actual situation. Through different upper and lower thresholds, it can effectively alleviate the miscapture when the resource jitters back and forth near the single threshold.
根据该步骤,可对交换机芯片上的瞬息拥塞进行检测,并捕获瞬息拥塞事件的开始时间信息、结束时间信息、突发尖峰值、进入捕获状态的端口队列编号等信息。According to this step, the transient congestion on the switch chip can be detected, and information such as the start time information, end time information, burst peak value, port queue number entering the capture state and the like of the transient congestion event can be captured.
每个瞬息拥塞的捕获引擎,可以提供给CPU实时访问的状态信息有:当前是否处于瞬息拥塞监测;当前监测的瞬息拥塞的端口号和队列号;当前检测的瞬息拥塞的端口号的最大资源占用;已经发生瞬息拥塞检测的总次数。Each transient congestion capture engine can provide the CPU with real-time access status information: whether it is currently under transient congestion monitoring; the port number and queue number of the currently monitored transient congestion; the maximum resource occupation of the currently detected transient congestion port number ; The total number of times a transient congestion detection has occurred.
S103存储捕获的瞬息拥塞事件信息。S103 stores the captured transient congestion event information.
在一种可能的实现方式中,在每个瞬息拥塞捕获引擎内,设置一个缓存模块,在瞬息拥塞事件信息存入共享存储器之前,临时存放捕获数据,且在缓存中也存在一个内置的合并机制,在软件可配置的时间窗口内,合并同一个端口的捕获事件,或同一个端口队列的捕获事件,从而减少存储空间,减少软件访问所需要占用的带宽。In a possible implementation, in each transient congestion capture engine, a cache module is set to temporarily store the captured data before the transient congestion event information is stored in the shared memory, and there is also a built-in merging mechanism in the cache , within a time window configurable by the software, the capture events of the same port or the capture events of the same port queue are merged, thereby reducing storage space and bandwidth required for software access.
具体地,将瞬息拥塞事件信息存入捕获引擎的缓存模块,对缓存模块存储的预设合并时间窗口内的瞬息拥塞事件进行合并,得到合并后的瞬息拥塞事件信息。Specifically, the transient congestion event information is stored in the cache module of the capture engine, and the transient congestion events within the preset combining time window stored in the cache module are merged to obtain the merged transient congestion event information.
在一个实施例中,获取同一个端口或同一个端口队列在预设合并时间窗口内的瞬息拥塞事件,并进行合并。将预设合并时间窗口内的所有瞬息拥塞事件的最大资源占用率,作为合并后的瞬息拥塞事件的最大资源占用率;将预设合并时间窗口内的所有瞬息拥塞事件的次数之和,作为合并后的瞬息拥塞事件次数;将预设合并时间窗口内的所有瞬息拥塞事件的捕获开始时间最小值,作为合并后的瞬息拥塞事件的捕获开始时间;将预设合并时间窗口内的所有瞬息拥塞事件的捕获结束时间最大值,作为合并后的瞬息拥塞事件的捕获结束时间。In one embodiment, the instantaneous congestion events of the same port or the same port queue within the preset merging time window are acquired and merged. The maximum resource occupancy rate of all transient congestion events within the preset combined time window is used as the maximum resource occupancy rate of the combined transient congestion events; the sum of the times of all transient congestion events within the preset combined time window is used as the combined The number of transient congestion events after the combination; the minimum value of the capture start time of all transient congestion events in the preset merging time window is used as the capture start time of the combined transient congestion event; all the transient congestion events in the preset merging time window The maximum capture end time of is used as the capture end time of the merged transient congestion event.
如图4所示,是获取的端口资源占用情况,当资源占用率大于阈值上限时触发捕获,降低到阈值下限时结束捕获。且设置一个合并时间窗口,将设置的时间段内的瞬息拥塞事件进行合并。As shown in Figure 4, it is the obtained port resource occupancy. When the resource occupancy is greater than the upper threshold, the capture is triggered, and when it falls to the lower threshold, the capture ends. And a merge time window is set, and the instantaneous congestion events within the set time period are merged.
进一步地,本申请提供的瞬息拥塞检测方案中,为了提高存储效率,并节省芯片资源,根据实际需要分配一个固定深度的共享的存储空间,用来存储所有捕获引擎产生的记录数据。该共享存储空间仅需一个单端口静态随机存取存储器(SRAM)。捕获引擎可将合并后的瞬息拥塞事件信息存入共享的存储模块,供CPU随时读取瞬息拥塞流量检测数据,相比实时上报,节省了大量CPU访问路径的带宽占用。Furthermore, in the transient congestion detection solution provided by this application, in order to improve storage efficiency and save chip resources, a fixed-depth shared storage space is allocated according to actual needs to store the record data generated by all capture engines. The shared memory space requires only a single-port static random access memory (SRAM). The capture engine can store the merged transient congestion event information into a shared storage module for the CPU to read transient congestion traffic detection data at any time. Compared with real-time reporting, it saves a lot of bandwidth occupation of the CPU access path.
在一个可选地实施例中,还可以将合并后的瞬息拥塞事件信息存入遥感组包器,将瞬息拥塞的信息塞入IFA(Inband Flow Analyzer)组包的LNS(local name space)域段,从而扩展出将本地交换机的瞬息拥塞信息以网络遥感数据包的形式送出的功能。可以不通过CPU就可以获取到瞬息拥塞捕获信息。In an optional embodiment, the merged transient congestion event information can also be stored in the remote sensing packetizer, and the transient congestion information can be inserted into the LNS (local name space) domain segment of the IFA (Inband Flow Analyzer) packet , so as to extend the function of sending the instantaneous congestion information of the local exchange in the form of network remote sensing data packets. Instantaneous congestion capture information can be obtained without using the CPU.
为了便于理解本申请实施例提供的芯片上的瞬息拥塞检测方法,下面结合附图2进行说明。In order to facilitate the understanding of the on-chip transient congestion detection method provided by the embodiment of the present application, the following description will be made with reference to FIG. 2 .
如图2所示,本申请实施例提供的瞬息拥塞检测方法,包括背景扫描、事件变动扫描、检测判断、状态捕获、缓存、存储几个步骤。As shown in FIG. 2 , the transient congestion detection method provided by the embodiment of the present application includes several steps of background scanning, event change scanning, detection and judgment, state capture, buffering, and storage.
首先,获取数据包进出端口时,片上资源变动情况以及没有数据包进出时,片上资源占用情况。然后进行检测判断,判断是否进入瞬息拥塞捕获。包括:判断端口的资源占用率是否大于等于预设的阈值上限,判断端口的使能开关是否处于打开状态,当端口的资源占用率大于等于阈值上限,且使能开关处于打开状态时,确定端口满足预设的瞬息拥塞捕获条件。确定满足捕获条件后,触发捕获引擎进行状态捕获,并记录瞬息拥塞事件的开始时间信息、结束时间信息、突发尖峰值、进入捕获状态的端口队列编号等信息。First of all, when the data packet enters and exits the port, the change of the on-chip resource and the occupancy of the on-chip resource when there is no data packet enters and exits. Then, detection and judgment are carried out to judge whether to enter the instantaneous congestion capture. Including: judging whether the resource occupancy rate of the port is greater than or equal to the preset upper threshold, judging whether the enable switch of the port is on, when the resource occupancy rate of the port is greater than or equal to the upper threshold and the enable switch is on, determine the port Satisfy the preset transient congestion capture conditions. After confirming that the capture conditions are satisfied, the capture engine is triggered to capture the state, and record the start time information, end time information, burst peak value, port queue number entering the capture state and other information of the transient congestion event.
进一步地,将瞬息拥塞事件信息存入捕获引擎的缓存模块,对缓存模块存储的预设合并时间窗口内的瞬息拥塞事件进行合并,得到合并后的瞬息拥塞事件信息。捕获引擎可将合并后的瞬息拥塞事件信息存入共享的存储模块,供CPU读取瞬息拥塞流量检测数据。还可以将合并后的瞬息拥塞事件信息存入遥感组包器,从而扩展出将本地交换机的瞬息拥塞信息以网络遥感数据包的形式送出的功能。Further, the transient congestion event information is stored in the cache module of the capture engine, and the transient congestion events within the preset combining time window stored in the cache module are merged to obtain the merged transient congestion event information. The capture engine can store the merged transient congestion event information into a shared storage module for the CPU to read transient congestion flow detection data. The merged instantaneous congestion event information can also be stored in the remote sensing packetizer, thereby expanding the function of sending the instantaneous congestion information of the local exchange in the form of network remote sensing data packets.
为了便于理解本申请实施例提供的芯片上的瞬息拥塞检测方法,下面结合附图3进行说明。In order to facilitate the understanding of the on-chip transient congestion detection method provided by the embodiment of the present application, it will be described below with reference to FIG. 3 .
如图3所示,本申请实施例提供的瞬息拥塞检测方法,包括:首先,获取数据包进出端口时,片上资源变动情况以及没有数据包进出时,片上资源占用情况。然后进行检测判断,判断是否进入瞬息拥塞捕获。包括:判断端口的资源占用率是否大于等于预设的阈值上限,判断端口的使能开关是否处于打开状态,当端口的资源占用率大于等于阈值上限,且使能开关处于打开状态时,确定端口满足预设的瞬息拥塞捕获条件。As shown in FIG. 3 , the transient congestion detection method provided by the embodiment of the present application includes: firstly, acquiring on-chip resource changes when data packets enter and exit ports, and on-chip resource occupancy when no data packets enter and exit. Then, detection and judgment are carried out to judge whether to enter the instantaneous congestion capture. Including: judging whether the resource occupancy rate of the port is greater than or equal to the preset upper threshold, judging whether the enable switch of the port is on, when the resource occupancy rate of the port is greater than or equal to the upper threshold and the enable switch is on, determine the port Satisfy the preset transient congestion capture conditions.
进一步地,判断捕获时长是否超时,资源占用率是否低于阈值下限,若捕获时长超时或资源占用率低于阈值下限,则退出捕获状态,并保存捕获信息。Further, it is judged whether the capture duration is overdue and whether the resource occupancy rate is lower than the lower threshold, and if the capture duration is overtime or the resource occupancy rate is lower than the lower threshold, exit the capture state and save the capture information.
根据本申请实施例提供的瞬息拥塞检测方法,根据端口的资源占用情况判断是否满足瞬息拥塞捕获条件,当满足捕获条件时,控制预设的捕获引擎进行瞬息拥塞捕获状态,对芯片上的瞬息拥塞事件信息进行捕获和存储。能够在瞬息拥塞极短的持续时间中,对瞬息拥塞事件进行捕获和记录,便于后续分析,降低瞬息拥塞流量对网络和应用性能的影响。According to the transient congestion detection method provided by the embodiment of the present application, it is judged whether the transient congestion capture condition is satisfied according to the resource occupation of the port. Event information is captured and stored. Capable of capturing and recording transient congestion events in a very short duration of transient congestion, which facilitates subsequent analysis and reduces the impact of transient congestion traffic on network and application performance.
且本申请所提出的交换机芯片中的瞬息拥塞检测方案提供高度灵活可配置的,基于突发事件的捕获和记录。在保证检测精细颗粒度的同时,可自适应的调整捕获窗口和频度,节省检测过程所占用的芯片资源和CPU访问的带宽,该方案对出端口及队列和入端口及队列均可适用,并不依赖于芯片内部主数据通路。Moreover, the transient congestion detection scheme in the switch chip proposed in this application provides highly flexible and configurable capturing and recording based on emergencies. While ensuring fine-grained detection, the capture window and frequency can be adaptively adjusted to save chip resources occupied by the detection process and CPU access bandwidth. This solution is applicable to both outgoing ports and queues, as well as incoming ports and queues. It does not depend on the main data path inside the chip.
本申请实施例还提供一种芯片上的瞬息拥塞检测装置,该装置用于执行上述实施例的芯片上的瞬息拥塞检测方法,如图5所示,该装置包括:The embodiment of the present application also provides an on-chip transient congestion detection device, which is used to implement the on-chip transient congestion detection method of the above embodiment, as shown in FIG. 5 , the device includes:
获取模块501,用于获取芯片端口队列的资源占用率;An
捕获模块502,用于根据资源占用率判断端口队列是否满足预设的瞬息拥塞捕获条件,当满足瞬息拥塞捕获条件时,触发预设的捕获引擎进入瞬息拥塞捕获状态,捕获端口队列的瞬息拥塞事件信息;The
存储模块503,用于存储捕获的瞬息拥塞事件信息。The
需要说明的是,上述实施例提供的芯片上的瞬息拥塞检测装置在执行芯片上的瞬息拥塞检测方法时,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将设备的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。另外,上述实施例提供的芯片上的瞬息拥塞检测装置与芯片上的瞬息拥塞检测方法实施例属于同一构思,其体现实现过程详见方法实施例,这里不再赘述。It should be noted that when the device for detecting transient congestion on a chip provided by the above-mentioned embodiments executes the method for detecting transient congestion on a chip, it only uses the division of the above-mentioned functional modules for illustration. In practical applications, the above-mentioned Function allocation is accomplished by different functional modules, that is, the internal structure of the device is divided into different functional modules to complete all or part of the functions described above. In addition, the on-chip transient congestion detection device and the on-chip transient congestion detection method embodiment provided by the above embodiments belong to the same concept, and the implementation process thereof is detailed in the method embodiment, and will not be repeated here.
本申请实施例还提供一种与前述实施例所提供的芯片上的瞬息拥塞检测方法对应的电子设备,以执行上述芯片上的瞬息拥塞检测方法。An embodiment of the present application further provides an electronic device corresponding to the method for detecting transient congestion on a chip provided in the foregoing embodiments, so as to implement the method for detecting transient congestion on a chip.
请参考图6,其示出了本申请的一些实施例所提供的一种电子设备的示意图。如图6所示,电子设备包括:处理器600,存储器601,总线602和通信接口603,处理器600、通信接口603和存储器601通过总线602连接;存储器601中存储有可在处理器600上运行的计算机程序,处理器600运行计算机程序时执行本申请前述任一实施例所提供的芯片上的瞬息拥塞检测方法。Please refer to FIG. 6 , which shows a schematic diagram of an electronic device provided by some embodiments of the present application. As shown in Figure 6, the electronic equipment includes:
其中,存储器601可能包含高速随机存取存储器(RAM:Random Access Memory),也可能还包括非不稳定的存储器(non-volatile memory),例如至少一个磁盘存储器。通过至少一个通信接口603(可以是有线或者无线)实现该系统网元与至少一个其他网元之间的通信连接,可以使用互联网、广域网、本地网、城域网等。Wherein, the
总线602可以是ISA总线、PCI总线或EISA总线等。总线可以分为地址总线、数据总线、控制总线等。其中,存储器601用于存储程序,处理器600在接收到执行指令后,执行程序,前述本申请实施例任一实施方式揭示的芯片上的瞬息拥塞检测方法可以应用于处理器600中,或者由处理器600实现。The
处理器600可能是一种集成电路芯片,具有信号的处理能力。在实现过程中,上述方法的各步骤可以通过处理器600中的硬件的集成逻辑电路或者软件形式的指令完成。上述的处理器600可以是通用处理器,包括中央处理器(Central Processing Unit,简称CPU)、网络处理器(Network Processor,简称NP)等;还可以是数字信号处理器(DSP)、专用集成电路(ASIC)、现成可编程门阵列(FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。可以实现或者执行本申请实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合本申请实施例所公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器601,处理器600读取存储器601中的信息,结合其硬件完成上述方法的步骤。The
本申请实施例提供的电子设备与本申请实施例提供的芯片上的瞬息拥塞检测方法出于相同的发明构思,具有与其采用、运行或实现的方法相同的有益效果。The electronic device provided in the embodiment of the present application is based on the same inventive concept as the method for detecting transient congestion on a chip provided in the embodiment of the present application, and has the same beneficial effect as the method adopted, operated or implemented.
本申请实施例还提供一种与前述实施例所提供的芯片上的瞬息拥塞检测方法对应的计算机可读存储介质,请参考图7,其示出的计算机可读存储介质为光盘700,其上存储有计算机程序(即程序产品),计算机程序在被处理器运行时,会执行前述任意实施例所提供的芯片上的瞬息拥塞检测方法。The embodiment of the present application also provides a computer-readable storage medium corresponding to the on-chip transient congestion detection method provided in the foregoing embodiments. Please refer to FIG. A computer program (that is, a program product) is stored, and when the computer program is run by the processor, it will execute the on-chip transient congestion detection method provided by any of the foregoing embodiments.
需要说明的是,计算机可读存储介质的例子还可以包括,但不限于相变内存(PRAM)、静态随机存取存储器(SRAM)、动态随机存取存储器(DRAM)、其他类型的随机存取存储器(RAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、快闪记忆体或其他光学、磁性存储介质,在此不再一一赘述。It should be noted that examples of computer-readable storage media may also include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access Memory (RAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other optical and magnetic storage media will not be repeated here.
本申请的上述实施例提供的计算机可读存储介质与本申请实施例提供的芯片上的瞬息拥塞检测方法出于相同的发明构思,具有与其存储的应用程序所采用、运行或实现的方法相同的有益效果。The computer-readable storage medium provided by the above-mentioned embodiments of the present application is based on the same inventive concept as the on-chip transient congestion detection method provided by the embodiments of the present application, and has the same method adopted, run or implemented by the stored application program. Beneficial effect.
以上实施例的各技术特征可以进行任意的组合,为使描述简洁,未对上述实施例中的各个技术特征所有可能的组合都进行描述,然而,只要这些技术特征的组合不存在矛盾,都应当认为是本说明书记载的范围。The technical features of the above embodiments can be combined arbitrarily. To make the description concise, all possible combinations of the technical features in the above embodiments are not described. However, as long as there is no contradiction in the combination of these technical features, they should be It is considered to be within the range described in this specification.
以上实施例仅表达了本发明的几种实施方式,其描述较为具体和详细,但并不能因此而理解为对本发明专利范围的限制。应当指出的是,对于本领域的普通技术人员来说,在不脱离本发明构思的前提下,还可以做出若干变形和改进,这些都属于本发明的保护范围。因此,本发明专利的保护范围应以所附权利要求为准。The above examples only express several implementations of the present invention, and the description thereof is relatively specific and detailed, but should not be construed as limiting the patent scope of the present invention. It should be pointed out that those skilled in the art can make several modifications and improvements without departing from the concept of the present invention, and these all belong to the protection scope of the present invention. Therefore, the protection scope of the patent for the present invention should be based on the appended claims.
Claims (10)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211558203.3A CN115834484A (en) | 2022-12-06 | 2022-12-06 | Method, device and equipment for detecting transient congestion on chip and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211558203.3A CN115834484A (en) | 2022-12-06 | 2022-12-06 | Method, device and equipment for detecting transient congestion on chip and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115834484A true CN115834484A (en) | 2023-03-21 |
Family
ID=85545293
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211558203.3A Pending CN115834484A (en) | 2022-12-06 | 2022-12-06 | Method, device and equipment for detecting transient congestion on chip and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115834484A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2025001121A1 (en) * | 2023-06-30 | 2025-01-02 | 中兴通讯股份有限公司 | Packet management method and apparatus, storage medium |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6473398B1 (en) * | 1996-03-15 | 2002-10-29 | Alcatel Canada Inc. | Congestion management in managed packet-switched networks |
CN101971629A (en) * | 2008-03-12 | 2011-02-09 | 艾利森电话股份有限公司 | Apparatus and method for adapting a target rate of a video signal |
US20150281091A1 (en) * | 2012-10-15 | 2015-10-01 | Nec Corporation | Control apparatus, node, communication system, communication method, and program |
CN110278157A (en) * | 2018-03-14 | 2019-09-24 | 华为技术有限公司 | Jamming control method and the network equipment |
-
2022
- 2022-12-06 CN CN202211558203.3A patent/CN115834484A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6473398B1 (en) * | 1996-03-15 | 2002-10-29 | Alcatel Canada Inc. | Congestion management in managed packet-switched networks |
CN101971629A (en) * | 2008-03-12 | 2011-02-09 | 艾利森电话股份有限公司 | Apparatus and method for adapting a target rate of a video signal |
US20150281091A1 (en) * | 2012-10-15 | 2015-10-01 | Nec Corporation | Control apparatus, node, communication system, communication method, and program |
CN110278157A (en) * | 2018-03-14 | 2019-09-24 | 华为技术有限公司 | Jamming control method and the network equipment |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2025001121A1 (en) * | 2023-06-30 | 2025-01-02 | 中兴通讯股份有限公司 | Packet management method and apparatus, storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US7881221B2 (en) | Hardware implementation of network testing and performance monitoring in a network device | |
US9282022B2 (en) | Forensics for network switching diagnosis | |
US7787442B2 (en) | Communication statistic information collection apparatus | |
US12199870B2 (en) | Congestion processing method, apparatus, network device and storage medium | |
CN112260899B (en) | Network monitoring method and device based on MMU (memory management unit) | |
US20140201354A1 (en) | Network traffic debugger | |
CN115834484A (en) | Method, device and equipment for detecting transient congestion on chip and storage medium | |
CN105978821A (en) | Method and device for avoiding network congestion | |
CN103345447B (en) | EMS memory management process and system | |
CN106850457B (en) | Cache sharing method and device | |
EP2579507B1 (en) | Method and system for counting data packets | |
WO2022174444A1 (en) | Data stream transmission method and apparatus, and network device | |
CN112152876B (en) | Method and device for acquiring packet loss information | |
CN117240796B (en) | Network card speed limiting method, system, equipment and storage medium | |
CN1601975A (en) | Packet switching equipment traffic monitoring query method and line card collector | |
CN115378873B (en) | Flow control method and system for improving Ethernet data transmission efficiency | |
US9306854B2 (en) | Method and apparatus for diagnosing interface oversubscription and microbursts | |
CN115361191A (en) | A firewall traffic detection method, system, device and medium based on sflow | |
US8645593B2 (en) | Signal processor, transmission apparatus, and method for processing signal | |
WO2018058625A1 (en) | Method and device for detecting message backpressure | |
CN114189480A (en) | Flow sampling method and device, electronic equipment and medium | |
JP7652251B2 (en) | Packet capture device and packet capture method | |
US10466934B2 (en) | Methods and systems for time-based binning of network traffic | |
CN115834427B (en) | High-speed network flow passive measurement method and device | |
US11451998B1 (en) | Systems and methods for communication system resource contention monitoring |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |