HK1214695B

HK1214695B - Compilation of finite automata based on memory hierarchy

Info

Publication number: HK1214695B
Application number: HK16102603.7A
Authority: HK
Inventors: R‧戈亚尔; S‧L‧比拉
Original assignee: 马维尔亚洲私人有限公司
Priority date: 2014-04-14
Filing date: 2016-03-07
Publication date: 2019-09-13

Description

Compilation of Finite Automata Based on Memory Hierarchy

技术领域Technical Field

本发明的各实施例涉及基于存储器层次的有限自动机的编译。Embodiments of the present invention relate to compilation of memory-hierarchy based finite automata.

背景技术Background Art

开放系统互连(OSI)参考模型定义了用于通过传输介质进行通信的7个网络协议层(L1-L7)。上层(L4-L7)表示端到端通信并且下层(L1-L3)表示本地通信。The Open Systems Interconnection (OSI) reference model defines seven network protocol layers (L1-L7) for communication over a transmission medium. The upper layers (L4-L7) represent end-to-end communication and the lower layers (L1-L3) represent local communication.

联网应用感知系统需要处理、过滤和切换L3到L7网络协议层的范围，例如，L7网络协议层诸如超文本传输协议(HTTP)和简单邮件传输协议(SMTP)，以及L4网络协议层诸如传输控制协议(TCP)。除了处理网络协议层以外，联网应用感知系统需要以线速通过L4-L7网络协议层来同时通过基于访问和内容的安全性来保护这些协议，这些协议层包括防火墙、虚拟专用网(VPN)、安全套接字层(SSL)、入侵检测系统(IDS)、互联网协议安全(IPSec)、防病毒(AV)和防垃圾邮件功能。Networked application-aware systems need to process, filter, and switch a range of network protocol layers from L3 to L7, for example, L7 network protocol layers such as Hypertext Transfer Protocol (HTTP) and Simple Mail Transfer Protocol (SMTP), and L4 network protocol layers such as Transmission Control Protocol (TCP). In addition to processing network protocol layers, networked application-aware systems need to protect these protocols at line speed through L4-L7 network protocol layers with both access-based and content-based security, including firewalls, virtual private networks (VPNs), secure sockets layers (SSL), intrusion detection systems (IDS), Internet Protocol Security (IPSec), antivirus (AV), and anti-spam capabilities.

网络处理器可用于高吞吐量L2和L3网络协议处理，即，执行数据包处理从而以线速转发数据包。通常，通用处理器用于处理需要更多智能处理的L4-L7网络协议。虽然通用处理器可以执行此类计算密集型任务，但是没有足够用于处理数据使得其能够被以线速转发的性能。Network processors are used for high-throughput Layer 2 and Layer 3 network protocol processing, meaning they perform packet processing to forward packets at line speed. Typically, general-purpose processors are used to process Layer 4-7 network protocols, which require more intelligent processing. While general-purpose processors can perform these computationally intensive tasks, they lack the performance to process data at line speed.

入侵检测系统(IDS)应用可以检查流过网络的各个数据包的内容，并且可以标识可能指示试图侵入或损害系统的可疑图样。可疑图样的一个示例可以是数据包中的特定文本串，该特定文本串在100个字符以后跟随另一特定文本串。此类内容感知联网可能要求以线速检查数据包的内容。可以对内容进行分析，以确定是否存在安全漏洞或入侵。Intrusion detection system (IDS) applications can inspect the content of individual packets flowing through a network and identify suspicious patterns that could indicate an attempt to infiltrate or compromise a system. An example of a suspicious pattern might be a specific string of text in a packet that is followed 100 characters later by another specific string of text. This type of content-aware networking can require inspecting the content of packets at line speed. The content can be analyzed to determine if a security breach or intrusion exists.

可以应用大量的处于正则表达式形式的图样和规则(本文中也称为正则表达式图样)以确保检测到所有的安全漏洞或入侵。正则表达式是用于描述字符串中的图样的一种紧凑的方法。通过正则表达式匹配的最简单的图样是单个字符或字符串，例如/c/或/cat/。正则表达式还可以包括具有特殊含义的运算符和元字符。通过使用元字符，正则表达式可以用于更复杂的搜索，如“abc.*xyz.”。即，在“abc”和“xyz”之间的无限量字符的情况下，找到字符串“abc”，之后是字符串“xyz”。另一示例是正则表达式“abc..abc.*xyz；”，即，找到字符串“abc”，后面两个字符，然后是字符串“abc”并且在无限量字符后由字符串“xyz”跟随。通常使用搜索算法(如用于处理正则表达式的确定有限自动机 (DFA)或非确定有限自动机(NFA))执行内容搜索。A large number of patterns and rules in the form of regular expressions (also referred to as regular expression patterns herein) can be applied to ensure that all security holes or intrusions are detected. Regular expressions are a compact method for describing patterns in strings. The simplest pattern matched by a regular expression is a single character or string, such as /c/ or /cat/. Regular expressions can also include operators and metacharacters with special meanings. By using metacharacters, regular expressions can be used for more complex searches, such as "abc.*xyz.". That is, in the case of an infinite number of characters between "abc" and "xyz", the string "abc" is found, followed by the string "xyz". Another example is the regular expression "abc..abc.*xyz;", that is, the string "abc", the next two characters, then the string "abc" and followed by the string "xyz" after the infinite number of characters. Content searches are usually performed using a search algorithm (such as a deterministic finite automaton (DFA) or a non-deterministic finite automaton (NFA) for processing regular expressions).

概述Overview

本发明的实施例提供一种用于有限自动机的编译和运行时间处理的方法、装置、计算机程序产品和相应的系统。Embodiments of the present invention provide a method, apparatus, computer program product, and corresponding system for compilation and runtime processing of finite automata.

根据一个实施例，在操作地耦合到一个网络的一个安全装置中的至少一个处理器中，一种方法可以包括生成至少一个每图样非确定型有限自动机(NFA)。该至少一个处理器可以被操作地耦合到多个存储器，该多个存储器被映射到在一个存储器层次中的多个层级中。每个每图样NFA可以对于单个正则表达式图样而生成并且可以包括一组对应的节点。该方法可以分布每个每图样NFA的该组对应的节点的节点以基于所映射的层级和为这些层级配置的每图样NFA存储分配设定来将其存储在该多个存储器中。According to one embodiment, in at least one processor in a security device operatively coupled to a network, a method may include generating at least one per-pattern nondeterministic finite automaton (NFA). The at least one processor may be operatively coupled to a plurality of memories, the plurality of memories being mapped into a plurality of levels in a memory hierarchy. Each per-pattern NFA may be generated for a single regular expression pattern and may include a set of corresponding nodes. The method may distribute the nodes of the set of corresponding nodes of each per-pattern NFA for storage across the plurality of memories based on the mapped levels and per-pattern NFA storage allocation settings configured for the levels.

该方法可以包括基于该分布在该多个存储器中静态地存储所生成的每个每图样NFA的该组对应的节点的这些节点。The method may include statically storing the nodes of the set of corresponding nodes of each generated per-pattern NFA in the plurality of memories based on the distribution.

生成该至少一个每图样NFA可以包括指定从一个给定节点经由一个与该给定节点相关联的下一节点地址到一个下一节点的转换并且配置该下一节点地址以标识该下一节点和分布给该下一节点用于进行存储的该多个存储器中的一个给定存储器。Generating the at least one per-pattern NFA may include specifying a transition from a given node to a next node via a next node address associated with the given node and configuring the next node address to identify the next node and a given one of the plurality of memories allocated to the next node for storage.

该方法可以包括配置这些每图样NFA存储分配设定中的每个每图样NFA存储分配设定用于这些层级中的一个对应的层级，以表示所生成的每个每图样NFA的该组对应的节点的目标数量的唯一节点，以分布用于在该多个存储器中的一个映射到该对应的层级的给定存储器中进行存储。The method may include configuring each of the per-pattern NFA storage allocation settings for a corresponding one of the hierarchies to represent a target number of unique nodes of the corresponding set of nodes of each per-pattern NFA generated to be distributed for storage in a given memory of the plurality of memories mapped to the corresponding hierarchical level.

该组对应的节点中的唯一节点可以是按一种连续方式安排在该至少一个每图样NFA的一个包括该组对应的节点的给定的每图样 NFA内。The only nodes in the set of corresponding nodes may be arranged in a sequential manner within a given one of the at least one per-pattern NFAs that includes the set of corresponding nodes.

该方法可以包括对于该对应的层级配置每个每图样NFA存储分配设定，其方式为，在为一组正则表达式图样中的每个正则表达式图样生成一个每图样NFA的事件中，使得该给定存储器能够提供足够的存储容量用于存储从所生成的每个每图样NFA表示的该目标数量的唯一节点。The method may include configuring each per-pattern NFA storage allocation setting for the corresponding level in a manner such that, in the event of generating a per-pattern NFA for each regular expression pattern in a set of regular expression patterns, the given memory is able to provide sufficient storage capacity for storing the target number of unique nodes represented by each generated per-pattern NFA.

该方法可以包括经由一个绝对值来表示该目标数量的唯一节点。该绝对值可以是对于每组对应的节点的一个共用值，从而使得每组对应的节点能够具有对于该目标数量的唯一节点的一个相同的值，以便存储在被映射到该对应的层级的该给定存储器中。The method may include representing the target number of unique nodes via an absolute value. The absolute value may be a common value for each group of corresponding nodes, such that each group of corresponding nodes can have a same value for the target number of unique nodes for storage in the given memory mapped to the corresponding level.

该方法可以包括经由一个百分比值来表示该目标数量的唯一节点，该百分比值用于应用到每组对应的节点的对应的节点总数，从而使得每组对应的节点能够具有对于该目标数量的唯一节点的一个单独的值，以便存储在被映射到该对应的层级的该给定存储器中。The method may include representing the target number of unique nodes via a percentage value that is applied to the corresponding total number of nodes for each group of corresponding nodes, thereby enabling each group of corresponding nodes to have a separate value for the target number of unique nodes to be stored in the given memory mapped to the corresponding level.

该多个存储器中的每个存储器可以基于该多个存储器的性能排序以降序连续地映射到这些层级的一个对应的层级。该多个存储器中的一个性能排序最高的存储器可以被映射到这些层级中一个排序最高的层级。Each memory in the plurality of memories may be mapped to a corresponding one of the tiers in descending order based on the performance ranking of the plurality of memories. A memory with the highest performance ranking in the plurality of memories may be mapped to a highest-ranked one of the tiers.

这些每图样NFA存储分配设定可以包括一个第一每图样NFA 存储分配设定和一个第二每图样NFA存储分配设定。这些层级可以包括一个排序最高的层级和一个排序次高的层级。该第一每图样NFA存储分配设定可以被配置成用于该排序最高的层级。该第二每图样NFA 存储分配设定可以被配置成用于该排序次高的层级。The per-pattern NFA storage allocation settings may include a first per-pattern NFA storage allocation setting and a second per-pattern NFA storage allocation setting. The tiers may include a highest-ranked tier and a second-highest-ranked tier. The first per-pattern NFA storage allocation setting may be configured for the highest-ranked tier. The second per-pattern NFA storage allocation setting may be configured for the second-highest-ranked tier.

分布所生成的每个每图样NFA的该组对应的节点的这些节点可以包括以一种连续方式分布，该分布包括该组对应的节点的这些节点的一个第一分布，用于存储在该多个存储器中的一个第一存储器中。该第一存储器可以被映射到这些层级中的一个排序最高的层级。该分布可以包括该组对应的节点中的这些的节点的一个第二分布，基于在该组对应的节点中剩余的至少一个未分布的节点。每个至少一个第二分布可以是用于在该多个存储器中的一个给定存储器中进行存储。该给定存储器可以被映射到这些层级中的一个给定的层级，对于该组对应的节点的这些节点的每一次分布，该给定的层级连续地低于该排序最高的层级。Distributing the nodes of the set of corresponding nodes of each per-graph NFA generated may include distributing them in a continuous manner, the distribution including a first distribution of the nodes of the set of corresponding nodes for storage in a first memory of the plurality of memories. The first memory may be mapped to a highest-ranked level of the hierarchy. The distribution may include a second distribution of the nodes of the set of corresponding nodes based on at least one undistributed node remaining in the set of corresponding nodes. Each at least one second distribution may be for storage in a given memory of the plurality of memories. The given memory may be mapped to a given level of the hierarchy, the given level being successively lower than the highest-ranked level for each distribution of the nodes of the set of corresponding nodes.

如果该给定的层级是这些层级中排序最低的层级，则该至少一个第二分布中的一个给定的分布可以包括在该组对应的节点中的所有剩余的未分布节点。A given one of the at least one second distribution may include all remaining undistributed nodes in the set of corresponding nodes if the given level is the lowest-ranked level among the levels.

该方法可以包括将在该第一分布中的节点数目最大化，并且该最大化后的数目可以是由这些每图样NFA存储分配设定中的一个被配置成用于该排序最高的层级的对应的每图样NFA存储分配设定来限制的。The method may include maximizing the number of nodes in the first distribution, and the maximized number may be limited by one of the per-pattern NFA storage allocation settings being configured as a corresponding per-pattern NFA storage allocation setting for the highest-ranked level.

该方法可以包括将在每个至少一个第二分布中的节点数目最大化，并且该最大化后的数目可以是对于每一次分布由这些每图样NFA 存储分配设定中的一个被配置成用于该给定层级的对应的每图样NFA 存储分配设定来限制的。The method may include maximizing the number of nodes in each at least one second distribution, and the maximized number may be limited for each distribution by one of the per-pattern NFA storage allocation settings configured as a corresponding per-pattern NFA storage allocation setting for the given level.

该方法可以包括静态地存储每个每图样NFA的该组对应的节点的这些节点，使得能够用在从网络接收的一个输入流中的多个有效载荷的多个区段来行走每组对应的节点的这些节点，以确定在该输入流中的至少一个正则表达式图样的一次匹配。The method may include statically storing the nodes of the set of corresponding nodes of each per-pattern NFA, such that the nodes of each set of corresponding nodes can be walked using multiple segments of multiple payloads in an input stream received from the network to determine a match of at least one regular expression pattern in the input stream.

该行走可以基于该组对应的节点的匹配这些有效载荷的区段的单独节点来进行。The walking may be performed based on individual nodes of the set of corresponding nodes matching the segments of the payloads.

该方法可以进一步包括基于在一组正则表达式图样中的正则表达式图样总数和与该行走相关联的所希望的性能度量来确定这些每图样NFA存储分配设定。The method may further include determining the per-pattern NFA storage allocation settings based on a total number of regular expression patterns in a set of regular expression patterns and a desired performance metric associated with the walk.

该方法可以包括基于选自一组正则表达式图样中的每个图样的至少一个子图样来生成一个统一的确定型有限自动机(DFA)并且将该统一的DFA存储在该多个存储器中的一个给定存储器中。The method may include generating a unified deterministic finite automaton (DFA) based on at least one sub-pattern of each pattern selected from a set of regular expression patterns and storing the unified DFA in a given memory of the plurality of memories.

该多个存储器可以包括一个第一存储器、一个第二存储器、和一个第三存储器，并且该第一和第二存储器可以与该至少一个处理器共同定位在一个芯片上，而该第三存储器可以是定位在该芯片之外的一个外部存储器并且被映射到这些层级中的一个排序最低的层级。The multiple memories may include a first memory, a second memory, and a third memory, and the first and second memories may be co-located on a chip with the at least one processor, while the third memory may be an external memory located outside the chip and mapped to a lowest-ordered level among the levels.

在此披露的另一个示例实施例包括一种对应于与在此披露的方法实施例相一致的操作的装置。Another example embodiment disclosed herein includes an apparatus corresponding to operations consistent with the method embodiments disclosed herein.

另外，再另一个示例实施例可以包括一种非瞬态计算机可读介质，其上存储有一个指令序列，该指令序列当被处理器加载并执行时致使处理器执行在此披露的方法。Additionally, yet another example embodiment may include a non-transitory computer-readable medium having stored thereon a sequence of instructions that, when loaded and executed by a processor, cause the processor to perform the methods disclosed herein.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

根据本发明的示例性实施例的以下更具体的说明，上述内容将是明显的，如在这些附图中展示的，其中贯穿这些不同的视图的相同的参照字符是指相同的部分。附图不一定按比例，而是着重于展示本发明的实施例。The foregoing will be apparent from the following more particular description of exemplary embodiments of the invention, as illustrated in the accompanying drawings, in which like reference characters refer to like parts throughout the different views. The drawings are not necessarily to scale, emphasis instead being placed upon illustrating embodiments of the invention.

图1是一个安全装置的实施例的框图，其中，可以实现在此披露的实施例。FIG1 is a block diagram of one embodiment of a security device in which embodiments disclosed herein may be implemented.

图2A-图2G是示例NFA和DFA图形以及展示图爆(graph explosion)概念的一个表。2A-2G are example NFA and DFA graphs and a table illustrating the concept of graph explosion.

图3A是一个安全装置的实施例的另一个框图，其中，可以实现在此披露的实施例。3A is another block diagram of one embodiment of a security device in which embodiments disclosed herein may be implemented.

图3B是一种方法的示例实施例的一个流程图，该方法可以被实现在至少一个处理器中，该处理器与操作地耦合到网络的一个安全装置中的至少一个存储器操作地耦合。3B is a flow diagram of an example embodiment of a method that may be implemented in at least one processor operatively coupled to at least one memory in a security device operatively coupled to a network.

图4是一个超非确定型自动机(HNA)协处理器的一个环境的示例实施例的框图。4 is a block diagram of an example embodiment of an environment for a Hyper Nondeterministic Automaton (HNA) coprocessor.

图5A是一个每图样非确定型有限自动机(NFA)图形的示例实施例的框图，该图形可以由一个行走器(walker)使用以匹配在一个输入流中的正则表达式图样。5A is a block diagram of an example embodiment of a per-pattern nondeterministic finite automaton (NFA) graph that may be used by a walker to match regular expression patterns in an input stream.

图5B是用于以一个有效载荷行走图5A的NFA图形的处理周期的示例实施例的一个表。5B is a table of an example embodiment of a processing cycle for walking the NFA graph of FIG. 5A with a payload.

图6是该行走器的一个环境的示例实施例的框图。FIG6 is a block diagram of an example embodiment of an environment for the walker.

图7是该编译器的一个环境的示例实施例的框图。FIG7 is a block diagram of an example embodiment of an environment for the compiler.

图8是对于多个每图样NFA的节点分布的示例实施例的框图。8 is a block diagram of an example embodiment of a node distribution for multiple per-pattern NFAs.

图9是一种方法的示例实施例的一个流程图，该方法可以在至少一个处理器中执行，该至少一个处理器与多个存储器操作地耦合，该多个存储器映射到操作地耦合到网络的一个安全装置中的存储器层次中的多个层级。9 is a flow diagram of an example embodiment of a method that may be performed in at least one processor operatively coupled to a plurality of memories mapped to a plurality of levels in a memory hierarchy in a security device operatively coupled to a network.

图10是对于多个每图样NFA的节点的另一个节点分布的示例实施例的框图。10 is a block diagram of an example embodiment of another node distribution for multiple per-pattern NFAs.

图11是一种用于分布至少一个每图样NFA的节点的方法的示例实施例的一个流程图。11 is a flow diagram of an example embodiment of a method for distributing nodes of at least one per-pattern NFA.

图12是任选地在此处披露的实施例内的一台计算机的示例内部结构的一个框图。12 is a block diagram of an example internal structure of a computer optionally included within embodiments disclosed herein.

具体实施方式DETAILED DESCRIPTION

在详细描述本发明的示例实施例之前，直接在以下描述了一种示例安全应用以帮助读者理解在此披露的本发明的特征，这些实施例可以实现在该安全应用中并且典型的处理使用确定型有限自动机(DFA) 和非确定型有限自动机(NFA)。Before describing in detail example embodiments of the present invention, an example security application is described directly below to help the reader understand the features of the present invention disclosed herein, in which these embodiments can be implemented and in which typical processing uses deterministic finite automata (DFA) and non-deterministic finite automata (NFA) to help the reader understand the features of the present invention disclosed herein.

图1是一个安全装置102的实施例的框图，其中，可以实现在此披露的实施例。安全装置102可以包括一个网络服务处理器100。安全装置102可以是一个单独的系统，该系统可以将在一个网络接口103a 接收的包切换到另一个网络接口103b并且可以在转发这些包之前在所接收的数据包上执行多个安全功能。例如，安全装置102可以用于对数据包101a执行安全处理，这些数据包可以是在一个广域网(WAN)105a 或者任何其他合适的网络上接收的，然后将这些处理过的数据包101b 转发到一个局域网(LAN)105b或者任何其他合适的网络。FIG1 is a block diagram of an embodiment of a security appliance 102 in which embodiments disclosed herein may be implemented. Security appliance 102 may include a network services processor 100. Security appliance 102 may be a standalone system that switches packets received on one network interface 103a to another network interface 103b and performs various security functions on the received packets before forwarding them. For example, security appliance 102 may be configured to perform security processing on packets 101a received on a wide area network (WAN) 105a or any other suitable network, and then forward the processed packets 101b to a local area network (LAN) 105b or any other suitable network.

网络服务处理器100可以被配置成用于处理封装在所接收的数据包中的开放系统互连(OSI)网络L2-L7层协议。如本领域技术人员公知的，OSI参考模型定义了七个网络协议层(L1-7)。物理层(L1) 表示将设备连接到传输媒介的实际接口，包括电气接口和物理接口。数据链路层(L2)执行数据组帧。网络层(L3)将数据格式化为数据包。传输层(L4)处理端到端的传输。会话层(L5)管理设备之间的通信，例如，无论通信是半双工的还是全双工的。表现层(L6)管理数据格式化及表现，例如，语法、控制代码、特殊图形及字符集。应用层(L7)允许多个用户之间的通信，例如，文件传送及电子邮件。The network service processor 100 can be configured to process the open systems interconnection (OSI) network L2-L7 layer protocols encapsulated in the received data packets. As is well known to those skilled in the art, the OSI reference model defines seven network protocol layers (L1-7). The physical layer (L1) represents the actual interface that connects the device to the transmission medium, including the electrical interface and the physical interface. The data link layer (L2) performs data framing. The network layer (L3) formats the data into data packets. The transport layer (L4) handles end-to-end transmission. The session layer (L5) manages communication between devices, for example, whether the communication is half-duplex or full-duplex. The presentation layer (L6) manages data formatting and presentation, for example, syntax, control codes, special graphics and character sets. The application layer (L7) allows communication between multiple users, for example, file transfer and email.

网络服务处理器100可以为高层网络协议(例如，L4-L7)调度和排列工作(数据包处理操作)，并且允许在所接收到的待执行的数据包中进行高层网络协议的处理，以便以线速转发数据包。通过处理这些协议来以线速转发这些数据包，该网络服务处理器100不会降低网络数据传送速率。网络服务处理器100可以从网络接口103a或103b接收数据包，这些接口可以是物理硬件接口并且可以对所接收的数据包执行 L2-L7网络协议处理。网络服务处理器100可以后续地将处理过的数据包101b通过网络接口103a或103b转发到网络中的另一跳、最终目的地、或通过另一个总线(未展示)以便由一个主机处理器(未展示)进行进一步处理。网络协议处理可以包括网络安全协议的处理，如防火墙、应用防火墙、包括IP层安全协议(IPSec)和/或安全套接层(SSL)的虚拟专用网(VPN)、入侵检测系统(IDS)、防病毒(AV)、或任何其他合适的网络协议。The network services processor 100 can schedule and queue work (packet processing operations) for higher-level network protocols (e.g., L4-L7) and enable higher-level network protocol processing within received packets to be executed, thereby forwarding the packets at line speed. By processing these protocols to forward the packets at line speed, the network services processor 100 does not degrade the network data transfer rate. The network services processor 100 can receive packets from network interfaces 103a or 103b, which can be physical hardware interfaces, and can perform L2-L7 network protocol processing on the received packets. The network services processor 100 can subsequently forward the processed packets 101b via network interfaces 103a or 103b to another hop in the network, to a final destination, or via another bus (not shown) for further processing by a host processor (not shown). Network protocol processing can include processing of network security protocols such as firewalls, application firewalls, virtual private networks (VPNs) including IP layer security (IPSec) and/or secure sockets layer (SSL), intrusion detection systems (IDS), antivirus (AV), or any other suitable network protocols.

网络服务处理器100可以使用多个处理器(即，内核)来提供高应用性能。每个内核(未展示)可以专用于执行数据面、控制面操作或其组合。数据面操作可以包括用于转发数据包的数据包操作。控制面操作可以包括处理复杂的高层协议部分，如IP层安全协议(IPSec)、传输控制协议(TCP)、安全套接层(SSL)、或任何其他合适的高层协议。数据面操作可以包括处理这些复杂的高层协议的其他部分。The network services processor 100 can use multiple processors (i.e., cores) to provide high application performance. Each core (not shown) can be dedicated to performing data plane operations, control plane operations, or a combination thereof. Data plane operations can include packet operations for forwarding packets. Control plane operations can include processing complex high-level protocol portions, such as IP Security (IPSec), Transmission Control Protocol (TCP), Secure Sockets Layer (SSL), or any other suitable high-level protocol. Data plane operations can include processing other portions of these complex high-level protocols.

网络服务处理器100还可以包括特定用途协处理器，该协处理器可以使内核分流，使得网络服务处理器100实现高吞吐量。例如，网络服务处理器100可以包括一个加速单元106，该加速单元可以包括一个超非确定型自动机(HNA)协处理器108用于硬件加速NFA处理、以及一个超级有限自动机(HFA)协处理器110用于硬件加速DFA处理。HNA 108和HFA 110协处理器可以被配置成用于使网络服务处理器100的通用内核(未展示)分流执行计算和内存密集型图样匹配法的沉重负担。The network services processor 100 may also include special purpose coprocessors that can offload cores to achieve high throughput for the network services processor 100. For example, the network services processor 100 may include an acceleration unit 106, which may include a hyper nondeterministic automaton (HNA) coprocessor 108 for hardware-accelerated NFA processing and a hyper finite automaton (HFA) coprocessor 110 for hardware-accelerated DFA processing. The HNA 108 and HFA 110 coprocessors may be configured to offload the heavy burden of executing computationally and memory-intensive pattern matching methods from a general-purpose core (not shown) of the network services processor 100.

网络服务处理器100可以执行图样搜索、正则表达式处理、内容验证、变换、和安全性加速数据包处理。正则表达式处理和图样搜索可以用于对AV和IDS应用以及其他需要字符串匹配的应用执行字符串匹配。在网络服务处理器100中的存储器控制器(未展示)可以控制对存储器104的访问，该存储器操作地耦合到网络服务处理器100。存储器104可以是内部的(即，芯片上的)或外部的(即，芯片外的)或其组合，并且可以被配置成用于存储所接收的数据包，如用于由网络服务处理器100进行处理的数据包101a。存储器104可以被配置成用于存储用于在DFA和NFA图形表达式搜索中进行查询和图样匹配的编译规则数据。编译规则数据可以被存储为一个二进制图像112，该图像可以包括用于DFA和NFA两者的编译规则数据，或者将DFA编译规则数据与NFA编译规则数据分开的多个二进制图像。The network services processor 100 can perform pattern searching, regular expression processing, content validation, transformation, and security acceleration packet processing. Regular expression processing and pattern searching can be used to perform string matching for AV and IDS applications and other applications that require string matching. A memory controller (not shown) in the network services processor 100 can control access to a memory 104 that is operatively coupled to the network services processor 100. The memory 104 can be internal (i.e., on-chip) or external (i.e., off-chip) or a combination thereof and can be configured to store received packets, such as packet 101a for processing by the network services processor 100. The memory 104 can be configured to store compiled rule data for querying and pattern matching in DFA and NFA graph expression searches. The compiled rule data can be stored as a binary image 112 that can include compiled rule data for both DFA and NFA, or multiple binary images that separate DFA compiled rule data from NFA compiled rule data.

典型的内容感知应用处理可以使用或者DFA或者NFA来识别所接收的数据包的内容中的图样。DFA和NFA都是有限状态机，也就是，每个计算模型包括一组状态、一个启动状态、一个输入字母表(所有可能的符号的集合)以及一个转换函数。计算在启动状态中开始并且变化为取决于转换函数的新状态。Typical content-aware application processing can use either DFA or NFA to identify patterns in the content of received data packets. Both DFA and NFA are finite state machines, that is, each computation model consists of a set of states, a starting state, an input alphabet (the set of all possible symbols), and a transition function. The computation begins in the starting state and changes to a new state depending on the transition function.

该图样通常是用正则表达式来表达的，该正则表达式包括基本元素，例如正常的文本字符如A-Z和0-9以及元字符如*、^和|。正则表达式的基本元素是有待匹配的符号(单个字符)。基本元素可以与允许连结、交替(|)的元字符以及克莱尼星号(*)组合。用于连结的元字符可以用于从单个字符(或多个子字符串)创造多个字符匹配图样，而用于交替(|)的元字符可以用于创造可以匹配两个或多个子字符串中任一者的正则表达式。元字符克莱尼星号(*)允许一个图样匹配任意次，包括不出现前面的字符或字符串。The pattern is usually expressed using a regular expression, which includes basic elements, such as normal text characters such as A-Z and 0-9, and metacharacters such as *, ^, and |. The basic element of a regular expression is the symbol (single character) to be matched. The basic elements can be combined with metacharacters that allow concatenation, alternation (|), and the Kleinian star (*). The metacharacters for concatenation can be used to create multiple character matching patterns from a single character (or multiple substrings), while the metacharacters for alternation (|) can be used to create a regular expression that can match any of two or more substrings. The metacharacter Kleinian star (*) allows a pattern to be matched any number of times, including without the preceding character or string.

将不同运算符和多个单个字符相组合允许构造复杂的表达式子图样。例如，子图样如(th(is|at)*)可以匹配多个字符串，如：th、this、 that、thisis、thisat、thatis、或thatat。表达式的复杂子图样的另一个示例可以是结合了允许罗列一个要搜索的字符清单的一个字符类构造[...] 的子图样。例如，gr[ea]y查找grey和gray两者。其他的复杂子图样的示例是可以使用破折号来表示字符范围的那些，例如，[A-Z]或匹配任何一个字符的元字符“.”。图样的一个元素可以是一个基本元素或者一个或多个基本元元组合与一个或多个元字符的组合。Combining different operators with multiple single characters allows for the construction of complex expression subpatterns. For example, a subpattern like (th(is|at)*) can match multiple strings, such as th, this, that, thisis, thisat, thatis, or thatat. Another example of a complex subpattern of an expression is a subpattern constructed [...] in conjunction with a character class that allows for a list of characters to be searched for. For example, gr[ea]y finds both grey and gray. Other examples of complex subpatterns are those that use dashes to indicate character ranges, such as [A-Z] or the metacharacter ".", which matches any character. An element of a pattern can be a primitive element or a combination of one or more primitive elements with one or more metacharacters.

对DFA或NFA状态机的输入典型地包括多个区段，如一个(8 位)字节的字符串，也就是，字母表可以是来自输入流(即，所接收的数据包)的单个字节(一个字符或符号)。输入流中的每个区段(例如，字节)可以造成从一个状态到另一个状态的转换。DFA或NFA状态机的这些状态和转换函数可以通过一个节点图形来表示。图形中的每个节点可以表示一个状态并且图中的弧线(在此也被称为转换或转换弧线) 可以代表状态转换。该状态机的当前状态可以通过一个节点标识符来代表，该标识符选择该图形中的一个特定的节点。The input to a DFA or NFA state machine typically includes multiple segments, such as a string of (8-bit) bytes, that is, an alphabet can be a single byte (a character or symbol) from an input stream (i.e., a received data packet). Each segment (e.g., byte) in the input stream can cause a transition from one state to another. These states and transition functions of a DFA or NFA state machine can be represented by a node graph. Each node in the graph can represent a state and the arcs in the graph (also referred to herein as transitions or transition arcs) can represent state transitions. The current state of the state machine can be represented by a node identifier that selects a specific node in the graph.

使用DFA来处理正则表达式并且在字符输入流中找到由正则表达式描述的一个或多个图样可以被表征为具有确定型运行时间性能。 DFA的下一状态可以从一个输入字符(或符号)以及该DFA的当前状态来确定，因为每个DFA状态仅有一次状态转换。于是，DFA的运行时间性能被称为是确定型的，并且其行为可以从该输入进行完全地预测。然而，确定论的折中是这样一个图形：其中节点的数目(或图形尺寸)可能随着图样的尺寸以指数增长。Using a DFA to process a regular expression and find one or more patterns described by the regular expression in a character input stream can be characterized as having deterministic runtime performance. The next state of the DFA can be determined from an input character (or symbol) and the current state of the DFA, because each DFA state has only one state transition. Thus, the runtime performance of the DFA is called deterministic, and its behavior can be fully predicted from the input. However, the trade-off of determinism is a graph in which the number of nodes (or graph size) may grow exponentially with the size of the pattern.

相反，NFA图形的节点数目(或图形尺寸)可以被表征为随着图样的尺寸线性地增长。然而，使用NFA来处理正则表达式并且在字符输入流中找到由正则表达式描述的一个或多个图样可以被表征为具有非确定型运行时间性能。例如，给定一个输入字符(或符号)以及 NFA的一个当前状态，有可能存在该NFA转换到一个以上的下一状态。于是，NFA的下一状态不能从输入和NFA的当前状态唯一地确定。因此，NFA的运行时间性能被称为是非确定型的，因为其行为不能从该输入进行完全地预测。In contrast, the number of nodes (or graph size) of an NFA graph can be characterized as growing linearly with the size of the pattern. However, using an NFA to process regular expressions and find one or more patterns described by the regular expression in a character input stream can be characterized as having non-deterministic runtime performance. For example, given an input character (or symbol) and a current state of the NFA, it is possible that the NFA can transition to more than one next state. Thus, the next state of the NFA cannot be uniquely determined from the input and the current state of the NFA. Therefore, the runtime performance of an NFA is said to be non-deterministic because its behavior cannot be fully predicted from the input.

图2A-图2G展示了DFA“图爆”概念。图2A、2B、和2C展示了分别用于图样“.*a[^\n],”“.*a[^\n][^\n],”“.*a[^\n][^\n][^\n],”的NFA 图形，而图2D、2E、和2F展示了分别用于相同图样的DFA图形。如在图2A-图2F中所示并且由图2G的表总结的，NFA图形可以对于某些图样线性地增长而DFA图形对于相同的图样可能指数地增长，从而导致图爆。如所示的，对于一个或多个给定的图样，DFA状态的数目可以大于NFA状态的数目，典型地是在多数百个或多数千个状态的量级。这是“图爆”的一个示例，这是DFA的标志特征。Figures 2A-2G illustrate the concept of DFA "graph explosion." Figures 2A, 2B, and 2C illustrate NFA graphs for the patterns ".*a[^\n]," ".*a[^\n][^\n]," and ".*a[^\n][^\n][^\n]," respectively, while Figures 2D, 2E, and 2F illustrate DFA graphs for the same patterns, respectively. As shown in Figures 2A-2F and summarized by the table in Figure 2G, NFA graphs can grow linearly for certain patterns while DFA graphs can grow exponentially for the same pattern, leading to graph explosion. As shown, for one or more given patterns, the number of DFA states can be greater than the number of NFA states, typically on the order of hundreds or thousands of states. This is an example of "graph explosion," a hallmark feature of DFAs.

根据在此披露的实施例，内容搜索可以使用DFA、NFA或其组合来进行。根据一个实施例，一个运行时间处理器、协处理器、或其组合可以硬件实现并且可以被配置成实现一个编译器或行走器。According to embodiments disclosed herein, content searching may be performed using DFA, NFA, or a combination thereof. According to one embodiment, a runtime processor, coprocessor, or a combination thereof may be implemented in hardware and may be configured to implement a compiler or walker.

该编译器可以将一个图样或一个图样输入列表(也称为签名或规则)编译到DFA、NFA或其组合中。DFA和NFA可以是二进制数据结构，如DFA和NFA图形和表。The compiler can compile a pattern or a list of pattern inputs (also called signatures or rules) into a DFA, NFA, or a combination thereof. The DFA and NFA can be binary data structures such as DFA and NFA graphs and tables.

行走器可以执行运行时间处理，即用于标识在输入流中的一个图样的存在或者将该图样与在输入流中的内容进行匹配的行动。内容可以是一个互联网协议(IP)数据报的一个有效载荷部分、或在输入流中的任何其他合适的有效载荷。DFA或NFA图形的运行时间处理可以被称为使用该有效载荷行走该DFA或NFA图形，以确定图样匹配。被配置成用于生成DFA、NFA或其组合的一个处理器在此可以被称为编译器。被配置成用于使用所生成的DFA、NFA或其组合来实现有效载荷的运行时间处理的一个处理器在此可以被称为行走器。根据在此披露的实施例，网络服务处理器100可以被配置成用于在安全装置102中实现一个编译器和一个行走器。The walker can perform runtime processing, i.e., an action for identifying the presence of a pattern in an input stream or matching the pattern with the content in the input stream. The content can be a payload portion of an Internet Protocol (IP) datagram or any other suitable payload in the input stream. The runtime processing of a DFA or NFA graph can be referred to as walking the DFA or NFA graph using the payload to determine pattern matching. A processor configured to generate a DFA, NFA, or a combination thereof can be referred to as a compiler herein. A processor configured to implement runtime processing of a payload using the generated DFA, NFA, or a combination thereof can be referred to as a walker herein. According to the embodiments disclosed herein, the network service processor 100 can be configured to implement a compiler and a walker in the security device 102.

图3A是图1的安全装置102的另一个实施例的框图，其中，可以实现在此披露的实施例。如参考图1披露的，安全装置102可以操作地耦合到一个或多个网络并且可以包括存储器104和网络服务处理器100，该网络服务处理器可以包括加速单元106。参考图3A，网络服务处理器100可以被配置成用于实现一个编译器306，该编译器生成二进制图像112和一个使用二进制图像112的行走器320。例如，编译器 306可以生成包括由行走器320使用的编译规则数据的二进制图像112 以便在所接收的数据包101a(在图1中展示)上执行图样匹配方法。根据在此披露的实施例，编译器306可以基于如下进一步描述的至少一种启发法通过确定DFA、NFA或其组合的编译规则数据来生成二进制图像112。编译器306可以确定有利地适合DFA和NFA的规则数据。FIG3A is a block diagram of another embodiment of the security device 102 of FIG1 , in which embodiments disclosed herein may be implemented. As disclosed with reference to FIG1 , the security device 102 may be operatively coupled to one or more networks and may include a memory 104 and a network services processor 100, which may include an acceleration unit 106. Referring to FIG3A , the network services processor 100 may be configured to implement a compiler 306 that generates a binary image 112 and a walker 320 that utilizes the binary image 112. For example, the compiler 306 may generate the binary image 112 including compiled rule data used by the walker 320 to perform a pattern matching method on a received data packet 101a (shown in FIG1 ). According to embodiments disclosed herein, the compiler 306 may generate the binary image 112 by determining compiled rule data for a DFA, NFA, or a combination thereof based on at least one heuristic method, as further described below. The compiler 306 may determine rule data that is advantageously suitable for both a DFA and an NFA.

根据在此描述的实施例，编译器306可以通过处理一个规则集 310来生成二进制图像112，该规则集可以包括一组一个或多个正则表达式图样304和任选的限定符308。从规则集310中，编译器306可以使用选自该一个或多个正则表达式图样中所有图样的子图样生成一个统一的DFA 312以及用于在该组一个或多个正则表达式图样304中的至少一个图样的至少一个NFA 314，用于由行走器320在运行时间处理过程中使用，以及包括用于在统一的DFA 312的状态(未展示)和该至少一个NFA 314的状态之间转换行走器320的映射信息的元数据(未展示)。According to embodiments described herein, compiler 306 may generate binary image 112 by processing a rule set 310, which may include a set of one or more regular expression patterns 304 and optional qualifiers 308. From rule set 310, compiler 306 may generate a unified DFA 312 using subpatterns selected from all of the one or more regular expression patterns and at least one NFA 314 for at least one of the one or more regular expression patterns 304, for use by a walker 320 during runtime processing, as well as metadata (not shown) including mapping information for transitioning walker 320 between states (not shown) of unified DFA 312 and states of at least one NFA 314.

该统一的DFA 312和该至少一个NFA 314可以逐数据结构地作为图形或者以其他任何合适的形式来表示，并且在元数据中的映射可以逐数据结构地作为一个或多个表或者以其他任何合适的形式来表示。根据在此披露的实施例，如果选自一个图样的一个子图样是该图样，则对于该图样不生成NFA。根据在此披露的实施例，所生成的每个NFA 可以用于该组中的一个特定的图样，而一个统一的DFA可以基于来自该组中所有图样的所有子图样来生成。The unified DFA 312 and the at least one NFA 314 can be represented as a graph or in any other suitable form, and the mapping in the metadata can be represented as one or more tables or in any other suitable form, respectively. According to the embodiments disclosed herein, if a sub-pattern selected from a pattern is the pattern, no NFA is generated for the pattern. According to the embodiments disclosed herein, each generated NFA can be used for a specific pattern in the group, and a unified DFA can be generated based on all sub-patterns from all patterns in the group.

行走器320基于消耗(即，处理)来自所接收的数据包101a 中的有效载荷的区段(如字节)通过转换该统一的DFA 312和该至少一个NFA的状态用该有效载荷来行走该统一的DFA 312和该至少一个 NFA 314。于是，行走器320使该有效载荷走过该统一的DFA 312和该至少一个NFA 314，该至少一个NFA可以是对于单个正则表达式图样生成的一个每图样NFA。The walker 320 walks the unified DFA 312 and the at least one NFA 314 with the payload based on consuming (i.e., processing) segments (e.g., bytes) of the payload from the received data packet 101a by transforming the states of the unified DFA 312 and the at least one NFA. Thus, the walker 320 walks the payload through the unified DFA 312 and the at least one NFA 314, which may be a per-pattern NFA generated for a single regular expression pattern.

规则集310可以包括一组一个或多个正则表达式图样304，并且可以处于Perl兼容的正则表达式(PCRE)形式或者任何其他合适的形式。PCRE已经成为在安全和联网应用中正则表达式句法的事实标准。由于更多的要求深度数据包检查的应用已经出现或者更多的威胁已经在互联网上变得流行，用于标识病毒/攻击或应用的相应的签名/图样也已经变得更复杂。例如，签名数据库已经从具有简单的字符串图样演进到具有通配符、范围、字符类和高级PCRE签名的正则表达式(regex) 图样。The rule set 310 may include a set of one or more regular expression patterns 304 and may be in the form of Perl Compatible Regular Expressions (PCRE) or any other suitable form. PCRE has become the de facto standard for regular expression syntax in security and networking applications. As more applications requiring deep packet inspection have emerged or more threats have become prevalent on the Internet, the corresponding signatures/patterns used to identify viruses/attacks or applications have become more complex. For example, signature databases have evolved from having simple string patterns to having regular expression (regex) patterns with wildcards, ranges, character classes, and advanced PCRE signatures.

如在图3A中所示，任选的限定符308可以各自与该组正则表达式图样304中的一个图样相关联。例如，任选的限定符322可以与图样316相关联。任选的限定符308可以各自是指定所希望的定制、高级 PCRE签名选项、或其他对于处理与这些限定符相关的图样而言合适的选项的一个或多个限定符。例如，限定符322可以指示该高级PCRE签名选项的一个起始偏移(即，在有效载荷中匹配的一个图样的一个第一匹配字符在有效载荷中的位置)选项对于图样316是否是希望的。As shown in FIG3A , optional qualifiers 308 can each be associated with a pattern in the set of regular expression patterns 304. For example, optional qualifier 322 can be associated with pattern 316. Optional qualifiers 308 can each be one or more qualifiers that specify desired customization, advanced PCRE signature options, or other options suitable for processing the pattern associated with the qualifiers. For example, qualifier 322 can indicate whether a starting offset (i.e., the position in the payload of the first matching character of a pattern that matches in the payload) option for the advanced PCRE signature option is desired for pattern 316.

根据在此披露的实施例，编译器306可以使用选自该组一个或多个正则表达式图样304中的所有图样的子图样302来生成一个统一的 DFA 312。编译器306可以基于如下进一步描述的至少一种启发法从该组一个或多个正则表达式图样304中的每一个图样中选择子图样302。编译器306还可以生成用于该组中的至少一个图样316的至少一个NFA 314，该至少一个图样316的一部分(未展示)用于生成该至少一个NFA 314，并且用于该至少一个NFA 314的运行时间处理(即，行走)的至少一个行走方向可以是基于所选的子图样318的长度是固定的还是可变的、以及所选的子图样318在该至少一个图样316中的位置来确定。编译器306可以将该统一的DFA 312和该至少一个NFA 314存储在该至少一个存储器104中。According to embodiments disclosed herein, compiler 306 can generate a unified DFA 312 using subpatterns 302 selected from all patterns in the set of one or more regular expression patterns 304. Compiler 306 can select subpatterns 302 from each pattern in the set of one or more regular expression patterns 304 based on at least one heuristic, as described further below. Compiler 306 can also generate at least one NFA 314 for at least one pattern 316 in the set, wherein a portion (not shown) of at least one pattern 316 is used to generate the at least one NFA 314, and at least one walk direction for runtime processing (i.e., walking) of the at least one NFA 314 can be determined based on whether the length of the selected subpattern 318 is fixed or variable and the position of the selected subpattern 318 in the at least one pattern 316. Compiler 306 can store the unified DFA 312 and the at least one NFA 314 in the at least one memory 104.

该编译器可以确定所选的潜在子图样的长度是固定的还是可变的。例如，如“cdef”的子图样的长度可以被确定为具有为4的固定长度，因为“cdef”是一个字符串，而包括运算符的复杂的子图样可以被确定为具有可变的长度。例如，如“a.*cd[^\n]{0,10}.*y”的复杂子图样可以具有“cd[^\n]{0,10}”作为所选的子图样，该子图样可以具有2 到12的可变长度。The compiler can determine whether the length of the selected potential sub-pattern is fixed or variable. For example, the length of a sub-pattern such as "cdef" can be determined to have a fixed length of 4 because "cdef" is a string, while a complex sub-pattern including an operator can be determined to have a variable length. For example, a complex sub-pattern such as "a.*cd[^\n]{0,10}.*y" can have "cd[^\n]{0,10}" as the selected sub-pattern, which can have a variable length of 2 to 12.

根据在此披露的实施例，子图样选择可以是基于至少一种启发法。一个子图样是一组来自一个图样的一个或多个连续的元素，其中，来自该图样的每一个元素可以通过一个在DFA或NFA图形中的节点表示，这是为了匹配来自该有效载荷的字节或字符的目的。如上所述，一个元素可以是由一个节点表示的单个文本字符，或由一个节点表示的一个字符类。基于一个子图样是否有可能导致过多的DFA图爆，如以上参考图2A-图2G描述的，编译器306可以确定该图样中的哪个子图样更好地适合NFA。例如，从一个包括连续文本字符的子图样生成DFA将不会导致DFA图爆，而如上所述的复杂的子图样可以包括运算符以及字符并因此可能导致DFA图爆。例如，包括重复多次的通配符或更大字符类的子图样(例如，[^\n]*或[^\n]{16})可能导致在DFA中过多的状态并因此可能更有利地适合于NFA。于是，编译器306在此可以被称为“智能编译器”。According to embodiments disclosed herein, subpattern selection can be based on at least one heuristic. A subpattern is a set of one or more consecutive elements from a pattern, wherein each element from the pattern can be represented by a node in a DFA or NFA graph for the purpose of matching bytes or characters from the payload. As described above, an element can be a single text character represented by a node, or a character class represented by a node. Based on whether a subpattern is likely to cause excessive DFA graph explosion, as described above with reference to Figures 2A-2G, compiler 306 can determine which subpattern in the pattern is better suited for an NFA. For example, generating a DFA from a subpattern that includes consecutive text characters will not cause the DFA graph to explode, while a complex subpattern as described above may include operators and characters and therefore may cause the DFA graph to explode. For example, a subpattern that includes multiple repeated wildcards or larger character classes (e.g., [^\n]* or [^\n]{16}) may result in too many states in the DFA and therefore may be more advantageously suited for an NFA. Thus, compiler 306 may be referred to herein as a "smart compiler."

如以上所披露的，从在该组一个或多个正则表达式304中的每个图样选择一个子图样可以是基于至少一种启发法。根据一个实施例，该至少一种启发法可以包括将所选唯一子图样的数目和所选每个子图样的长度最大化。例如，如“ab.*cdef.*mn”的一个图样可以具有多个潜在的子图样，如“ab.”、“cdef”和“.*mn”。该编译器可以选择“cdef”作为该图样的子图样，因为它是在图样“ab.*cdef.*mn”中不太可能导致DFA图爆的最大的子图样。然而，如果子图样“cdef”已经对另一个图样选中，则该编译器可以选择图样“ab.*cdef.*mn”的一个替代子图样。替代性地，该编译器可以对于该另一个图样用另外的子图样来替换子图样“cdef”，使得子图样“cdef”能够对于图样“ab.*cdef.*mn”进行选择。As disclosed above, selecting a subpattern from each pattern in the set of one or more regular expressions 304 can be based on at least one heuristic. According to one embodiment, the at least one heuristic can include maximizing the number of unique subpatterns selected and the length of each subpattern selected. For example, a pattern such as "ab.*cdef.*mn" can have multiple potential subpatterns, such as "ab.", "cdef," and ".*mn." The compiler can select "cdef" as the subpattern of the pattern because it is the largest subpattern in the pattern "ab.*cdef.*mn" that is unlikely to cause the DFA graph to explode. However, if the subpattern "cdef" has already been selected for another pattern, the compiler can select an alternative subpattern of the pattern "ab.*cdef.*mn." Alternatively, the compiler can replace the subpattern "cdef" with another subpattern for the other pattern, so that the subpattern "cdef" can be selected for the pattern "ab.*cdef.*mn."

于是，编译器306可以基于对于这些图样304中的每个而言可能的子图样的上下文来对图样304选择子图样，使得能够将所选的唯一子图样的数目和所选的每个子图样的长度最大化。于是，编译器306可以从所选的子图样302生成一个统一的DFA 312，这通过增加一个图样在该至少一个NFA 314中匹配的概率而使在该至少一个NFA 314的图样匹配中误报(即，无匹配或部分匹配)的数目最小化。The compiler 306 can then select subpatterns for the patterns 304 based on the context of possible subpatterns for each of the patterns 304 so as to maximize the number of unique subpatterns selected and the length of each subpattern selected. The compiler 306 can then generate a unified DFA 312 from the selected subpatterns 302 that minimizes the number of false positives (i.e., no match or partial match) in pattern matching in the at least one NFA 314 by increasing the probability that a pattern will match in the at least one NFA 314.

通过使子图样长度最大化，在NFA处理中的误报可以避免。在NFA处理中的误报可能导致非确定型运行时间处理，并且因此可能降低运行时间性能。另外，通过使所选的唯一子图样的数目最大化，编译器306使得在该统一的DFA与从该统一的DFA中(来自该图样)的一个子图样的匹配所给定的组中的一个子图样所生成的该至少一个 NFA 314的1:1的转换成为可能。By maximizing the subpattern length, false positives in NFA processing can be avoided. False positives in NFA processing can result in non-deterministic runtime processing and, therefore, can degrade runtime performance. Additionally, by maximizing the number of unique subpatterns selected, compiler 306 enables a 1:1 conversion of the at least one NFA 314 generated from a subpattern in the unified DFA with a subpattern from the pattern in the unified DFA.

例如，如果所选的子图样由多个图样共享，那么该统一的DFA 的一个行走器将会需要转换到多个至少一个NFA，因为每个至少一个 NFA是一个每图样NFA，并且来自该统一的DFA的子图样匹配表示对于该多个图样中的每个的部分匹配。于是，使唯一子图样的数目最大化减少了DFA:NFA数目1:N的转换，从而减少了通过行走器320的运行时间处理。For example, if the selected subpattern is shared by multiple patterns, then a walker of the unified DFA will need to transition to multiple at least one NFA, because each at least one NFA is a per-pattern NFA, and subpattern matches from the unified DFA represent partial matches for each of the multiple patterns. Thus, maximizing the number of unique subpatterns reduces the number of DFA:NFA transitions to 1:N, thereby reducing runtime processing by the walker 320.

为了使唯一子图样的数目能够最大化，编译器302可以计算所选子图样318的一个哈希值326并且将所计算的哈希值326与从中选择子图样318的一个图样316的一个标识符(未展示)相关联地进行存储。例如，编译器306可以对于在该组304中的每个图样计算所选子图样的一个哈希值。所计算的哈希值324可以作为一个表或以任何合适的方式存储在该至少一个存储器104中。所用的哈希方法可以是任何合适的哈希方法。该编译器可以将所计算的哈希值与对于该组中其他图样所选的子图样的一个哈希值列表进行比较，以便确定所选的子图样是否唯一。To maximize the number of unique subpatterns, compiler 302 may calculate a hash value 326 for selected subpattern 318 and store the calculated hash value 326 in association with an identifier (not shown) of a pattern 316 from which subpattern 318 was selected. For example, compiler 306 may calculate a hash value for each pattern in group 304 for the selected subpattern. The calculated hash values 324 may be stored in the at least one memory 104 as a table or in any other suitable manner. The hashing method used may be any suitable hashing method. The compiler may compare the calculated hash value with a list of hash values for selected subpatterns of other patterns in the group to determine whether the selected subpattern is unique.

如果所计算的哈希值存在于该列表中，该编译器可以确定是(i) 用来自该图样的另一个子图样替换该子图样还是(ii)用选自该组中的另一个图样的一个替代子图样来替换对于该组中的该另一个图样所选的子图样。该组中的另一个图样可以基于与该列表中与所计算的哈希值的关联进行标识。确定是替换(i)还是(ii)可以是基于比较考虑用于替换的子图样的长度，以便如上所述使正被选择的唯一子图样的长度最大化。替换一个所选的子图样可以包括选择对于一个给定图样标识的下一最长子图样或者下一最高优先级的子图样。例如，潜在的子图样可以基于造成DFA图爆的可能性或者预期的DFA图爆的强度来确定优先级。If the calculated hash value exists in the list, the compiler can determine whether to (i) replace the sub-pattern with another sub-pattern from the pattern or (ii) replace the sub-pattern selected for the other pattern in the group with an alternative sub-pattern selected from another pattern in the group. The other pattern in the group can be identified based on its association with the calculated hash value in the list. Determining whether to replace (i) or (ii) can be based on comparing the lengths of the sub-patterns considered for replacement so as to maximize the length of the unique sub-pattern being selected as described above. Replacing a selected sub-pattern can include selecting the next longest sub-pattern or the next highest priority sub-pattern identified for a given pattern. For example, potential sub-patterns can be prioritized based on the likelihood of causing a DFA graph explosion or the strength of the expected DFA graph explosion.

根据在此披露的实施例，该至少一种启发法可以包括标识每个图样的子图样并且如果给定的子图样具有小于一个最小阈值的长度，则忽略每个图样的这些所标识的子图样中的一个给定的子图样。例如，为了减少在该至少一个NFA中的误报，该编译器可以忽略具有小于该最小阈值的长度的子图样，因为此类子图样可能导致在该至少一个NFA 中的更高的误报概率。According to embodiments disclosed herein, the at least one heuristic may include identifying subpatterns of each pattern and ignoring a given subpattern of each pattern if the given subpattern has a length less than a minimum threshold. For example, to reduce false positives in the at least one NFA, the compiler may ignore subpatterns having a length less than the minimum threshold because such subpatterns may result in a higher probability of false positives in the at least one NFA.

该至少一种启发法可以包括访问与使用指示符的历史频率相关联的一个子图样知识库(未展示)，并且如果在所访问的知识库中对于该给定子图样的使用指示符的历史频率大于或等于一个频率使用阈值，则忽略每个图样的这些所标识的子图样中的一个给定的子图样。例如，应用或协议专用的子图样可以具有高使用频率，如对于超文本传送协议(HTTP)有效载荷、“回车换行”、或空流量(clear traffic)如来自二进制文件的多个连续的0，或任何其他频繁使用的子图样。The at least one heuristic may include accessing a subpattern knowledge base (not shown) associated with historical frequencies of usage indicators, and ignoring a given subpattern from each pattern if the historical frequency of usage indicators for the given subpattern in the accessed knowledge base is greater than or equal to a frequency usage threshold. For example, application- or protocol-specific subpatterns may have a high frequency of usage, such as for Hypertext Transfer Protocol (HTTP) payloads, "carriage return line feeds," or clear traffic such as multiple consecutive zeros from a binary file, or any other frequently used subpattern.

该至少一种启发法可以包括标识每个图样的子图样，并且对于每个图样，通过选择所标识的子图样中的一个给定的子图样，基于该给定的子图样具有这些所标识的子图样中最大的连续文本字符数目并且基于该给定的子图样对于该组一个或多个正则表达式所选的所有子图样中是唯一的，来最大化在所选子图样中的连续文本字符的数目。如以上所披露的，使所选的子图样的长度最大化可以使能在该至少一个NFA 中存在更高的匹配概率。The at least one heuristic may include identifying subpatterns of each pattern and, for each pattern, maximizing the number of consecutive text characters in the selected subpattern by selecting a given subpattern from the identified subpatterns based on the given subpattern having the largest number of consecutive text characters among the identified subpatterns and based on the given subpattern being unique among all subpatterns selected for the set of one or more regular expressions. As disclosed above, maximizing the length of the selected subpattern may enable a higher probability of a match in the at least one NFA.

该至少一种启发法可以包括基于这些给定子图样中每个的子图样类型和这些给定子图样的长度来设置每个图样的给定子图样的优先级。该子图样类型可以是纯文本、交替、单字符重复、或多字符重复，并且对于子图样类型从最高到最低的优先级顺序可以是纯文本、交替、单字符重复、和多字符重复。于是，为具有至少一个最小长度阈值的长度的文本串的子图样可以被设置为优先级高于可变长度的复杂子图样。The at least one heuristic method can include setting a priority for a given sub-pattern of each of the given sub-patterns based on the sub-pattern type of each of the given sub-patterns and the length of the given sub-patterns. The sub-pattern type can be plain text, alternating, single character repeat, or multiple character repeat, and the order of priority for the sub-pattern type from highest to lowest can be plain text, alternating, single character repeat, and multiple character repeat. Thus, sub-patterns that are text strings having a length of at least a minimum length threshold can be given a higher priority than complex sub-patterns of variable length.

编译器306可以使更长长度的子图样的优先级高于较短长度的另一个子图样。编译器306可以基于优先级设置选择一个唯一子图样作为该所选子图样。如上所述，所选的唯一子图样可以具有至少一个最小长度阈值的长度。The compiler 306 may prioritize a sub-pattern of a longer length over another sub-pattern of a shorter length. The compiler 306 may select a unique sub-pattern as the selected sub-pattern based on the priority setting. As described above, the selected unique sub-pattern may have a length of at least a minimum length threshold.

如果给定的子图样都不是唯一并且具有至少该最小长度阈值的长度的，则编译器306可以基于优先级设置选择一个非唯一子图样作为所选的子图样。于是，编译器306可以从一个图样中选择一个子图样，该子图样是选自另一个图样的子图样的一个副本，而不是选择具有小于该最小阈值的长度的一个子图样。为了协助完成子图样，编译器306可以在这些图样上进行多次忽略并且将可能的子图样通过长度分类。于是，对于在该组一个或多个正则表达式304中的一个给定图样的编译器子图样选择可以在对于在该组一个或多个正则表达式304中的其他图样的子图样选择的上下文内进行。If none of the given subpatterns are unique and have a length of at least the minimum length threshold, the compiler 306 can select a non-unique subpattern as the selected subpattern based on the priority setting. Thus, the compiler 306 can select a subpattern from a pattern that is a copy of a subpattern selected from another pattern, rather than selecting a subpattern with a length less than the minimum threshold. To assist in completing subpatterns, the compiler 306 can perform multiple ignores on these patterns and sort possible subpatterns by length. Thus, the compiler's subpattern selection for a given pattern in the set of one or more regular expressions 304 can be performed within the context of subpattern selections for other patterns in the set of one or more regular expressions 304.

如上所述，限定符322可以指示希望报告一个起始偏移。然而，该起始偏移可能不容易分辨。例如，在给定一个如“axycamb”的有效载荷时，在有效载荷匹配图样中找到一个起始偏移如“a.*b”或“a.*d”可能是困难的，因为可能正在匹配两个图样“axycamb”和“amb.”。于是，可能需要追踪对于在该有效载荷中“a”的两个实例的偏移作为潜在起始偏移。根据在此披露的实施例，潜在的起始偏移不需要被追踪，由于在已经确定在一个有效载荷中找到该整个图样的匹配之前不确定该起始偏移。确定整个图样的匹配可以通过采用来自该统一的DFA、该至少一个NFA或其组合的匹配结果来发现。As described above, qualifier 322 may indicate that a starting offset is desired to be reported. However, the starting offset may not be easily discernible. For example, given a payload such as "axycamb," finding a starting offset such as "a.*b" or "a.*d" in the payload matching pattern may be difficult because two patterns, "axycamb" and "amb.", may be matching. Thus, it may be necessary to track the offsets for the two instances of "a" in the payload as potential starting offsets. According to the embodiments disclosed herein, potential starting offsets do not need to be tracked because the starting offset is not determined until a match for the entire pattern has been found in a payload. Determining a match for the entire pattern can be found by using matching results from the unified DFA, the at least one NFA, or a combination thereof.

根据在此披露的实施例，如果在所接收的数据包101中的一个有效载荷包括与来自图样316的一个所选子图样318匹配的内容，则该行走器可以转换到行走图样318的至少一个NFA。行走器320可以报告该所选子图样318的匹配以及一个偏移，该偏移将该匹配子图样的最后一个字符在所接收的数据包中的位置标识为在该有效载荷中该子图样的结束偏移。子图样匹配可以是对该图样的部分匹配，如果该自图样是该图样的一个子集。于是，行走器320可以通过行走该图样的至少一个 NFA来在有效载荷中继续搜索该图样的剩余部分，以便确定该图样的最终匹配。应当理解的是，该图样可以遍历所接收的数据包101a中的一个或多个有效载荷。According to embodiments disclosed herein, if a payload in a received packet 101 includes content that matches a selected sub-pattern 318 from pattern 316, the walker can transition to walking at least one NFA of pattern 318. Walker 320 can report the match of the selected sub-pattern 318 along with an offset identifying the position of the last character of the matching sub-pattern in the received packet as the end offset of the sub-pattern in the payload. A sub-pattern match can be a partial match of the pattern if the sub-pattern is a subset of the pattern. Walker 320 can then continue searching the payload for the remainder of the pattern by walking at least one NFA of the pattern to determine a final match for the pattern. It should be understood that the pattern can traverse one or more payloads in the received packet 101a.

图3B是一种方法的示例实施例的一个流程图(350)，该方法可以被实现在至少一个处理器中，该处理器与操作地耦合到网络的一个安全装置中的至少一个存储器操作地耦合。该方法可以开始(352)并且基于至少一种启发法从该组一个或多个正则表达式图样中的每一个图样中选择一个子图样(354)。该方法可以使用选自该组中所有图样的这些子图样生成一个统一的确定型有限自动机(DFA)(356)。该方法可以生成用于该组中的至少一个图样的至少一个非确定型有限自动机(NFA)，该至少一个图样的一部分用于生成该至少一个NFA，并且用于该至少一个NFA的运行时间处理的至少一个行走方向可以是基于所选的子图样的长度是固定的还是可变的、以及所选的子图样在该至少一个图样中的位置来确定(358)。该方法可以将该统一的DFA和该至少一个NFA存储在该至少一个存储器中(360)。在该示例实施例中，该方法随后结束(362)。FIG3B is a flowchart (350) of an example embodiment of a method that can be implemented in at least one processor operatively coupled to at least one memory in a security device operatively coupled to a network. The method can begin (352) and select a subpattern from each of the set of one or more regular expression patterns based on at least one heuristic (354). The method can generate a unified deterministic finite automaton (DFA) (356) using the subpatterns selected from all patterns in the set. The method can generate at least one nondeterministic finite automaton (NFA) for at least one pattern in the set, a portion of the at least one pattern being used to generate the at least one NFA, and at least one walk direction for runtime processing of the at least one NFA can be determined based on whether the length of the selected subpattern is fixed or variable and the position of the selected subpattern in the at least one pattern (358). The method can store the unified DFA and the at least one NFA in the at least one memory (360). In the example embodiment, the method then ends (362).

如上所述，编译器306可以生成该统一的DFA 312和该至少一个NFA 314以使得行走器320能够在所接收的数据包101a中搜索一个或多个正则表达式图样304的匹配。编译器306可以基于至少一种启发法从该组一个或多个正则表达式图样304中的每一个图样中选择一个子图样。统一的DFA 312可以使用选自该组304中的所有图样的这些子图样302来生成。编译器306可以对于该组304中的至少一个图样316 生成至少一个NFA 314。于是，编译器306可以被配置成用于将规则集 310编译成二进制图像112，该二进制图像标识来自规则集310的可能最好适合DFA或NFA处理的部分。由此，二进制图像112可以包括至少两个区段，其中，一个第一区段用于DFA处理并且一个第二区段用于NFA处理，如统一的DFA 312和该至少一个NFA 314。如以上所披露的，二进制图像112可以包括用于DFA和NFA两者的编译规则数据，或者可以是将DFA编译规则数据与NFA编译规则数据分开的多个二进制图像。例如，NFA编译规则可以与DFA编译规则分开并存储在操作地耦合到HNA 108的图形存储器中。存储器104可以是图形存储器，该图形存储器可以是多个存储器，例如下文相对于图4披露的图形存储器456。As described above, the compiler 306 can generate the unified DFA 312 and the at least one NFA 314 to enable the walker 320 to search for a match to one or more regular expression patterns 304 in the received data packet 101a. The compiler 306 can select a sub-pattern from each of the set of one or more regular expression patterns 304 based on at least one heuristic. The unified DFA 312 can be generated using these sub-patterns 302 selected from all patterns in the set 304. The compiler 306 can generate at least one NFA 314 for at least one pattern 316 in the set 304. The compiler 306 can then be configured to compile the rule set 310 into a binary image 112 that identifies portions of the rule set 310 that are likely best suited for DFA or NFA processing. Thus, binary image 112 may include at least two sections, wherein a first section is used for DFA processing and a second section is used for NFA processing, such as a unified DFA 312 and the at least one NFA 314. As disclosed above, binary image 112 may include compilation rule data for both DFA and NFA, or may be multiple binary images that separate DFA compilation rule data from NFA compilation rule data. For example, NFA compilation rules may be separated from DFA compilation rules and stored in a graphics memory operatively coupled to HNA 108. Memory 104 may be a graphics memory, which may be a plurality of memories, such as graphics memory 456 disclosed below with respect to FIG. 4.

图4是图1的HNA 108的一个环境的示例实施例的框图450。根据在此披露的实施例，HFA 110可以被配置成用于实现行走器320相对于DFA处理的功能性，并且HNA 108可以被配置成用于实现行走器 320相对于NFA处理的功能性。Figure 4 is a block diagram 450 of an example embodiment of an environment for HNA 108 of Figure 1. According to embodiments disclosed herein, HFA 110 may be configured to implement the functionality of walker 320 with respect to DFA processing, and HNA 108 may be configured to implement the functionality of walker 320 with respect to NFA processing.

根据在此披露的实施例，HNA 108可以被配置成用于从一个指令队列454读取至少一个指令453。指令队列454可以被配置成用于存储该至少一个指令453，该指令可以是由一个主机(未展示)发送的以便由HNA 108进行处理。该至少一个指令453可以包括至少一个工作，如S1459a、S2459b、或S3459c。该至少一个工作各自可以基于由图1 的HFA协处理器110标识的对于图3A的这些子图样302的一个给定子图样(在输入流中正在进行匹配)的部分匹配结果来确定。According to embodiments disclosed herein, the HNA 108 can be configured to read at least one instruction 453 from an instruction queue 454. The instruction queue 454 can be configured to store the at least one instruction 453, which can be sent by a host (not shown) for processing by the HNA 108. The at least one instruction 453 can include at least one task, such as S1 459a, S2 459b, or S3 459c. Each of the at least one tasks can be determined based on a partial match result identified by the HFA coprocessor 110 of FIG. 1 for a given sub-pattern of the sub-patterns 302 of FIG. 3A (being matched in the input stream).

该至少一个工作中的一个给定工作可以指示该至少一个NFA 314中的一个给定NFA、该给定NFA的至少一个给定节点、一个给定有效载荷中的至少一个给定的偏移、以及至少一个行走方向，该至少一个行走方向各自对应于该至少一个给定节点中的一个节点。该至少一个工作各自可以包括由该HFA处理的结果，从而使得HNA能够在对于该至少一个图样304中的一个给定图样的给定NFA中将一个对应于给定子图样的匹配前进。于是，每个工作表示由HFA协处理器110确定的部分匹配结果，以便通过HNA协处理器108将该给定图样的匹配前进。A given job in the at least one job may indicate a given NFA in the at least one NFA 314, at least one given node in the given NFA, at least one given offset in a given payload, and at least one walking direction, each of the at least one walking direction corresponding to one of the at least one given nodes. Each of the at least one job may include a result processed by the HFA, thereby enabling the HNA to advance a match corresponding to a given subpattern in a given NFA for a given pattern in the at least one pattern 304. Thus, each job represents a partial match result determined by the HFA coprocessor 110 to advance the match for the given pattern by the HNA coprocessor 108.

HNA 108可以通过读取存储在其中的至少一个指针(未展示) 或其他合适的指令信息来处理该至少一个指令453。该至少一个指针可以包括指向一个输入缓冲器458的一个输入缓冲器指针(未展示)。该至少一个指令453还可以包括指向一个有效载荷462的一个有效载荷指针(未展示)、指向一个匹配结果缓冲器466的一个结果缓冲器指针(未展示)、指向一个保存缓冲器464的一个保存缓冲器指针(未展示)和指向一个运行堆栈460的一个运行堆栈指针(未展示)。The HNA 108 may process the at least one instruction 453 by reading at least one pointer (not shown) or other suitable instruction information stored therein. The at least one pointer may include an input buffer pointer (not shown) pointing to an input buffer 458. The at least one instruction 453 may also include a payload pointer (not shown) pointing to a payload 462, a result buffer pointer (not shown) pointing to a match result buffer 466, a save buffer pointer (not shown) pointing to a save buffer 464, and a run stack pointer (not shown) pointing to a run stack 460.

输入缓冲器458、运行堆栈460、和保存缓冲器464在此可以分别被称为输入堆栈、运行堆栈和保存堆栈，虽然输入缓冲器458、运行堆栈460、和保存缓冲器464可以展现或不展现堆栈的后进先出 (LIFO)特性。输入缓冲器458、运行堆栈460、和保存缓冲器464可以位于相同的或不同的物理缓冲器之内。如果位于相同的物理缓冲器之内，输入缓冲器458、运行堆栈460、和保存缓冲器464的条目可以是基于这些条目的字段设定而有差别的或者是以任何其他合适的方式而有差别的。输入缓冲器458和运行堆栈460可以位于相同的物理缓冲器 (可以是芯片上的)中，而保持缓冲器464可以位于另一个物理缓冲器 (可以是芯片外的)中。Input buffer 458, run stack 460, and hold buffer 464 may be referred to herein as an input stack, a run stack, and a hold stack, respectively, although input buffer 458, run stack 460, and hold buffer 464 may or may not exhibit a last-in, first-out (LIFO) stack. Input buffer 458, run stack 460, and hold buffer 464 may be located in the same physical buffer or in different physical buffers. If located in the same physical buffer, the entries of input buffer 458, run stack 460, and hold buffer 464 may be differentiated based on field settings of those entries or in any other suitable manner. Input buffer 458 and run stack 460 may be located in the same physical buffer (which may be on-chip), while hold buffer 464 may be located in another physical buffer (which may be off-chip).

该至少一个指令453的该至少一个工作如S1 459a、S2 459b、或S3 459c可以被存储在用于由HNA 108处理的输入堆栈458中。该至少一个指令的该至少一个工作可以各自属于一个相同的给定的有效载荷，如有效载荷462，该有效载荷由HFA 110处理。The at least one work of the at least one instruction 453, such as S1 459a, S2 459b, or S3 459c, can be stored in an input stack 458 for processing by the HNA 108. The at least one work of the at least one instruction can each belong to a same given payload, such as payload 462, that is processed by the HFA 110.

HNA 108可以被配置成用于基于该输入缓冲器指针从该输入缓冲器458加载(即，提取或取回)至少一个工作，如工作S1 459a、 S2 459b、或S3 459c。HNA 108可以将该至少一个工作推送(即，存储) 至运行堆栈460。HNA 108可以从该运行堆栈弹出(即，读取、提取、加载等)一个给定的工作，如条目S1 459a、S2 459b、或S3 459c，并且处理该给定的工作。该至少一个工作各自(例如，S1 459a、S2 459b、或S3 459c)可以包括到该有效载荷462的一个区段(未展示)的一个有效载荷偏移(未展示)，以及指向一个图形457的指针，可以是至少一个有限自动机中的一个给定有限自动机，如图3A的该至少一个NFA 314。The HNA 108 can be configured to load (i.e., extract or retrieve) at least one job, such as job S1 459a, S2 459b, or S3 459c, from the input buffer 458 based on the input buffer pointer. The HNA 108 can push (i.e., store) the at least one job to the run stack 460. The HNA 108 can pop (i.e., read, extract, load, etc.) a given job, such as entry S1 459a, S2 459b, or S3 459c, from the run stack and process the given job. Each of the at least one job (e.g., S1 459a, S2 459b, or S3 459c) can include a payload offset (not shown) to a section (not shown) of the payload 462 and a pointer to a graph 457, which can be a given finite automaton in at least one finite automaton, such as the at least one NFA 314 of FIG. 3A .

HNA 108可以从图形存储器456加载(即，提取)图形457，该图形可以被包括在图1和图3A的二进制图像112中，并且开始使用对应于有效载荷462的对应有效载荷偏移的有效载荷区段来处理图形 457。图形存储器456可以是多个存储器。图形存储器456可以操作地耦合到HFA 110以及HNA 108。HNA 108可以通过用有效载荷区段行走图形457的节点来处理图形457。图形457的部分匹配的路径可以包括图形457的至少两个节点，这些节点将该有效载荷的连续区段与一个所使用的给定图样相匹配以生成图形457。部分匹配的路径在此可以被称为线程或活动线程。HNA 108 can load (i.e., extract) graph 457 from graph memory 456, which can be included in binary image 112 of Figures 1 and 3A, and begin processing graph 457 using a payload segment corresponding to a corresponding payload offset of payload 462. Graph memory 456 can be a plurality of memories. Graph memory 456 can be operatively coupled to HFA 110 and HNA 108. HNA 108 can process graph 457 by walking the nodes of graph 457 with the payload segment. A partially matched path of graph 457 can include at least two nodes of graph 457 that match consecutive segments of the payload with a given pattern used to generate graph 457. A partially matched path can be referred to as a thread or active thread herein.

HNA 108可以使用来自有效载荷462的有效载荷区段处理图形 457，将条目推送到运行堆栈460/从该运行堆栈弹出，以保持并且恢复它在图形457中的位置。例如，如果一个行走过的节点呈现出对于下一要走的节点的多个选项，HNA 108可能需要保持其在图形中的位置。例如，HNA 108可以行走一个呈现多个处理路径选项的节点，如在该图形中表示的一个叉形。根据在此披露的实施例，DFA或NFA的节点与一个节点类型相关。与分离类型相关联的节点可以呈现多个处理路径选项。分离类型在以下参考图5A进一步进行披露。HNA 108 can process graph 457 using the payload section from payload 462, pushing entries onto/popping out of run stack 460 to maintain and restore its position in graph 457. For example, if a walked node presents multiple options for the next node to be walked, HNA 108 may need to maintain its position in the graph. For example, HNA 108 can walk a node that presents multiple processing path options, such as a fork shown in the graph. According to the embodiments disclosed herein, nodes of a DFA or NFA are associated with a node type. Nodes associated with a separation type can present multiple processing path options. Separation types are further disclosed below with reference to FIG5A.

根据在此披露的实施例，HNA 108可以被配置成用于选择该多个处理路径中的一个给定的路径，并且将一个条目推送到运行堆栈460 中，该运行堆栈可以基于确定沿所选路径的行走过的节点处的不匹配 (即，否定)结果允许HNA 108返回并且沿该多个处理路径中的未选择的路径前进。于是，推送运行堆栈460上的条目可以在图形457中保存一个表示未探索的上下文的位置。未探索的上下文可以指示图形457 的一个给定节点和一个相应的有效载荷偏移以使HNA 108能够返回到该给定节点并且用该有效载荷462的给定区段来行走该给定节点，因为该给定区段可能位于有效载荷462中的相应的有效载荷偏移处。于是，运行堆栈460可以用于允许引擎462记住并在稍后行走图形457的一个未探索的路径。推送或存储一个指示给定节点和在给定有效载荷中的相应偏移的条目在此可以被称为存储未探索的上下文、线程或非活动线程。弹出、提取、或加载一个指示该给定节点和在该给定有效载荷中的相应的偏移的条目以便用在该给定有效载荷中的相应的偏移处定位的一个区段来行走该给定节点，在此可以被称为激活一个线程。丢弃一个指示该给定节点和在该给定有效载荷中的相应偏移的条目在此可以被称为清除一个条目或止用一个线程。According to embodiments disclosed herein, HNA 108 can be configured to select a given path from the multiple processing paths and push an entry onto run stack 460. This run stack can allow HNA 108 to return and proceed along an unselected path from the multiple processing paths based on a mismatch (i.e., a negative result) at a node walked along the selected path. Thus, pushing an entry onto run stack 460 can save a location in graph 457 representing an unexplored context. The unexplored context can indicate a given node in graph 457 and a corresponding payload offset, enabling HNA 108 to return to the given node and walk the given node using a given segment of payload 462, as the given segment may be located at a corresponding payload offset in payload 462. Thus, run stack 460 can be used to allow engine 462 to remember and later walk an unexplored path in graph 457. Pushing or storing an entry indicating a given node and a corresponding offset in a given payload may be referred to herein as storing an unexplored context, thread, or inactive thread. Popping, extracting, or loading an entry indicating the given node and the corresponding offset in the given payload so as to walk the given node using a segment located at the corresponding offset in the given payload may be referred to herein as activating a thread. Discarding an entry indicating the given node and the corresponding offset in the given payload may be referred to herein as clearing an entry or deactivating a thread.

在用图形457行走该有效载荷462的区段时达到有效载荷462 的一端的事件中，运行堆栈460可以允许HNA 108保存其在图形457 中的位置。例如，HNA 108可以确定该有效载荷或有效载荷462的一部分部分地匹配一个给定图样，并且有效载荷462的一个当前有效载荷偏移是该有效载荷462的一个结束偏移。于是，HNA 108可以确定仅发现该给定图样的一个部分匹配并且该整个有效载荷462已处理。于是， HNA 108可以将运行堆栈460的内容保存到保存缓冲器464以用对应于与已处理的有效载荷462同一个流的下一有效载荷继续行走。保存缓冲器464可以被配置成用于在整个有效载荷462被处理的事件中存储运行堆栈460的至少一个运行堆栈条目，镜像运行堆栈460的一个运行状态。In the event that the end of payload 462 is reached while walking a segment of payload 462 using graph 457, run stack 460 can allow HNA 108 to save its position in graph 457. For example, HNA 108 can determine that the payload or a portion of payload 462 partially matches a given pattern and that a current payload offset of payload 462 is an end offset of payload 462. Thus, HNA 108 can determine that only a partial match of the given pattern was found and that the entire payload 462 has been processed. HNA 108 can then save the contents of run stack 460 to save buffer 464 to continue walking with the next payload corresponding to the same stream as processed payload 462. Save buffer 464 can be configured to store at least one run stack entry of run stack 460 in the event that the entire payload 462 has been processed, mirroring an operational state of run stack 460.

基于找到该图样的一个最终(即，完全或完整)的匹配，HNA 可以弹出并丢弃在运行堆栈460中的与当前工作相关联的条目，例如从输入缓冲器加载的工作，如S1 459a，并且将匹配结果(未展示)保存到匹配结果缓冲器466。替代性地，HNA 108可以继续处理运行堆栈460 的与当前工作相关的条目，因为所有有可能的匹配路径可能都是有意义的。Upon finding a final (i.e., complete or full) match for the pattern, the HNA 108 may pop and discard the entry associated with the current job in the run stack 460, such as the job loaded from the input buffer, such as S1 459a, and save the match results (not shown) to the match result buffer 466. Alternatively, the HNA 108 may continue processing the entries associated with the current job in the run stack 460, as all possible matching paths may be meaningful.

匹配结果可以包括与确定该图样的最终匹配之处的节点相关联的节点地址。确定该图样的最终匹配之处的节点在此可以被称为标记节点。节点地址、或在图形457中的一个最终匹配位置的其他标识符、匹配图样的标识符、匹配图样的长度、或任何其他合适的匹配结果、或其组合可以被包括在这些匹配结果中。Matching result can comprise the node address that is associated with the node of the final matching place of determining this pattern.The node of the final matching place of determining this pattern can be referred to as mark node at this.Node address or other identifier of a final matching position in graph 457, the identifier of matching pattern, the length of matching pattern, or any other suitable matching result or its combination can be included in these matching results.

基于处理所有与当前工作相关联的运行堆栈条目，HNA 108可以从运行堆栈加载下一工作，该下一工作先前已经从输入缓冲器458加载(例如，S2 459b)，因为HNA 108可以被配置成用于顺序地处理指令453的工作。于是，HNA 108可以从图形存储器456提取下一个图形 (未展示)、用来自有效载荷462的一个或多个有效载荷区段行走该下一个图形、并且继续处理另外的工作直到运行堆栈460为空。Upon processing all run stack entries associated with the current job, the HNA 108 may load the next job from the run stack, which was previously loaded from the input buffer 458 (e.g., S2 459b), because the HNA 108 may be configured to sequentially process the jobs of instruction 453. The HNA 108 may then fetch the next graphic (not shown) from the graphic memory 456, walk the next graphic using one or more payload segments from the payload 462, and continue processing additional jobs until the run stack 460 is empty.

基于在用有效载荷462行走图形457时找到有效载荷462的不匹配，HNA 108可以从运行堆栈460弹出一个与该当前工作(例如，S1 459a)相关联的条目并且基于所弹出的条目的内容用有效载荷462的下一区段来行走下一节点。如果运行堆栈460不包括一个与当前工作相关联的条目，则HNA 108可以结束当前工作并可以从运行堆栈460加载先前已经从输入缓冲器458加载的下一工作(例如，S2459b)。于是， HNA 108可以被配置成用于基于所加载的下一工作来行走下一个图形，并且继续处理另外的工作直到运行堆栈460为空。Upon finding a mismatch in payload 462 while walking graph 457 with payload 462, HNA 108 may pop an entry associated with the current job (e.g., S1 459a) from run stack 460 and walk the next node using the next segment of payload 462 based on the contents of the popped entry. If run stack 460 does not include an entry associated with the current job, HNA 108 may end the current job and load the next job (e.g., S2 459b) previously loaded from input buffer 458 from run stack 460. HNA 108 may then be configured to walk the next graph based on the loaded next job and continue processing additional jobs until run stack 460 is empty.

图5A是一个每图样NFA图形504的示例实施例的框图500，该图形可以由一个行走器320使用以匹配在一个输入流(未展示)中的正则表达式图样502。如上所述，HNA 108可以被配置成用于实现行走器320相对于NFA处理的功能性。5A is a block diagram 500 of an example embodiment of a per-pattern NFA graph 504 that may be used by a walker 320 to match regular expression patterns 502 in an input stream (not shown). As described above, HNA 108 may be configured to implement the functionality of walker 320 with respect to NFA processing.

在该示例实施例中，输入流可以包括一个带有有效载荷542的数据包(未展示)。正则表达式图样502是一个图样“h[^\n]*ab”，该图样指定字符“h”之后是与换行字符不匹配的无限数目的连续字符(即， [^\n]*)。该无限数目可以是零或更多。图样502进一步包括连续跟随在与换行字符不匹配的无限数目的字符之后的字符“a”和“b”。在该示例实施例中，有效载荷542包括区段552a-d(即，h、x、a、和b)，在有效载荷542中分别具有偏移520a-d(即，0、1、2、和3)。In this example embodiment, the input stream may include a data packet (not shown) with a payload 542. Regular expression pattern 502 is a pattern "h[^\n]*ab," which specifies that the character "h" is followed by an infinite number of consecutive characters that do not match a newline character (i.e., [^\n]*). The infinite number may be zero or more. Pattern 502 further includes the characters "a" and "b" followed consecutively by an infinite number of characters that do not match a newline character. In this example embodiment, payload 542 includes segments 552a-d (i.e., h, x, a, and b), which have offsets 520a-d (i.e., 0, 1, 2, and 3), respectively, within payload 542.

应当理解的是，正则表达式图样502、NFA图形504、有效载荷542、区段522a-d、以及偏移520a-d表示用于说明目的的示例，并且在此披露的系统、方法和相应的装置可以适用于任何合适的正则表达式图样、NFA图形、有效载荷、区段、和偏移。此外，应当理解的是，NFA图形504可以是一个更大的NFA图形(未展示)的一个子区域。另外，有效载荷542可以是一个更大的有效载荷(未展示)的一部分，并且该部分可以是在该更大的有效载荷的开始、结束、或任何位置处，从而导致与在该示例实施例中不同的偏移。It should be understood that regular expression pattern 502, NFA graph 504, payload 542, segments 522a-d, and offsets 520a-d represent examples for illustrative purposes, and that the systems, methods, and corresponding apparatus disclosed herein can be applied to any suitable regular expression pattern, NFA graph, payload, segment, and offset. Furthermore, it should be understood that NFA graph 504 can be a subregion of a larger NFA graph (not shown). Additionally, payload 542 can be a portion of a larger payload (not shown), and the portion can be at the beginning, end, or any other location of the larger payload, resulting in a different offset than in the example embodiment.

在该示例实施例中，NFA图形504是一个被配置成用于将该正则表达式图样502与该输入流进行匹配的每图样NFA图形。例如，NFA 图形504可以是一个包括由编译器306生成的多个节点的图形，如节点 N0 506、N1 508、N2 510、N3 512、N4 514、和N5 515。节点N0506 可以表示图样502的一个起始节点，而节点N5 515可以表示图样502 的一个标记节点。标记节点N5 515可以与一个指示符(未展示)相关联，该指示符反映与该输入流匹配的图样502的一个最终(即，完全或完整)的匹配。于是，行走器302可以基于遍历标记节点N5 515并检测该指示符来确定图样502在输入流中是匹配的。该指示符可以是与标记节点或任何其他合适的指示符相关联的元数据(未展示)的一个标志或字段设定。In this example embodiment, NFA graph 504 is a per-pattern NFA graph configured to match the regular expression pattern 502 to the input stream. For example, NFA graph 504 can be a graph including a plurality of nodes generated by compiler 306, such as nodes N0 506, N1 508, N2 510, N3 512, N4 514, and N5 515. Node N0 506 can represent a starting node of pattern 502, while node N5 515 can represent a marked node of pattern 502. Marked node N5 515 can be associated with an indicator (not shown) that reflects a final (i.e., complete or complete) match of pattern 502 with the input stream. Thus, walker 302 can determine that pattern 502 is a match in the input stream based on traversing marked node N5 515 and detecting the indicator. The indicator may be a flag or field setting in metadata (not shown) associated with the marker node or any other suitable indicator.

根据在此披露的实施例，行走器320可以一次行走一个区段地通过NFA图形504行走有效载荷542的区段522a-d，以将正则表达式 502与该输入流进行匹配。这些区段516中用于行走一个给定节点的一个给定区段可以基于在这种偏移518中其对应的偏移是有效载荷542之内的当前偏移而进行确定。根据在此披露的实施例，行走器320可以通过增大或减小当前偏移来更新该当前偏移。例如，行走器320可以在正向或逆向的方向中行走NFA图形504，并且因此可以通过分别增大或减小当前偏移，在一个正向543或逆向546方向中行走来自有效载荷542 的区段。According to embodiments disclosed herein, the walker 320 can walk segments 522a-d of the payload 542 through the NFA graph 504 one segment at a time to match the regular expression 502 against the input stream. A given segment of the segments 516 used to walk a given node can be determined based on whether its corresponding offset in the offset 518 is a current offset within the payload 542. According to embodiments disclosed herein, the walker 320 can update the current offset by increasing or decreasing the current offset. For example, the walker 320 can walk the NFA graph 504 in a forward or reverse direction and, therefore, can walk segments from the payload 542 in a forward 543 or reverse 546 direction by increasing or decreasing the current offset, respectively.

节点N0 506、N2 510、N3 512、和N4 514可以被配置成用于将一个对应的元素对有效载荷542的一个给定的区段进行匹配，而节点 N1 508和N5 515可以是不指示匹配功能性并因此不会从有效载荷542 进行处理的节点类型的节点。在该示例实施例中，节点N1 508是对行走器320呈现多个转换路径选项的分离节点。例如，行走分离节点N1 508呈现ε路径530a和530b。根据在此披露的实施例，行走器320可以基于与行走器306互相一致的隐含设定来选择该多个路径530a和 530b中的一个给定路径。例如，编译器306可以基于行走器320遵循一条确定性路径的隐含理解来生成NFA图形504，例如隐含地理解行走器 320基于行走分离节点508来选择一条上方ε路径530a。根据在此披露的实施例，上方ε路径530a可以被选择为表示懒惰路径的上方ε路径 530a。懒惰路径可以是代表元素的最短可能匹配的路径。Nodes N0 506, N2 510, N3 512, and N4 514 can be configured to match a corresponding element to a given segment of payload 542, while nodes N1 508 and N5 515 can be nodes of a node type that do not indicate matching functionality and therefore are not processed from payload 542. In this example embodiment, node N1 508 is a split node that presents multiple transition path options to walker 320. For example, walking split node N1 508 presents ε-paths 530a and 530b. According to embodiments disclosed herein, walker 320 can select a given path from the multiple paths 530a and 530b based on implicit assumptions made by walker 306. For example, compiler 306 can generate NFA graph 504 based on an implicit understanding that walker 320 follows a deterministic path, such as implicitly understanding that walker 320 selects an upper ε-path 530a based on walking split node 508. According to embodiments disclosed herein, upper epsilon path 530a may be selected as upper epsilon path 530a representing a lazy path. A lazy path may be a path representing the shortest possible match for an element.

根据在此披露的实施例，分离节点508可以是与分离节点元数据(未展示)相关联的，以呈现该多个路径选项。例如，在该示例实施例中，该分离节点元数据可以或直接或间接地指示多个下一节点，如节点N2 510和N3 512。如果该多个下一节点是直接指示的，那么该元数据可以包括指向这些下一节点N2 510和N3 512的绝对地址或指针。如果该多个下一节点是间接指示的，那么该元数据可以包括可用于解析下一节点N2 510和N3 512的绝对地址或指向下一节点N2 510和N3 512 的指针的索引或偏移。替代性地，也可以使用用于直接或间接地指示该多个下一节点的下一节点地址的其他合适的形式。According to embodiments disclosed herein, split node 508 may be associated with split node metadata (not shown) to present the multiple path options. For example, in this example embodiment, the split node metadata may directly or indirectly indicate multiple next nodes, such as nodes N2 510 and N3 512. If the multiple next nodes are directly indicated, the metadata may include absolute addresses or pointers to these next nodes N2 510 and N3 512. If the multiple next nodes are indirectly indicated, the metadata may include indexes or offsets that can be used to resolve the absolute addresses of next nodes N2 510 and N3 512 or pointers to next nodes N2 510 and N3 512. Alternatively, other suitable forms of next node addresses for directly or indirectly indicating the multiple next nodes may also be used.

该隐含的理解可以包括配置行走器320以基于包括在该分离节点元数据之内的一个特定的条目位置中的节点元数据来选择多个下一节点中的一个给定的下一节点。编译器306可以被配置成用于生成该分离节点元数据，包括一个该给定的下一节点在所指定的条目位置的指示。于是，编译器306可以使用如下隐含的理解来生成NFA图形504：一个给定路径，如上方ε路径530a，在分离节点N1 508处将被行走器 320选择。The implicit understanding can include configuring the walker 320 to select a given next node from a plurality of next nodes based on node metadata included in a particular entry position within the split node metadata. The compiler 306 can be configured to generate the split node metadata including an indication that the given next node is at the specified entry position. Thus, the compiler 306 can generate the NFA graph 504 using the implicit understanding that a given path, such as the ε path 530a above, will be selected by the walker 320 at the split node N1 508.

图5B是用于以一个有效载荷542行走图5A的NFA图形的处理周期的示例实施例的一个表538。应当理解的是，一个处理周期可以包括一个或多个时钟周期。Figure 5B is a table 538 of an example embodiment of a processing cycle for walking the NFA graph of Figure 5A with a payload 542. It should be understood that a processing cycle may include one or more clock cycles.

如在表538中所示的，处理周期540a-h可以包括在当前偏移532用来自有效载荷542的一个区段行走一个当前节点530以确定一个匹配结果534并基于匹配结果534确定行走行动536。在该示例实施例中，节点N0 506可以具有字符节点类型。例如，节点NO 506可以是一个字符节点，该字符节点被配置成用于匹配在输入流中的字符“h”。在该示例实施例中，行走器320可以在处理周期540a中在当前偏移520a 处用区段522a(即，“h”)行走起始节点N0 506。As shown in table 538, processing cycles 540a-h can include walking a current node 530 at a current offset 532 with a segment from payload 542 to determine a match result 534 and determining a walk action 536 based on the match result 534. In this example embodiment, node N0 506 can have a character node type. For example, node NO 506 can be a character node that is configured to match the character "h" in the input stream. In this example embodiment, the walker 320 can walk the starting node N0 506 at the current offset 520a with segment 522a (i.e., "h") in processing cycle 540a.

当区段522a匹配在节点N0 506处的字符“h”时，行走器320 可以确定匹配结果534是肯定匹配结果。如经由与起始节点N0 506相关联的元数据(未展示)通过编译器306指定的，行走器320可以在正向方向上行走并提取由与节点N0 506相关联的元数据指示的下一节点并且可以将当前偏移从520a(即，“0”)增大到520b(即，“1”)。通过节点N0 506指示的下一节点在该示例实施例中是分离节点N1 508。于是，行走器320采取处理周期540a的行动536，该行动包括将有效载荷542中的当前偏移更新到“1”并且转换到分离节点N1508。转换可以包括提取(在此也被称为加载)分离节点N1 508。When segment 522a matches the character "h" at node N0 506, the walker 320 can determine that the match result 534 is a positive match result. As specified by the compiler 306 via metadata (not shown) associated with the starting node N0 506, the walker 320 can walk in the forward direction and extract the next node indicated by the metadata associated with node N0 506 and can increase the current offset from 520a (i.e., "0") to 520b (i.e., "1"). The next node indicated by node N0 506 is the separation node N1 508 in this example embodiment. The walker 320 then takes action 536 of processing cycle 540a, which includes updating the current offset in the payload 542 to "1" and switching to the separation node N1 508. The conversion can include extracting (also referred to herein as loading) the separation node N1 508.

由于分离节点N1 508呈现多个转换路径选项，如ε路径530a 和530b，处理周期540b的行动536可以包括选择上方ε路径530a并提取与有效载荷542无关的节点N2 510而不从有效载荷542进行消耗(即，处理)。因为没有通过分离节点N1 508执行匹配功能，所以当前偏移/ 区段532没有变化，并且由此对于处理周期540b而言有效载荷没有被消耗(即，处理)。Because split node N1 508 presents multiple conversion path options, such as epsilon paths 530a and 530b, action 536 of processing cycle 540b may include selecting upper epsilon path 530a and extracting node N2 510, which is not associated with payload 542, without consuming (i.e., processing) from payload 542. Because the matching function is not performed by split node N1 508, current offset/segment 532 is unchanged, and thus the payload is not consumed (i.e., processed) for processing cycle 540b.

因为分离节点N1 508呈现多个路径选项，行动536可以包括存储未探索的上下文，如通过存储节点N3 512和当前偏移520b(即，“1”)的一个间接或直接的标识符。所选的转换路径在此可以被称为当前的或活动的线程，并且所存储的每个未遍历的转换路径在此可以被称为存储线程。每个线程可以通过一个相应的节点标识符和在有效载荷中的偏移来标识。于是，未探索的上下文可以标识一个未探索的线程 (即，路径)。Because split node N1 508 presents multiple path options, action 536 may include storing an unexplored context, such as by storing an indirect or direct identifier of node N3 512 and current offset 520b (i.e., "1"). The selected transition path may be referred to herein as the current or active thread, and each stored untraversed transition path may be referred to herein as a stored thread. Each thread may be identified by a corresponding node identifier and offset in the payload. Thus, the unexplored context may identify an unexplored thread (i.e., path).

存储该未探索的上下文可以使得行走器320能够记得返回到节点N3 512以便在沿所选的部分匹配路径发生否定匹配结果的事件中在有效载荷542中的偏移520b处用区段“1”行走节点N3 512，例如，如果该否定匹配结果是在节点N2 510处或沿从节点N2 510延伸的路径上的节点处确定的。根据在此披露的实施例，未探索的上下文可以用一个丢弃未探索的处理(DUP)指示符来标记，该指示符指示在沿所选的转换路径标识一个用于图样502的最终匹配的事件中行走器320是丢弃还是处理该未探索的上下文。Storing the unexplored context can enable the walker 320 to remember to return to node N3 512 to walk node N3 512 with segment "1" at offset 520b in the payload 542 in the event that a negative match result occurs along the selected partial match path, for example, if the negative match result is determined at node N2 510 or at a node along a path extending from node N2 510. According to embodiments disclosed herein, the unexplored context can be marked with a discard unexplored process (DUP) indicator that indicates whether the walker 320 discards or processes the unexplored context in the event that a final match for the graph 502 is identified along the selected transition path.

例如，基于到达指示在输入流中对于图样502的最终(即，完整或完全)的匹配的标记节点N5 515，行走器320可以采用该DUP指示符来确定是否通过在偏移520b处以区段“x”行走节点N3 512来处理该未探索的上下文，以努力确定NFA图形504的匹配图样502的另一个路径，或者是否丢弃该未探索的上下文。用DUP指示符标记该未探索的上下文可以包括以任何合适的方式标记该未探索的上下文，如通过将与该未探索的上下文相关的一个位或字段设定为真，以表示堆栈条目的所希望的处理，或者设定为假，以表示堆栈条目的所希望的丢弃。For example, upon reaching marked node N5 515 indicating a final (i.e., complete or full) match for pattern 502 in the input stream, the walker 320 may employ the DUP indicator to determine whether to process the unexplored context by walking node N3 512 at offset 520 b with segment “x” in an effort to determine another path of the NFA graph 504 that matches pattern 502, or whether to discard the unexplored context. Marking the unexplored context with a DUP indicator may include marking the unexplored context in any suitable manner, such as by setting a bit or field associated with the unexplored context to true to indicate desired processing of the stack entry, or to false to indicate desired discarding of the stack entry.

一个存储线程是否被遍历可以通过编译器306来确定。例如，编译器306可以控制是否对应于每个节点的元数据来配置一种设定从而设定该DUP指示符。替代性地，编译器306可以配置一个包括在与该有限自动机相关联的全局元数据中的全局设定，指定所有存储线程都需要被遍历，从而使得所有可能的匹配能够得到标识。Whether a storage thread is traversed can be determined by the compiler 306. For example, the compiler 306 can control whether to configure a setting corresponding to the metadata of each node to set the DUP indicator. Alternatively, the compiler 306 can configure a global setting included in the global metadata associated with the finite automaton to specify that all storage threads need to be traversed, so that all possible matches can be identified.

在该示例实施例中，ε转换路径530a的选择可能造成检测到在节点N2 510或在当前线程的一个后续节点如N4 514处的匹配失败。于是，如果检测到匹配失败，则用于ε转换路径530b的存储线程可以被遍历。替代性地，如果由编译器306指定，ε转换路径530b可以被遍历，而无论遍历ε转换路径530b是否造成检测到匹配失败。In this example embodiment, selection of the epsilon transition path 530a may result in a match failure being detected at node N2 510 or at a subsequent node of the current thread, such as N4 514. Thus, if a match failure is detected, the storage thread for the epsilon transition path 530b may be traversed. Alternatively, if specified by the compiler 306, the epsilon transition path 530b may be traversed regardless of whether traversing the epsilon transition path 530b results in a match failure being detected.

存储该未遍历的转换路径可以包括在一个堆栈(如图4的运行堆栈460)中推送一个条目，这是通过与在该条目中的当前偏移522b 的指示相关联地存储下一节点N3 513的一个标识符。下一节点N3 513 的标识符可以是一个值、指针或该下一节点的任何其他合适的指示符。偏移的值可以是一个数值、指针或标识区段516在有效载荷542中位置的任何其他合适的值。Storing the untraversed transition path may include pushing an entry into a stack (such as the run stack 460 of FIG. 4 ) by storing an identifier of the next node N3 513 in association with an indication of the current offset 522 b in the entry. The identifier of the next node N3 513 may be a value, a pointer, or any other suitable indicator of the next node. The value of the offset may be a value, a pointer, or any other suitable value that identifies the location of the segment 516 in the payload 542.

根据该示例实施例，基于选择上方路径(即，ε转换路径530a)，行走器320可以提取节点N2 510并且尝试将在当前偏移520b(即，“1”) 处的区段522b(即，“x”)与处理周期540c中节点N2 510的元素“a”进行匹配。因为“x”并不匹配节点N2 510处的元素“a”，处理周期540c的行动536可以包括从运行堆栈460弹出一个条目。弹出的条目 544b可以是一个最近弹出的条目，如在该示例实施例中指示节点N3 512 和偏移520b(即，“1”)的一个存储条目544a。According to this example embodiment, based on selecting the upper path (i.e., ε-transition path 530a), the walker 320 can extract node N2 510 and attempt to match segment 522b (i.e., "x") at the current offset 520b (i.e., "1") with element "a" of node N2 510 in processing cycle 540c. Because "x" does not match element "a" at node N2 510, action 536 of processing cycle 540c can include popping an entry from the run stack 460. The popped entry 544b can be a recently popped entry, such as, in this example embodiment, a stored entry 544a indicating node N3 512 and offset 520b (i.e., "1").

行走器320可以转换并且行走节点N3 512并且用在有效载荷 542中偏移520b处定位的区段“x”。于是，处理周期540d展示了匹配结果534对于处理周期540d是肯定的。处理周期540d的行动536可以包括将当前偏移更新到偏移520c并且转换回到分离节点N1 508，该节点可以是由节点N3 512指示的下一节点。Walker 320 can switch and walk node N3 512 and use segment "x" located at offset 520b in payload 542. Thus, processing cycle 540d shows that match result 534 is positive for processing cycle 540d. Action 536 of processing cycle 540d can include updating the current offset to offset 520c and switching back to split node N1 508, which can be the next node indicated by node N3 512.

因为所有从分离节点508的弧线转换都是ε转换，所以行走器 320可以再次选择该多个路径选项中的一个路径，并且不对来自有效载荷542的一个区段进行消耗(即，处理)，由于当前偏移对于处理周期 540e没有更新。在示例实施例中，行走器320再次选择ε转换路径530a。于是，行走器320通过在运行堆栈460上推送节点N3512和当前偏移 (现在为520c(即，“2”))再次存储一个线程。如对于处理周期540f 所示的，行走器320提取节点N2 510并将在偏移520c(即，“2”)处的区段522c(即，“a”)与节点N2 510的元素“a”进行匹配。因为“a”在节点N2 510处匹配，行走器320将当前偏移更新到520d(即，“3”)并且转换到节点N4 514，该节点由如通过编译器306配置的节点N2 510元数据(未展示)指定。例如，N2 510元数据可以指定从一个给定节点如节点N2 510经由与给定节点N2 510相关联的下一节点地址(未展示)到下一节点如节点N4 514的转换511。根据在此披露的实施例，该下一节点地址可以被配置成用于标识该下一节点和该多个存储器456中的编译器306将该下一节点分布到其中用于存储的一个给定存储器。Because all arc transitions from split node 508 are ε-transitions, walker 320 can again select one of the multiple path options and not consume (i.e., process) a segment from payload 542, since the current offset has not been updated for processing cycle 540e. In the example embodiment, walker 320 again selects ε-transition path 530a. Walker 320 then stores a thread again by pushing node N3 512 and the current offset, now 520c (i.e., "2"), onto run stack 460. As shown for processing cycle 540f, walker 320 extracts node N2 510 and matches segment 522c (i.e., "a") at offset 520c (i.e., "2") with element "a" of node N2 510. Because "a" matches at node N2 510, the walker 320 updates the current offset to 520d (i.e., "3") and transitions to node N4 514, which is specified by the node N2 510 metadata (not shown) as configured by the compiler 306. For example, the N2 510 metadata may specify a transition 511 from a given node, such as node N2 510, to a next node, such as node N4 514, via a next node address (not shown) associated with the given node N2 510. According to embodiments disclosed herein, the next node address may be configured to identify the next node and the compiler 306 may distribute the next node to a given memory in the plurality of memories 456 for storage.

于是，对于处理周期540g，行走器320可以提取下一节点N4 514和在偏移520d处的下一个区段522d(即，“b”)。因为“b”在节点N4 514处匹配，行走器320可以转换到下一节点N5 515。节点N5 515是与一个指示符相关联的标记节点，该指示符表示与该输入流中的正则表达式图样542的一个最终(即，完全或完整)的匹配。由此，对于处理周期540h，行走器320可以不继续沿当前路径行走并且通过在匹配结果缓冲器466中存储一个条目来报告该最终匹配。行走器320然后可以对运行堆栈460检查存储线程并且如由相应的DUP指示符所指示的那样要么丢弃所存储的线程要么激活它们。于是，行走器320弹出标识节点N3 512和偏移520(即，“2”)的条目，并且根据与所弹出的条目相关的DUP指示符来确定是通过以在偏移520c处的区段522c行走节点N3 512来激活存储的线程还是丢弃所存储的线程。Thus, for processing cycle 540g, the walker 320 can extract the next node N4 514 and the next segment 522d (i.e., "b") at offset 520d. Because "b" matches at node N4 514, the walker 320 can transition to the next node N5 515. Node N5 515 is a marked node associated with an indicator that indicates a final (i.e., complete or complete) match with the regular expression pattern 542 in the input stream. Thus, for processing cycle 540h, the walker 320 can discontinue walking along the current path and report the final match by storing an entry in the match result buffer 466. The walker 320 can then examine the run stack 460 for stored threads and either discard the stored threads or activate them as indicated by the corresponding DUP indicators. The walker 320 then pops the entry identifying node N3 512 and offset 520 (i.e., "2") and determines, based on the DUP indicator associated with the popped entry, whether to activate the stored thread by walking node N3 512 with segment 522c at offset 520c or to discard the stored thread.

在此披露的实施例可以由于以上披露的组合的DFA和NFA类型的处理而能够获得优化的匹配性能。例如，以上披露的实施例可以减少在NFA处理中的误报的数目，因为NFA处理可以是基于经由DFA 处理标识的部分匹配。另外，因为在此披露的实施例包括每规则(即，每图样)NFA，这些NFA可以通过DFA处理进行标识，在此披露的实施例进一步优化匹配性能。The embodiments disclosed herein can achieve optimized matching performance due to the combined DFA and NFA type processing disclosed above. For example, the embodiments disclosed above can reduce the number of false positives in NFA processing because NFA processing can be based on partial matches identified via DFA processing. Additionally, because the embodiments disclosed herein include per-rule (i.e., per-pattern) NFAs, which can be identified via DFA processing, the embodiments disclosed herein further optimize matching performance.

如以上所披露的，DFA 312是一个统一的DFA并且该至少一个NFA 314各自是一个每图样NFA。通过HFA 110使有效载荷走过该统一的DFA 312可以被认为是标记图样的起始点(中间匹配)的一个第一解析块并且对可能从该标记继续行走的该至少一个NFA 314提供该起始点以确定一个最终匹配。例如，基于由穿过一个统一的DFA 312 处理一个输入流的有效载荷的多个区段而确定的部分匹配结果，行走器 320可以确定规则集310的给定数目的规则(即，图样)需要被进一步处理，并且HFA 110可以产生可能被变换为给定数目的NFA行走的图样匹配结果，因为该至少一个NFA 314各自是一个每图样NFA。As disclosed above, DFA 312 is a unified DFA and each of the at least one NFA 314 is a per-pattern NFA. Passing the payload through the unified DFA 312 by HFA 110 can be considered as a first parsing block that marks the starting point of the pattern (intermediate match) and provides the starting point to the at least one NFA 314, which may continue walking from the mark to determine a final match. For example, based on partial match results determined by processing multiple segments of the payload of an input stream through a unified DFA 312, walker 320 may determine that a given number of rules (i.e., patterns) of rule set 310 need to be further processed, and HFA 110 may generate pattern matching results that may be converted into a given number of NFA walks because each of the at least one NFA 314 is a per-pattern NFA.

图6是行走器320的一个环境600的示例实施例的框图600。一个数据包101a输入流可以被接收602并且可以包括数据包616a-f，这些数据包可以是来自不同流，如一个第一流614a和一个第二流614b。例如，数据包P1 616a、P4 616d、和P6 616f可以是在第一流614a中的数据包，而数据包P2 616b、P3 616c、和P5 616e可以属于第二流614b。处理内核603可以是安全装置102的通用处理器内核，如以上参考图1 所披露的，该安全装置可以被配置成用于执行处理数据包101a的高层协议并且可以被配置成用于是该图样匹配方法分流到HFA110和HNA 108。FIG6 is a block diagram 600 of an example embodiment of an environment 600 for a walker 320. An input stream of packets 101a may be received 602 and may include packets 616a-f, which may be from different streams, such as a first stream 614a and a second stream 614b. For example, packets P1 616a, P4 616d, and P6 616f may be packets in the first stream 614a, while packets P2 616b, P3 616c, and P5 616e may belong to the second stream 614b. A processing core 603 may be a general-purpose processor core of a security device 102, as disclosed above with reference to FIG1 . The security device may be configured to execute higher-level protocols for processing packet 101a and may be configured to offload the pattern matching method to the HFA 110 and HNA 108.

数据包101a可以被转发604到HFA 110并且行走器320可以使这些数据包101a的区段走过该统一的DFA，如图3A的统一的DFA 312，以确定在该输入流中的正则表达式图样304的部分匹配。行走器 320可以被配置成用于转发606这些部分匹配的结果，这些部分匹配的结果可以标识数据包101a的区段的偏移和每图样NFA的节点，如该至少一个NFA 314，以通过HNA 108来进行这些部分匹配，该HNA 108 可以基于HFA 110的DFA处理的部分匹配结果来行走该至少一个NFA 314，因为这些部分匹配结果可以与数据包101a中的相应数据包一起被转发608到HNA 108。Data packets 101a may be forwarded 604 to HFA 110 and walker 320 may walk segments of data packets 101a through the unified DFA, such as unified DFA 312 of FIG. 3A , to determine partial matches of regular expression pattern 304 in the input stream. Walker 320 may be configured to forward 606 the results of these partial matches, which may identify offsets of segments of data packets 101a and nodes of per-pattern NFAs, such as at least one NFA 314, to HNA 108 for partial matches. HNA 108 may walk at least one NFA 314 based on the partial match results processed by the DFA of HFA 110, as these partial match results may be forwarded 608 to HNA 108 along with corresponding data packets in data packets 101a.

HNA 108可以允许确定部分匹配结果618c、618b、和618a形成对该输入流中的这些正则表达式图样304中的一个给定的正则表达式图样的一个最终(即，完整的)匹配。例如，通过间接地经由处理内核603或直接地从HFA 110，从HFA 110转发606这些HFA部分匹配结果到HNA 108，每个通过HFA 110部分匹配的数据包可以使得HNA 108能够使该部分匹配前进，因为行走器320可以用来自HFA 110的“提示”或起始信息使数据包101a的区段走过该至少一个NFA 314。The HNA 108 can determine that the partial match results 618 c, 618 b, and 618 a form a final (i.e., complete) match for a given regular expression pattern 304 in the input stream. For example, by forwarding 606 the HFA partial match results from the HFA 110 to the HNA 108, either indirectly via the processing core 603 or directly from the HFA 110, each partially matched packet passing through the HFA 110 can enable the HNA 108 to advance the partial match because the walker 320 can use the “hint” or start information from the HFA 110 to walk segments of the packet 101 a through the at least one NFA 314.

例如，如以上参考图4披露的，输入堆栈458可以包括至少一个工作，如该至少一个指令453中的S1 459a、S2 459b、或S3 459c，用于通过HNA 108进行处理。该至少一个指令的该至少一个工作可以各自属于一个相同的给定的有效载荷，如有效载荷462，该有效载荷由HFA 110处理。可以基于由HFA 110“预筛选”的数据包的此类“提示”或起始信息可以包括NFA开始节点以及相应的有效载荷区段的偏移，用于以一个每图样NFA行走，如以上所披露的。于是，行走器320可以确定对于可以从HNA 108转发到处理内核603的数据包101a和可以在适当时作为网络中的数据包101b然后被转发612的数据包101a的最终匹配结果610。For example, as disclosed above with reference to FIG. 4 , input stack 458 may include at least one task, such as S1 459a, S2 459b, or S3 459c in the at least one instruction 453, for processing by HNA 108. The at least one task of the at least one instruction may each belong to the same given payload, such as payload 462, processed by HFA 110. Such "hints" or starting information, which may be based on packets "pre-screened" by HFA 110, may include NFA start nodes and corresponding payload segment offsets for walking a per-pattern NFA, as disclosed above. Walker 320 may then determine a final matching result 610 for packet 101a, which may be forwarded from HNA 108 to processing core 603, and for packet 101a, which may be forwarded 612 as packet 101b in the network, as appropriate.

除了可以减少NFA处理的误报数目的通过HFA 110进行的此类数据包预筛选之外，在此披露的实施例可以通过将每个每图样NFA 的节点基于节点局域性分布到在一个存储器层次中的多个存储器来进一步优化匹配性能。由于每个NFA可以是一个每图样NFA，在此披露的实施例可以基于如下理解有利地将每个每图样NFA的节点分布到在一个层次中的多个存储器：规则(即，图样)越长，从该规则的结束部分生成的节点需要被访问(即，行走或遍历)的可能性就越小。通过将该每图样NFA的每个的更早的节点存储在相对更快的(即，更高性能) 的存储器中，在此披露的实施例可以进一步优化匹配性能。应当理解的是，因为此类节点分布可以是基于层级到存储器映射，所以节点可以有利地基于所映射的层级来分布，从而能够采用优化匹配性能的任何合适的分布。In addition to such packet pre-screening by HFA 110, which can reduce the number of false positives processed by NFAs, embodiments disclosed herein can further optimize matching performance by distributing the nodes of each per-pattern NFA across multiple memories within a memory hierarchy based on node locality. Since each NFA can be a per-pattern NFA, embodiments disclosed herein can advantageously distribute the nodes of each per-pattern NFA across multiple memories within a hierarchy based on the understanding that the longer a rule (i.e., pattern), the less likely a node generated from the end of the rule will need to be visited (i.e., walked or traversed). By storing each earlier node of the per-pattern NFA in a relatively faster (i.e., higher performance) memory, embodiments disclosed herein can further optimize matching performance. It should be appreciated that because such node distribution can be based on a hierarchy-to-memory mapping, the nodes can advantageously be distributed based on the mapped hierarchy, thereby enabling any suitable distribution that optimizes matching performance.

如以上所披露的，该至少一个NFA 314(如图5A的每图样NFA 504)可以被存储在至少一个存储器中，如图1的存储器104或图4的图形存储器456。根据在此披露的实施例，行走器320的匹配性能可以基于智能编译器306有利地跨可以包括在一个存储器层次中的多个图形存储器的该至少一个存储器456来分布该每图样NFA 504的节点来进行优化。As disclosed above, the at least one NFA 314 (such as the per-graph NFA 504 of FIG. 5A ) can be stored in at least one memory, such as the memory 104 of FIG. 1 or the graph memory 456 of FIG. 4 . According to embodiments disclosed herein, the matching performance of the walker 320 can be optimized based on the intelligent compiler 306 advantageously distributing the nodes of the per-graph NFA 504 across the at least one memory 456, which may include multiple graph memories in a memory hierarchy.

例如，行走器320的匹配性能可以是基于将连续的节点(如图 5A的每图样NFA 504的区段509的节点N0 506、N1 508、N2 510和 N3 512)存储在一个更快性能的存储器中来进行优化，该存储器相对于可以在存储器层次中映射到更低层级的另一个存储连续节点N4514和 N5 515的存储器映射到一个更高层级。由于NFA 504是从单个图样(如图样502)生成的一个每图样NFA，NFA 504与从其他图样生成的其他NFA分离并且因此在此披露的实施例可以是基于对该每图样NFA的节点的所识别的局域性并不存在于一个统一的NFA的节点中。For example, the matching performance of the walker 320 can be optimized by storing consecutive nodes (e.g., nodes N0 506, N1 508, N2 510, and N3 512 of section 509 of the per-graph NFA 504 of FIG. 5A ) in a faster-performing memory that is mapped to a higher level in the memory hierarchy relative to another memory storing consecutive nodes N4 514 and N5 515 that can be mapped to a lower level in the memory hierarchy. Because NFA 504 is a per-graph NFA generated from a single graph (e.g., graph 502), NFA 504 is separate from other NFAs generated from other graphs, and thus, embodiments disclosed herein can be based on identifying locality in the nodes of the per-graph NFA that does not exist in the nodes of a unified NFA.

在此披露的实施例可以是基于以下理解，即，一个每图样NFA 图形(如每图样NFA图形504)的更早的节点(如节点N0 506、N1 508、 N2 510和N3 512)可以具有比节点N4 514和N5 515更高的被遍历的可能性，因为节点N4 514和N5 515是朝向该规则(即，图样)502的结束处定位的并因此要求该有效载荷中的更多得到匹配以便进行行走 (即，遍历)。于是，一个每图样HFA(如NFA 504)或其他任何合适的每图样NFA图形的更早的节点可以被认为是与仅在出现图样完整匹配的事件中被访问的“低触摸”节点相比由于误报可以被更频繁访问的“高触摸”节点。Embodiments disclosed herein may be based on the understanding that earlier nodes (e.g., nodes N0 506, N1 508, N2 510, and N3 512) of a per-pattern NFA graph (e.g., per-pattern NFA graph 504) may have a higher probability of being traversed than nodes N4 514 and N5 515 because nodes N4 514 and N5 515 are located toward the end of the rule (i.e., pattern) 502 and therefore require more of the payload to be matched in order to be walked (i.e., traversed). Thus, earlier nodes of a per-pattern HFA (e.g., NFA 504) or any other suitable per-pattern NFA graph may be considered "high touch" nodes that may be visited more frequently due to false positives than "low touch" nodes that are visited only in the event of a complete pattern match.

根据在此披露的实施例，编译器306可以基于对每个每图样 NFA中的哪些节点被认为是“高触摸”节点而哪些节点被认为是“低触摸”节点的理解将每个每图样NFA的节点分布到在一个层次中的存储器。这种理解可以被用于通过将这些节点分布到在一个存储器层次中的存储器来“预高速缓存”(即，静态存储)每个每图样NFA的节点，从而允许改进的匹配性能。例如，“高触摸”节点可以基于以下理解被分布到更快的存储器：“高触摸”节点由于其在该每图样NFA中的局域性将被更频繁地访问(即，行走或遍历)。According to embodiments disclosed herein, compiler 306 can distribute the nodes of each per-pattern NFA to memory in a hierarchy based on an understanding of which nodes in each per-pattern NFA are considered "high touch" nodes and which nodes are considered "low touch" nodes. This understanding can be used to "pre-cache" (i.e., statically store) the nodes of each per-pattern NFA by distributing these nodes to memory in a memory hierarchy, thereby allowing improved matching performance. For example, "high touch" nodes can be distributed to faster memory based on the understanding that "high touch" nodes will be accessed (i.e., walked or traversed) more frequently due to their locality in the per-pattern NFA.

一般而言，基于一组正则表达式图样生成的一个统一NFA的正则表达式访问图样可以是随机的，因为这种访问图样可以是基于具体的有效载荷。因此，正则表达式访问图样的历史无法用于预测进一步的正则表达式访问图样。例如，高速缓存一个统一的NFA的最近遍历的节点不能对行走器提供性能收益，因为在该统一的NFA内访问的下一节点可能不是所高速缓存的节点。In general, the regular expression access pattern of a unified NFA generated from a set of regular expression patterns can be random, as such access patterns can be based on specific payloads. Therefore, the history of regular expression access patterns cannot be used to predict future regular expression access patterns. For example, caching the most recently traversed nodes of a unified NFA does not provide performance benefits to the walker, as the next node visited within the unified NFA may not be the cached node.

图7是编译器306的一个环境700的实施例的框图。如以上所披露的，编译器306可以被称为智能编译器，该智能编译器可以被配置成用于通过标识规则集310的可能最好适合DFA或NFA处理的部分来将规则集310编译成二进制图像112。由此，二进制图像112可以包括至少两个区段，其中，一个第一区段用于DFA处理并且一个第二区段用于NFA处理，如统一的DFA 312和该至少一个NFA 314，如以上关于图3A披露的。根据在此披露的实施例，HNA108可以操作地耦合到可以包括图形存储器456的多个存储器，如以上关于图4披露的。根据在此披露的实施例，编译器306可以被配置成用于确定该统一的DFA 312和该至少一个NFA314的节点在图形存储器456中的放置。FIG7 is a block diagram of an embodiment of an environment 700 for compiler 306. As disclosed above, compiler 306 can be referred to as an intelligent compiler, which can be configured to compile rule set 310 into binary image 112 by identifying portions of rule set 310 that are likely best suited for DFA or NFA processing. Thus, binary image 112 can include at least two sections, a first section for DFA processing and a second section for NFA processing, such as unified DFA 312 and at least one NFA 314, as disclosed above with respect to FIG3A. According to embodiments disclosed herein, HNA 108 can be operatively coupled to multiple memories, which can include graph memory 456, as disclosed above with respect to FIG4. According to embodiments disclosed herein, compiler 306 can be configured to determine placement of nodes of unified DFA 312 and at least one NFA 314 in graph memory 456.

根据在此披露的实施例，该统一的DFA 312可以被静态地存储在图形存储器456的一个给定存储器中，而至少一个NFA 314可以具有跨图形存储器456分布并静态地存储的节点，因为编译器306可以将分布特定NFA节点用于存储在特定存储器中以便优化行走器匹配性能作为目标。根据在此披露的实施例，图形存储器456可以是在一个可以包括多个层级708a-c的存储器层次743中。该多个层级708a-c可以被映射到可以包括存储器756a-c的该多个图形存储器456。According to embodiments disclosed herein, the unified DFA 312 can be statically stored in a given memory of the graph memory 456, while at least one NFA 314 can have nodes distributed across the graph memory 456 and statically stored, as the compiler 306 can target distributing specific NFA nodes for storage in specific memories to optimize walker matching performance. According to embodiments disclosed herein, the graph memory 456 can be in a memory hierarchy 743 that can include a plurality of levels 708a-c. The plurality of levels 708a-c can be mapped to the plurality of graph memories 456 that can include memories 756a-c.

编译器306可以将层级708a-c以任何合适的方式进行映射，并且层级708a-c可以用降序712排序，使得层级708a可以是排序最高的等级708a而等级708c可以是排序最低的等级。图形存储器756a-c可以包括一个随机存取存储器(RAM)，该存储器可以为与一个网络服务处理器100上的片上搜索存储器(OSM)共同定位的最高性能存储器。图形存储器756a-c可以包括一个可以是在外部并操作性耦合到网络服务处理器100的系统存储器。Compiler 306 may map levels 708a-c in any suitable manner, and levels 708a-c may be sorted in descending order 712 such that level 708a may be the highest-ordered level 708a and level 708c may be the lowest-ordered level. Graphics memory 756a-c may include a random access memory (RAM) that may be high-performance memory co-located with an on-chip search memory (OSM) on the network services processor 100. Graphics memory 756a-c may include system memory that may be external and operatively coupled to the network services processor 100.

基于根据存储器性能(即，读和写访问时间)的映射，该RAM 存储器可以被映射到排序最高的层级708a，该OSM可以被映射到排序次高的层级708b，而系统存储器可以被映射到排序最低的层级708c。然而，应当理解的是，在该多个层级708a-c与图形存储器756a-c之间的映射可以是以任何合适的方式形成的。例如，该映射可以是基于如下的理解：可以生成一个与规则集310(这些节点从其中被分布到存储器 756a-c)相关联的应用，因此最高性能的存储器可能不配映射到排序最高的层级。另外，应当理解的是，所示出的存储器层次743中的层级数目和图形存储器756a-c的数目是为了说明的目的并且可以是任何合适的层级和存储器数目。Based on a mapping based on memory performance (i.e., read and write access times), the RAM memory may be mapped to the highest-ranked tier 708a, the OSM memory may be mapped to the next-highest-ranked tier 708b, and the system memory may be mapped to the lowest-ranked tier 708c. However, it should be understood that the mapping between the multiple tiers 708a-c and the graphics memories 756a-c may be formed in any suitable manner. For example, the mapping may be based on the understanding that an application associated with the rule set 310 from which the nodes are distributed to the memories 756a-c may be generated, and therefore the highest-performance memory may not be worthy of being mapped to the highest-ranked tier. Furthermore, it should be understood that the number of tiers in the memory hierarchy 743 and the number of graphics memories 756a-c shown are for illustrative purposes and may be any suitable number of tiers and memories.

如以上所披露的，一个每图样NFA的节点的局域性可以由智能编译器306利用，这是通过存储从在更快的存储器中的一个给定图样的多个更早的部分生成的NFA节点。另外，因为，由于给定图样的部分匹配是通过HFA 110的DFA处理来确定的，给定图样的匹配的概率已经更高，所以组合此类实施例以优化匹配性能。As disclosed above, the locality of nodes in a per-pattern NFA can be exploited by the intelligent compiler 306 by storing NFA nodes generated from earlier portions of a given pattern in faster memory. Additionally, since the probability of a match for a given pattern is already higher due to the partial matching of the given pattern being determined by the DFA processing of the HFA 110, such embodiments are combined to optimize matching performance.

例如，如以上所披露的，DFA处理可以用于减少由NFA处理找到的误报数目。由于每个NFA可以是每图样NFA，每个每图样NFA 的节点可以是有利地基于该多个存储器到存储器层次743的层级的映射而跨该多个存储器分布的。例如，从相对更短长度的图样生成的更小的NFA可以使所有节点分布到一个第一层级并且存储在一个被映射到该第一层级的第一存储器中，而从相对更长长度的图样生成的更大的 NFA可以使一个第一节点部分分布到该第一层级并且剩余的部分分布在剩余的层级中。该第一层级可以是一个映射到最高性能存储器的排序最高的层级。For example, as disclosed above, DFA processing can be used to reduce the number of false positives found by NFA processing. Since each NFA can be a per-pattern NFA, the nodes of each per-pattern NFA can be advantageously distributed across the multiple memories based on the mapping of the multiple memories to the levels of the memory hierarchy 743. For example, a smaller NFA generated from a pattern of relatively shorter length can have all nodes distributed to a first level and stored in a first memory mapped to the first level, while a larger NFA generated from a pattern of relatively longer length can have a first node partially distributed to the first level and the remaining nodes distributed among the remaining levels. The first level can be a highest-ranked level mapped to the highest-performance memory.

于是，这些每图样NFA的更早的节点可以被存储在该最高性能的存储器中。因为由于误报更早的节点可以具有更高的被遍历的可能性，在此披露的实施例可以使得大多数误报能够经由访问映射到在存储器层次743中的更高层级的存储器来处理。根据在此披露的实施例，可以通过使得对映射到排序最高的层级(如在存储器层次743中的层级 708a)的存储器756a的访问数目能够比对可以被映射到排序最低的层级708c的存储器756c的访问数目相对更高来优化匹配性能。Thus, the earlier nodes of these per-graph NFAs can be stored in the highest performance memory. Because earlier nodes can have a higher probability of being traversed due to false positives, embodiments disclosed herein can enable most false positives to be handled by accessing memories mapped to higher levels in the memory hierarchy 743. According to embodiments disclosed herein, matching performance can be optimized by enabling a relatively higher number of accesses to the memory 756a mapped to the highest-ranked level (e.g., level 708a in the memory hierarchy 743) than to the memory 756c mapped to the lowest-ranked level 708c.

存储器756a可以是允许例如每秒1300百万次事务处理的最高性能的存储器，而存储器756b可以具有允许每秒150百万次事务处理的较差的性能，并且存储器756c可以是允许每秒12百万次事务处理的最差性能的存储器。另外，根据在此披露的实施例，映射到排序更高的层级的此类更高性能的存储器的量可以是在大小上相对小于映射到排序最低的层级708c的更低性能的存储器，如存储器756c(相比之下可以是相对大的存储器)。例如，存储器756c可以是外部的且提供受物理上所附接的存储器的量限制的相对大量的存储容量的一种系统存储器。Memory 756a may be the highest performance memory allowing, for example, 1300 million transactions per second, while memory 756b may have a lower performance allowing 150 million transactions per second, and memory 756c may be the lowest performance memory allowing 12 million transactions per second. Additionally, according to embodiments disclosed herein, the amount of such higher performance memory mapped to the higher-ordered tiers may be relatively smaller in size than the lower performance memory, such as memory 756c (which may be a relatively large memory by comparison), mapped to the lowest-ordered tier 708c. For example, memory 756c may be a type of system memory that is external and provides a relatively large amount of storage capacity limited by the amount of physically attached memory.

根据在此披露的实施例，每图样NFA存储分配设定710a-c可以被配置成用于层级708a-c。每图样NFA存储分配设定710a-c可以表示用于从每个每图样NFA分布到这些层级708a-c中的一个对应的层级以存储在一个映射到该对应的层级的给定存储器的该目标数量的唯一节点。编译器306可以被配置成用于确定每图样NFA分布设定710a-c，其方式为使得映射到层级708a-c的存储器756a-c在对于规则集310中的该一个或多个图样中的每个生成一个每图样NFA的事件中能够提供足够的存储容量。According to embodiments disclosed herein, per-pattern NFA storage allocation settings 710a-c can be configured for levels 708a-c. Per-pattern NFA storage allocation settings 710a-c can indicate a target number of unique nodes to be distributed from each per-pattern NFA to a corresponding one of the levels 708a-c to store in a given memory mapped to the corresponding level. Compiler 306 can be configured to determine per-pattern NFA distribution settings 710a-c in such a manner that the memory 756a-c mapped to levels 708a-c provides sufficient storage capacity in the event that a per-pattern NFA is generated for each of the one or more patterns in rule set 310.

每图样NFA存储分配设定710a-c可以表示每个每图样NFA的该组对应的节点中用于在一个对应的层级分布以存储到映射到该对应的层级的一个给定存储器的该目标数量的唯一节点。例如，基于被配置成用于层级708a的每图样NFA存储分配设定710a，编译器306可以分布每图样NFA 714a的该组对应的节点702a的一个第一部分704a和每图样NFA 714b的该组对应的节点702b的一个第二部分704b用于存储在映射到层级708a的存储器756a中。Per-pattern NFA storage allocation settings 710a-c can indicate the target number of unique nodes in the corresponding set of nodes of each per-pattern NFA to be distributed at a corresponding level for storage in a given memory mapped to the corresponding level. For example, based on per-pattern NFA storage allocation setting 710a configured for level 708a, compiler 306 can distribute a first portion 704a of the corresponding set of nodes 702a of per-pattern NFA 714a and a second portion 704b of the corresponding set of nodes 702b of per-pattern NFA 714b for storage in memory 756a mapped to level 708a.

基于被配置成用于层级708b的每图样NFA存储分配设定 710b，编译器306可以分布每图样NFA 714a的该组对应的节点702a的一个第三部分706a和每图样NFA 714b的该组对应的节点702b的一个第四部分706b用于存储在映射到层级708b的存储器756b中。这种分布是目标分布，因为一组给定的对应的节点的节点数目可能不包括该目标数目，因为在用于分布的对应组中可能之前产生了少于目标数目或者剩余少于目标数目。Based on per-graph NFA storage allocation setting 710b configured for level 708b, compiler 306 may distribute a third portion 706a of the set of corresponding nodes 702a of per-graph NFA 714a and a fourth portion 706b of the set of corresponding nodes 702b of per-graph NFA 714b for storage in memory 756b mapped to level 708b. This distribution is a target distribution because the number of nodes in a given set of corresponding nodes may not include the target number because fewer than the target number may have been previously generated or fewer than the target number may remain in the corresponding set for distribution.

在该示例实施例中，每图样NFA存储分配设定710c可以被配置成用于存储器层次743的排序最低的层级708c并且可以表示无限数目的方式来指定。在该示例实施例中映射到排序最低的层级708c的存储器756c可以是具有相对大存储量的一种系统存储器。于是，编译器 306可以将节点分布到系统存储器，包括将为这些每图样NFA 714a-b 中每个所生成的每组对应的节点的任何剩余的未分布节点进行分布以存储在系统存储器756c中。In this example embodiment, per-pattern NFA storage allocation 710c can be configured for the lowest-ordered level 708c of the memory hierarchy 743 and can be specified in an infinite number of ways. In this example embodiment, the memory 756c mapped to the lowest-ordered level 708c can be a type of system memory with a relatively large storage capacity. The compiler 306 can then distribute the nodes to the system memory, including distributing any remaining undistributed nodes for each corresponding set of nodes generated for each of the per-pattern NFAs 714a-b for storage in the system memory 756c.

应当理解的是，对于存储器映射的层级可以固有地由该编译器理解，并且于是可以免除特定的层级708a-c。例如，编译器306可以配置这些每图样NFA存储分配设定710a-c并且基于对存储器756a-c中的每个在存储器层次743中的层级映射的固有理解将这些设定直接映射到存储器756a-c。还应当理解的是，在图7中所示的每图样NFA、这些每图样NFA的节点、和分布的数目是用于说明的目的并且可以是任何合适的每图样NFA、节点或分布数目。It should be understood that the compiler may inherently understand the hierarchy of memory mappings and thus may dispense with specific hierarchies 708a-c. For example, compiler 306 may configure per-pattern NFA storage allocation settings 710a-c and map these settings directly to memories 756a-c based on its inherent understanding of the hierarchical mapping of each of memories 756a-c in memory hierarchy 743. It should also be understood that the number of per-pattern NFAs, nodes, and distributions shown in FIG7 is for illustrative purposes and may be any suitable number of per-pattern NFAs, nodes, or distributions.

图8是对于多个每图样NFA的节点分布的示例实施例的框图800。在该示例实施例中，对于一个或多个图样804中的一个图样816a 生成一个第一NFA 814a，对于该一个或多个图样804中的一个第二图样816b生成一个第二NFA 814b，并且对于该一个或多个图样804中的一个第三图样816c生成一个第三NFA 814c。8 is a block diagram 800 of an example embodiment of a node distribution for multiple per-pattern NFAs. In this example embodiment, a first NFA 814a is generated for a pattern 816a in one or more patterns 804, a second NFA 814b is generated for a second pattern 816b in the one or more patterns 804, and a third NFA 814c is generated for a third pattern 816c in the one or more patterns 804.

第一每图样NFA 814a的一个第一节点部分804a被分布到映射到在存储器层次812中的一个第一存储器856a的层级808a，并且一个第二节点部分806a被分布到映射到一个第二存储器856b的第二层级 808b。在该示例实施例中，层级808a是排序最高的层级，而层级808b 是排序最低的层级。第二每图样NFA 814b的一个第三节点部分804b 被分布到映射到在存储器层次812中的第一存储器856a的层级808a，并且一个第四节点部分806b被分布到映射到第二存储器856b的第二层级808b。第三每图样NFA 814c的一个第五节点部分804c被分布到映射到在存储器层次812中的第一存储器856a的层级808a，并且一个第六节点部分806c被分布到映射到第二存储器856b的第二层级808b。A first node portion 804a of the first per-pattern NFA 814a is distributed to level 808a, which is mapped to a first memory 856a in the memory hierarchy 812, and a second node portion 806a is distributed to a second level 808b, which is mapped to a second memory 856b. In this example embodiment, level 808a is the highest-ordered level, and level 808b is the lowest-ordered level. A third node portion 804b of the second per-pattern NFA 814b is distributed to level 808a, which is mapped to the first memory 856a in the memory hierarchy 812, and a fourth node portion 806b is distributed to the second level 808b, which is mapped to the second memory 856b. A fifth node portion 804c of the third per-pattern NFA 814c is distributed to the level 808a mapped to the first memory 856a in the memory hierarchy 812, and a sixth node portion 806c is distributed to the second level 808b mapped to the second memory 856b.

如在图8中所示，分布用于存储在映射到层级808a的存储器 856a中的第二NFA814b的第二节点部分804b可以是分别小于第一 NFA 814a和第三NFA 814c的第一节点部分804a和第五节点部分804c。如果每图样NFA 814b的节点数目小于由用于层级808a的一个每图样 NFA存储分配设定(未展示)所表示的唯一目标节点数目，于是例如可以是这种情况。另外，由于层级808b是在存储器层次812中的排序最低的层级，用于层级808b的下一每图样NFA存储分配设定(未展示) 可以是非常大的，从而使得所有未分布的节点能够在已经进行了到高于层级808b的每个层级的分布之后被分布以存储在被映射到层级808b的存储器856a中。于是，在该示例实施例中，第二节点部分806a可以包括比第六节点部分806c更多的节点，因为图样816a可以是比图样816c 更长的规则。另外，第四节点部分806b可以为空，因为图样816b可以是相对较短的，其中，对于每图样NFA 814b生成的少数节点造成每图样NFA814b的所有节点被分布到层级808a以存储在存储器856a中。As shown in FIG8 , the second node portion 804b of the second NFA 814b distributed for storage in the memory 856a mapped to the level 808a can be smaller than the first node portion 804a and the fifth node portion 804c of the first NFA 814a and the third NFA 814c, respectively. This can be the case, for example, if the number of nodes per-pattern NFA 814b is smaller than the number of unique target nodes represented by a per-pattern NFA storage allocation setting (not shown) for the level 808a. Furthermore, since level 808b is the lowest-ordered level in the memory hierarchy 812, the next per-pattern NFA storage allocation setting (not shown) for level 808b can be significantly larger, allowing all undistributed nodes to be distributed for storage in the memory 856a mapped to the level 808b after the distribution to each level above level 808b has occurred. Thus, in this example embodiment, second node portion 806a may include more nodes than sixth node portion 806c because graph 816a may be a longer rule than graph 816c. Additionally, fourth node portion 806b may be empty because graph 816b may be relatively short, wherein the small number of nodes generated for each graph NFA 814b results in all nodes of each graph NFA 814b being distributed to level 808a for storage in memory 856a.

编译器306可以作为生成每个每图样NFA的一部分来分布每个每图样NFA的节点。如以上所披露的，在NFA中从一个第一节点到一个第二节点的转换可以经由通过一个下一节点地址标识该第二节点的第一节点元数据来指定。根据在此披露的实施例，该下一节点地址可以被编译器306配置为包括一个部分，该部分指示该多个存储器中的已经将该第二节点分布到其中用于存储的一个给定存储器。The compiler 306 can distribute the nodes of each per-pattern NFA as part of generating each per-pattern NFA. As disclosed above, a transition from a first node to a second node in an NFA can be specified via first-node metadata that identifies the second node via a next-node address. According to embodiments disclosed herein, the next-node address can be configured by the compiler 306 to include a portion that indicates a given memory in the plurality of memories to which the second node has been distributed for storage.

图9是一种方法900的示例实施例的一个流程图，该方法可以在至少一个处理器中执行，该至少一个处理器与多个存储器操作地耦合，该多个存储器映射到操作地耦合到网络的一个安全装置中的存储器层次中的多个层级。该方法可以开始(902)并且生成至少一个每图样非确定型有限自动机(NFA)(904)。每个每图样NFA可以对于单个正则表达式图样而生成并且可以包括一组对应的节点。该方法可以分布每个每图样NFA的该组对应的节点的节点以基于所映射的层级和为这些层级配置的每图样NFA存储分配设定来将其存储在该多个存储器中 (906)并且在该示例实施例中该方法然后结束(908)。FIG9 is a flow chart of an example embodiment of a method 900 that can be executed in at least one processor operatively coupled to a plurality of memories mapped to a plurality of levels in a memory hierarchy in a security device operatively coupled to a network. The method can begin (902) and generate at least one per-pattern nondeterministic finite automaton (NFA) (904). Each per-pattern NFA can be generated for a single regular expression pattern and can include a set of corresponding nodes. The method can distribute the nodes of the set of corresponding nodes of each per-pattern NFA to be stored in the plurality of memories based on the mapped levels and the per-pattern NFA storage allocation settings configured for the levels (906) and then end in the example embodiment (908).

图10是对于多个每图样NFA的节点的另一个节点分布的示例实施例的框图1000。在该示例实施例中，节点分布1004和1006被示出为用于存储在一个第一存储器1056a和一个第二存储器1056b中。每个每图样NFA 1014a-c的分布1004可以是基于每图样NFA存储分配设定 1010a和1010b，这些设定被配置成用于分别用于层级1008a和1008b。在该示例实施例中，层级1008a和1008b分别被映射到第一存储器1056a 和第二存储器1056b。FIG10 is a block diagram 1000 of another example embodiment of another node distribution for multiple per-pattern NFAs. In this example embodiment, node distributions 1004 and 1006 are shown for storage in a first memory 1056a and a second memory 1056b. Distribution 1004 for each per-pattern NFA 1014a-c can be based on per-pattern NFA storage allocation settings 1010a and 1010b, which are configured for use with levels 1008a and 1008b, respectively. In this example embodiment, levels 1008a and 1008b are mapped to first memory 1056a and second memory 1056b, respectively.

图11是一种用于分布至少一个每图样NFA的节点的方法的示例实施例的一个流程图1100。根据在此披露的实施例，分布所生成的每个每图样NFA的该组对应的节点的这些节点可以包括以一种连续方式分布该组对应的节点中的这些节点，该分布包括该组对应的节点的这些节点的一个第一分布，用于存储在该多个存储器中的一个第一存储器中。该第一存储器可以被映射到这些层级中排序最高的层级。该分布可以包括在上一次分布之后该组对应的节点中的这些节点的一个第二分布，基于在该组对应的节点中剩余的至少一个未分布的节点。每个至少一个第二分布可以是用于在该多个存储器中的一个给定存储器中进行存储。该给定存储器可以被映射到这些层级中的一个给定的层级，对于每一次分布，该给定的层级连续地低于该排序最高的层级。Figure 11 is a flowchart 1100 of an example embodiment of a method for distributing nodes of at least one per-pattern NFA. According to embodiments disclosed herein, distributing the nodes of the set of corresponding nodes of each generated per-pattern NFA may include distributing the nodes in the set of corresponding nodes in a sequential manner, the distribution comprising a first distribution of the nodes in the set of corresponding nodes for storage in a first memory of the plurality of memories. The first memory may be mapped to a highest-ranked level among the hierarchies. The distribution may include a second distribution of the nodes in the set of corresponding nodes after a previous distribution, based on at least one undistributed node remaining in the set of corresponding nodes. Each at least one second distribution may be for storage in a given memory of the plurality of memories. The given memory may be mapped to a given level among the hierarchies, the given level being successively lower than the highest-ranked level for each distribution.

该方法可以开始(1102)并且将一个给定的层级设定到一个存储器层次中的排序最高的层级(1104)。该方法可以将一个给定的每图样NFA设定到从一组一个或多个正则表达式图样生成的至少一个NFA 中的一个第一每图样NFA(1106)。该方法可以检查该给定的每图样 NFA的未分布节点数目(1108)。如果该给定的每图样NFA的未分布节点数目是零，则该方法可以检查该给定的每图样NFA是否是从该组一个或多个正则表达式图样生成的最后一个NFA(1116)。The method may begin (1102) and set a given level to the highest-ordered level in a memory hierarchy (1104). The method may set a given per-pattern NFA to a first per-pattern NFA in at least one NFA generated from a set of one or more regular expression patterns (1106). The method may check the number of undistributed nodes of the given per-pattern NFA (1108). If the number of undistributed nodes of the given per-pattern NFA is zero, the method may check whether the given per-pattern NFA is the last NFA generated from the set of one or more regular expression patterns (1116).

如果该给定的每图样NFA是所生成的最后一个每图样NFA，则该方法可以检查该给定的层级是否是排序最低的层级(1120)，并且如果该给定的层级是排序最低的层级，则在该示例实施例中，该方法然后结束(1126)。然而，如果检查该给定的层级是否是排序最低的层级 (1120)结果为否，则该方法可以将该给定的层级设定到下一连续更低的层级(1124)并且再次将该给定的每图样NFA设定到由该组一个或多个正则表达式图样生成的至少一个NFA中的该第一每图样NFA (1106)并且继续检查该给定的每图样NFA的未分布节点数目(1108)。如果该给定的每图样NFA的未分布节点数目为零，则该方法可以如以上所披露的那样继续进行。If the given per-pattern NFA is the last per-pattern NFA generated, the method may check whether the given level is the lowest-ordered level (1120), and if the given level is the lowest-ordered level, then in this example embodiment, the method ends (1126). However, if the result of checking whether the given level is the lowest-ordered level (1120) is no, the method may set the given level to the next successively lower level (1124) and again set the given per-pattern NFA to the first per-pattern NFA in the at least one NFA generated by the set of one or more regular expression patterns (1106) and continue checking the number of undistributed nodes of the given per-pattern NFA (1108). If the number of undistributed nodes of the given per-pattern NFA is zero, then the method may continue as disclosed above.

如果检查该给定的每图样NFA的未分布节点数目(1108)的结果是非零，则该方法可以检查该给定的层级是否是排序最低的层级 (1110)。如果结果为是，则该方法可以将未分布节点数目分布到映射到该给定层级的一个给定存储器(1114)并且该方法可以检查该给定的每图样NFA是否是从该组一个或多个正则表达式图样生成的最后一个 NFA(1116)。如果是，则该方法可以如以上所披露的那样继续进行。如果否，则该方法可以将该给定的每图样NFA设定到所生成的下一每图样NFA(1118)，并且该方法可以迭代以再次检查已经更新到所生成的该下一每图样NFA的该给定的每图样NFA的未分布节点数目 (1108)。If the result of checking the number of undistributed nodes for the given per-pattern NFA (1108) is non-zero, the method can check whether the given level is the lowest-ordered level (1110). If so, the method can distribute the number of undistributed nodes to a given memory mapped to the given level (1114) and the method can check whether the given per-pattern NFA is the last NFA generated from the set of one or more regular expression patterns (1116). If so, the method can continue as disclosed above. If not, the method can set the given per-pattern NFA to the next per-pattern NFA generated (1118), and the method can iterate to again check the number of undistributed nodes for the given per-pattern NFA that has been updated to the next per-pattern NFA generated (1108).

如果检查该给定的层级是否是排序最低的层级(1110)的结果是否，则该方法可以检查该给定的每图样NFA的未分布节点数目是否超过由对于该给定的层级(1112)配置的一个每图样NFA存储分配设定表示的节点数目。如果是，则该方法可以将由对于该给定的层级配置的该每图样NFA存储分配设定表示的节点数目分布以存储在映射到该给定层级的该给定存储器中(1122)并且检查该给定的每图样NFA是否是从该组一个或多个正则表达式图样生成的最后一个NFA(1116)。如果是，则该方法可以如以上所披露的那样继续进行。If the result of checking whether the given level is the lowest-ordered level (1110) is no, the method can check whether the number of undistributed nodes of the given per-pattern NFA exceeds the number of nodes represented by a per-pattern NFA storage allocation setting configured for the given level (1112). If so, the method can distribute the number of nodes represented by the per-pattern NFA storage allocation setting configured for the given level for storage in the given memory mapped to the given level (1122) and check whether the given per-pattern NFA is the last NFA generated from the set of one or more regular expression patterns (1116). If so, the method can continue as disclosed above.

如果检查该给定的每图样NFA是否是最后一个每图样NFA (1116)的结果为否，则该方法可以将该给定的每图样NFA设定到所生成的下一每图样NFA(1118)，并且该方法可以迭代以再次检查已经更新到所生成的该下一每图样NFA的该给定的每图样NFA的未分布节点数目(1108)。If the result of checking whether the given per-pattern NFA is the last per-pattern NFA (1116) is no, the method may set the given per-pattern NFA to the next per-pattern NFA generated (1118), and the method may iterate to again check the number of undistributed nodes of the given per-pattern NFA that has been updated to the next per-pattern NFA generated (1108).

然而，如果检查该给定的每图样NFA的未分布节点数目是否超过由对于该给定的层级配置的一个每图样NFA存储分配设定表示的节点数目(1112)的结果为否，则该方法可以将该未分布节点数目分布到映射到该给定的层级的该给定存储器(1114)并且如以上所披露的那样继续进行。However, if the result of checking whether the number of undistributed nodes of the given per-pattern NFA exceeds the number of nodes represented by a per-pattern NFA storage allocation setting configured for the given level (1112) is no, the method can distribute the number of undistributed nodes to the given memory mapped to the given level (1114) and continue as disclosed above.

根据在此披露的实施例，该每图样NFA存储分配设定可以经由一个绝对值来表示该目标数量的唯一节点。该绝对值可以是对于每组对应的节点的一个共用值，从而使得每组对应的节点能够具有对于该目标数量的唯一节点的一个相同的值，以便存储在被映射到该对应的层级的给定存储器中。例如，如图10中所示，这些每图样NFA中的每个 1014a-c具有一个所选的第一部分1004，该第一部分表示来自这些每图样NFA 1014a-c的有待分布到映射到层级1008a(配置每图样存储分配设定1010a用于该层级)的存储器1056a的同一个节点数目。According to embodiments disclosed herein, the per-pattern NFA storage allocation setting can represent the target number of unique nodes via an absolute value. The absolute value can be a common value for each group of corresponding nodes, so that each group of corresponding nodes can have the same value for the target number of unique nodes to be stored in a given memory mapped to the corresponding hierarchy. For example, as shown in FIG10 , each of the per-pattern NFAs 1014 a-c has a selected first portion 1004 representing the same number of nodes from the per-pattern NFAs 1014 a-c to be distributed to the memory 1056 a mapped to the hierarchy 1008 a for which the per-pattern storage allocation setting 1010 a is configured.

替代性地，该目标数量的唯一节点可以经由一个百分比值表示，该百分比值用于应用到每组对应的节点的对应的节点总数，从而使得每组对应的节点能够具有对于该目标数量的唯一节点的一个单独的值，以便存储在被映射到该对应的层级的该给定存储器中。例如，如果一个如25％的数被配置成用于每图样NFA存储分配设定1010a，该设定被配置成用于层级1008a，那么第一部分1004将会包括这些每图样NFA 1014a-c中每个的25％的节点。由于每个每图样NFA 1014a-c的节点可以不同，来自这些每图样NFA 1014a-c中每个的节点数目可以不同。Alternatively, the target number of unique nodes can be represented by a percentage value applied to the corresponding total number of nodes for each group of corresponding nodes, thereby enabling each group of corresponding nodes to have a separate value for the target number of unique nodes to be stored in the given memory mapped to the corresponding level. For example, if a number such as 25% is configured for the per-pattern NFA storage allocation setting 1010a, which is configured for level 1008a, then the first portion 1004 will include 25% of the nodes from each of these per-pattern NFAs 1014a-c. Since the number of nodes in each per-pattern NFA 1014a-c can be different, the number of nodes from each of these per-pattern NFAs 1014a-c can be different.

这些每图样NFA存储分配设定可以包括一个第一每图样NFA 存储分配设定和一个第二每图样NFA存储分配设定。这些层级可以包括一个排序最高的层级和一个排序次高的层级。该第一每图样NFA存储分配设定可以被配置成用于该排序最高的层级。该第二每图样NFA 存储分配设定可以被配置成用于该排序次高的层级。该第一每图样NFA 存储分配设定可以小于该第二每图样NFA存储分配设定。例如，被表示以用于分布到最高性能存储器的来自每个每图样NFA的节点数目可以是小于被表示以用于最低性能存储器的节点数目(如可以表示无限数目)的一个系统存储器。The per-pattern NFA storage allocation settings may include a first per-pattern NFA storage allocation setting and a second per-pattern NFA storage allocation setting. The tiers may include a highest-ranked tier and a second-highest-ranked tier. The first per-pattern NFA storage allocation setting may be configured for the highest-ranked tier. The second per-pattern NFA storage allocation setting may be configured for the second-highest-ranked tier. The first per-pattern NFA storage allocation setting may be smaller than the second per-pattern NFA storage allocation setting. For example, the number of nodes from each per-pattern NFA represented for distribution to the highest-performance memory may be a system memory that is smaller than the number of nodes represented for the lowest-performance memory (e.g., an infinite number may be represented).

在此披露的实施例可以将在一个给定分布中的节点数目最大化，并且最大化后的数目可以是由这些每图样NFA存储分配设定中的一个被配置成用于一个给定层级的对应的每图样NFA存储分配设定来限制的。例如，由一个每图样NFA存储分配设定表示的节点数目可以是十。于是，包括十个或更多未分布节点的每个每图样NFA将会使十个节点被分布。包括少于十个未分布节点的每个每图样将会分布未分布节点数目中的一个对应的数目。Embodiments disclosed herein can maximize the number of nodes in a given distribution, and the maximized number can be limited by one of these per-pattern NFA storage allocation settings configured for a given hierarchy. For example, the number of nodes represented by a per-pattern NFA storage allocation setting can be ten. Thus, each per-pattern NFA that includes ten or more undistributed nodes will have ten nodes distributed. Each per-pattern that includes fewer than ten undistributed nodes will have a corresponding number of undistributed nodes distributed.

图12是一台计算机1200的内部结构的示例的一个框图，其中，可以实现在此披露的各实施例。计算机1200包含一个系统总线1202，其中，总线是用于在计算机或处理系统的组件之间进行数据传输的一组硬件线。系统总线1202实质上是连接计算机系统的不同元件(例如处理器、磁盘存储器、存储器、输入/输出端口、网络端口等等)、使得在这些元件之间能够进行信息传输的一种共享管道。与系统总线1202 一起工作的是一个I/O设备接口1204，用于将各种输入和输出设备(例如键盘、鼠标、显示器、打印机、扬声器等等)连接到计算机1200。网络接口1206允许计算机1200连接到附接于网络的各种其他设备。存储器1208提供用于计算机软件指令1210和数据1212的易失性存储，这些指令和数据可以用于实现在此披露的实施例。磁盘存储器1214提供用于计算机软件指令1210和数据1212的非易失性存储，这些指令和数据可以用于实现在此披露的实施例。中央处理器单元1218还与系统总线1202一起工作并且提供计算机指令的执行。Figure 12 is a block diagram of an example of the internal structure of a computer 1200, in which the various embodiments disclosed herein may be implemented. Computer 1200 includes a system bus 1202, where a bus is a set of hardware lines used to transfer data between components of a computer or processing system. System bus 1202 is essentially a shared conduit that connects the various components of a computer system (e.g., processor, disk storage, memory, input/output ports, network ports, etc.), enabling information transfer between these components. Working in conjunction with system bus 1202 is an I/O device interface 1204 for connecting various input and output devices (e.g., keyboard, mouse, display, printer, speakers, etc.) to computer 1200. Network interface 1206 allows computer 1200 to connect to various other devices attached to a network. Memory 1208 provides volatile storage for computer software instructions 1210 and data 1212, which may be used to implement the embodiments disclosed herein. Disk storage 1214 provides non-volatile storage for computer software instructions 1210 and data 1212 that may be used to implement the embodiments disclosed herein. Central processor unit 1218 also works with system bus 1202 and provides execution of computer instructions.

在此披露的其他示例实施例可以使用计算机程序产品进行配置；例如，控件可以在软件中编程为实现在此披露的示例实施例。在此披露的其他示例实施例可以包括一种非瞬态计算机可读介质，该介质含有可以由处理器执行的指令并且当执行时致使处理器完成在此描述的方法。应当理解的是，在此描述的这些框图和流程图的元素可以软件、硬件、固件或在将来确定的其他类似的实现方式来实现。此外，在此描述的这些框图和流程图的元素可以在软件、硬件、或固件中以任何方式被组合或拆分。Other example embodiments disclosed herein can be configured using a computer program product; for example, a control can be programmed in software to implement the example embodiments disclosed herein. Other example embodiments disclosed herein may include a non-transitory computer-readable medium containing instructions that can be executed by a processor and that, when executed, causes the processor to perform the methods described herein. It should be understood that the elements of the block diagrams and flow charts described herein can be implemented in software, hardware, firmware, or other similar implementations to be determined in the future. In addition, the elements of the block diagrams and flow charts described herein can be combined or split in any manner in software, hardware, or firmware.

应当理解的是，术语“在此”可转移到结合有在此呈现的传授内容的申请或专利中，使得主题、定义或数据在进行此结合的申请或专利中沿用。It should be understood that the term "herein" can be transferred to an application or patent incorporating teachings presented herein so that the subject matter, definitions, or data are carried forward in the incorporated application or patent.

如果在软件中实现，那么软件可以任何能够支持在此披露的示例实施例的语言书写。该软件可以被存储在任何形式的计算机可读介质中，如随机存取存储器(RAM)、只读存储器(ROM)、光盘只读存储器(CD-ROM)等等中。在工作中，通用或特定用途处理器以本领域中充分已知的方式加载和执行软件。应当理解的是，此外该框图和流程图可以包括更多或更少的元件、不同地安排或取向、或不同地表示。应当理解的是，实现方式可以指框图、流程图、和/或网络图以及展示本发明实施例的执行的框图和流程图的数目。If implemented in software, the software can be written in any language that can support the example embodiments disclosed herein. The software can be stored in any form of computer-readable medium, such as random access memory (RAM), read-only memory (ROM), compact disc read-only memory (CD-ROM), etc. At work, a general or special purpose processor loads and executes the software in a manner well known in the art. It should be understood that, in addition, the block diagrams and flow charts may include more or fewer elements, be arranged or oriented differently, or be represented differently. It should be understood that implementation may refer to block diagrams, flow charts, and/or network diagrams and the number of block diagrams and flow charts that illustrate the execution of embodiments of the present invention.

虽然通过参考本发明的示例实施例已经具体地示出和描述了本发明，但本领域普通技术人员将会理解在不脱离由所附权利要求书限定的本发明范围的情况下可在形式和细节中做出不同的改变。While the invention has been particularly shown and described with reference to example embodiments thereof, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the scope of the invention as defined in the following claims.

Claims

1. A security device operatively coupled to a network, the security device comprising:

Multiple memories mapped to multiple levels in a memory hierarchy, and at least one processor operatively coupled to the multiple memories, the at least one processor being configured to:

Generate at least one per-pattern nondeterministic finite automaton (NFA), each per-pattern NFA being generated for a single regular expression pattern and comprising a set of corresponding nodes; and

The nodes of the corresponding nodes for each per-pattern NFA generated by the distribution are stored in the plurality of memories based on the mapped levels and the per-pattern NFA storage allocation settings configured for these levels.

2. The security device of claim 1, wherein the at least one processor is further configured to store the nodes corresponding to the group of each generated per-pattern NFA in the plurality of memories.

3. The security apparatus of claim 1, wherein generating the at least one per-pattern NFA includes specifying a translation from a given node to a next node via a next node address associated with the given node, and the next node address is configured to identify the next node and a given memory allocated to one of the plurality of memories for storage by the next node.

4. The security device of claim 1, wherein each of these per-pattern NFA storage allocation settings is configured for a corresponding level among these levels to represent a unique number of nodes corresponding to the target number of nodes for each generated per-pattern NFA, to be distributed for storage in a given memory mapped to the corresponding level in one of the plurality of memories.

5. The security device of claim 4, wherein the unique nodes in the corresponding group are arranged in a continuous manner within a given per-pattern NFA that includes the corresponding group of nodes.

6. The security device of claim 4, wherein each per-pattern NFA storage allocation setting is configured for the corresponding level in such a way that, in an event in which a per-pattern NFA is generated for each regular expression pattern in a set of regular expression patterns, the given memory is able to provide sufficient storage capacity to store the number of unique nodes representing the target number from each generated per-pattern NFA.

7. The security device of claim 4, wherein the target number of unique nodes is represented by an absolute value and is a shared value for each group of corresponding nodes, such that each group of corresponding nodes can have the same value for the target number of unique nodes, so as to be stored in the given memory mapped to the corresponding level.

8. The security device of claim 4, wherein the target number of unique nodes is represented by a percentage value applied to the total number of nodes in each group of corresponding nodes, such that each group of corresponding nodes can have a separate value for the target number of unique nodes for storage in the given memory mapped to the corresponding level.

9. The security device of claim 1, wherein each of the plurality of memories is sequentially mapped to a corresponding level in descending order based on the performance ranking of the plurality of memories, and the memory with the highest performance ranking among the plurality of memories is mapped to the highest-ranking level among the levels.

10. The security device of claim 1, wherein the per-pattern NFA storage allocation settings include a first per-pattern NFA storage allocation setting and a second per-pattern NFA storage allocation setting, the hierarchy includes a highest-ranking hierarchy and a second-highest-ranking hierarchy, the first per-pattern NFA storage allocation setting being configured for the highest-ranking hierarchy, and the second per-pattern NFA storage allocation setting being configured for the second-highest-ranking hierarchy.

11. The security device of claim 1, wherein the nodes of the group corresponding to each per pattern NFA generated by the distribution include the nodes of the group corresponding to the distribution distributed in a continuous manner, the distribution including:

A first distribution of these nodes in the corresponding group, used for storage in a first memory among the plurality of memories, the first memory being mapped to the highest-ordering of these levels; and

A second distribution of these nodes in the corresponding group of nodes, based on at least one undistributed node remaining in the corresponding group of nodes after the previous distribution, each at least one second distribution is used to store in a given memory of the plurality of memories, the given memory being mapped to a given level of these levels, the given level being successively lower than the highest-ranked level for each distribution of these nodes in the corresponding group of nodes.

12. The security device of claim 11, wherein if the given level is the lowest-ranked level among these levels, then a given distribution in the at least one second distribution includes all remaining undistributed nodes in the corresponding nodes of the group.

13. The security apparatus of claim 11, wherein the at least one processor is further configured to maximize the number of nodes in the first distribution, and the maximized number may be limited by one of these per-pattern NFA storage allocation settings being configured for the corresponding per-pattern NFA storage allocation setting of the highest sorted level.

14. The security apparatus of claim 11, wherein the at least one processor is further configured to maximize the number of nodes in each of the at least one second distribution, and the maximized number may be limited for each distribution by one of these per-pattern NFA storage allocation settings configured for the corresponding per-pattern NFA storage allocation setting for the given level.

15. The security device of claim 1, wherein the nodes of each set of corresponding nodes for each pattern NFA are statically stored such that the nodes of each set of corresponding nodes can be traversed using multiple segments of multiple payloads in an input stream received from the network to determine a single match of at least one regular expression pattern in the input stream.

16. The security apparatus of claim 15, wherein the walk is performed on individual nodes matching the segments of the payloads corresponding to the set of nodes, and the at least one processor is further configured to determine the per-pattern NFA storage allocation settings based on the total number of regular expression patterns in a set of regular expression patterns and the desired performance metric associated with the walk.

17. The security device of claim 1, wherein the at least one processor is further configured to:

A unified deterministic finite automaton (DFA) is generated based on at least one subgraph of each graph selected from a set of regular expression graphs; and

The unified DFA is stored in one of the multiple memories.

18. The security device of claim 1, wherein the plurality of memories includes a first memory, a second memory, and a third memory, and the first and second memories are located on a chip together with the at least one processor, while the third memory is an external memory located outside the chip and mapped to the lowest-order of these levels.

19. A method for implementing a nondeterministic finite automaton for each drawing, comprising:

In at least one processor, the at least one processor is operatively coupled to a plurality of memories, the plurality of memories being mapped to multiple levels in a memory hierarchy of a security device operatively coupled to a network:

The nodes corresponding to the group of nodes for each pattern NFA are distributed and stored in the plurality of memories based on the mapped levels and the per-pattern NFA storage allocation settings configured for these levels.

20. The method of claim 19, further comprising storing, based on the distribution, the nodes corresponding to the group of each generated per-pattern NFA in the plurality of memories statically.

21. The method of claim 19, wherein generating the at least one per-pattern NFA includes specifying a transition from a given node to a next node via a next node address associated with the given node, and configuring the next node address to identify the next node and a given memory allocated to one of the plurality of memories for storage by the next node.

22. The method of claim 19, further comprising configuring each of these per-pattern NFA storage allocation settings for a corresponding level among these levels to represent a target number of unique nodes corresponding to the group of generated per-pattern NFAs, for distribution for storage in a given memory mapped to the corresponding level in one of the plurality of memories.

23. The method of claim 22, wherein the unique nodes in the corresponding group are arranged in a continuous manner within a given per-pattern NFA that includes the corresponding group of nodes.

24. The method of claim 22, further comprising configuring a storage allocation setting for each per-pattern NFA for the corresponding hierarchy in such a way that, in an event of generating a per-pattern NFA for each regular expression pattern in a set of regular expression patterns, the given memory is able to provide sufficient storage capacity to store the target number of unique nodes represented from each generated per-pattern NFA.

25. The method of claim 22, wherein the target number of unique nodes is represented by an absolute value and is a shared value for each group of corresponding nodes, such that each group of corresponding nodes can have the same value for the target number of unique nodes, so as to be stored in the given memory mapped to the corresponding level.

26. The method of claim 22, further comprising representing the target number of unique nodes via a percentage value, the percentage value being applied to the corresponding total number of nodes for each group of corresponding nodes, such that each group of corresponding nodes can have a separate value for the target number of unique nodes for storage in the given memory mapped to the corresponding level.

27. The method of claim 19, wherein each of the plurality of memories is sequentially mapped to a corresponding level in descending order based on the performance ranking of the plurality of memories, and the memory with the highest performance ranking among the plurality of memories is mapped to the highest-ranking level among the levels.

28. The method of claim 19, wherein the per-pattern NFA storage allocation settings include a first per-pattern NFA storage allocation setting and a second per-pattern NFA storage allocation setting, the hierarchy including a top-ranking hierarchy and a second-ranking hierarchy, the first per-pattern NFA storage allocation setting being configured for the top-ranking hierarchy, and the second per-pattern NFA storage allocation setting being configured for the second-ranking hierarchy.

29. The method of claim 19, wherein the nodes of the group corresponding to each per pattern NFA generated by the distribution include those nodes distributed in a continuous manner, the distribution comprising:

30. The method of claim 29, wherein if the given level is the lowest-ranked level among these levels, then a given distribution in the at least one second distribution includes all remaining undistributed nodes in the corresponding nodes of that group.

31. The method of claim 29, further comprising maximizing the number of nodes in the first distribution, wherein the maximized number may be limited by one of these per-pattern NFA storage allocation settings being configured for the corresponding per-pattern NFA storage allocation setting of the highest sorted level.

32. The method of claim 29, further comprising maximizing the number of nodes in each of at least one second distribution, wherein the maximized number is limited for each distribution by a corresponding per-pattern NFA storage allocation setting configured for the given level.

33. The method of claim 19, further comprising statically storing the nodes of the group corresponding to each per pattern NFA, such that the nodes of the group corresponding to each NFA can be traversed using multiple segments of multiple payloads in an input stream received from the network to determine a single match of at least one regular expression pattern in the input stream.

34. The method of claim 33, wherein the walk is performed on individual nodes matching segments of the payloads corresponding to the set of nodes, and the method further comprises determining the per-pattern NFA storage allocation settings based on the total number of regular expression patterns in a set of regular expression patterns and the desired performance metric associated with the walk.

35. The method of claim 19, further comprising:

The unified DFA is stored in one of the multiple memories.

36. The method of claim 19, wherein the plurality of memories includes a first memory, a second memory, and a third memory, and the first and second memories are co-located on a chip with the at least one processor, while the third memory is an external memory located outside the chip and mapped to the lowest-order of these levels.

37. A non-transient computer-readable medium having an instruction sequence encoded thereon, the instruction sequence causing the processor, when loaded and executed by a processor operatively coupled to and mapped to multiple memories at multiple levels in a memory hierarchy, to: