US20220124185A1 - Terabit-scale network packet processing via flow-level parallelization - Google Patents
Terabit-scale network packet processing via flow-level parallelization Download PDFInfo
- Publication number
- US20220124185A1 US20220124185A1 US17/566,633 US202117566633A US2022124185A1 US 20220124185 A1 US20220124185 A1 US 20220124185A1 US 202117566633 A US202117566633 A US 202117566633A US 2022124185 A1 US2022124185 A1 US 2022124185A1
- Authority
- US
- United States
- Prior art keywords
- data packet
- packets
- partition
- flow
- flow key
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L69/00—Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
- H04L69/22—Parsing or analysis of headers
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/10—Flow control; Congestion control
- H04L47/24—Traffic characterised by specific attributes, e.g. priority or QoS
- H04L47/2408—Traffic characterised by specific attributes, e.g. priority or QoS for supporting different services, e.g. a differentiated services [DiffServ] type of service
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L49/00—Packet switching elements
- H04L49/90—Buffering arrangements
- H04L49/9063—Intermediate storage in different physical parts of a node or terminal
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/10—Flow control; Congestion control
- H04L47/24—Traffic characterised by specific attributes, e.g. priority or QoS
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L49/00—Packet switching elements
- H04L49/90—Buffering arrangements
- H04L49/9063—Intermediate storage in different physical parts of a node or terminal
- H04L49/9068—Intermediate storage in different physical parts of a node or terminal in the network interface card
Definitions
- the present disclosure relates generally to data mining, and relates more particularly to devices, non-transitory computer-readable media, and methods for organizing packet flows for downstream processing stages.
- Data mining has become a valuable tool for helping network service providers to analyze and understand their customers' service-related needs. For instance, information can be extracted from a data set (e.g., a set of packets exchanged between network endpoints) and transformed into a structure that can be analyzed for the occurrence of patterns, relationships, and other statistics that indicate how the customers are using the network.
- a data set e.g., a set of packets exchanged between network endpoints
- a method includes extracting, by a network interface card of an application server, a first flow key from a first data packet, inputting, by the network interface card, the first flow key into a hash function to obtain a first output value, selecting, by the network interface card, a first partition in a memory of the application server to which to store the first data packet, wherein the first partition is selected based on the first output value, and storing, by the network interface card, the first data packet to the first partition.
- a device in another example, includes a processor and a computer-readable medium storing instructions which, when executed by the processor, cause the processor to perform operations.
- the operations include extracting a first flow key from a first data packet, inputting the first flow key into a hash function to obtain a first output value, selecting a first partition in a memory to which to store the first data packet, wherein the first partition is selected based on the first output value, and storing the first data packet to the first partition.
- an apparatus in another example, includes a first network interface card and a second network interface card.
- the first network interface card is configured to identify, by applying a first hash function to a first flow key extracted from a first data packet, a first flow of packets of a plurality of flows of packets to which the first data packet belongs.
- the second network interface card is configured to identify, by applying the first hash function to a second flow key extracted from a second data packet, a second flow of packets of the plurality of flows of packets to which the second data packet belongs.
- the apparatus also includes a memory, wherein a first partition of the memory is assigned to the first flow of packets and a second partition of the memory is assigned to the second flow of packets.
- the apparatus also includes a plurality of processors configured to execute a plurality of threads including a first thread and a second thread, wherein the first thread is programmed to retrieve data packets from the first partition and the second thread is programmed to retrieve data packets from the second partition.
- FIG. 1 illustrates an example network related to the present disclosure
- FIG. 2 is a block diagram illustrating one example of the memory of FIG. 1 in more detail
- FIG. 3 illustrates a flowchart of an example method for organizing terabit-scale packet volumes into flows for downstream processing stages
- FIG. 4 depicts a high-level block diagram of a computing device specifically programmed to perform the functions described herein.
- the present disclosure organizes terabit-scale packet volumes into flows for downstream processing stages.
- data mining has become a valuable tool for helping network service providers to analyze and understand their customers' service-related needs.
- Network traffic can be analyzed for patterns, relationships, and other statistics that indicate how the customers are using the network.
- traffic volumes increase (e.g., to the terabit scale)
- real-time analysis applications are moved to the cloud, these applications must adapt to the highly distributed environment and the increasing volume of traffic.
- Parallelization e.g., processing of multiple data items simultaneously, or in parallel
- parallelization at the packet level is infeasible. For instance, the number of incoming packets could vastly overwhelm the number of threads available to process the packets.
- Examples of the present disclosure provide a way of organizing terabit-rate packet volumes into flows for downstream processing stages that may be performed in parallel. Although parallelization at the packet level has been shown to be infeasible at terabit rates, by efficiently organizing the packets into packet flows, examples of the present disclosure are able to achieve terabit-rate parallelization at the flow-level.
- packet traffic traversing the network is replicated, and the replicated or “mirrored” versions of the original packets (hereinafter referred to simply as “packets”) are subsequently organized into flows, which are in turn uniquely assigned to respective processing threads of a host computing system (e.g., an application server).
- a host computing system e.g., an application server
- the header of a packet is scanned by an intelligent (i.e., programmable) network interface card (NIC) for a flow key, which is input into a hash function.
- the result of the hash function operating on the flow key is a value that corresponds to a thread identifier, where the thread identified by the thread identifier is assigned to process the flow of packets to which the packet belongs.
- the packet is then stored by the NIC in a partition in memory that is accessible by the corresponding thread.
- FIG. 1 illustrates an example network 100 , related to the present disclosure.
- the network 100 may be any type of communications network, such as for example, a traditional circuit switched network (CS) (e.g., a public switched telephone network (PSTN)) or an Internet Protocol (IP) network (e.g., an IP Multimedia Subsystem (IMS) network, an asynchronous transfer mode (ATM) network, a wireless network, a cellular network (e.g., 2G, 3G and the like), a long term evolution (LTE) network, and the like) related to the current disclosure.
- IP network is broadly defined as a network that uses Internet Protocol to exchange data packets.
- Additional exemplary IP networks include Voice over IP (VoIP) networks, Service over IP (SoIP) networks, and the like.
- VoIP Voice over IP
- SoIP Service over IP
- the network 100 may comprise a core network 102 .
- core network 102 may combine core network components of a cellular network with components of a triple play service network; where triple play services include telephone services, Internet services, and television services to subscribers.
- core network 102 may functionally comprise a fixed mobile convergence (FMC) network, e.g., an IP Multimedia Subsystem (IMS) network.
- FMC fixed mobile convergence
- IMS IP Multimedia Subsystem
- core network 102 may functionally comprise a telephony network, e.g., an Internet Protocol/Multi-Protocol Label Switching (IP/MPLS) backbone network utilizing Session Initiation Protocol (SIP) for circuit-switched and Voice over Internet Protocol (VoIP) telephony services.
- IP/MPLS Internet Protocol/Multi-Protocol Label Switching
- SIP Session Initiation Protocol
- VoIP Voice over Internet Protocol
- Core network 102 may also further comprise an Internet Service Provider (ISP) network.
- the core network 102 may include a traffic analysis point (TAP) 104 , a multiplexer 106 , and an application server 126 .
- TAP traffic analysis point
- multiplexer 106 multiplexer
- application server 126 application server
- FIG. 1 various additional elements of core network 102 are omitted from FIG. 1 , including switches, routers, firewalls, web servers, and the like.
- the core network 102 may be in communication with one or more wireless access networks 120 and 122 .
- Either or both of the access networks 120 and 122 may include a radio access network implementing such technologies as: global system for mobile communication (GSM), e.g., a base station subsystem (BSS), or IS-95, a universal mobile telecommunications system (UMTS) network employing wideband code division multiple access (WCDMA), or a CDMA3000 network, among others.
- GSM global system for mobile communication
- BSS base station subsystem
- UMTS universal mobile telecommunications system
- WCDMA wideband code division multiple access
- CDMA3000 CDMA3000 network
- either or both of the access networks 120 and 122 may comprise an access network in accordance with any “second generation” (2G), “third generation” (3G), “fourth generation” (4G), Long Term Evolution (LTE), or any other yet to be developed future wireless/cellular network technology including “fifth generation” (5G) and further generations.
- the operator of core network 102 may provide a data service to subscribers via access networks 120 and 122 .
- the access networks 120 and 122 may all be different types of access networks, may all be the same type of access network, or some access networks may be the same type of access network and other may be different types of access networks.
- the core network 102 and the access networks 120 and 122 may be operated by different service providers, the same service provider or a combination thereof.
- the access network 120 may be in communication with one or more user endpoint devices (also referred to as “endpoint devices” or “UE”) 108 and 110 , while the access network 122 may be in communication with one or more user endpoint devices 112 and 114 .
- Access networks 120 and 122 may transmit and receive communications between respective UEs 108 , 110 , 112 , and 114 and core network 102 relating to communications with web servers, TAP 104 , and/or other servers via the Internet and/or other networks, and so forth.
- the user endpoint devices 108 , 110 , 112 , and 114 may be any type of subscriber/customer endpoint device configured for wireless communication such as a laptop computer, a Wi-Fi device, a Personal Digital Assistant (PDA), a mobile phone, a smartphone, an email device, a computing tablet, a messaging device, a wearable “smart” device (e.g., a smart watch or fitness tracker), a portable media device (e.g., an MP3 player), a gaming console, a portable gaming device, a set top box, a smart television, and the like.
- PDA Personal Digital Assistant
- any one or more of the user endpoint devices 108 , 110 , 112 , and 114 may have both cellular and non-cellular access capabilities and may further have wired communication and networking capabilities (e.g., such as a desktop computer). It should be noted that although only four user endpoint devices are illustrated in FIG. 1 , any number of user endpoint devices may be deployed.
- the TAP 104 is configured to mirror or replicate data packets traversing the core network 102 and to send the replicated data packets (hereinafter referred to as “packets” or “data packets”) to the multiplexer 106 .
- the TAP 104 is an optical TAP that mirrors the data packets in a manner that is transparent to the UEs 108 , 110 , 112 , and 114 (i.e., without noticeably disrupting the network activity).
- the multiplexer 106 executes a load balancing algorithm in order to distribute the data packets among n intelligent network interface cards 116 1 - 116 n (hereinafter collectively referred to as “NICs 116 ”) of the application server 126 .
- NICs 116 n intelligent network interface cards 116 1 - 116 n
- the data packets may be distributed to the NICs 116 in a round robin fashion, a weighted round robin fashion, a random fashion, or according to any other load balancing algorithm.
- Each of the NICs 116 scans the header of each data packet that it receives and extracts a flow key. Data packets belonging to the same flow of packets will contain the same flow key. For instance, all data packets belonging to a first flow of packets will contain a first flow key, while all data packets belonging to a second flow of packets will contain a second flow key that is different from the first flow key.
- the flow key is a 5-tuple defining the Transmission Control Protocol/Internet Protocol (TCP/IP) connection via which the data packet travels.
- TCP/IP Transmission Control Protocol/Internet Protocol
- the 5-tuple includes: the source IP address, the destination IP address, the source port number (e.g., Transmission Control Protocol/User Datagram Protocol or TCP/UDP port number), the destination port number (e.g., TCP/UDP port number), and the type of service (ToS).
- the NIC 116 then inputs the flow key into a hash function.
- each NIC 116 may comprise a processor (e.g., a central processing unit) or a field programmable gate array (FPGA) to run the hash function.
- each of the NICs 116 uses the same hash function to ensure uniform assignment of packet flows to processing threads.
- the hash function may be deterministic, such that the assignment of a packet to a packet flow, and of a packet flow to a processing thread, is predictable (e.g., not random).
- the output value of the hash function comprises a thread identifier that corresponds to a specific processing thread executing on one of the processors 124 of the application server 126 .
- the output value of the hash function will be the same for all data packets belonging to the same flow of packets, regardless of which NICs 116 receive the data packets.
- the NICs 116 may tag the data packets with the output value of the hash function before storing the data packets in the memory 118 of the application server 126 .
- FIG. 2 is a block diagram illustrating one example of the memory 118 of FIG. 1 in more detail.
- the memory 118 is divided into a plurality partitions 200 1 - 200 m (hereinafter collectively referred to as “partitions 200 ”).
- the partitions 200 may occupy contiguous blocks of the memory 118 .
- the number of and the sizes of the partitions 200 are configurable, and may be reconfigured on-the-fly to accommodate packet flows of varying sizes and changing network conditions. For instance, the number of partitions may be increased when service times decrease and/or when the number of threads executing on the processors 124 increases.
- the maximum number of partitions 200 may be empirically determined, and in one embodiment the maximum number of partitions 200 does not exceed a value that would cause an imbalance (i.e., a disproportionate share of packet flows being assigned to one partition 200 , where “disproportionate” may be defined as some configurable percentage of packet flows beyond the mean or median number of packet flows assigned to all of the partitions 200 ) across the partitions 200 .
- packet distribution across the partitions 200 is uniformly random, but may exhibit periods of intense imbalance.
- the packet flow-to-partition assignment is flow modulo x.
- each of the partitions 200 is assigned to one flow of packets.
- the NICs 116 select the appropriate partitions 200 to which to store the data packets based on the output values of the hash function. In other words, the output value of the hash function for a particular data packet will determine the partition 200 to which the data packet should be stored.
- Data packets stored in the partitions 200 may be queued up in a work queue 202 from which threads executing on the processors 124 of the application server 126 retrieve the data packets for processing. Queuing of the data packets may be based on a round robin service model, a pseudo-random service model, or any other service model.
- the application server 126 further comprises a plurality of processors 124 .
- Each of the processors 124 further supports a plurality of threads, where each thread of the plurality of threads is assigned to process data packets from a unique flow of packets.
- each thread is further assigned to one of the partitions 200 in the memory 118 of the application server 126 .
- a first thread may retrieve data packets from a first partition
- a second thread may retrieve data packets from a second partition.
- the processors 124 may support parallel processing of a plurality of packet flows, where the individual packets of the packet flows are traversing the network 100 at terabit rates.
- one or more of the processors 124 may also host a set of instructions for running the hash function into which the flow keys are input (e.g., as an alternative to the NICs 116 running the hash function).
- the output value of the hash function will dictate to which partition 200 in memory 118 the data packet is stored.
- the partition 200 will dictate which thread executing on the processors 124 accesses the data packet for further processing. Because the flow key does not change for the life of the flow of packets, and because the same hash function is used by all of the NICs 116 , the assignment of a flow of packets to a processing thread persists, without the need for blocking or synchronization. Data packets can thus be efficiently organized into flows of packets, and flows of packets can be uniquely assigned to processing threads.
- examples of the present disclosure are thus able to achieve efficient parallelization in a network where packet volumes approach terabit rates.
- the appropriate partition 200 in memory 118 and the appropriate thread in the processors 124 for a given data packet will be dictated by the same information (i.e., the output value of the hash function), the correspondence between the number of partitions 200 and the number of threads is not necessarily one-to-one. In general, the greater the ratio of partitions 200 to threads, the less likely it will be that two or more threads will collide on (i.e., attempt to concurrently access) the same partition 200 . When parallelization is achieved at partition-level granularity as disclosed, collisions are more likely to occur during periods of cross-partition imbalance. An increase in sustained imbalance periods (i.e., durations of time during which imbalances are present) may also cause an increase in the number of partitions 200 .
- sustained imbalance periods i.e., durations of time during which imbalances are present
- increasing the number of partitions 200 in the memory 118 may minimize thread collisions. Collisions can be further minimized by ensuring that a partition 200 is not assigned to a new thread until the currently assigned thread has finished operating on its flow of packets. In one example, this is enforced by imposing a “drain period” before increasing the number of partitions from a first number to a second number and redistributing the flows of packets to the second number of partitions. During the drain period, the threads complete processing on the data packets that they have already retrieved from the first number of partitions. Once the last thread finishes processing its data packets, the drain period ends, the second number of partitions is instantiated, and the flows of packets are redistributed to the second number of partitions.
- Redistribution of the flows of data packets may result in a flow of packets being processed by a new thread; however, by imposing the drain period, the chances of the new thread processing the flow of packets at the same time as the old thread are minimized.
- Flow-level locking may be imposed to minimize the duration of the drain period. In this case, the flow-level locking takes advantage of the dynamic inherently present in very large networks, where the probability of consecutive data packets belonging to the same flow of data packets at a single observation point (e.g., the TAP 104 ) is very small.
- any one or more of the TAP 104 , multiplexer 106 , application server 126 , or NICs 116 may comprise or be configured as a general purpose computer as illustrated in FIG. 4 and discussed below.
- the terms “configure” and “reconfigure” may refer to programming or loading a computing device with computer-readable/computer-executable instructions, code, and/or programs, e.g., in a memory, which when executed by a processor of the computing device, may cause the computing device to perform various functions.
- Such terms may also encompass providing variables, data values, tables, objects, or other data structures or the like which may cause a computer device executing computer-readable instructions, code, and/or programs to function differently depending upon the values of the variables or other data structures that are provided.
- the network 100 has been simplified.
- the network 100 may include other network elements (not shown) such as border elements, routers, switches, policy servers, security devices, a content distribution network (CDN) and the like.
- the network 100 may also be expanded by including additional endpoint devices, access networks, network elements, application servers, etc. without altering the scope of the present disclosure.
- FIG. 3 illustrates a flowchart of an example method 300 for organizing terabit-scale packet volumes into flows for downstream processing stages.
- the method 300 may be performed by an intelligent NIC, e.g., one of the NICs 116 illustrated in FIG. 1 .
- the method 300 may be performed by another device.
- any references in the discussion of the method 300 to the NICs 116 of FIG. 1 (or any other elements of FIG. 1 ) are not intended to limit the means by which the method 300 may be performed.
- the method 300 begins in step 302 .
- the NIC 116 receives a data packet from the multiplexer 106 .
- the data packet is a replica of a data packet that was exchanged between two endpoints in the network 100 (e.g., between two of the UEs 108 , 110 , 112 , and 114 ).
- the data packet may have been directed to the NIC 116 in accordance with any load balancing algorithm.
- the NIC 116 extracts a flow key from the data packet.
- the flow key is extracted from the data packet's header and comprises a 5-tuple of source IP address, destination IP address, source port number, destination port number, and ToS.
- step 308 the NIC 116 inputs the flow key into a hash function.
- the hash function produces an output value based on the input flow key.
- the NIC selects a partition 200 in memory 118 to which to store the data packet, based on the output value of the hash function.
- the output value of the hash function comprises a thread identifier that dictates both: (1) the corresponding thread executing on the processors 124 that will process the flow of packets to which the data packet belongs; and (2) the partition 200 in memory 118 to which to store the data packets of the flow of packets for retrieval by the thread.
- step 312 the NIC stores the data packet to the partition 200 in memory 118 that was selected in step 310 .
- the method 300 ends in step 314 .
- one or more steps of the method 300 may include a storing, displaying, and/or outputting step as required for a particular application.
- any data, records, fields, and/or intermediate results discussed in the method can be stored, displayed and/or outputted to another device as required for a particular application.
- operations, steps, or blocks in FIG. 3 that recite a determining operation or involve a decision do not necessarily require that both branches of the determining operation be practiced. In other words, one of the branches of the determining operation can be deemed as an optional step.
- operations, steps or blocks of the above described method(s) can be combined, separated, and/or performed in a different order from that described above, without departing from the examples of the present disclosure.
- FIG. 4 depicts a high-level block diagram of a computing device specifically programmed to perform the functions described herein.
- any one or more components or devices illustrated in FIG. 1 or described in connection with the method 300 may be implemented as the system 400 .
- any one of the NICs 116 of FIG. 1 (such as might be used to perform the method 300 ) could be implemented as illustrated in FIG. 4 .
- the application server 126 as a whole could be implemented as illustrated in FIG. 4 .
- the system 400 comprises a hardware processor element 402 , a memory 404 , a module 405 for organizing terabit-scale packet volumes into flows, and various input/output (I/O) devices 406 .
- the hardware processor 402 may comprise, for example, a microprocessor, a central processing unit (CPU), or the like.
- the memory 404 may comprise, for example, random access memory (RAM), read only memory (ROM), a disk drive, an optical drive, a magnetic drive, and/or a Universal Serial Bus (USB) drive.
- the module 405 for organizing terabit-scale packet volumes into flows may include circuitry and/or logic for performing special purpose functions relating to data mining, including a code component 408 for executing the hash function described above (where each NIC that is configured as illustrated in FIG. 4 includes the same code component 408 executing the same hash function).
- the input/output devices 406 may include, for example, storage devices (including but not limited to, a tape drive, a floppy drive, a hard disk drive or a compact disk drive), a receiver, a transmitter, a fiber optic communications line, an output port, or a user input device (such as a keyboard, a keypad, a mouse, and the like).
- storage devices including but not limited to, a tape drive, a floppy drive, a hard disk drive or a compact disk drive
- a receiver a transmitter, a fiber optic communications line, an output port, or a user input device (such as a keyboard, a keypad, a mouse, and the like).
- the general-purpose computer may employ a plurality of processor elements.
- the general-purpose computer may employ a plurality of processor elements.
- the general-purpose computer of this Figure is intended to represent each of those multiple general-purpose computers.
- one or more hardware processors can be utilized in supporting a virtualized or shared computing environment.
- the virtualized computing environment may support one or more virtual machines representing computers, servers, or other computing devices. In such virtualized virtual machines, hardware components such as hardware processors and computer-readable storage devices may be virtualized or logically represented.
- the present disclosure can be implemented in software and/or in a combination of software and hardware, e.g., using application specific integrated circuits (ASIC), a programmable logic array (PLA), including a field-programmable gate array (FPGA), or a state machine deployed on a hardware device, a general purpose computer or any other hardware equivalents, e.g., computer readable instructions pertaining to the method(s) discussed above can be used to configure a hardware processor to perform the steps, functions and/or operations of the above disclosed method(s).
- ASIC application specific integrated circuits
- PDA programmable logic array
- FPGA field-programmable gate array
- instructions and data for the present module or process 405 for organizing terabit-scale packet volumes into flows can be loaded into memory 404 and executed by hardware processor element 402 to implement the steps, functions or operations as discussed above in connection with the example method 300 .
- a hardware processor executes instructions to perform “operations,” this could include the hardware processor performing the operations directly and/or facilitating, directing, or cooperating with another hardware device or component (e.g., a co-processor and the like) to perform the operations.
- the processor executing the computer readable or software instructions relating to the above described method(s) can be perceived as a programmed processor or a specialized processor.
- the present module 405 for organizing terabit-scale packet volumes into flows (including associated data structures) of the present disclosure can be stored on a tangible or physical (broadly non-transitory) computer-readable storage device or medium, e.g., volatile memory, non-volatile memory, ROM memory, RAM memory, magnetic or optical drive, device or diskette and the like.
- the computer-readable storage device may comprise any physical devices that provide the ability to store information such as data and/or instructions to be accessed by a processor or a computing device such as a computer or an application server.
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Computer Security & Cryptography (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
Abstract
Description
- This application is a continuation of U.S. patent application Ser. No. 16/896,161, filed Jun. 8, 2020, (currently allowed) which is a continuation of U.S. patent application Ser. No. 15/598,673, filed on May 18, 2017, now U.S. Pat. No. 10,681,189, both of which are herein incorporated by reference in their entirety.
- The present disclosure relates generally to data mining, and relates more particularly to devices, non-transitory computer-readable media, and methods for organizing packet flows for downstream processing stages.
- Data mining has become a valuable tool for helping network service providers to analyze and understand their customers' service-related needs. For instance, information can be extracted from a data set (e.g., a set of packets exchanged between network endpoints) and transformed into a structure that can be analyzed for the occurrence of patterns, relationships, and other statistics that indicate how the customers are using the network.
- In one example, the present disclosure describes a device, computer-readable medium, and method for organizing terabit-scale packet volumes into flows for downstream processing stages. For instance, in one example, a method includes extracting, by a network interface card of an application server, a first flow key from a first data packet, inputting, by the network interface card, the first flow key into a hash function to obtain a first output value, selecting, by the network interface card, a first partition in a memory of the application server to which to store the first data packet, wherein the first partition is selected based on the first output value, and storing, by the network interface card, the first data packet to the first partition.
- In another example, a device includes a processor and a computer-readable medium storing instructions which, when executed by the processor, cause the processor to perform operations. The operations include extracting a first flow key from a first data packet, inputting the first flow key into a hash function to obtain a first output value, selecting a first partition in a memory to which to store the first data packet, wherein the first partition is selected based on the first output value, and storing the first data packet to the first partition.
- In another example, an apparatus includes a first network interface card and a second network interface card. The first network interface card is configured to identify, by applying a first hash function to a first flow key extracted from a first data packet, a first flow of packets of a plurality of flows of packets to which the first data packet belongs. The second network interface card is configured to identify, by applying the first hash function to a second flow key extracted from a second data packet, a second flow of packets of the plurality of flows of packets to which the second data packet belongs. The apparatus also includes a memory, wherein a first partition of the memory is assigned to the first flow of packets and a second partition of the memory is assigned to the second flow of packets. The apparatus also includes a plurality of processors configured to execute a plurality of threads including a first thread and a second thread, wherein the first thread is programmed to retrieve data packets from the first partition and the second thread is programmed to retrieve data packets from the second partition.
- The teachings of the present disclosure can be readily understood by considering the following detailed description in conjunction with the accompanying drawings, in which:
-
FIG. 1 illustrates an example network related to the present disclosure; -
FIG. 2 is a block diagram illustrating one example of the memory ofFIG. 1 in more detail; -
FIG. 3 illustrates a flowchart of an example method for organizing terabit-scale packet volumes into flows for downstream processing stages; and -
FIG. 4 depicts a high-level block diagram of a computing device specifically programmed to perform the functions described herein. - To facilitate understanding, identical reference numerals have been used, where possible, to designate identical elements that are common to the figures.
- In one example, the present disclosure organizes terabit-scale packet volumes into flows for downstream processing stages. As discussed above, data mining has become a valuable tool for helping network service providers to analyze and understand their customers' service-related needs. Network traffic can be analyzed for patterns, relationships, and other statistics that indicate how the customers are using the network. However, as traffic volumes increase (e.g., to the terabit scale), and real-time analysis applications are moved to the cloud, these applications must adapt to the highly distributed environment and the increasing volume of traffic. Parallelization (e.g., processing of multiple data items simultaneously, or in parallel) can greatly speed the processing of large volumes of data. However, when working with terabit-rate packet volumes, parallelization at the packet level is infeasible. For instance, the number of incoming packets could vastly overwhelm the number of threads available to process the packets.
- Examples of the present disclosure provide a way of organizing terabit-rate packet volumes into flows for downstream processing stages that may be performed in parallel. Although parallelization at the packet level has been shown to be infeasible at terabit rates, by efficiently organizing the packets into packet flows, examples of the present disclosure are able to achieve terabit-rate parallelization at the flow-level. In one example, packet traffic traversing the network is replicated, and the replicated or “mirrored” versions of the original packets (hereinafter referred to simply as “packets”) are subsequently organized into flows, which are in turn uniquely assigned to respective processing threads of a host computing system (e.g., an application server). In some examples, the header of a packet is scanned by an intelligent (i.e., programmable) network interface card (NIC) for a flow key, which is input into a hash function. The result of the hash function operating on the flow key is a value that corresponds to a thread identifier, where the thread identified by the thread identifier is assigned to process the flow of packets to which the packet belongs. The packet is then stored by the NIC in a partition in memory that is accessible by the corresponding thread.
- To better understand the present disclosure,
FIG. 1 illustrates anexample network 100, related to the present disclosure. Thenetwork 100 may be any type of communications network, such as for example, a traditional circuit switched network (CS) (e.g., a public switched telephone network (PSTN)) or an Internet Protocol (IP) network (e.g., an IP Multimedia Subsystem (IMS) network, an asynchronous transfer mode (ATM) network, a wireless network, a cellular network (e.g., 2G, 3G and the like), a long term evolution (LTE) network, and the like) related to the current disclosure. It should be noted that an IP network is broadly defined as a network that uses Internet Protocol to exchange data packets. Additional exemplary IP networks include Voice over IP (VoIP) networks, Service over IP (SoIP) networks, and the like. - In one embodiment, the
network 100 may comprise acore network 102. In one example,core network 102 may combine core network components of a cellular network with components of a triple play service network; where triple play services include telephone services, Internet services, and television services to subscribers. For example,core network 102 may functionally comprise a fixed mobile convergence (FMC) network, e.g., an IP Multimedia Subsystem (IMS) network. In addition,core network 102 may functionally comprise a telephony network, e.g., an Internet Protocol/Multi-Protocol Label Switching (IP/MPLS) backbone network utilizing Session Initiation Protocol (SIP) for circuit-switched and Voice over Internet Protocol (VoIP) telephony services.Core network 102 may also further comprise an Internet Service Provider (ISP) network. In one embodiment, thecore network 102 may include a traffic analysis point (TAP) 104, amultiplexer 106, and anapplication server 126. Although only asingle TAP 104, asingle multiplexer 106, and asingle application server 126 are illustrated, it should be noted that any number of TAPs, multiplexers, and application servers may be deployed. Furthermore, for ease of illustration, various additional elements ofcore network 102 are omitted fromFIG. 1 , including switches, routers, firewalls, web servers, and the like. - The
core network 102 may be in communication with one or more 120 and 122. Either or both of thewireless access networks 120 and 122 may include a radio access network implementing such technologies as: global system for mobile communication (GSM), e.g., a base station subsystem (BSS), or IS-95, a universal mobile telecommunications system (UMTS) network employing wideband code division multiple access (WCDMA), or a CDMA3000 network, among others. In other words, either or both of theaccess networks 120 and 122 may comprise an access network in accordance with any “second generation” (2G), “third generation” (3G), “fourth generation” (4G), Long Term Evolution (LTE), or any other yet to be developed future wireless/cellular network technology including “fifth generation” (5G) and further generations. The operator ofaccess networks core network 102 may provide a data service to subscribers via 120 and 122. In one embodiment, theaccess networks 120 and 122 may all be different types of access networks, may all be the same type of access network, or some access networks may be the same type of access network and other may be different types of access networks. Theaccess networks core network 102 and the 120 and 122 may be operated by different service providers, the same service provider or a combination thereof.access networks - In one example, the
access network 120 may be in communication with one or more user endpoint devices (also referred to as “endpoint devices” or “UE”) 108 and 110, while theaccess network 122 may be in communication with one or moreuser endpoint devices 112 and 114. 120 and 122 may transmit and receive communications betweenAccess networks 108, 110, 112, and 114 andrespective UEs core network 102 relating to communications with web servers,TAP 104, and/or other servers via the Internet and/or other networks, and so forth. - In one embodiment, the
108, 110, 112, and 114 may be any type of subscriber/customer endpoint device configured for wireless communication such as a laptop computer, a Wi-Fi device, a Personal Digital Assistant (PDA), a mobile phone, a smartphone, an email device, a computing tablet, a messaging device, a wearable “smart” device (e.g., a smart watch or fitness tracker), a portable media device (e.g., an MP3 player), a gaming console, a portable gaming device, a set top box, a smart television, and the like. In one example, any one or more of theuser endpoint devices 108, 110, 112, and 114 may have both cellular and non-cellular access capabilities and may further have wired communication and networking capabilities (e.g., such as a desktop computer). It should be noted that although only four user endpoint devices are illustrated inuser endpoint devices FIG. 1 , any number of user endpoint devices may be deployed. - In one embodiment, the TAP 104 is configured to mirror or replicate data packets traversing the
core network 102 and to send the replicated data packets (hereinafter referred to as “packets” or “data packets”) to themultiplexer 106. In one example, theTAP 104 is an optical TAP that mirrors the data packets in a manner that is transparent to the 108, 110, 112, and 114 (i.e., without noticeably disrupting the network activity).UEs - The
multiplexer 106 executes a load balancing algorithm in order to distribute the data packets among n intelligent network interface cards 116 1-116 n (hereinafter collectively referred to as “NICs 116”) of theapplication server 126. For instance, the data packets may be distributed to the NICs 116 in a round robin fashion, a weighted round robin fashion, a random fashion, or according to any other load balancing algorithm. - Each of the NICs 116 scans the header of each data packet that it receives and extracts a flow key. Data packets belonging to the same flow of packets will contain the same flow key. For instance, all data packets belonging to a first flow of packets will contain a first flow key, while all data packets belonging to a second flow of packets will contain a second flow key that is different from the first flow key. In one embodiment, the flow key is a 5-tuple defining the Transmission Control Protocol/Internet Protocol (TCP/IP) connection via which the data packet travels. In one example, the 5-tuple includes: the source IP address, the destination IP address, the source port number (e.g., Transmission Control Protocol/User Datagram Protocol or TCP/UDP port number), the destination port number (e.g., TCP/UDP port number), and the type of service (ToS). The NIC 116 then inputs the flow key into a hash function. In one example, each NIC 116 may comprise a processor (e.g., a central processing unit) or a field programmable gate array (FPGA) to run the hash function.
- In one example, each of the NICs 116 uses the same hash function to ensure uniform assignment of packet flows to processing threads. The hash function may be deterministic, such that the assignment of a packet to a packet flow, and of a packet flow to a processing thread, is predictable (e.g., not random). For instance, as discussed in greater detail below, the output value of the hash function comprises a thread identifier that corresponds to a specific processing thread executing on one of the
processors 124 of theapplication server 126. Moreover, because data packets belonging to the same flow of packets share the same flow key, and because the same hash function is used by all NICs 116, the output value of the hash function will be the same for all data packets belonging to the same flow of packets, regardless of which NICs 116 receive the data packets. The NICs 116 may tag the data packets with the output value of the hash function before storing the data packets in thememory 118 of theapplication server 126. - As discussed above, the
application server 126 further comprises amemory 118.FIG. 2 is a block diagram illustrating one example of thememory 118 ofFIG. 1 in more detail. As illustrated, thememory 118 is divided into a plurality partitions 200 1-200 m (hereinafter collectively referred to as “partitions 200”). The partitions 200 may occupy contiguous blocks of thememory 118. The number of and the sizes of the partitions 200 are configurable, and may be reconfigured on-the-fly to accommodate packet flows of varying sizes and changing network conditions. For instance, the number of partitions may be increased when service times decrease and/or when the number of threads executing on theprocessors 124 increases. The maximum number of partitions 200 may be empirically determined, and in one embodiment the maximum number of partitions 200 does not exceed a value that would cause an imbalance (i.e., a disproportionate share of packet flows being assigned to one partition 200, where “disproportionate” may be defined as some configurable percentage of packet flows beyond the mean or median number of packet flows assigned to all of the partitions 200) across the partitions 200. In one example, packet distribution across the partitions 200 is uniformly random, but may exhibit periods of intense imbalance. In one example, where the number of partitions 200 is x, the packet flow-to-partition assignment is flow modulo x. - In one example, each of the partitions 200 is assigned to one flow of packets. Thus, the NICs 116 select the appropriate partitions 200 to which to store the data packets based on the output values of the hash function. In other words, the output value of the hash function for a particular data packet will determine the partition 200 to which the data packet should be stored. Data packets stored in the partitions 200 may be queued up in a
work queue 202 from which threads executing on theprocessors 124 of theapplication server 126 retrieve the data packets for processing. Queuing of the data packets may be based on a round robin service model, a pseudo-random service model, or any other service model. - As discussed above, the
application server 126 further comprises a plurality ofprocessors 124. Each of theprocessors 124 further supports a plurality of threads, where each thread of the plurality of threads is assigned to process data packets from a unique flow of packets. As discussed above, each thread is further assigned to one of the partitions 200 in thememory 118 of theapplication server 126. For instance, a first thread may retrieve data packets from a first partition, while a second thread may retrieve data packets from a second partition. As such, theprocessors 124 may support parallel processing of a plurality of packet flows, where the individual packets of the packet flows are traversing thenetwork 100 at terabit rates. In one example, one or more of theprocessors 124 may also host a set of instructions for running the hash function into which the flow keys are input (e.g., as an alternative to the NICs 116 running the hash function). - Thus, when a NIC 116 inputs a flow key from a data packet into the hash function, the output value of the hash function will dictate to which partition 200 in
memory 118 the data packet is stored. The partition 200, in turn, will dictate which thread executing on theprocessors 124 accesses the data packet for further processing. Because the flow key does not change for the life of the flow of packets, and because the same hash function is used by all of the NICs 116, the assignment of a flow of packets to a processing thread persists, without the need for blocking or synchronization. Data packets can thus be efficiently organized into flows of packets, and flows of packets can be uniquely assigned to processing threads. By leveraging the natural organization of data packets in an IP network (i.e., the packet flows) along with the hash function (which minimizes per-packet synchronization costs), examples of the present disclosure are thus able to achieve efficient parallelization in a network where packet volumes approach terabit rates. - It should be noted that although the appropriate partition 200 in
memory 118 and the appropriate thread in theprocessors 124 for a given data packet will be dictated by the same information (i.e., the output value of the hash function), the correspondence between the number of partitions 200 and the number of threads is not necessarily one-to-one. In general, the greater the ratio of partitions 200 to threads, the less likely it will be that two or more threads will collide on (i.e., attempt to concurrently access) the same partition 200. When parallelization is achieved at partition-level granularity as disclosed, collisions are more likely to occur during periods of cross-partition imbalance. An increase in sustained imbalance periods (i.e., durations of time during which imbalances are present) may also cause an increase in the number of partitions 200. - In one example, increasing the number of partitions 200 in the
memory 118 may minimize thread collisions. Collisions can be further minimized by ensuring that a partition 200 is not assigned to a new thread until the currently assigned thread has finished operating on its flow of packets. In one example, this is enforced by imposing a “drain period” before increasing the number of partitions from a first number to a second number and redistributing the flows of packets to the second number of partitions. During the drain period, the threads complete processing on the data packets that they have already retrieved from the first number of partitions. Once the last thread finishes processing its data packets, the drain period ends, the second number of partitions is instantiated, and the flows of packets are redistributed to the second number of partitions. Redistribution of the flows of data packets may result in a flow of packets being processed by a new thread; however, by imposing the drain period, the chances of the new thread processing the flow of packets at the same time as the old thread are minimized. Flow-level locking may be imposed to minimize the duration of the drain period. In this case, the flow-level locking takes advantage of the dynamic inherently present in very large networks, where the probability of consecutive data packets belonging to the same flow of data packets at a single observation point (e.g., the TAP 104) is very small. - Any one or more of the
TAP 104,multiplexer 106,application server 126, or NICs 116 may comprise or be configured as a general purpose computer as illustrated inFIG. 4 and discussed below. It should also be noted that as used herein, the terms “configure” and “reconfigure” may refer to programming or loading a computing device with computer-readable/computer-executable instructions, code, and/or programs, e.g., in a memory, which when executed by a processor of the computing device, may cause the computing device to perform various functions. Such terms may also encompass providing variables, data values, tables, objects, or other data structures or the like which may cause a computer device executing computer-readable instructions, code, and/or programs to function differently depending upon the values of the variables or other data structures that are provided. - Those skilled in the art will realize that the
network 100 has been simplified. For example, thenetwork 100 may include other network elements (not shown) such as border elements, routers, switches, policy servers, security devices, a content distribution network (CDN) and the like. Thenetwork 100 may also be expanded by including additional endpoint devices, access networks, network elements, application servers, etc. without altering the scope of the present disclosure. - To further aid in understanding the present disclosure,
FIG. 3 illustrates a flowchart of anexample method 300 for organizing terabit-scale packet volumes into flows for downstream processing stages. In one example, themethod 300 may be performed by an intelligent NIC, e.g., one of the NICs 116 illustrated inFIG. 1 . However, in other examples, themethod 300 may be performed by another device. As such, any references in the discussion of themethod 300 to the NICs 116 ofFIG. 1 (or any other elements ofFIG. 1 ) are not intended to limit the means by which themethod 300 may be performed. - The
method 300 begins instep 302. Instep 304, the NIC 116 receives a data packet from themultiplexer 106. In one example, the data packet is a replica of a data packet that was exchanged between two endpoints in the network 100 (e.g., between two of the 108, 110, 112, and 114). As discussed above, the data packet may have been directed to the NIC 116 in accordance with any load balancing algorithm.UEs - In
step 306, the NIC 116 extracts a flow key from the data packet. In one example, the flow key is extracted from the data packet's header and comprises a 5-tuple of source IP address, destination IP address, source port number, destination port number, and ToS. - In
step 308, the NIC 116 inputs the flow key into a hash function. The hash function produces an output value based on the input flow key. - In
step 310, the NIC selects a partition 200 inmemory 118 to which to store the data packet, based on the output value of the hash function. As discussed above, in one example, the output value of the hash function comprises a thread identifier that dictates both: (1) the corresponding thread executing on theprocessors 124 that will process the flow of packets to which the data packet belongs; and (2) the partition 200 inmemory 118 to which to store the data packets of the flow of packets for retrieval by the thread. - In
step 312, the NIC stores the data packet to the partition 200 inmemory 118 that was selected instep 310. Themethod 300 ends in step 314. - Although not expressly specified above, one or more steps of the
method 300 may include a storing, displaying, and/or outputting step as required for a particular application. In other words, any data, records, fields, and/or intermediate results discussed in the method can be stored, displayed and/or outputted to another device as required for a particular application. Furthermore, operations, steps, or blocks inFIG. 3 that recite a determining operation or involve a decision do not necessarily require that both branches of the determining operation be practiced. In other words, one of the branches of the determining operation can be deemed as an optional step. Furthermore, operations, steps or blocks of the above described method(s) can be combined, separated, and/or performed in a different order from that described above, without departing from the examples of the present disclosure. -
FIG. 4 depicts a high-level block diagram of a computing device specifically programmed to perform the functions described herein. For example, any one or more components or devices illustrated inFIG. 1 or described in connection with themethod 300 may be implemented as thesystem 400. For instance, any one of the NICs 116 ofFIG. 1 (such as might be used to perform the method 300) could be implemented as illustrated inFIG. 4 . Alternatively, theapplication server 126 as a whole could be implemented as illustrated inFIG. 4 . - As depicted in
FIG. 4 , thesystem 400 comprises ahardware processor element 402, amemory 404, amodule 405 for organizing terabit-scale packet volumes into flows, and various input/output (I/O)devices 406. - The
hardware processor 402 may comprise, for example, a microprocessor, a central processing unit (CPU), or the like. Thememory 404 may comprise, for example, random access memory (RAM), read only memory (ROM), a disk drive, an optical drive, a magnetic drive, and/or a Universal Serial Bus (USB) drive. Themodule 405 for organizing terabit-scale packet volumes into flows may include circuitry and/or logic for performing special purpose functions relating to data mining, including acode component 408 for executing the hash function described above (where each NIC that is configured as illustrated inFIG. 4 includes thesame code component 408 executing the same hash function). The input/output devices 406 may include, for example, storage devices (including but not limited to, a tape drive, a floppy drive, a hard disk drive or a compact disk drive), a receiver, a transmitter, a fiber optic communications line, an output port, or a user input device (such as a keyboard, a keypad, a mouse, and the like). - Although only one processor element is shown, it should be noted that the general-purpose computer may employ a plurality of processor elements. Furthermore, although only one general-purpose computer is shown in the Figure, if the method(s) as discussed above is implemented in a distributed or parallel manner for a particular illustrative example, i.e., the steps of the above method(s) or the entire method(s) are implemented across multiple or parallel general-purpose computers, then the general-purpose computer of this Figure is intended to represent each of those multiple general-purpose computers. Furthermore, one or more hardware processors can be utilized in supporting a virtualized or shared computing environment. The virtualized computing environment may support one or more virtual machines representing computers, servers, or other computing devices. In such virtualized virtual machines, hardware components such as hardware processors and computer-readable storage devices may be virtualized or logically represented.
- It should be noted that the present disclosure can be implemented in software and/or in a combination of software and hardware, e.g., using application specific integrated circuits (ASIC), a programmable logic array (PLA), including a field-programmable gate array (FPGA), or a state machine deployed on a hardware device, a general purpose computer or any other hardware equivalents, e.g., computer readable instructions pertaining to the method(s) discussed above can be used to configure a hardware processor to perform the steps, functions and/or operations of the above disclosed method(s). In one example, instructions and data for the present module or
process 405 for organizing terabit-scale packet volumes into flows (e.g., a software program comprising computer-executable instructions) can be loaded intomemory 404 and executed byhardware processor element 402 to implement the steps, functions or operations as discussed above in connection with theexample method 300. Furthermore, when a hardware processor executes instructions to perform “operations,” this could include the hardware processor performing the operations directly and/or facilitating, directing, or cooperating with another hardware device or component (e.g., a co-processor and the like) to perform the operations. - The processor executing the computer readable or software instructions relating to the above described method(s) can be perceived as a programmed processor or a specialized processor. As such, the
present module 405 for organizing terabit-scale packet volumes into flows (including associated data structures) of the present disclosure can be stored on a tangible or physical (broadly non-transitory) computer-readable storage device or medium, e.g., volatile memory, non-volatile memory, ROM memory, RAM memory, magnetic or optical drive, device or diskette and the like. More specifically, the computer-readable storage device may comprise any physical devices that provide the ability to store information such as data and/or instructions to be accessed by a processor or a computing device such as a computer or an application server. - While various examples have been described above, it should be understood that they have been presented by way of example only, and not limitation. Thus, the breadth and scope of a preferred example should not be limited by any of the above-described example examples, but should be defined only in accordance with the following claims and their equivalents.
Claims (19)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US17/566,633 US20220124185A1 (en) | 2017-05-18 | 2021-12-30 | Terabit-scale network packet processing via flow-level parallelization |
Applications Claiming Priority (3)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US15/598,673 US10681189B2 (en) | 2017-05-18 | 2017-05-18 | Terabit-scale network packet processing via flow-level parallelization |
| US16/896,161 US11240354B2 (en) | 2017-05-18 | 2020-06-08 | Terabit-scale network packet processing via flow-level parallelization |
| US17/566,633 US20220124185A1 (en) | 2017-05-18 | 2021-12-30 | Terabit-scale network packet processing via flow-level parallelization |
Related Parent Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US16/896,161 Continuation US11240354B2 (en) | 2017-05-18 | 2020-06-08 | Terabit-scale network packet processing via flow-level parallelization |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| US20220124185A1 true US20220124185A1 (en) | 2022-04-21 |
Family
ID=64271656
Family Applications (3)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US15/598,673 Expired - Fee Related US10681189B2 (en) | 2017-05-18 | 2017-05-18 | Terabit-scale network packet processing via flow-level parallelization |
| US16/896,161 Active US11240354B2 (en) | 2017-05-18 | 2020-06-08 | Terabit-scale network packet processing via flow-level parallelization |
| US17/566,633 Abandoned US20220124185A1 (en) | 2017-05-18 | 2021-12-30 | Terabit-scale network packet processing via flow-level parallelization |
Family Applications Before (2)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US15/598,673 Expired - Fee Related US10681189B2 (en) | 2017-05-18 | 2017-05-18 | Terabit-scale network packet processing via flow-level parallelization |
| US16/896,161 Active US11240354B2 (en) | 2017-05-18 | 2020-06-08 | Terabit-scale network packet processing via flow-level parallelization |
Country Status (1)
| Country | Link |
|---|---|
| US (3) | US10681189B2 (en) |
Families Citing this family (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US10686872B2 (en) * | 2017-12-19 | 2020-06-16 | Xilinx, Inc. | Network interface device |
| US10652162B2 (en) * | 2018-06-30 | 2020-05-12 | Intel Corporation | Scalable packet processing |
| US10798609B2 (en) * | 2018-10-16 | 2020-10-06 | Oracle International Corporation | Methods, systems, and computer readable media for lock-free communications processing at a network node |
| CN111817979A (en) * | 2020-06-23 | 2020-10-23 | 成都深思科技有限公司 | Multi-dimensional flow association data packet processing method based on sniffing mode |
| US20240078185A1 (en) * | 2022-09-07 | 2024-03-07 | Mellanox Technologies, Ltd. | Using parallel processor(s) to process packets in real-time |
Citations (9)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20020161919A1 (en) * | 1997-10-14 | 2002-10-31 | Boucher Laurence B. | Fast-path processing for receiving data on TCP connection offload devices |
| US6904043B1 (en) * | 1999-05-21 | 2005-06-07 | Advanced Micro Devices, Inc. | Apparatus and methods for storing and processing header information in a network switch |
| US6990102B1 (en) * | 2001-05-10 | 2006-01-24 | Advanced Micro Devices, Inc. | Parallel lookup tables for locating information in a packet switched network |
| US7274706B1 (en) * | 2001-04-24 | 2007-09-25 | Syrus Ziai | Methods and systems for processing network data |
| US20070253430A1 (en) * | 2002-04-23 | 2007-11-01 | Minami John S | Gigabit Ethernet Adapter |
| US7865624B1 (en) * | 2005-04-04 | 2011-01-04 | Oracle America, Inc. | Lookup mechanism based on link layer semantics |
| US20140029617A1 (en) * | 2012-07-27 | 2014-01-30 | Ren Wang | Packet processing approach to improve performance and energy efficiency for software routers |
| US10243857B1 (en) * | 2016-09-09 | 2019-03-26 | Marvell Israel (M.I.S.L) Ltd. | Method and apparatus for multipath group updates |
| US10956346B1 (en) * | 2017-01-13 | 2021-03-23 | Lightbits Labs Ltd. | Storage system having an in-line hardware accelerator |
Family Cites Families (34)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US6338078B1 (en) | 1998-12-17 | 2002-01-08 | International Business Machines Corporation | System and method for sequencing packets for multiprocessor parallelization in a computer network system |
| US6483804B1 (en) * | 1999-03-01 | 2002-11-19 | Sun Microsystems, Inc. | Method and apparatus for dynamic packet batching with a high performance network interface |
| US6631422B1 (en) | 1999-08-26 | 2003-10-07 | International Business Machines Corporation | Network adapter utilizing a hashing function for distributing packets to multiple processors for parallel processing |
| US6870849B1 (en) | 2000-07-06 | 2005-03-22 | Ross W. Callon | Apparatus and method for efficient hashing in networks |
| US6754662B1 (en) * | 2000-08-01 | 2004-06-22 | Nortel Networks Limited | Method and apparatus for fast and consistent packet classification via efficient hash-caching |
| US7206861B1 (en) | 2002-07-29 | 2007-04-17 | Juniper Networks, Inc. | Network traffic distribution across parallel paths |
| US7483430B1 (en) | 2003-02-28 | 2009-01-27 | Cisco Technology, Inc. | Hierarchical hash method for performing forward route lookup |
| CA2577891A1 (en) | 2004-08-24 | 2006-03-02 | Washington University | Methods and systems for content detection in a reconfigurable hardware |
| US7443878B2 (en) | 2005-04-04 | 2008-10-28 | Sun Microsystems, Inc. | System for scaling by parallelizing network workload |
| ATE453149T1 (en) | 2005-05-04 | 2010-01-15 | Telecom Italia Spa | METHOD AND SYSTEM FOR PROCESSING PACKET FLOWS AND COMPUTER PROGRAM PRODUCT THEREFOR |
| US20080101233A1 (en) | 2006-10-25 | 2008-05-01 | The Governors Of The University Of Alberta | Method and apparatus for load balancing internet traffic |
| US7813342B2 (en) * | 2007-03-26 | 2010-10-12 | Gadelrab Serag | Method and apparatus for writing network packets into computer memory |
| US8131841B2 (en) | 2007-07-27 | 2012-03-06 | Hewlett-Packard Development Company, L.P. | Method and apparatus for detecting predefined signatures in packet payload |
| US7836195B2 (en) * | 2008-02-27 | 2010-11-16 | Intel Corporation | Preserving packet order when migrating network flows between cores |
| US8259585B1 (en) | 2009-04-17 | 2012-09-04 | Juniper Networks, Inc. | Dynamic link load balancing |
| US8990431B2 (en) | 2009-05-05 | 2015-03-24 | Citrix Systems, Inc. | Systems and methods for identifying a processor from a plurality of processors to provide symmetrical request and response processing |
| US8788570B2 (en) | 2009-06-22 | 2014-07-22 | Citrix Systems, Inc. | Systems and methods for retaining source IP in a load balancing multi-core environment |
| US8018961B2 (en) | 2009-06-22 | 2011-09-13 | Citrix Systems, Inc. | Systems and methods for receive and transmission queue processing in a multi-core architecture |
| US8503456B2 (en) | 2009-07-14 | 2013-08-06 | Broadcom Corporation | Flow based path selection randomization |
| US20130343377A1 (en) | 2012-06-21 | 2013-12-26 | Jonathan Stroud | Hash-based packet distribution in a computer system |
| US9047417B2 (en) * | 2012-10-29 | 2015-06-02 | Intel Corporation | NUMA aware network interface |
| US9172756B2 (en) | 2013-03-12 | 2015-10-27 | Cisco Technology, Inc. | Optimizing application performance in a network environment |
| US20140282551A1 (en) * | 2013-03-13 | 2014-09-18 | Emulex Design & Manufacturing Corporation | Network virtualization via i/o interface |
| US9860332B2 (en) * | 2013-05-08 | 2018-01-02 | Samsung Electronics Co., Ltd. | Caching architecture for packet-form in-memory object caching |
| US9838291B2 (en) * | 2013-08-02 | 2017-12-05 | Cellos Software Ltd | Multicore processing of bidirectional traffic flows |
| US20150078375A1 (en) * | 2013-09-13 | 2015-03-19 | Broadcom Corporation | Mutable Hash for Network Hash Polarization |
| US9397946B1 (en) * | 2013-11-05 | 2016-07-19 | Cisco Technology, Inc. | Forwarding to clusters of service nodes |
| US10230824B2 (en) * | 2014-11-17 | 2019-03-12 | Keysight Technologies Singapore (Holdings) Pte. Lte. | Packet classification using memory pointer information |
| US9794263B2 (en) | 2014-12-27 | 2017-10-17 | Intel Corporation | Technologies for access control |
| US9853903B1 (en) * | 2015-04-23 | 2017-12-26 | Cisco Technology, Inc. | Simultaneous redirecting and load balancing |
| US9996498B2 (en) * | 2015-09-08 | 2018-06-12 | Mellanox Technologies, Ltd. | Network memory |
| US9948559B2 (en) * | 2015-10-31 | 2018-04-17 | Nicira, Inc. | Software receive side scaling for overlay flow re-dispatching |
| US10911579B1 (en) * | 2016-03-01 | 2021-02-02 | Amazon Technologies, Inc. | Generating programmatically defined fields of metadata for network packets |
| US20170318082A1 (en) * | 2016-04-29 | 2017-11-02 | Qualcomm Incorporated | Method and system for providing efficient receive network traffic distribution that balances the load in multi-core processor systems |
-
2017
- 2017-05-18 US US15/598,673 patent/US10681189B2/en not_active Expired - Fee Related
-
2020
- 2020-06-08 US US16/896,161 patent/US11240354B2/en active Active
-
2021
- 2021-12-30 US US17/566,633 patent/US20220124185A1/en not_active Abandoned
Patent Citations (9)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20020161919A1 (en) * | 1997-10-14 | 2002-10-31 | Boucher Laurence B. | Fast-path processing for receiving data on TCP connection offload devices |
| US6904043B1 (en) * | 1999-05-21 | 2005-06-07 | Advanced Micro Devices, Inc. | Apparatus and methods for storing and processing header information in a network switch |
| US7274706B1 (en) * | 2001-04-24 | 2007-09-25 | Syrus Ziai | Methods and systems for processing network data |
| US6990102B1 (en) * | 2001-05-10 | 2006-01-24 | Advanced Micro Devices, Inc. | Parallel lookup tables for locating information in a packet switched network |
| US20070253430A1 (en) * | 2002-04-23 | 2007-11-01 | Minami John S | Gigabit Ethernet Adapter |
| US7865624B1 (en) * | 2005-04-04 | 2011-01-04 | Oracle America, Inc. | Lookup mechanism based on link layer semantics |
| US20140029617A1 (en) * | 2012-07-27 | 2014-01-30 | Ren Wang | Packet processing approach to improve performance and energy efficiency for software routers |
| US10243857B1 (en) * | 2016-09-09 | 2019-03-26 | Marvell Israel (M.I.S.L) Ltd. | Method and apparatus for multipath group updates |
| US10956346B1 (en) * | 2017-01-13 | 2021-03-23 | Lightbits Labs Ltd. | Storage system having an in-line hardware accelerator |
Also Published As
| Publication number | Publication date |
|---|---|
| US20200304609A1 (en) | 2020-09-24 |
| US11240354B2 (en) | 2022-02-01 |
| US20180336071A1 (en) | 2018-11-22 |
| US10681189B2 (en) | 2020-06-09 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US11240354B2 (en) | Terabit-scale network packet processing via flow-level parallelization | |
| US11411935B2 (en) | Extracting data from encrypted packet flows | |
| US11005815B2 (en) | Priority allocation for distributed service rules | |
| US11570108B2 (en) | Distribution of network traffic to software defined network based probes | |
| EP3111603B1 (en) | Method and network device for handling packets in a network by means of forwarding tables | |
| US10348650B2 (en) | Augmentation of pattern matching with divergence histograms | |
| US20200162422A1 (en) | Separating cgn forwarding and control | |
| CN112583734A (en) | Burst flow control method and device, electronic equipment and storage medium | |
| US11726829B2 (en) | Adaptive, performance-oriented, and compression-assisted encryption scheme | |
| Nallusamy et al. | Decision Tree‐Based Entries Reduction scheme using multi‐match attributes to prevent flow table overflow in SDN environment | |
| US8467298B2 (en) | Applying a table-lookup approach to load spreading in forwarding data in a network | |
| US20250358643A1 (en) | Optimization of long term evolution/fifth generation service through conformity-based recommendations | |
| US12519809B2 (en) | Scalable assessment and prioritization of network asset cryptography for quantum risk mitigation | |
| US20210258232A1 (en) | Varying data flow aggregation period relative to data value | |
| Bienkowski et al. | Online aggregation of the forwarding information base: accounting for locality and churn | |
| EP2930883A1 (en) | Method for the implementation of network functions virtualization of a telecommunications network providing communication services to subscribers, telecommunications network, program and computer program product | |
| US20160142342A1 (en) | Apparatus and method for fast search table update in a network switch | |
| Sharmin et al. | Quality of experience-aware resource allocation for video content distribution to telco-cloud users | |
| CN108377254B (en) | Method and device for consistent flow assignment in load balancing | |
| Ding et al. | Sprinklers: A randomized variable-size striping approach to reordering-free load-balanced switching | |
| US12363197B1 (en) | Managing network services utilizing service groups | |
| Leivadeas et al. | Considerations for a successful network service chain deployment | |
| Farias et al. | VNF-Cache: An In-Network Key-Value Store Cache Based on Network Function Virtualization | |
| CN114826823A (en) | Virtual network segmentation method, device and system | |
| Zhang et al. | Horizontal partition for scalable control in software-defined data center networks |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |
|
| AS | Assignment |
Owner name: AT&T INTELLECTUAL PROPERTY I, L.P., GEORGIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:ZAIFMAN, ARTHUR L.;MOCENIGO, JOHN M.;SIGNING DATES FROM 20170517 TO 20171027;REEL/FRAME:058829/0930 |
|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: NOTICE OF ALLOWANCE MAILED -- APPLICATION RECEIVED IN OFFICE OF PUBLICATIONS |
|
| STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO PAY ISSUE FEE |
|
| STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO PAY ISSUE FEE |