CN115473861B - High-performance processing system and method based on communication and calculation separation and storage medium - Google Patents
High-performance processing system and method based on communication and calculation separation and storage medium Download PDFInfo
- Publication number
- CN115473861B CN115473861B CN202210991611.1A CN202210991611A CN115473861B CN 115473861 B CN115473861 B CN 115473861B CN 202210991611 A CN202210991611 A CN 202210991611A CN 115473861 B CN115473861 B CN 115473861B
- Authority
- CN
- China
- Prior art keywords
- communication
- node
- tcp
- application
- application layer
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000004891 communication Methods 0.000 title claims abstract description 74
- 238000012545 processing Methods 0.000 title claims abstract description 58
- 238000004364 calculation method Methods 0.000 title claims abstract description 30
- 238000000034 method Methods 0.000 title claims abstract description 25
- 238000000926 separation method Methods 0.000 title claims abstract description 23
- 239000000872 buffer Substances 0.000 claims abstract description 21
- 230000006870 function Effects 0.000 claims description 19
- 230000015654 memory Effects 0.000 claims description 17
- 238000003672 processing method Methods 0.000 claims description 10
- 238000012546 transfer Methods 0.000 claims description 6
- 238000010276 construction Methods 0.000 abstract description 3
- 238000005516 engineering process Methods 0.000 description 7
- 238000011161 development Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 238000003491 array Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 238000006467 substitution reaction Methods 0.000 description 2
- 230000001133 acceleration Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L49/00—Packet switching elements
- H04L49/10—Packet switching elements characterised by the switching fabric construction
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L49/00—Packet switching elements
- H04L49/15—Interconnection of switching modules
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L69/00—Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
- H04L69/16—Implementation or adaptation of Internet protocol [IP], of transmission control protocol [TCP] or of user datagram protocol [UDP]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L69/00—Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
- H04L69/30—Definitions, standards or architectural aspects of layered protocol stacks
- H04L69/32—Architecture of open systems interconnection [OSI] 7-layer type protocol stacks, e.g. the interfaces between the data link level and the physical level
- H04L69/322—Intralayer communication protocols among peer entities or protocol data unit [PDU] definitions
- H04L69/329—Intralayer communication protocols among peer entities or protocol data unit [PDU] definitions in the application layer [OSI layer 7]
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Computer Security & Cryptography (AREA)
- Computer And Data Communications (AREA)
Abstract
The application discloses a high-performance processing system and method based on communication and calculation separation, and a storage medium, wherein the system comprises a TOE node, a PCIe switching node and a CPU node; the TOE node is used for external TCP or IP communication of the system and terminating TCP or IP data; the PCIe switching node is used for transmitting the application layer data to a user buffer area of the software in a DMA mode; the CPU node is used for running functional software based on the application to complete the application calculation function of the system. The application can improve the processing performance of the whole system, reduce the occupation of the processing capacity of the CPU, and is a practical high-performance processing system construction scheme in engineering.
Description
Technical Field
The application relates to the field of data processing, in particular to a high-performance processing system and method based on communication and calculation separation and a storage medium.
Background
Current common data processing systems are built in the manner of CPU + ethernet switch + interface chips. The interface chip is responsible for the communication of the system to the outside, and forms a unified gateway for the system to the outside. The CPU node completes the Ethernet interface, receives and transmits data through the software TCP/IP protocol stack, and simultaneously carries out application calculation on the corresponding data according to the application scene. The Ethernet switch is responsible for interconnection among the nodes of the whole system. For the purpose of achieving reliable end-to-end data transfer, communication protocols typically employ TCP/IP based protocols.
The model is constructed based on mature Ethernet switching technology and TCP/IP technology, and can meet certain application requirements. However, with the development of communication technology, the communication bandwidth is rapidly increased, the data load is greatly increased, and the disadvantage of the model is more and more obvious, so that the model cannot adapt to the current high-performance processing requirement. The concrete steps are as follows:
(1) The processing of packet data by a TCP/IP protocol stack of kernel software, especially the processing requirements of checksum calculation of each layer of TCP/IP protocol, repeated interaction of TCP protocol, error control, overtime transmission and the like, bring heavy protocol processing pressure to a CPU, and especially under the high throughput and high concurrency application scenarios, the protocol stack processing can occupy the processing capacity of the CPU greatly. And the processing delay will be further exacerbated as the load increases due to the uncertainty of the software processing delay.
(2) Based on the processing model of the kernel of the operating system, the whole system has the problems of high interrupt rate, repeated copying of data among memories, frequent switching of application program contexts and the like, and the processing performance of the CPU is further deteriorated.
(3) Because each node in the communication generally adopts TCP/IP protocol to communicate, each node needs to receive and transmit TCP/IP, and the processing redundancy of the whole system is increased.
Accordingly, the above-mentioned technical problems of the related art are to be solved.
Disclosure of Invention
The present application is directed to solving one of the technical problems in the related art. Therefore, the embodiment of the application provides a high-performance processing system, a high-performance processing method and a storage medium based on communication and calculation separation, which can improve the processing performance of the system and reduce the occupation of the processing capacity of a CPU.
According to an aspect of an embodiment of the present application, there is provided a high performance processing system based on separation of communication and computation, the system including: TOE node, PCIe switching node, CPU node;
the TOE node is used for external TCP or IP communication of the system and terminating TCP or IP data;
the PCIe switching node is used for transmitting the application layer data to a user buffer area of the software in a DMA mode;
the CPU node is used for running functional software based on the application to complete the application calculation function of the system.
In one embodiment, the TOE node is configured for TCP or IP communication with the system outside, and includes:
the TOE node analyzes the application layer data and sends the application layer data to a PCIe bus for internal communication;
the TOE node receives data sent by the PCIe bus, bears a TCP/IP protocol, and sends the data sent by the PCIe bus to an external network.
In one embodiment, the TOE node transfers the application layer data to the corresponding memory space over a PCIe bus.
In one embodiment, the PCIe switching node transfers the application layer data to a user buffer of the software by DMA, including:
the PCIe switching node conveys the application layer data to a user buffer of the corresponding software through a PCIe bus.
In one embodiment, the processing tasks of the system include a computing task that is computationally completed by the CPU node and functional software, and a communication task that is completed by hardware.
According to an aspect of an embodiment of the present application, there is provided a high performance processing method based on separation of communication and computation, the method including:
carrying out TCP or IP communication externally and terminating TCP or IP data;
the application layer data is transported to a user buffer area of the software in a DMA mode;
and running application-based functional software to complete the application computing function of the system.
In one embodiment, the performing TCP or IP communication with the external device includes:
analyzing application layer data and sending the application layer data to a PCIe bus for internal communication;
and receiving data sent by the PCIe bus, carrying a TCP/IP protocol, and sending the data sent by the PCIe bus to an external network.
In one embodiment, the method further comprises:
and the application layer data are transported to the corresponding memory space through the PCIe bus.
In one embodiment, the method for transferring the application layer data to the user buffer area of the software through the DMA mode comprises the following steps:
the application layer data is carried over the PCIe bus to the user buffers of the corresponding software.
According to an aspect of an embodiment of the present application, there is provided a storage medium storing a processor-executable program which, when executed by a processor, implements the communication-and computation-separation-based high-performance processing method according to any one of claims 6 to 9.
The high-performance processing system and method based on communication and calculation separation and the storage medium provided by the embodiment of the application have the beneficial effects that: the system comprises TOE nodes, PCIe switching nodes and CPU nodes; the TOE node is used for external TCP or IP communication of the system and terminating TCP or IP data; the PCIe switching node is used for transmitting the application layer data to a user buffer area of the software in a DMA mode; the CPU node is used for running functional software based on the application to complete the application calculation function of the system. The application can improve the processing performance of the whole system, reduce the occupation of the processing capacity of the CPU, and is a practical high-performance processing system construction scheme in engineering.
Additional aspects and advantages of the application will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the application.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present application, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic diagram of a conventional data processing system architecture;
FIG. 2 is a schematic diagram of a high performance processing system architecture based on communication and computing separation according to an embodiment of the present application;
fig. 3 is a flowchart of a high performance processing method based on separation of communication and computation according to an embodiment of the present application.
Detailed Description
In order that those skilled in the art will better understand the present application, a technical solution in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present application without making any inventive effort, shall fall within the scope of the present application.
The terms "first," "second," "third," and "fourth" and the like in the description and in the claims and drawings are used for distinguishing between different objects and not necessarily for describing a particular sequential or chronological order. Furthermore, the terms "comprise" and "have," as well as any variations thereof, are intended to cover a non-exclusive inclusion. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those listed steps or elements but may include other steps or elements not listed or inherent to such process, method, article, or apparatus.
Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment may be included in at least one embodiment of the application. The appearances of such phrases in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those of skill in the art will explicitly and implicitly appreciate that the embodiments described herein may be combined with other embodiments.
The technical terms appearing in the present application are explained as follows:
DMA: DMA (Direct Memory Access ) allows hardware devices of different speeds to communicate without relying on the massive interrupt load of the CPU. Otherwise, the CPU needs to copy each piece of data from the source to the register and then write them back to the new place again. During this time, the CPU is not available for other tasks.
TCP: TCP (TCP offload engine) is a TCP acceleration technique used in a Network Interface Controller (NIC) to offload the work of TCP/IP stacking to the network interface controller, and is done in hardware. TCP functions are common on high-speed ethernet interfaces, such as gigabit ethernet (GbE) or 10 gigabit ethernet (10 GbE), where the work of processing TCP/IP packet headers becomes heavier, which can be done by hardware to ease the burden on the processor.
PCIe: PCIe is a high-speed serial computer expansion bus standard. PCIe includes higher maximum system bus throughput, lower I/O pin count and smaller physical size, better bus device performance scaling, more detailed error detection and reporting mechanisms (advanced error reporting, AER) and native hot plug functionality. PCIe provides hardware support for I/O virtualization.
As shown in FIG. 1, a currently common data processing system is built with CPU+Ethernet switch+interface chips. The interface chip is responsible for the communication of the system to the outside, and forms a unified gateway for the system to the outside. The CPU node completes the Ethernet interface, receives and transmits data through the software TCP/IP protocol stack, and simultaneously carries out application calculation on the corresponding data according to the application scene. The Ethernet switch is responsible for interconnection among the nodes of the whole system. For the purpose of achieving reliable end-to-end data transfer, communication protocols typically employ TCP/IP based protocols.
The model is constructed based on mature Ethernet switching technology and TCP/IP technology, and can meet certain application requirements. However, with the development of communication technology, the communication bandwidth is rapidly increased, the data load is greatly increased, and the disadvantage of the model is more and more obvious, so that the model cannot adapt to the current high-performance processing requirement. The concrete steps are as follows:
(1) The processing of packet data by a TCP/IP protocol stack of kernel software, especially the processing requirements of checksum calculation of each layer of TCP/IP protocol, repeated interaction of TCP protocol, error control, overtime transmission and the like, bring heavy protocol processing pressure to a CPU, and especially under the high throughput and high concurrency application scenarios, the protocol stack processing can occupy the processing capacity of the CPU greatly. And the processing delay will be further exacerbated as the load increases due to the uncertainty of the software processing delay.
(2) Based on the processing model of the kernel of the operating system, the whole system has the problems of high interrupt rate, repeated copying of data among memories, frequent switching of application program contexts and the like, and the processing performance of the CPU is further deteriorated.
(3) Because each node in the communication generally adopts TCP/IP protocol to communicate, each node needs to receive and transmit TCP/IP, and the processing redundancy of the whole system is increased.
Based on the facts and analysis, the conventional processing model cannot meet the scene requirements of high performance, high concurrency and low time delay. In order to face the actual processing challenges, a high-performance processing platform is constructed, and the application provides a design and implementation scheme of a high-performance processing system based on communication and calculation separation. Decoupling the computing problem, the communication problem in the system and the communication problem outside the system, and treating the decoupling and the decoupling separately, the application is characterized in that: the calculation problem is solved by the CPU and the functional software operated by the CPU; the communication problem is completely separated from the CPU and is uniformly solved by hardware; the system external communication has hardware TCP/IP protocol stack (namely TOE hardware engine) for processing and application termination; the communication content in the system is application layer data, and high-performance bearing and exchange are carried out through a PCIe bus; and the application layer is directly communicated into a user buffer area of the software in a DMA mode.
Specifically, as shown in fig. 2, the high-performance processing system based on communication and calculation separation proposed by the present application includes: TOE node, PCIe switching node, CPU node; the TOE node is used for external TCP or IP communication of the system and terminating TCP or IP data; the PCIe switching node is used for transmitting the application layer data to a user buffer area of the software in a DMA mode; the CPU node is used for running functional software based on the application to complete the application calculation function of the system.
Specifically, the TOE node in this embodiment is used for TCP or IP communication of the system outside, and includes: the TOE node analyzes the application layer data and sends the application layer data to a PCIe bus for internal communication; the TOE node receives data sent by the PCIe bus, bears a TCP/IP protocol, and sends the data sent by the PCIe bus to an external network. The TOE node of this embodiment is a fully functional TCP/IP hardware protocol stack. Externally, the whole system is responsible for external TCP/IP communication; in the pair, the complete TCP/IP data is terminated at the node, on one hand, the application layer data is analyzed and sent to a PCIe bus for internal communication; on the other hand, the data from the PCIe bus is received, carried by the TCP/IP protocol and sent to the external network. Meanwhile, the TOE node also has a DMA function, and can transport the application layer data to the corresponding memory space through the PCIe bus.
Specifically, the PCIe switching node of the present embodiment transfers the application layer data to the user buffer area of the software in a DMA mode, including: the PCIe switching node conveys the application layer data to a user buffer of the corresponding software through a PCIe bus. The PCIe switching node uses the PCIe bus to efficiently carry data communication in the system, and supports direct access to the memory space of the CPU node.
Therefore, the processing tasks of the system provided in this embodiment include a calculation task and a communication task, wherein the calculation task is calculated and completed by the CPU node and the functional software, and the communication task is completed by hardware. The CPU node is responsible for running various application-based functional software to complete the application computing function of the system. Because the communication task is decoupled into hardware for implementation, and the data source model facing the functional software is based on memory rather than IO, a specific functional software only receives and transmits associated data, so that the communication task can have higher throughput performance. On the other hand, in order to "pass through" data to the functional software, a driver for the efficient kernel-bypass model needs to be built. On the software model, a software model similar to DPDK can be used for development.
The application provides a design and implementation scheme of a high-performance processing system based on communication and calculation separation, which is based on the idea of communication and calculation separation and divides communication into communication outside a system and communication inside the system. The method and the device can be used for decoupling and processing the calculation problem, the intra-system communication problem and the external-system communication problem, and can be used for treating the calculation problem, the internal-system communication problem and the external-system communication problem separately. The method provides a practical technical scheme for engineering construction of a high-performance processing system. The system built by the application can meet the practical requirements of high concurrency, high throughput and low delay.
Fig. 3 is a flowchart of a high performance processing method based on communication and computation separation according to an embodiment of the present application, and as shown in fig. 3, the present application provides a high performance processing method based on communication and computation separation, including:
s301, carrying out TCP or IP communication externally and terminating TCP or IP data.
S302, the application layer data is transported to a user buffer area of the software in a DMA mode.
S303, running function software based on the application to complete the application calculation function of the system.
Optionally, the performing TCP or IP communication to the outside includes:
analyzing application layer data and sending the application layer data to a PCIe bus for internal communication; and receiving data sent by the PCIe bus, carrying a TCP/IP protocol, and sending the data sent by the PCIe bus to an external network.
Optionally, the method of the present embodiment further includes: and the application layer data are transported to the corresponding memory space through the PCIe bus.
It should be noted that, the method for transferring the application layer data to the user buffer area of the software by the DMA mode includes: the application layer data is carried over the PCIe bus to the user buffers of the corresponding software.
Further, the present embodiment also provides a storage medium storing a processor-executable program which, when executed by a processor, implements the communication-and-computation-separation-based high-performance processing method as described in the previous embodiment.
The content in the method embodiment is applicable to the storage medium embodiment, and functions specifically implemented by the storage medium embodiment are the same as those of the method embodiment, and the achieved beneficial effects are the same as those of the method embodiment.
In some alternative embodiments, the functions/acts noted in the block diagrams may occur out of the order noted in the operational illustrations. For example, two blocks shown in succession may in fact be executed substantially concurrently or the blocks may sometimes be executed in the reverse order, depending upon the functionality/acts involved. Furthermore, the embodiments presented and described in the flowcharts of the present application are provided by way of example in order to provide a more thorough understanding of the technology. The disclosed methods are not limited to the operations and logic flows presented herein. Alternative embodiments are contemplated in which the order of various operations is changed, and in which sub-operations described as part of a larger operation are performed independently.
Furthermore, while the application is described in the context of functional modules, it should be appreciated that, unless otherwise indicated, one or more of the functions and/or features may be integrated in a single physical device and/or software module or may be implemented in separate physical devices or software modules. It will also be appreciated that a detailed discussion of the actual implementation of each module is not necessary to an understanding of the present application. Rather, the actual implementation of the various functional modules in the apparatus disclosed herein will be apparent to those skilled in the art from consideration of their attributes, functions and internal relationships. Accordingly, one of ordinary skill in the art can implement the application as set forth in the claims without undue experimentation. It is also to be understood that the specific concepts disclosed are merely illustrative and are not intended to be limiting upon the scope of the application, which is to be defined in the appended claims and their full scope of equivalents.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method of the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
Logic and/or steps represented in the flowcharts or otherwise described herein, e.g., a ordered listing of executable instructions for implementing logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). Additionally, the computer-readable medium may even be paper or other suitable medium upon which the program is printed, as the program may be electronically captured, via, for instance, optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner, if necessary, and then stored in a computer memory.
It is to be understood that portions of the present application may be implemented in hardware, software, firmware, or a combination thereof. In the above-described embodiments, the various steps or methods may be implemented in software or firmware stored in a memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, may be implemented using any one or combination of the following techniques, as is well known in the art: discrete logic circuits having logic gates for implementing logic functions on data signals, application specific integrated circuits having suitable combinational logic gates, programmable Gate Arrays (PGAs), field Programmable Gate Arrays (FPGAs), and the like.
In the foregoing description of the present specification, reference has been made to the terms "one embodiment/example", "another embodiment/example", "certain embodiments/examples", and the like, means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the application. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiments or examples. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
While embodiments of the present application have been shown and described, it will be understood by those of ordinary skill in the art that: many changes, modifications, substitutions and variations may be made to the embodiments without departing from the spirit and principles of the application, the scope of which is defined by the claims and their equivalents.
The above embodiments are only for illustrating the technical solution of the present application, and not for limiting the same; although the application has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present application.
Claims (5)
1. A high performance processing system based on separation of communications and computing, the system comprising: TOE node, PCIe switching node, CPU node;
the TOE node is used for external TCP or IP communication of the system, transmitting application layer data to a corresponding memory space through a PCIe bus, and terminating the TCP or IP data;
the PCIe switching node is used for transmitting the application layer data to a user buffer area of the software in a DMA mode;
the CPU node is used for running functional software based on the application to complete the application calculation function of the system;
the processing tasks of the system comprise a calculation task and a communication task, wherein the calculation task is calculated and completed by the CPU node and functional software, and the communication task is completed by hardware;
the TOE node is used for TCP or IP communication of the system outside, and comprises:
the TOE node analyzes the application layer data and sends the application layer data to a PCIe bus for internal communication;
the TOE node receives data sent by the PCIe bus, bears a TCP/IP protocol, and sends the data sent by the PCIe bus to an external network.
2. The communication and computation separation-based high performance processing system of claim 1, wherein the PCIe switching node DMA-transfers application layer data to a user buffer of software, comprising:
the PCIe switching node conveys the application layer data to a user buffer of the corresponding software through a PCIe bus.
3. A high performance processing method based on separation of communication and computation, applied to the high performance processing system according to any one of claims 1 to 2, characterized in that the method comprises:
TCP or IP communication is carried out by adopting TOE nodes, application layer data is transported to a corresponding memory space through a PCIe bus, and the TCP or IP data is terminated;
the PCIe switching node is adopted to transmit the application layer data to a user buffer area of the software in a DMA mode;
the CPU node is adopted to run the function software based on the application, so as to complete the application calculation function of the system;
the processing tasks of the high-performance processing system comprise a calculation task and a communication task, wherein the calculation task is calculated and completed by functional software, and the communication task is completed by hardware;
the adopting the TOE node to externally perform TCP or IP communication comprises the following steps:
analyzing the application layer data by adopting the TOE node and sending the application layer data to a PCIe bus for internal communication;
and receiving data sent by the PCIe bus by adopting the TOE node, carrying a TCP/IP protocol, and sending the data sent by the PCIe bus to an external network.
4. A communication and computation separation based high performance processing method according to claim 3, wherein the application layer data is transferred to the user buffer of the software by DMA, comprising:
the application layer data is carried over the PCIe bus to the user buffers of the corresponding software.
5. A storage medium storing a processor-executable program which, when executed by a processor, implements the communication-and computation-separation-based high-performance processing method according to any one of claims 3 to 4.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210991611.1A CN115473861B (en) | 2022-08-18 | 2022-08-18 | High-performance processing system and method based on communication and calculation separation and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210991611.1A CN115473861B (en) | 2022-08-18 | 2022-08-18 | High-performance processing system and method based on communication and calculation separation and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115473861A CN115473861A (en) | 2022-12-13 |
CN115473861B true CN115473861B (en) | 2023-11-03 |
Family
ID=84365900
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210991611.1A Active CN115473861B (en) | 2022-08-18 | 2022-08-18 | High-performance processing system and method based on communication and calculation separation and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115473861B (en) |
Citations (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1667601A (en) * | 2004-03-11 | 2005-09-14 | 国际商业机器公司 | Apparatus and method for sharing a network I/O adapter between logical partitions |
CN1819584A (en) * | 2004-11-12 | 2006-08-16 | 微软公司 | Method and apparatus for secure internet protocol (ipsec) offloading with integrated host protocol stack management |
WO2017046582A1 (en) * | 2015-09-16 | 2017-03-23 | Nanospeed Technologies Limited | Tcp/ip offload system |
WO2018018611A1 (en) * | 2016-07-29 | 2018-02-01 | 华为技术有限公司 | Task processing method and network card |
CN109491934A (en) * | 2018-09-28 | 2019-03-19 | 方信息科技(上海)有限公司 | A kind of storage management system control method of integrated computing function |
CN109714302A (en) * | 2017-10-25 | 2019-05-03 | 阿里巴巴集团控股有限公司 | The discharging method of algorithm, device and system |
CN110109852A (en) * | 2019-04-03 | 2019-08-09 | 华东计算技术研究所(中国电子科技集团公司第三十二研究所) | System and method for realizing TCP _ IP protocol by hardware |
CN111031011A (en) * | 2019-11-26 | 2020-04-17 | 中科驭数(北京)科技有限公司 | Interaction method and device of TCP/IP accelerator |
CN111163121A (en) * | 2019-11-19 | 2020-05-15 | 核芯互联科技(青岛)有限公司 | Ultra-low-delay high-performance network protocol stack processing method and system |
CN112953967A (en) * | 2021-03-30 | 2021-06-11 | 扬州万方电子技术有限责任公司 | Network protocol unloading device and data transmission system |
CN113225307A (en) * | 2021-03-18 | 2021-08-06 | 西安电子科技大学 | Optimization method, system and terminal for pre-reading descriptors in offload engine network card |
CN113312283A (en) * | 2021-05-28 | 2021-08-27 | 北京航空航天大学 | Heterogeneous image learning system based on FPGA acceleration |
CN113347017A (en) * | 2021-04-09 | 2021-09-03 | 中科创达软件股份有限公司 | Network communication method and device, network node equipment and hybrid network |
CN114238187A (en) * | 2022-02-24 | 2022-03-25 | 苏州浪潮智能科技有限公司 | FPGA-based full-stack network card task processing system |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11537541B2 (en) * | 2018-09-28 | 2022-12-27 | Xilinx, Inc. | Network interface device and host processing device |
-
2022
- 2022-08-18 CN CN202210991611.1A patent/CN115473861B/en active Active
Patent Citations (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1667601A (en) * | 2004-03-11 | 2005-09-14 | 国际商业机器公司 | Apparatus and method for sharing a network I/O adapter between logical partitions |
CN1819584A (en) * | 2004-11-12 | 2006-08-16 | 微软公司 | Method and apparatus for secure internet protocol (ipsec) offloading with integrated host protocol stack management |
WO2017046582A1 (en) * | 2015-09-16 | 2017-03-23 | Nanospeed Technologies Limited | Tcp/ip offload system |
WO2018018611A1 (en) * | 2016-07-29 | 2018-02-01 | 华为技术有限公司 | Task processing method and network card |
CN109714302A (en) * | 2017-10-25 | 2019-05-03 | 阿里巴巴集团控股有限公司 | The discharging method of algorithm, device and system |
CN109491934A (en) * | 2018-09-28 | 2019-03-19 | 方信息科技(上海)有限公司 | A kind of storage management system control method of integrated computing function |
CN110109852A (en) * | 2019-04-03 | 2019-08-09 | 华东计算技术研究所(中国电子科技集团公司第三十二研究所) | System and method for realizing TCP _ IP protocol by hardware |
CN111163121A (en) * | 2019-11-19 | 2020-05-15 | 核芯互联科技(青岛)有限公司 | Ultra-low-delay high-performance network protocol stack processing method and system |
CN111031011A (en) * | 2019-11-26 | 2020-04-17 | 中科驭数(北京)科技有限公司 | Interaction method and device of TCP/IP accelerator |
CN113225307A (en) * | 2021-03-18 | 2021-08-06 | 西安电子科技大学 | Optimization method, system and terminal for pre-reading descriptors in offload engine network card |
CN112953967A (en) * | 2021-03-30 | 2021-06-11 | 扬州万方电子技术有限责任公司 | Network protocol unloading device and data transmission system |
CN113347017A (en) * | 2021-04-09 | 2021-09-03 | 中科创达软件股份有限公司 | Network communication method and device, network node equipment and hybrid network |
CN113312283A (en) * | 2021-05-28 | 2021-08-27 | 北京航空航天大学 | Heterogeneous image learning system based on FPGA acceleration |
CN114238187A (en) * | 2022-02-24 | 2022-03-25 | 苏州浪潮智能科技有限公司 | FPGA-based full-stack network card task processing system |
Non-Patent Citations (4)
Title |
---|
Decentralized Attribute-Based Encryption and Data Sharing Scheme in Cloud Storage;Xiehua Li;Yanlong Wang;Ming Xu;Yaping Cui;;中国通信(第02期);全文 * |
User-Level Device Drivers: Achieved Performance;Ben Leslie;Peter Chubb;Nicholas Fitzroy-Dale;Stefan G(o|¨)tz;Charles Gray;Luke Macpherson;Daniel Potts;Kevin Elphinstone;Gernot Heiser;;Journal of Computer Science and Technology(第05期);全文 * |
一种TCP/IP卸载的数据零拷贝传输方法;王小峰;时向泉;苏金树;;计算机工程与科学(第02期);全文 * |
基于多核NPU的TCP数据接收卸载;李杰;陈曙晖;;计算机工程与科学(第07期);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN115473861A (en) | 2022-12-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP4401381A2 (en) | Receiver-based precision congestion control | |
US8477806B2 (en) | Method and system for transmission control packet (TCP) segmentation offload | |
US20220311711A1 (en) | Congestion control based on network telemetry | |
WO2023005748A1 (en) | Data processing method and apparatus | |
CN109960671B (en) | Data transmission system, method and computer equipment | |
WO2022139930A1 (en) | Resource consumption control | |
US20220166698A1 (en) | Network resource monitoring | |
DE102022126611A1 (en) | SERVICE MESH OFFSET TO NETWORK DEVICES | |
US12034604B2 (en) | MQTT protocol simulation method and simulation device | |
US20220321491A1 (en) | Microservice data path and control path processing | |
CN113746749A (en) | Network connection device | |
DE112016002909T5 (en) | Flexible interconnect architecture | |
WO2023075930A1 (en) | Network interface device-based computations | |
CN106873915A (en) | A kind of data transmission method and device based on RDMA registers memory blocks | |
US8161126B2 (en) | System and method for RDMA QP state split between RNIC and host software | |
CN116074131B (en) | Data processing method, intelligent network card and electronic equipment | |
CN113347017A (en) | Network communication method and device, network node equipment and hybrid network | |
CN115686836A (en) | Unloading card provided with accelerator | |
CN115473861B (en) | High-performance processing system and method based on communication and calculation separation and storage medium | |
CN115202573A (en) | Data storage system and method | |
CN116048424B (en) | IO data processing method, device, equipment and medium | |
Gilfeather et al. | Modeling protocol offload for message-oriented communication | |
CN113572575B (en) | Self-adaptive data transmission method and system | |
CN113271336B (en) | DPDK-based robot middleware DDS data transmission method, electronic device and computer-readable storage medium | |
CN114625220B (en) | Server and data processing method thereof |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |