CN102073547B

CN102073547B - Performance optimizing method for multipath server multi-buffer-zone parallel packet receiving

Info

Publication number: CN102073547B
Application number: CN 201010611827
Authority: CN
Inventors: 云晓春; 杜跃进; 王丽宏; 汪立东; 陈训逊; 包秀国; 杜翠兰; 王勇; 刘朝辉
Original assignee: National Computer Network and Information Security Management Center; Dawning Information Industry Beijing Co Ltd
Current assignee: National Computer Network and Information Security Management Center; Dawning Information Industry Beijing Co Ltd
Priority date: 2010-12-17
Filing date: 2010-12-17
Publication date: 2013-08-28
Anticipated expiration: 2030-12-17
Also published as: CN102073547A

Abstract

The invention provides a performance optimizing method for multipath server multi-buffer-zone parallel packet receiving. Driving software is in charge of distributing buffer zones for receiving messages and is required to apply one message buffer zone for each thread in an inner core; and due to application in the inner core, when an internal memory is applied, a central processing unit (CPU) number connected with the internal memory is appointed as a thread number through a parameter, namely, a local internal memory on the CPU 0 is applied for the thread 0 and a local internal memory on the CPU 1 is applied for the thread 1. When each thread invokes an application program interface (API) interface for receiving the messages for the first time, interface library software binds the thread to the CPU corresponding to the thread number. Cost of the CPU accessing a remote internal memory and cost of the thread invoking on the plurality of CPUs are reduced; and multi-thread packet receiving efficiency is improved.

Description

The performance optimization method of the parallel packet receiving of a kind of multipath server multiple buffer

Technical field

The present invention relates to the network data processing field, be specifically related to the performance optimization method of the parallel packet receiving of a kind of multipath server multiple buffer.

Background technology

Network data processing system in multipath server (mainboard has the server of a plurality of CPU) operation, generally need to use multithread mode, start a plurality of thread parallels identical with the CPU number and handle, each thread process part flow is in order to use each CPU.

The message buffer of each thread application oneself in the present technical scheme of using always, the message buffer of each thread application oneself, network interface card receives next message according to certain branch flow algorithm, is distributed in the buffer zone of each thread the message in the own buffer zone of each thread circular treatment.

In addition, on the server of general multichannel CPU, the Memory Controller Hub of each CPU is directly linked on some memory bars, is called local internal memory, does not have the internal memory that directly links to each other with local internal memory on the mainboard, is called far internal memory.On server master board, each CPU, visit the efficient of local internal memory than far internal memory is much higher.In the general technology scheme, during each thread application message buffer, application at random can't guarantee to use local internal memory to raise the efficiency as far as possible in all internal memories.

In the general operation system, a plurality of threads of application software random schedule on a plurality of CPU, thread scheduling can cause the expense that thread-data migration and context switch, and can have influence on the server system performance.

Summary of the invention

The performance optimization method that the purpose of this invention is to provide the parallel packet receiving of a kind of multipath server multiple buffer promotes the multithreading multiple buffer and is applied in performance on the multipath server.

The performance optimization method of the parallel packet receiving of a kind of multipath server multiple buffer comprises that kernel drives and the application interface library file, and implementation method is as follows:

When A, driving loading, according to predefined number of threads, for the CPU application local internal memory of each thread from correspondence, as message buffer;

The thread of B, application software during the api interface in the calling interface storehouse, at first is mapped to the buffer zone of the reference numeral of kernel spacing the user's space of this application, then on the CPU that oneself is tied to reference numeral for the first time;

C, each thread circulation receive message from own corresponding buffer region, in this process, thread and CPU binding can not moved between CPU, and message buffer is the local internal memory of this CPU, can not produce far internal memory cost of visit.

A kind of optimal technical scheme of the present invention is: thread number and CPU numbering is the corresponding relation of fixing in the step A, thread number to CPU number delivery, CPU numbering as this thread correspondence, when making number of threads greater than the CPU number, a plurality of threads on CPU and the numbering of a plurality of message buffers still have fixed correspondence with the CPU numbering.

The present invention has effectively avoided CPU visit, and far internal memory and thread have improved the efficient of multithreading packet receiving in the expense that a plurality of CPU dispatch.

Description of drawings

Fig. 1 is system logic structure of the present invention

Specific embodiments

The present invention is based on the server of multichannel CPU, comprises the system of kernel driving and application interface library, can realize combining closely of CPU, buffer zone and thread.

Drive software is responsible for distributing the buffer zone that receives the message use, it need be message buffer of each thread application in kernel, because in kernel, apply for, so during the application internal memory, can be thread number by link to each other CPU number of parameter specified memory, that is to say, be the local internal memory on No. 0 CPU of thread 0 application, be the local internal memory on No. 1 CPU of thread 1 application.

When interface library software calls the api interface that receives message for the first time at each thread, thread is tied on the CPU corresponding with thread number.

Implementation method and the process of this invention are as follows:

(1) driving applies for that for the CPU of each thread correspondence local internal memory is as message buffer.

Drive when loading, according to predefined number of threads, for the CPU application local internal memory of each thread from correspondence, as message buffer.Thread number and CPU numbering are the corresponding relations of fixing, thread number to CPU number delivery, CPU numbering as this thread correspondence, when making number of threads greater than the CPU number like this, a plurality of threads on CPU and the numbering of a plurality of message buffers still have fixed correspondence with the CPU numbering.

(2) interface library mapping message buffer, the CPU that binding is corresponding.

The thread of application software during the api interface in the calling interface storehouse, at first is mapped to the buffer zone of the reference numeral of kernel spacing the user's space of this application, then on the CPU that oneself is tied to reference numeral for the first time.

(3) thread parallel receives message.

When each thread receives message, write message in own corresponding buffer region, in this process, thread and CPU binding can not moved between CPU, and message buffer is the local internal memory of this CPU, can not produce far internal memory cost of visit.

Claims

1. the performance optimization method of the parallel packet receiving of a multipath server multiple buffer, it is characterized in that: described server comprises that kernel drives and the application interface library system, and described method comprises the steps:

The thread of B, application software during the api interface in the calling interface storehouse, at first is mapped to the buffer zone of the reference numeral of kernel spacing the user's space of described application software, then on the CPU that oneself is tied to reference numeral for the first time;

C, each thread circulation receive message from own corresponding buffer region, in this process, thread and CPU binding can not moved between CPU, and message buffer is the local internal memory of this CPU, can not produce far internal memory cost of visit;

Thread number and CPU numbering is the corresponding relation of fixing in the described A step, thread number to CPU number delivery, CPU numbering as this thread correspondence, when making number of threads greater than the CPU number, a plurality of threads on CPU and the numbering of a plurality of message buffers still have fixed correspondence with the CPU numbering.