[go: up one dir, main page]

CN102073547B - Performance optimizing method for multipath server multi-buffer-zone parallel packet receiving - Google Patents

Performance optimizing method for multipath server multi-buffer-zone parallel packet receiving Download PDF

Info

Publication number
CN102073547B
CN102073547B CN 201010611827 CN201010611827A CN102073547B CN 102073547 B CN102073547 B CN 102073547B CN 201010611827 CN201010611827 CN 201010611827 CN 201010611827 A CN201010611827 A CN 201010611827A CN 102073547 B CN102073547 B CN 102073547B
Authority
CN
China
Prior art keywords
cpu
thread
internal memory
buffer
message
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN 201010611827
Other languages
Chinese (zh)
Other versions
CN102073547A (en
Inventor
云晓春
杜跃进
王丽宏
汪立东
陈训逊
包秀国
杜翠兰
王勇
刘朝辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National Computer Network and Information Security Management Center
Dawning Information Industry Beijing Co Ltd
Original Assignee
National Computer Network and Information Security Management Center
Dawning Information Industry Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National Computer Network and Information Security Management Center, Dawning Information Industry Beijing Co Ltd filed Critical National Computer Network and Information Security Management Center
Priority to CN 201010611827 priority Critical patent/CN102073547B/en
Publication of CN102073547A publication Critical patent/CN102073547A/en
Application granted granted Critical
Publication of CN102073547B publication Critical patent/CN102073547B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention provides a performance optimizing method for multipath server multi-buffer-zone parallel packet receiving. Driving software is in charge of distributing buffer zones for receiving messages and is required to apply one message buffer zone for each thread in an inner core; and due to application in the inner core, when an internal memory is applied, a central processing unit (CPU) number connected with the internal memory is appointed as a thread number through a parameter, namely, a local internal memory on the CPU 0 is applied for the thread 0 and a local internal memory on the CPU 1 is applied for the thread 1. When each thread invokes an application program interface (API) interface for receiving the messages for the first time, interface library software binds the thread to the CPU corresponding to the thread number. Cost of the CPU accessing a remote internal memory and cost of the thread invoking on the plurality of CPUs are reduced; and multi-thread packet receiving efficiency is improved.

Description

The performance optimization method of the parallel packet receiving of a kind of multipath server multiple buffer
Technical field
The present invention relates to the network data processing field, be specifically related to the performance optimization method of the parallel packet receiving of a kind of multipath server multiple buffer.
Background technology
Network data processing system in multipath server (mainboard has the server of a plurality of CPU) operation, generally need to use multithread mode, start a plurality of thread parallels identical with the CPU number and handle, each thread process part flow is in order to use each CPU.
The message buffer of each thread application oneself in the present technical scheme of using always, the message buffer of each thread application oneself, network interface card receives next message according to certain branch flow algorithm, is distributed in the buffer zone of each thread the message in the own buffer zone of each thread circular treatment.
In addition, on the server of general multichannel CPU, the Memory Controller Hub of each CPU is directly linked on some memory bars, is called local internal memory, does not have the internal memory that directly links to each other with local internal memory on the mainboard, is called far internal memory.On server master board, each CPU, visit the efficient of local internal memory than far internal memory is much higher.In the general technology scheme, during each thread application message buffer, application at random can't guarantee to use local internal memory to raise the efficiency as far as possible in all internal memories.
In the general operation system, a plurality of threads of application software random schedule on a plurality of CPU, thread scheduling can cause the expense that thread-data migration and context switch, and can have influence on the server system performance.
Summary of the invention
The performance optimization method that the purpose of this invention is to provide the parallel packet receiving of a kind of multipath server multiple buffer promotes the multithreading multiple buffer and is applied in performance on the multipath server.
The performance optimization method of the parallel packet receiving of a kind of multipath server multiple buffer comprises that kernel drives and the application interface library file, and implementation method is as follows:
When A, driving loading, according to predefined number of threads, for the CPU application local internal memory of each thread from correspondence, as message buffer;
The thread of B, application software during the api interface in the calling interface storehouse, at first is mapped to the buffer zone of the reference numeral of kernel spacing the user's space of this application, then on the CPU that oneself is tied to reference numeral for the first time;
C, each thread circulation receive message from own corresponding buffer region, in this process, thread and CPU binding can not moved between CPU, and message buffer is the local internal memory of this CPU, can not produce far internal memory cost of visit.
A kind of optimal technical scheme of the present invention is: thread number and CPU numbering is the corresponding relation of fixing in the step A, thread number to CPU number delivery, CPU numbering as this thread correspondence, when making number of threads greater than the CPU number, a plurality of threads on CPU and the numbering of a plurality of message buffers still have fixed correspondence with the CPU numbering.
The present invention has effectively avoided CPU visit, and far internal memory and thread have improved the efficient of multithreading packet receiving in the expense that a plurality of CPU dispatch.
Description of drawings
Fig. 1 is system logic structure of the present invention
Specific embodiments
The present invention is based on the server of multichannel CPU, comprises the system of kernel driving and application interface library, can realize combining closely of CPU, buffer zone and thread.
Drive software is responsible for distributing the buffer zone that receives the message use, it need be message buffer of each thread application in kernel, because in kernel, apply for, so during the application internal memory, can be thread number by link to each other CPU number of parameter specified memory, that is to say, be the local internal memory on No. 0 CPU of thread 0 application, be the local internal memory on No. 1 CPU of thread 1 application.
When interface library software calls the api interface that receives message for the first time at each thread, thread is tied on the CPU corresponding with thread number.
Implementation method and the process of this invention are as follows:
(1) driving applies for that for the CPU of each thread correspondence local internal memory is as message buffer.
Drive when loading, according to predefined number of threads, for the CPU application local internal memory of each thread from correspondence, as message buffer.Thread number and CPU numbering are the corresponding relations of fixing, thread number to CPU number delivery, CPU numbering as this thread correspondence, when making number of threads greater than the CPU number like this, a plurality of threads on CPU and the numbering of a plurality of message buffers still have fixed correspondence with the CPU numbering.
(2) interface library mapping message buffer, the CPU that binding is corresponding.
The thread of application software during the api interface in the calling interface storehouse, at first is mapped to the buffer zone of the reference numeral of kernel spacing the user's space of this application, then on the CPU that oneself is tied to reference numeral for the first time.
(3) thread parallel receives message.
When each thread receives message, write message in own corresponding buffer region, in this process, thread and CPU binding can not moved between CPU, and message buffer is the local internal memory of this CPU, can not produce far internal memory cost of visit.

Claims (1)

1. the performance optimization method of the parallel packet receiving of a multipath server multiple buffer, it is characterized in that: described server comprises that kernel drives and the application interface library system, and described method comprises the steps:
When A, driving loading, according to predefined number of threads, for the CPU application local internal memory of each thread from correspondence, as message buffer;
The thread of B, application software during the api interface in the calling interface storehouse, at first is mapped to the buffer zone of the reference numeral of kernel spacing the user's space of described application software, then on the CPU that oneself is tied to reference numeral for the first time;
C, each thread circulation receive message from own corresponding buffer region, in this process, thread and CPU binding can not moved between CPU, and message buffer is the local internal memory of this CPU, can not produce far internal memory cost of visit;
Thread number and CPU numbering is the corresponding relation of fixing in the described A step, thread number to CPU number delivery, CPU numbering as this thread correspondence, when making number of threads greater than the CPU number, a plurality of threads on CPU and the numbering of a plurality of message buffers still have fixed correspondence with the CPU numbering.
CN 201010611827 2010-12-17 2010-12-17 Performance optimizing method for multipath server multi-buffer-zone parallel packet receiving Active CN102073547B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 201010611827 CN102073547B (en) 2010-12-17 2010-12-17 Performance optimizing method for multipath server multi-buffer-zone parallel packet receiving

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 201010611827 CN102073547B (en) 2010-12-17 2010-12-17 Performance optimizing method for multipath server multi-buffer-zone parallel packet receiving

Publications (2)

Publication Number Publication Date
CN102073547A CN102073547A (en) 2011-05-25
CN102073547B true CN102073547B (en) 2013-08-28

Family

ID=44032093

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 201010611827 Active CN102073547B (en) 2010-12-17 2010-12-17 Performance optimizing method for multipath server multi-buffer-zone parallel packet receiving

Country Status (1)

Country Link
CN (1) CN102073547B (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102769575A (en) * 2012-08-08 2012-11-07 南京中兴特种软件有限责任公司 A traffic load balancing method for smart network card
CN104881326B (en) * 2015-05-26 2018-04-13 上海帝联信息科技股份有限公司 Journal file processing method and processing device
CN105938438B (en) * 2015-11-24 2022-07-01 杭州迪普科技股份有限公司 Data message processing method and device
CN105912306B (en) * 2016-04-12 2018-05-18 电子科技大学 A kind of method of the data processing of high concurrent Platform Server
CN107168800A (en) * 2017-05-16 2017-09-15 郑州云海信息技术有限公司 A kind of memory allocation method and device
CN108536535A (en) * 2018-01-24 2018-09-14 北京奇艺世纪科技有限公司 A kind of dns server and its thread control method and device
CN111708631B (en) * 2020-05-06 2023-06-30 深圳震有科技股份有限公司 Data processing method based on multipath server, intelligent terminal and storage medium
CN111654551B (en) * 2020-06-17 2023-01-31 广东瀚阳轨道信息科技有限公司 Transmission control method and system for stress dispersion locking data of railway jointless track

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1517872A (en) * 2003-01-16 2004-08-04 国际商业机器公司 Method and device for dynamic allocation of computer resource
CN1664804A (en) * 2004-03-04 2005-09-07 国际商业机器公司 Mechanism for reducing remote memory accesses to shared data in a multi-nodal computer system
CN101477472A (en) * 2009-01-08 2009-07-08 上海交通大学 Multi-core multi-threading construction method for hot path in dynamic binary translator
WO2010004474A2 (en) * 2008-07-10 2010-01-14 Rocketic Technologies Ltd Efficient parallel computation of dependency problems
CN101634953A (en) * 2008-07-22 2010-01-27 国际商业机器公司 Method and device for calculating search space, and method and system for self-adaptive thread scheduling
CN102045199A (en) * 2010-12-17 2011-05-04 天津曙光计算机产业有限公司 Performance optimization method for multi-server multi-buffer zone parallel packet sending

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7159216B2 (en) * 2001-11-07 2007-01-02 International Business Machines Corporation Method and apparatus for dispatching tasks in a non-uniform memory access (NUMA) computer system

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1517872A (en) * 2003-01-16 2004-08-04 国际商业机器公司 Method and device for dynamic allocation of computer resource
CN1664804A (en) * 2004-03-04 2005-09-07 国际商业机器公司 Mechanism for reducing remote memory accesses to shared data in a multi-nodal computer system
WO2010004474A2 (en) * 2008-07-10 2010-01-14 Rocketic Technologies Ltd Efficient parallel computation of dependency problems
CN101634953A (en) * 2008-07-22 2010-01-27 国际商业机器公司 Method and device for calculating search space, and method and system for self-adaptive thread scheduling
CN101477472A (en) * 2009-01-08 2009-07-08 上海交通大学 Multi-core multi-threading construction method for hot path in dynamic binary translator
CN102045199A (en) * 2010-12-17 2011-05-04 天津曙光计算机产业有限公司 Performance optimization method for multi-server multi-buffer zone parallel packet sending

Also Published As

Publication number Publication date
CN102073547A (en) 2011-05-25

Similar Documents

Publication Publication Date Title
CN102073547B (en) Performance optimizing method for multipath server multi-buffer-zone parallel packet receiving
US12147849B2 (en) Work stealing in heterogeneous computing systems
CN104750543B (en) Thread creation method, service request processing method and relevant device
CN103412786B (en) High performance server architecture system and data processing method thereof
CN106293950B (en) A kind of resource optimization management method towards group system
US8413158B2 (en) Processor thread load balancing manager
CN102662740B (en) Asymmetric multi-core system and realization method thereof
CN111752971B (en) Method, device, equipment and storage medium for processing data stream based on task parallel
US20100037234A1 (en) Data processing system and method of task scheduling
CN102045199A (en) Performance optimization method for multi-server multi-buffer zone parallel packet sending
CN103617088A (en) Method, device and processor of device for distributing core resources in different types of threads of processor
CN111176806A (en) Service processing method, device and computer readable storage medium
GB2573316A (en) Data processing systems
CN118502903A (en) Thread bundle scheduling method based on general graphics processor and storage medium
CN110087324A (en) Resource allocation methods, device, access network equipment and storage medium
CN102855173A (en) Method and device for testing software performance
CN109002286A (en) Data asynchronous processing method and device based on synchronous programming
CN102520916B (en) Method for eliminating texture retardation and register management in MVP (multi thread virtual pipeline) processor
CN102508696A (en) Asymmetrical resource scheduling method and device
WO2019153681A1 (en) Smart instruction scheduler
CN102955685B (en) Multi-core DSP and system thereof and scheduler
CN109885261B (en) A method to improve the performance of storage system
JP5630798B1 (en) Processor and method
CN110764710A (en) Data access method and storage system of low-delay and high-IOPS
CN103714511A (en) GPU-based branch processing method and device

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant