Disclosure of Invention
The disclosure provides a method and a device for improving performance of an intelligent network card, which are used for solving the defect of low read-write efficiency in the prior art and realizing rapid and efficient data reading and writing.
In a first aspect, the present disclosure provides a method for improving performance of an intelligent network card, including:
Acquiring an address of an item pointed by a read request instruction of a current processing module;
Determining a hash value corresponding to the address according to the address of the table item;
Determining the locking state of the cell corresponding to the reading request instruction according to the hash value and a bit bitmap lock table which is constructed in advance and contains a plurality of cells;
And determining whether the dispatcher executes a read operation according to the locking state of the cell.
The method for improving the performance of the intelligent network card provided by the present disclosure, wherein the determining the locking state of the cell corresponding to the read request instruction according to the hash value and a pre-constructed bitmap lock table containing a plurality of cell bits specifically includes;
taking the hash value as a dividend and the depth of the bit map lock table as a divisor to obtain a remainder;
inquiring the value of the bit number in the cell corresponding to the remainder in the bit map lock table;
And determining the locking state of the cell according to the value of the bit number.
The method for improving the performance of the intelligent network card provided by the present disclosure, wherein the determining whether the dispatcher executes the read operation according to the locking state specifically includes:
If the locking state of the cell is unlocked, the dispatcher sends a reading request instruction to the controller, and the value of the bit number in the table is changed into the locking state;
And if the locking state of the cell is locking, the dispatcher does not send the reading request instruction to the controller.
According to the method for improving the performance of the intelligent network card provided by the present disclosure, after the scheduler sends the read request instruction to the controller if the locking state of the unit cell is unlocked, the method further includes:
And the controller reads the data of the table entry corresponding to the cell after receiving the reading request instruction of the scheduler, and sends the data of the table entry to the current processing module.
According to the method for improving the performance of the intelligent network card provided by the present disclosure, after the scheduler does not send the read request instruction to the controller if the locking state of the unit cell is locking, the method further includes:
adding, by the scheduler, the read request instruction to a wait table;
And the scheduler continues to schedule the read request instructions of other processing modules in a polling scheduling mode, and determines whether to execute the read operation according to the bit number value in the table corresponding to the read request instructions of the other processing modules, wherein the read request instructions of the other processing modules comprise read request instructions stored in a waiting table.
According to the method for improving the performance of the intelligent network card, the reading request instruction stored in the waiting table participates in the polling scheduling in a first-in first-out mode.
According to the method for improving the performance of the intelligent network card, the table entry writing operation after the table entry reading operation is completed comprises the following steps:
after the processing module updates the read data of the table entry, a write request instruction is initiated;
Acquiring an address of an entry aimed at by the write request instruction;
The controller completes the updated writing operation of the data after receiving the writing request instruction;
and after the writing operation is completed, the scheduler updates the state of the cell corresponding to the writing request instruction in the bitmap lock table to an unlocked state.
In a second aspect, the present disclosure provides an apparatus for improving performance of an intelligent network card, including:
The first processing module is used for acquiring the address of the table item pointed by the reading request instruction of the current processing module;
the second processing module is used for determining a hash value corresponding to the address according to the address of the table entry;
The third processing module is used for determining the locking state of the cell corresponding to the reading request instruction according to the hash value and a bit map locking table which is constructed in advance and contains a plurality of cells;
and the fourth processing module is used for determining whether the dispatcher executes the read operation according to the locking state of the cell.
In a third aspect, the present disclosure further provides an electronic device, including a memory, a processor, and a computer program stored on the memory and executable on the processor, where the processor implements the steps of the method for improving performance of a smart network card as described in any one of the above.
In a fourth aspect, the present disclosure also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of a method of enhancing the performance of a smart network card as described in any of the above.
The method and the device for improving the performance of the intelligent network card are characterized in that the method comprises the steps of obtaining addresses of table items pointed by reading request instructions of current processing modules, determining hash values corresponding to the addresses according to the addresses of the table items, enabling the addresses of one table item to be unique due to the uniqueness of the hash values, determining locking states of the cells corresponding to the reading request instructions according to the hash values and a bit bitmap locking table which is built in advance and comprises a plurality of cells, and accordingly adding a lock to an execution request of each processing module, and determining whether a dispatcher executes reading operation according to the locking states of the cells. The method and the device realize that one lock in the prior art is disassembled into a plurality of locks by introducing the bit map lock list structure, reduce the conflict probability of each lock and improve the reading and writing performance of list items.
Detailed Description
For the purposes of making the objects, technical solutions and advantages of the embodiments of the present disclosure more apparent, the technical solutions of the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present disclosure, and it is apparent that the described embodiments are some embodiments, but not all embodiments of the present disclosure. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the disclosed embodiments, are within the scope of the disclosed embodiments.
The following describes a method for improving performance of an intelligent network card according to an embodiment of the present disclosure with reference to fig. 1 to fig. 3, where the method includes:
step 100, obtaining the address of an item pointed by a read request instruction of a current processing module;
Specifically, in the prior art, in order to fully utilize bus bandwidth, a multi-port read-write mode is generally adopted in hardware design, wherein a plurality of processing modules (proc 0, proc1, procM) read-write the contents of the table entries in parallel, a scheduler adopts a polling rotation mode to schedule the read-write requests in sequence, the requests are sent to a storage controller, and the storage controller finishes the read-write of the contents of the table entries stored in the storage device particles.
In connection with fig. 1 it is assumed that there are two processing modules Proc0, proc1, each processing module adding 1 to the table contents. The Entry0 content is read at time proc0 (assuming Entry content Entry 0.data=0 at this time), the Entry0 content read operation is also initiated at time proc1 to read the Entry 0.data=0 at time t2, entry 0.data=1 is written back to Entry0 at time proc0 at time t3, and Entry 0.data=1 is written back to Entry0 at time proc1 at time t 4. Thus, by proc0, proc1, the content of Entry0 is increased by only 1, and it is apparent that the result does not match the expected result.
Because the external memory is divided into a plurality of table entries, a multi-port read-write mode is adopted, namely a plurality of processing modules (proc 0, proc1, procM) are used for reading and writing the contents of the table entries in parallel, and each processing module needs to determine the addresses of different table entries when performing read-write operation on one table entry.
200, Determining a hash value corresponding to the address according to the address of the table item;
Specifically, after the address of the table item to be read by each processing module is obtained, a hash algorithm is adopted for the address to obtain a hash value of each table item address. The hash algorithm maps binary values of arbitrary length to shorter fixed length binary values, this small binary value being called the hash value. Hash values are a unique and extremely compact representation of a piece of data. If a piece of plaintext is hashed and even only one letter of the piece is changed, the subsequent hash will produce a different value. It is computationally impossible to find two different inputs hashed to the same value, so the hash value of the data can verify the integrity of the data. Thus, by hashing the address, the address is made unique and the representation of the address is converted into a binary form.
Step 300, determining the locking state of the cell corresponding to the reading request instruction according to the hash value and a bit map lock table which is constructed in advance and contains a plurality of cells;
Specifically, referring to FIG. 3, the bitmapped lock table locker-Bitmap includes a plurality of cells, and the bitmapped lock table locker-Bitmap is also a lock in nature, which is a multi-handle lock compared to the previous single lock, by constructing the bitmapped lock table locker-Bitmap in the present disclosure. After the adoption of a plurality of locks, the method is equivalent to dividing a large table entry in a storage device into small table entries, and each small table entry is provided with one lock. The greater the bit map lock table locker-Bitmap depth, the more locks, and the lower the probability of collision for small entries, relatively speaking. Of course, from a hardware implementation perspective, the Bitmap lock table locker-Bitmap structure may not be designed as deep as the memory device table entry size (millions).
Preferably, the bit map lock table locker-Bitmap in this disclosure has a table of power-of-power values of 128, 256, 512, 1024, etc. 2. According to the hash value of the address of the table entry obtained in the above step and the preset Bitmap lock table locker-Bitmap in the present disclosure, the corresponding cell of each table entry address in the Bitmap lock table locker-Bitmap is determined. Since the Bitmap lock table locker-Bitmap is a multi-cell table consisting of bits 0 or 1, the bit value in each cell is 0 or 1. Bit 0 may be set to indicate an unlocked state and bit 1 to indicate a locked state.
Step 400, determining whether the dispatcher executes a read operation according to the locking state of the cell.
Specifically, when the bit value of the cell corresponding to the read request instruction is 0, which indicates that the lock is not locked, the scheduler may execute the content of the read request instruction. When the bit value of the table corresponding to the read request instruction is 1, it means that the cell is in the locked state, and the scheduler cannot execute the corresponding read operation.
The method for improving the performance of the intelligent network card comprises the steps of obtaining addresses of table items pointed by a read request instruction of a current processing module, determining a hash value corresponding to the addresses according to the addresses of the table items, enabling the addresses of one table item to be unique due to the uniqueness of the hash value, determining the locking state of a cell corresponding to the read request instruction according to the hash value and a bit bitmap lock table which is built in advance and comprises a plurality of cells, and accordingly adding a lock to an execution request of each processing module, and determining whether a dispatcher executes a read operation according to the locking state of the cell. The method and the device realize that one lock in the prior art is disassembled into a plurality of locks by introducing the bit map lock list structure, reduce the conflict probability of each lock and improve the reading and writing performance of list items.
According to the method for improving the performance of the intelligent network card provided by the embodiment of the disclosure, the locking state of the cell corresponding to the reading request instruction is determined according to the hash value and a pre-constructed bit map locking table containing a plurality of cell bits, and the method specifically comprises the following steps of;
taking the hash value as a dividend and the depth of the bit map lock table as a divisor to obtain a remainder;
inquiring the value of the bit number in the cell corresponding to the remainder in the bit map lock table;
And determining the locking state of the cell according to the value of the bit number.
Specifically, before each processing module proc0, proc1, procM initiates the reading and writing of an entry, the address of the requested entry is sent to AddrHash module, addrHash module calculates a HASH value from the address by using a HASH algorithm, and then, the HASH value is left for the bit map lock table locker-Bitmap table depth, so that the HASH value falls into the bit map lock table locker-Bitmap table range.
Suppose that the address of an entry is 11111111 after being converted to a hash value, and the depth of the Bitmap lock table locker-Bitmap is 128. Then 11111111 is converted into a decimal number and then is 255, 255 is taken as a dividend, and the remainder obtained by taking 128 as a divisor is 127, and then the bit value of the byte in the 127 th cell in the bit map lock table locker-Bitmap table is queried according to the remainder 127. If the bit value in the cell is 0, the cell is in an unlocked state, and if the bit value in the cell is 1, the cell is in a locked state.
The method for improving the performance of the intelligent network card provided by the present disclosure, wherein the determining whether the dispatcher executes the read operation according to the locking state specifically includes:
If the locking state of the cell is unlocked, the dispatcher sends a reading request instruction to the controller, and the value of the bit number in the table is changed into the locking state;
And if the locking state of the cell is locking, the dispatcher does not send the reading request instruction to the controller.
Specifically, in connection with fig. 3, the present disclosure adds a Bitmap lock table locker-Bitmap table after scheduler arbitor schedules the modules. If the bit map lock table locker-Bitmap table cell corresponding to the scheduled read request is '0', indicating that no other module currently locks these entries, the current request may be executed, and the scheduler issues this request to the controller, and sets the corresponding bit in the bit map lock table locker-Bitmap table to '1' to lock the current operation.
If the byte value in the cell of the bitmapped lock table locker-Bitmap table corresponding to the scheduled read request is already '1', indicating that other modules are already operating on these entries, the scheduler stops scheduling until the corresponding bit in LOCKER _bitmap is unlocked.
According to the method for improving the performance of the intelligent network card provided by the present disclosure, after the scheduler sends the read request instruction to the controller if the locking state of the unit cell is unlocked, the method further includes:
And the controller reads the data of the table entry corresponding to the cell after receiving the reading request instruction of the scheduler, and sends the data of the table entry to the current processing module.
Specifically, after reading the value stored in the memory, the controller sends the value to the processing module that issues the read request command, and completes the read operation.
According to the method for improving the performance of the intelligent network card provided by the present disclosure, after the scheduler does not send the read request instruction to the controller if the locking state of the unit cell is locking, the method further includes:
adding, by the scheduler, the read request instruction to a wait table;
And the scheduler continues to schedule the read request instructions of other processing modules in a polling scheduling mode, and determines whether to execute the read operation according to the bit number value in the table corresponding to the read request instructions of the other processing modules, wherein the read request instructions of the other processing modules comprise read request instructions stored in a waiting table.
Specifically, in the above, by adding the Bitmap lock table locker-Bitmap table, separate locking of each task is realized, and although the correctness of the result can be ensured, the overall read-write performance can be reduced, and more trouble is that a head drag effect is also introduced. As shown in fig. 4, it is assumed that the 0 th processing module proc0 initiates the operation of reading the Entry0, and then the 1 st processing module proc1 initiates the operation of reading the Entry0, the Entry1, and the Entry 2. The above-described read operation of proc1 can only wait until the 0 th processing module proc0 module writes back and unlocks the Entry0 result. In fact, proc1 reads Entry1, entry2 these operations do not conflict with proc 0's read Entry0 operation, but subsequent requests are blocked because read Entry 0's head request is blocked. The occurrence of the head resistance effect further reduces the performance of the table entry reading and writing.
Therefore, in the embodiment of the disclosure, by introducing the wait list list_table structure, the request with the head resistance is firstly put into the wait list_table, and the subsequent requests are not blocked, so that the aim of improving the read-write performance of the TABLE items on the whole is fulfilled.
In conjunction with the illustration of fig. 5, it is specifically shown that when a request from a module is "blocked," the scheduler still schedules the request, but instead of sending it to the controller, sends the request to the wait TABLE wait_table, at which time the scheduler arbitor is not blocked and it can continue to schedule requests from other modules in a round-robin fashion. The request in the WAITING_TABLE is also a source of the scheduler and is scheduled. Thus, the request previously placed in the wait list wait_table because the entry was locked will eventually still be scheduled out. As with other modules' requests, the requests in the wait TABLE WAITING_TABLE still see the lock state in the Bitmap lock TABLE locker-Bitmap TABLE, if still locked, and if not, the requests continue to be dispatched to the wait TABLE WAITING_TABLE, and vice versa, to the controller.
According to the method for improving the performance of the intelligent network card, the reading request instruction stored in the waiting table participates in the polling scheduling in a first-in first-out mode.
Specifically, the wait TABLE wait_table may be implemented as a FIFO (FIRST IN FIRST OUT), and the additional resources are not increased. The introduction of the bit map lock table locker-Bitmap table further improves the read-write performance of the table entry, improves the utilization rate of the memory bus, and finally improves the performance of the whole design.
According to the method for improving the performance of the intelligent network card, the table entry writing operation after the table entry reading operation is completed comprises the following steps:
after the processing module updates the read data of the table entry, a write request instruction is initiated;
Acquiring an address of an entry aimed at by the write request instruction;
The controller completes the updated writing operation of the data after receiving the writing request instruction;
and after the writing operation is completed, the scheduler updates the state of the cell corresponding to the writing request instruction in the bitmap lock table to an unlocked state.
Specifically, the processing module writes the updated data back into the table entry after reading the value of the table entry in the read request command and updating the data.
When writing data of an entry back to the entry, a write request instruction needs to be issued, where the write request instruction corresponds to an address of the entry. When the processing module performs read-write operation, the cells in the bit map lock table are still in the locked state, so that the state of the bit map lock table does not need to be judged, the processing module directly sends data to the controller, and the controller performs corresponding data write operation on corresponding table items.
However, after the write operation is completed, the value in the cell corresponding to the write operation in the Bitmap lock table locker-Bitmap table needs to be changed to 0, so that the unlocked state is changed. The specific modes include two modes, one is to store the positions of the cells in the Bitmap lock table locker-Bitmap table corresponding to the read operation independently, and after the write operation is completed, the positions of the cells are found directly, so that the cells are changed from the locking state to the locking state.
Or the positions of the cells are not stored, the hash value is recalculated according to the address aimed by the write operation instruction, the positions of the corresponding cells are determined according to the hash value, and then the numerical value of the cells is changed from 1 to 0, so that the cells are in a locked open state.
For the above method, taking a complete read-write operation as an example, it can be expressed as follows:
Referring to fig. 6, when a read request to a current module is scheduled, HASH is first performed on an entry address, the HASH result is used as an index to find a BIT Bitmap lock TABLE locker-Bitmap TABLE (the TABLE is initialized to be all '0') if the corresponding BIT is '1', which indicates that the entry to be read is being rewritten by other modules, so that the read request is put into a wait TABLE wait_table for the next round of scheduling, if the corresponding BIT is '0', which indicates that the entry to be read is not processed by other modules, the scheduler modifies the corresponding BIT in the BIT Bitmap lock TABLE locker-Bitmap TABLE to be '1', locks the entry, and simultaneously sends the request to the controller, reads the entry content, and the read request scheduling is completed.
As shown in FIG. 7, when the processing module updates the table entry content and initiates the write request, the HASH is performed on the table entry address first, the HASH result is the index query node lock table locker-Bitmap table, the write request is sent to the controller, and the contents in the node lock table locker-Bitmap table are changed from '1' to '0', the table entry is unlocked, so far, the write request scheduling is completed.
Referring to fig. 8, the present disclosure provides a device for improving performance of an intelligent network card, including:
the first processing module 81 is configured to obtain an address of an entry pointed to by the read request instruction of the current processing module;
a second processing module 82, configured to determine a hash value corresponding to the address according to the address of the table entry;
A third processing module 8, configured to determine a locking state of a cell corresponding to the read request instruction according to the hash value and a pre-constructed bitmap lock table including a plurality of cells;
A fourth processing module 84 is configured to determine whether the scheduler is performing a read operation based on the locked state of the cell.
Since the apparatus provided in the embodiments of the present disclosure may be used to perform the method described in the above embodiments, the working principle and the beneficial effects thereof are similar, and thus, details will not be described herein, and reference may be made to the descriptions of the above embodiments.
The device for improving the performance of the intelligent network card is characterized by acquiring the address of the table item pointed by the read request instruction of the current processing module, then determining the hash value corresponding to the address according to the address of the table item, and determining the locking state of the cell corresponding to the read request instruction according to the hash value and a bit map lock table which is constructed in advance and contains a plurality of cells because of the uniqueness of the hash value. The method and the device realize that one lock in the prior art is disassembled into a plurality of locks by introducing the bit map lock list structure, reduce the conflict probability of each lock and improve the reading and writing performance of list items.
The device for improving the performance of the intelligent network card provided by the present disclosure, wherein the third processing module 83 is specifically configured to;
taking the hash value as a dividend and the depth of the bit map lock table as a divisor to obtain a remainder;
inquiring the value of the bit number in the cell corresponding to the remainder in the bit map lock table;
And determining the locking state of the cell according to the value of the bit number.
The device for improving the performance of the intelligent network card provided in the present disclosure, wherein the fourth processing module 84 is specifically configured to:
If the locking state of the cell is unlocked, the dispatcher sends a reading request instruction to the controller, and the value of the bit number in the table is changed into the locking state;
And if the locking state of the cell is locking, the dispatcher does not send the reading request instruction to the controller.
According to the device for improving the performance of the intelligent network card, the fourth processing module is further configured to:
And the controller reads the data of the table entry corresponding to the cell after receiving the reading request instruction of the scheduler, and sends the data of the table entry to the current processing module.
The device for improving the performance of the intelligent network card provided in the present disclosure, wherein the fourth processing module 84 is further configured to:
adding, by the scheduler, the read request instruction to a wait table;
And the scheduler continues to schedule the read request instructions of other processing modules in a polling scheduling mode, and determines whether to execute the read operation according to the bit number value in the table corresponding to the read request instructions of the other processing modules, wherein the read request instructions of the other processing modules comprise read request instructions stored in a waiting table.
According to the device for improving the performance of the intelligent network card, the reading request instruction stored in the waiting table participates in the polling scheduling in a first-in first-out mode.
According to the device for improving the performance of the intelligent network card, which is provided by the disclosure, the device further comprises:
The fifth processing module is used for initiating a write request instruction after the processing module updates the read data of the table entry;
a sixth processing module, configured to obtain an address of an entry targeted by the write request instruction;
The seventh processing module is used for completing the updated writing operation of the data after the controller receives the writing request instruction;
And the eighth processing module is used for updating the state of the cell corresponding to the write request instruction in the bitmap lock table to an unlocked state after the write operation is completed by the scheduler.
Fig. 9 illustrates a physical schematic diagram of an electronic device, which may include a processor (processor) 910, a communication interface (Communications Interface) 920, a memory 930, and a communication bus 940, where the processor 910, the communication interface 920, and the memory 930 perform communication with each other through the communication bus 940, as shown in fig. 9. The processor 910 may call a logic instruction in the memory 930 to execute a method for improving performance of an intelligent network card, where the method includes obtaining an address of an entry pointed by a read request instruction of a current processing module, determining a hash value corresponding to the address according to the address of the entry, determining a locking state of a cell corresponding to the read request instruction according to the hash value and a bit map lock table including a plurality of cells constructed in advance, and determining whether a scheduler executes a read operation according to the locking state of the cell.
Further, the logic instructions in the memory 930 described above may be implemented in the form of software functional units and may be stored in a computer-readable storage medium when sold or used as a stand-alone product. Based on such understanding, the technical solutions of the embodiments of the present disclosure may be essentially or, what contributes to the prior art, or part of the technical solutions, may be embodied in the form of a software product stored in a storage medium, including several instructions to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the methods described in the embodiments of the present disclosure. The storage medium includes a U disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, an optical disk, or other various media capable of storing program codes.
In another aspect, the disclosure further provides a computer program product, where the computer program product includes a computer program stored on a non-transitory computer readable storage medium, where the computer program includes program instructions, when the program instructions are executed by a computer, the computer is capable of executing a method for improving performance of an intelligent network card provided by the methods, where the method includes obtaining an address of a table entry pointed by a read request instruction of a current processing module, determining a hash value corresponding to the address according to the address of the table entry, determining a locking state of a cell corresponding to the read request instruction according to the hash value and a bit map lock table including a plurality of cells constructed in advance, and determining whether a scheduler performs a read operation according to the locking state of the cell.
In still another aspect, the disclosure further provides a non-transitory computer readable storage medium, on which a computer program is stored, where the computer program is implemented when executed by a processor to perform the above-provided methods for improving performance of an intelligent network card, where the method includes obtaining an address of an entry pointed by a read request instruction of a current processing module, determining a hash value corresponding to the address according to the address of the entry, determining a locking state of a cell corresponding to the read request instruction according to the hash value and a bit map lock table including a plurality of cells constructed in advance, and determining whether a scheduler performs a read operation according to the locking state of the cell.
The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
From the above description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course may be implemented by means of hardware. Based on this understanding, the foregoing technical solution may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer readable storage medium, such as ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the respective embodiments or some parts of the embodiments.
Finally, it should be noted that the foregoing embodiments are merely illustrative of the technical solutions of the present disclosure, and not limiting thereof, and although the present disclosure has been described in detail with reference to the foregoing embodiments, it should be understood by those skilled in the art that modifications may be made to the technical solutions described in the foregoing embodiments or equivalents may be substituted for some of the technical features thereof, and these modifications or substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present disclosure in essence.