CN103399758B - Hardware-accelerated methods, devices and systems - Google Patents
Hardware-accelerated methods, devices and systems Download PDFInfo
- Publication number
- CN103399758B CN103399758B CN201110459423.6A CN201110459423A CN103399758B CN 103399758 B CN103399758 B CN 103399758B CN 201110459423 A CN201110459423 A CN 201110459423A CN 103399758 B CN103399758 B CN 103399758B
- Authority
- CN
- China
- Prior art keywords
- service
- data processing
- configuration file
- fpga
- performance value
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Landscapes
- Debugging And Monitoring (AREA)
Abstract
The present invention provides a kind of hardware-accelerated methods, devices and systems, and wherein device includes: business monitoring means, configuration loading unit and configuration file storage area;Business monitoring means processes the service request amount of business for obtaining data, and service request amount is compared with request amount threshold value, obtains the data flux matched with service request and processes the performance number of business;Configuration file storage area is used for depositing FPGA configuration file;Configuration loading unit for processing the performance number of business according to the described data flux matched with service request obtained, obtain and process, with described data, the FPGA configuration file that the performance number of business mates, load the FPGA configuration file of described coupling, to realize the hardware-accelerated of correspondence.Present invention achieves and automatically adapt to different combining properties demands by a kind of hardware accelerator, reduce the production cost of hardware accelerator.
Description
Technical Field
The present invention relates to storage technologies, and in particular, to a method, an apparatus, and a system for accelerating hardware.
Background
There are many computationally intensive processes in current data processing. For example, in a storage system, a deduplication technology (abbreviated as "deduplication") and a redundant data compression technology (abbreviated as "compression") are both commonly used effective data reduction technologies; whether deduplication or compression is performed, a large amount of computation-intensive processing is involved, such as data block calculation, hash calculation of block data, hash value comparison, and the like, and the computation amount of the processing is large, so that considerable processor resources are occupied, and the performance of other services may be affected. To reduce the dependency of the computationally intensive processing on the processor, hardware accelerators are currently used to assist the processor in performing computations. The hardware accelerator in the prior art may be, for example, a hardware accelerator card using a Field-Programmable gate array (FPGA) chip as a core, and the hardware accelerator card of the FPGA is used to implement hardware acceleration of deduplication and compression processing.
However, the inventor finds that the technical defects of the current FPGA are as follows: the FPGA adopts a fixed configuration file, and the FPGA can only realize the logic function corresponding to the configuration file, so that the FPGA can only adapt to the performance combination distribution of the functions of deduplication, compression and the like corresponding to the configuration file. For example, the processing resources in the FPGA include 1000 logic units, and according to the configuration of the configuration file, the FPGA needs to allocate 200 logic units therein for processing the deduplication function, and allocate 800 logic units therein for processing the compression function, that is, a performance combination with a deduplication and compression ratio of 1: 4, where the performance combination refers to a ratio of processing resources of the accelerator card occupied by each function.
However, in practical applications, different users and different application environments may cause different performance combinations of the required functions, for example, a user may have more data deduplication requirements, and the resource requirement amount of the deduplication function is greater than that of the compression function (for example, 800 logic units are required for processing deduplication and 200 logic units are required for processing compression), but obviously, the design manner of the accelerator card in the prior art cannot meet the user requirements; although it is possible to produce accelerator cards with various combinations of capabilities to meet the different user requirements, such as producing accelerator cards corresponding to various combinations of capabilities (each accelerator card still only uses a fixed configuration file and only corresponds to one combination of capabilities), this inevitably increases the production and management costs of the hardware accelerator apparatus, and when the user's requirements for application environment and combination of capabilities change, the accelerator cards corresponding to the changed requirements have to be purchased again.
Disclosure of Invention
The first aspect of the present invention is to provide a hardware acceleration device, so as to automatically adapt to different performance combination requirements by using a hardware acceleration device, thereby reducing the production cost of the hardware acceleration device.
Another aspect of the present invention is to provide a hardware acceleration method to automatically adapt to different performance combination requirements by a hardware acceleration apparatus, so as to reduce the production cost of the hardware acceleration apparatus.
Another aspect of the present invention is to provide a hardware acceleration system to automatically adapt to different performance combination requirements by a hardware acceleration device, so as to reduce the production cost of the hardware acceleration device.
The hardware acceleration device provided by the invention comprises: the system comprises a service monitoring unit, a configuration loading unit, a Field Programmable Gate Array (FPGA) and a configuration file storage area;
the service monitoring unit is used for respectively acquiring service request quantities corresponding to at least two data processing services, and comparing the service request quantities with preset request quantity thresholds corresponding to the data processing services to obtain performance values of the data processing services corresponding to the service request quantities;
the configuration file storage area is used for storing FPGA configuration files of the field programmable gate array, the FPGA configuration files comprise configuration files respectively corresponding to different data processing services, and the configuration files of each data processing service comprise configuration files respectively corresponding to different service performance values;
and the configuration loading unit is used for acquiring an FPGA configuration file matched with the performance value of the data processing service according to the acquired performance value of the data processing service matched with the service request amount, and loading the matched FPGA configuration file so as to realize hardware acceleration corresponding to the FPGA configuration file.
The hardware acceleration method provided by the invention comprises the following steps:
acquiring a service request quantity of a data processing service, and comparing the service request quantity with a preset request quantity threshold corresponding to the data processing service to obtain a performance value of the data processing service matched with the service request quantity;
acquiring an FPGA configuration file matched with the performance value of the data processing service according to the acquired performance value of the data processing service matched with the service request amount, and loading the matched FPGA configuration file to realize hardware acceleration corresponding to the FPGA configuration file; the FPGA configuration file comprises configuration files respectively corresponding to different data processing services, and the configuration file of each data processing service comprises configuration files respectively corresponding to different service performance values.
The invention provides a hardware acceleration system, comprising: the invention relates to a Field Programmable Gate Array (FPGA) and a hardware accelerator.
The hardware accelerating device has the technical effects that: by setting a service monitoring unit, a configuration loading unit and the like, the service monitoring unit obtains a corresponding performance value according to the acquired request quantity and instructs the configuration loading unit to load an FPGA configuration file corresponding to the performance value, hardware acceleration corresponding to the performance combination determined by the performance value can be realized, the service monitoring unit can monitor a service request in real time and load the corresponding configuration file in real time for adjustment, the problem that a hardware acceleration device cannot meet different performance combination requirements of a user is solved, the hardware acceleration device can automatically adapt to different performance combination requirements, and the production cost of the hardware acceleration device is reduced.
The hardware acceleration method has the technical effects that: the corresponding performance value is obtained according to the acquired request quantity, the FPGA configuration file corresponding to the performance value is loaded, hardware acceleration corresponding to the performance combination determined by the performance value can be realized, the service request can be monitored in real time, the corresponding configuration file can be loaded in real time for adjustment, the problem that the hardware acceleration device cannot meet different performance combination requirements of a user is solved, the hardware acceleration device can automatically adapt to different performance combination requirements, and the production cost of the hardware acceleration device is reduced.
The hardware acceleration system has the technical effects that: the corresponding performance value is obtained according to the acquired request quantity, the FPGA configuration file corresponding to the performance value is loaded, hardware acceleration corresponding to the performance combination determined by the performance value can be realized, the service request can be monitored in real time, the corresponding configuration file can be loaded in real time for adjustment, the problem that the hardware acceleration device cannot meet different performance combination requirements of a user is solved, the hardware acceleration device can automatically adapt to different performance combination requirements, and the production cost of the hardware acceleration device is reduced.
Drawings
FIG. 1 is a schematic diagram of a hardware acceleration device according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a hardware acceleration device according to another embodiment of the present invention;
FIG. 3 is a schematic diagram illustrating a hierarchical design of configuration files in another embodiment of a hardware acceleration apparatus according to the present invention;
FIG. 4 is a schematic diagram of a hardware acceleration device according to another embodiment of the present invention;
FIG. 5 is a diagram illustrating an application of another embodiment of the hardware acceleration device according to the present invention;
FIG. 6 is a flowchart illustrating a hardware acceleration method according to an embodiment of the present invention;
FIG. 7 is a diagram illustrating an application scenario of the hardware acceleration device according to an embodiment of the present invention;
FIG. 8 is a diagram illustrating another application scenario of the hardware acceleration device according to an embodiment of the present invention.
Detailed Description
Example one
Fig. 1 is a schematic structural diagram of a hardware acceleration device according to an embodiment of the present invention, in which the hardware acceleration device is a hardware acceleration card with an FPGA as a core.
As shown in fig. 1, the hardware acceleration apparatus of the present embodiment may include: a service monitoring unit 11, a configuration loading unit 12 and a configuration file storage area 13. Wherein,
a configuration file storage area 13, configured to store a plurality of FPGA configuration files, where the FPGA configuration files include configuration files respectively corresponding to different data processing services, and the configuration file of each data processing service includes configuration files respectively corresponding to different service performance values;
the performance value of the data processing service refers to, for example, 100MB of original data can be processed in 1 second, or 200MB of original data can be processed in 1 second; the FPGA configuration file corresponding to the performance value means that the configuration file includes configuration data of some FPGAs, and if the FPGA is configured according to the configuration data, the performance value of the data processing service can be achieved, for example, the performance of processing 100MB of original data in 1 second for the compression service is achieved. In order to obtain the performance described above, the FPGA needs to allocate its own processing resources, for example, 1 compression channel needs to occupy 100 logic resources, and 1 compression channel can only reach 20MBps processing performance. If 100 of the 1000 logical resources of the FPGA are used for processing the compression service, and only the processing performance of 20MBps can be achieved, according to the configuration file corresponding to the performance value, the FPGA may allocate 500 logical resources for processing the compression service, so as to achieve 5 compression channels, that is, the processing performance of 100 MBps.
The plurality of FPGA configuration files comprise configuration files respectively corresponding to different data processing services, and the configuration file of each data processing service comprises configuration files respectively corresponding to different service performance values; the plurality of profiles may correspond to a plurality of different data processing services, for example, a profile corresponding to a deduplication service, a profile corresponding to a compression service, and the like; for the same data processing service, files corresponding to different performance values of the service are also included, for example, there are a plurality of profiles corresponding to the deduplication service, including a profile corresponding to a first performance value (100 MB deduplication service is processed in 1 second), a profile corresponding to a second performance value (200 MB deduplication service is processed in 1 second), and so on.
A service monitoring unit 11, configured to obtain a service request amount of a data processing service, and compare the service request amount with a preset request amount threshold corresponding to the data processing service, to obtain a performance value of the data processing service matching the service request amount;
optionally, for example, at least two data processing services may be processed simultaneously on the hardware accelerator card of the FPGA, where the data processing services include, for example, deduplication, compression, data blocking, hashing, and the like, and the hardware accelerator card of the FPGA has functions of deduplication, compression, data blocking, hashing, and the like.
Specifically, the service monitoring unit 11 may obtain a service request amount of the data processing service, where the service request amount refers to the data amount requested for deduplication, compression, and the like, for example, the number of compression requests obtained in 1 second is 10, and the data amount requested for compression of the 10 requests is 50MB, and the requested amount of the compression service is 50 MB. The service monitoring unit 11 compares the service request amount with a preset request amount threshold corresponding to the data processing service; the request amount threshold may be preset in the service monitoring unit 11, and the request amount threshold may be a numerical value or a range value, for example, if the preset compressed service request amount threshold is 80MB, the request amount is compared between 50MB and 80 MB; alternatively, when the threshold value is a range value, for example, 60B-80MB, the above-mentioned request amount 50MB is also compared with the range value.
Specifically, through the comparison, the service monitoring unit 11 can obtain the performance value of the data processing service corresponding to the service request amount. For example, the initially adopted profile of the compressed service is a profile corresponding to an 80MB performance value, and by monitoring the compressed service request amount in real time and analyzing the compressed service request amount, it is found that the current compressed request amount is reduced to 50MB and is lower than the set request amount threshold, indicating that the performance of the current profile is high, the processing performance of the service should be reduced to reduce the power consumption, and the performance value closest to the service request amount after reduction is adopted as the performance value corresponding to the service request amount. At this time, the service monitoring unit 11 determines the performance value of the data processing service corresponding to the service request amount.
It should be noted that the performance value "closest" to the service request amount in the embodiment of the present invention includes the performance value "same" as the service request amount.
And the configuration loading unit 12 is configured to obtain an FPGA configuration file matched with the performance value of the data processing service according to the obtained performance value of the data processing service matched with the service request volume, and load a field programmable gate array FPGA configuration file corresponding to the performance value, so as to realize hardware acceleration corresponding to the FPGA configuration file.
Specifically, after the service monitoring unit 11 determines a performance value of the data processing service corresponding to the service request amount, the configuration loading unit 12 may obtain a configuration file to be loaded according to the performance value, and load the configuration file. The configuration loading unit 12 may load an FPGA configuration file corresponding to the performance value according to the performance value, for example, the configuration file corresponding to the performance value of 100 MB/second of the deduplication service is loaded. By loading the configuration file, the hardware acceleration of the service performance value corresponding to the configuration file can be realized. For example, the FPGA hardware accelerator card needs to implement a compression service and a deduplication service, and loads a configuration file corresponding to a 100Mps performance value of the compression service and a configuration file corresponding to a 300Mps performance value of the deduplication service, so that the FPGA sets a logic function according to the corresponding configuration files, and finally implements the performance of the data processing service, that is, after configuring according to the configuration files, the FPGA hardware accelerator card can implement the compression service of the 100Mps performance value and the deduplication service of the 300Mps performance value, and when the FPGA hardware accelerator card is used for hardware acceleration of data processing, the processing performance of the compression service and the deduplication service is combined to be 1: 3.
As can be seen from the above analysis, the service monitoring unit 11 may determine the service performance value corresponding to the requested service volume by monitoring the service requested volume of the data processing service in real time; the configuration loading unit 12 can load the configuration file corresponding to the performance value, thereby realizing the real-time adjustment of the function of the FPGA hardware accelerator card; for example, when the application environment of the user changes and causes the processing performance combination requirements of the user for different services to change, the FPGA hardware accelerator card of this embodiment may monitor the change of the user service request in real time through its service monitoring unit 11, and configure its logic function by using the above method, and finally implement hardware acceleration corresponding to the user performance combination requirements, that is, it can automatically adapt to the performance combination requirement change and smoothly adjust the uninterrupted service. Moreover, the power consumption of the FPGA hardware accelerator card can be reduced by automatically adapting to the change of performance combination requirements; the functions can be realized through one hardware accelerator card, and compared with various accelerator cards in the prior art, the hardware accelerator card has the advantage that the production and management costs are reduced.
The hardware acceleration device of the embodiment of the invention can be applied to the field of data reduction of the storage system, can also be applied to operation-intensive applications such as data exclusive-or operation, data encryption and the like in the storage system, and can also be applied to other non-storage systems which need to perform hardware acceleration of various algorithms. As for the FPGA hardware accelerator card, there are two types in practical application, that is, a partially reconfigurable FPGA and a non-partially reconfigurable FPGA, and the structure and the working principle of the two types of FPGA hardware accelerator devices are described in detail with two embodiments.
Example two
Fig. 2 is a schematic structural diagram of another embodiment of the hardware acceleration apparatus of the present invention, and this embodiment is used to explain the structure and principle of the partially reconfigurable FPGA hardware acceleration apparatus, and fig. 2 shows the structural principle and the operating principle of the hardware acceleration apparatus; the FPGA capable of partial reconfiguration refers to that a part of configuration files loaded into the FPGA can be reconfigured, for example, the FPGA includes configuration files corresponding to the service a and the service B, and only the configuration file of the service B can be replaced.
First, the configuration file stored in the configuration file storage area 13 of the FPGA of the hardware accelerator will be described. A plurality of configuration files corresponding to different performance values of different data processing services are stored in the configuration file storage area 13, that is, a plurality of configuration files are available for loading and configuration, whereas the hardware acceleration apparatus in the prior art can only adopt a configuration file in a fixed form.
Specifically, referring to fig. 3, fig. 3 is a schematic diagram of a configuration file hierarchy design in another embodiment of the hardware acceleration apparatus of the present invention. The files in the configuration file storage area are clearly divided according to functional modules, and may include, for example, A, B, C, D and E five functional modules, where a denotes a compression functional module for processing compression services, B denotes a deduplication functional module for processing deduplication services, C denotes a block functional module for processing data block services, and the like, and the functional modules are not directly coupled, i.e., are directly independent of each other. For one of the function modules, a plurality of configuration files corresponding to different service processing performance values are included; for example, for the deduplication function module B, the module B includes four configuration files B1, B2, B3 and B4, each configuration file corresponds to a different performance value for deduplication, for example, B1 represents a performance value corresponding to 100MB of deduplication data amount processed per second, B2 represents a performance value corresponding to 200MB of deduplication data amount processed per second, B3 represents a performance value corresponding to 300MB of deduplication data amount processed per second, and the like, and the FPGA accelerator card can have corresponding service processing performance after loading the configuration files. The specific setting of which function modules and the configuration file corresponding to which performance values are set for each function module can be set by the manufacturer of the hardware accelerator according to the actual use requirement, which is not limited herein.
In the embodiment corresponding to fig. 3, one of the configuration files loaded by the configuration loading unit corresponds to one performance value, where one performance value corresponds to one data processing service, and accordingly: and the configuration loading unit loads the matched FPGA configuration files, and specifically comprises loading sub configuration files which correspond to the data processing services and are matched with the acquired performance values of the data processing services respectively.
Referring to fig. 2, A, B, C three kinds of function modules are placed in the FPGA configuration file storage area of fig. 2, and each kind of function module includes a configuration file corresponding to three kinds of performance values. It should be noted that, in fact, a1, a2, B1, B2, etc. are only a part of configuration units in a complete FPGA configuration file, for example, an FPGA needs to load a2 and C3, a combination of a2 and C3 is equivalent to a complete FPGA configuration file, the FPGA is configured and loaded after a2 and C3 are loaded, and has a performance combination determined by a2 and C3, and one of a2 and C3 is only a part of the complete configuration file; however, in the embodiment of the present invention, for the sake of simplifying the description, configuration files are used for naming a, a2 and C3, and a combination of both is also referred to as a configuration file. In addition, the different performance value files in each function module are arranged according to a progressive hierarchy, for example, a1 corresponds to a 100MBps performance value of the a function, a2 corresponds to a 300MBps performance value of the a function, and A3 corresponds to a 500MBps performance value of the a function, that is, the embodiment takes the case that the performance values increase with increasing sequence numbers as an example.
In the embodiment corresponding to fig. 2, one of the configuration files loaded by the configuration loading unit is a combination file corresponding to at least two performance values, where each performance value corresponds to a data processing service, and accordingly: and the configuration loading unit loads the matched FPGA configuration file, specifically, loading a combined file which corresponds to the data processing service and is matched with the acquired performance value of the data processing service.
In addition, the "fixed function" and the "basic function architecture" in fig. 2 are only some basic functional configurations for implementing FPGA configuration and logic functions, and are not described herein again. In this embodiment, the setting positions of the service monitoring unit, the configuration loading unit, and the configuration file storage area may be flexibly set, for example, in fig. 2, the service monitoring unit 11 and the configuration loading unit 12 are both set on the FPGA; the FPGA hardware accelerator device generally comprises an FPGA hardware accelerator card and an accelerator card driving/managing unit, wherein the FPGA hardware accelerator card comprises the FPGA; for example, the service monitoring unit and the configuration loading unit may be disposed on the FPGA, may also be disposed in an area outside the FPGA of the FPGA hardware accelerator card, or may also be disposed on the accelerator card driving/managing unit; the configuration file storage area may be set in an area outside the FPGA of the FPGA hardware accelerator card, or may be set on other storage devices, such as a magnetic disk, as long as the FPGA can access the storage area of the configuration file. The hardware accelerator in the embodiment of the present invention has flexible setting positions of each functional unit, and is not strictly limited, as long as the function of adaptively adjusting the performance combination of the present invention can be realized.
The configuration flow for performing hardware acceleration by using the FPGA hardware acceleration device of the present embodiment is as follows: referring to fig. 4, fig. 4 is a schematic structural diagram of another embodiment of the hardware acceleration device of the present invention. The traffic monitoring unit 11 of the hardware acceleration apparatus may include a comparison subunit 111, a first processing subunit 112, and a second processing subunit 113. The comparing subunit 111 may obtain a service request amount received by the hardware acceleration apparatus, and compare the service request amount with a preset request amount threshold, where the request amount threshold corresponding to each data processing service may be different, for example, for a compressed service, the request amount threshold may be preset to be 80 MB; the threshold may also be a range of values, such as 60MB-80 MB.
The first processing subunit 112 is configured to, when the service request amount is higher than the request amount threshold, indicate that performance of the service processing needs to be improved, adopt an improved performance value closest to the service request amount as the performance value matched with the service request amount; a second processing subunit 113, configured to, when the service request amount is lower than the request amount threshold, indicate that performance of the service processing needs to be reduced, adopt a reduced performance value closest to the service request amount as the performance value corresponding to the service request amount. Specifically, the performance value after the increase or the performance value after the decrease may be implemented in two ways, for example, for the implementation of the performance value after the increase, assuming that the current FPGA hardware acceleration device uses a configuration file corresponding to a 50MBps performance value of a compression service, and after the request amount comparison, it determines that the service performance of the compression service is 150MBps, and the configuration file corresponding to the compression service includes the 50MBps performance value, the 100MBps performance value, and the 150MBps performance value, at this time, the configuration file of the 50MBps performance value may be directly replaced with the configuration file of the 150M performance value (which is equivalent to directly loading a high performance configuration file), or the configuration file of the 100MBps performance value may also be added on the basis of the 50MBps performance value (which is equivalent to adding a service channel of the compression service). Similarly, for the implementation of the reduced performance value, the two manners described above may also be adopted, that is, directly adopting the performance value of the lower level, or reducing the service channel.
It should be noted that, when the performance is adjusted according to the request amount comparison, the performance value is usually increased or decreased step by step, and when the performance is increased or decreased by one step, it can be determined whether the performance is matched with the request amount, and if not, the performance is continuously increased or decreased until the performance is matched with the request amount; in addition, the number of the loaded configuration files is not limited, as long as the preset performance can be achieved. In addition, when monitoring a service request, multiple service requests are generally monitored simultaneously, and finally, the selection of the performance value of each specific data processing service also needs to comprehensively determine the comparison result of each service, for example, when both a compressed service and a blocking service need to improve service performance, assuming that the compressed service needs to improve service performance to 200MBps, and the blocking service needs to improve service performance to 500MBps, then the capacity limit of the total processing resource of the whole FPGA hardware accelerator needs to be considered, if both services are improved according to the above requirements and exceed the processing resource capacity of the hardware accelerator by 600MBps, then a priority setting method can be adopted, assuming that the blocking service is set to be high priority, preferentially meeting the performance requirement of the blocking service, improving the performance of the blocking service to 500MBps, and allocating the remaining 100MBps resources of the system to the compressed service of low priority, the performance requirement of the compression service 200MBps cannot be satisfied at this time.
As can be seen from the above analysis, the service monitoring unit 11 may monitor the service request in real time and load the corresponding configuration file in real time for adjustment, for example, the service monitoring unit 11 may determine the service performance value corresponding to the request amount by monitoring the service request amount of the data processing service in real time, and the configuration loading unit 12 retrieves the configuration file corresponding to the performance value from the configuration file storage area 13 and loads the configuration file into the FPGA, thereby implementing real-time adjustment of the function of the FPGA hardware accelerator card; for example, when the application environment of the user changes and causes the processing performance combination requirements of the user for different services to change, the FPGA hardware accelerator card of this embodiment may monitor the change of the user service request in real time through its own service monitoring unit, and configure its own logic function by using the above method, and finally implement hardware acceleration corresponding to the user performance combination requirements, that is, it may automatically adapt to the performance combination requirement change, and smoothly adjust the uninterrupted service. Moreover, the power consumption of the FPGA hardware accelerator card can be reduced by automatically adapting to the change of performance combination requirements; the functions can be realized through one hardware accelerator card, and compared with various accelerator cards in the prior art, the hardware accelerator card has the advantage that the production and management costs are reduced.
The hardware acceleration device of the embodiment can realize corresponding hardware acceleration by setting the service monitoring unit and the configuration loading unit, wherein the service monitoring unit obtains the corresponding performance value according to the acquired request amount, and the configuration loading unit loads the FPGA configuration file corresponding to the performance value, so that the problem that the hardware acceleration device cannot meet different performance combination requirements of a user is solved, the hardware acceleration device can automatically adapt to different performance combination requirements, and the production cost of the hardware acceleration device is reduced.
EXAMPLE III
Fig. 5 is a schematic application diagram of a hardware acceleration apparatus according to another embodiment of the present invention, which is used for explaining the structure and principle of a non-partially reconfigurable FPGA hardware acceleration apparatus, and fig. 5 shows the structure principle and the operation principle of the hardware acceleration apparatus. For example, if the FPGA includes configuration files corresponding to the service a and the service B, the configuration files including the service a and the service B may only be entirely replaced, and only a part of the configuration files, for example, files corresponding to the service a, may not be replaced.
The principle of the hardware accelerator of the present embodiment is substantially the same as that of the hardware accelerator of the embodiment (hereinafter, referred to as the previous embodiment) in fig. 2 to 4, so that the present embodiment is only briefly described, and the difference between the hardware accelerator of the present embodiment and the hardware accelerator of the previous embodiment is mainly described. As shown in fig. 5, the main feature of the hardware acceleration apparatus is that the structure of the configuration file stored in the configuration file storage area is different from that of the previous embodiment.
Specifically, each configuration file in the previous embodiment is a separate certain performance value corresponding to a certain data processing service, for example, the configuration file a1 is a 100MBps performance value corresponding to the data processing service a (compressed service). The loaded configuration file in this embodiment is a combination file including at least two data processing services, where the configuration file of each data processing service corresponds to one of the performance values of the service. For example, the "combination of functional capabilities 2" shown in fig. 5 may correspond to the combination of a2 and C3 in fig. 2, including A, C two functions (i.e., data processing services), and capabilities a2 in a and C3 in C. The configuration loading unit directly loads the whole 'functional performance combination 2' as a whole, rather than loading the A2 and the C3 separately as in the previous embodiment. For example, if the service request comparison shows that the compression service performance needs to be configured to be 100MBps and the blocking service performance needs to be 200MBps, the "functional performance combination 2" may be selected, where a2 in the combination corresponds to the compression service with the performance of 100MBps and C3 corresponds to the blocking service with the performance of 200 MBps.
In addition, the configuration file setting structure of the functional performance combination in this embodiment may also be applied to the previous embodiment, that is, the partially reconfigurable FPGA hardware accelerator may also adopt the configuration file of the functional performance combination; however, since the present embodiment cannot be partially reconfigured, the profile setting manner in the previous embodiment cannot be applied to the present embodiment.
In the hardware acceleration device of this embodiment, by setting the service monitoring unit, the configuration loading unit, and the like, the service monitoring unit obtains the corresponding performance value according to the obtained request amount, and instructs the configuration loading unit to load the FPGA configuration file corresponding to the performance value, hardware acceleration corresponding to the performance combination determined by the performance value can be realized, and the service monitoring unit can monitor the service request in real time and load the corresponding configuration file in real time for adjustment, thereby solving the problem that the hardware acceleration device cannot meet different performance combination requirements of a user, realizing that the hardware acceleration device automatically adapts to different performance combination requirements, and reducing the production cost of the hardware acceleration device.
Example four
Fig. 6 is a flowchart illustrating an embodiment of a hardware acceleration method according to the present invention, and as shown in fig. 6, the method may include:
601. acquiring a service request quantity of a data processing service, and comparing the service request quantity with a preset request quantity threshold corresponding to the data processing service to obtain a performance value of the data processing service matched with the service request quantity;
when the service request quantity is compared with a preset request quantity threshold corresponding to the data processing service, if the service request quantity is higher than the request quantity threshold, adopting the improved performance value as the performance value corresponding to the request processing quantity; and if the service request volume is lower than the request volume threshold, adopting the reduced performance value as the performance value matched with the request processing volume.
Comparing the service request quantity with a preset request quantity threshold corresponding to the data processing service; if the service request quantity is higher than the request quantity threshold value, adopting the upgraded value which is closest to the service request quantity as the performance value matched with the service request quantity; and if the service request volume is lower than the request volume threshold, adopting the reduced performance value closest to the service request volume as the performance value corresponding to the service request volume.
602. And acquiring an FPGA configuration file matched with the performance value of the data processing service according to the acquired performance value of the data processing service matched with the service request amount, and loading the matched FPGA configuration file to realize hardware acceleration corresponding to the FPGA configuration file.
If the FPGA is an FPGA which cannot be partially reconfigured; loading the matched FPGA configuration file, including: and loading the matched FPGA configuration file, wherein the configuration file is a combined file which corresponds to at least two data processing services and one performance value of each data processing service, and the performance value of the data processing service in the combined file corresponds to the service request quantity determined by the service monitoring unit.
If the FPGA is a partially reconfigurable FPGA; loading the matched FPGA configuration file, including: loading the configuration files corresponding to different data processing services respectively; the loaded configuration file is a performance value corresponding to a data processing service.
For example, the method of the present embodiment may be executed by a hardware acceleration apparatus according to any embodiment of the present invention; specific principles may be described with reference to apparatus embodiments.
According to the hardware acceleration method, the corresponding performance value is obtained according to the acquired request quantity, the FPGA configuration file corresponding to the performance value is loaded, hardware acceleration corresponding to the performance combination determined by the performance value can be achieved, the service request can be monitored in real time, the corresponding configuration file can be loaded in real time for adjustment, the problem that the hardware acceleration device cannot meet different performance combination requirements of a user is solved, the hardware acceleration device can automatically adapt to different performance combination requirements, and the production cost of the hardware acceleration device is reduced.
EXAMPLE five
Fig. 7 is a schematic view of an application scenario of the hardware acceleration device according to the embodiment of the present invention, and fig. 8 is a schematic view of another application scenario of the hardware acceleration device according to the embodiment of the present invention. The present embodiment is mainly to briefly describe the scenarios applied to the hardware acceleration apparatus according to any embodiment of the present invention, but the actual use is not limited to these two scenarios.
As shown in fig. 7, it is a typical way to perform deduplication/compression inside a storage device. The working process is as follows: the application server stores data through a storage service interface provided by the storage equipment; the storage service interface transmits the data to the deduplication/compression application module, and the module stores the data into a system memory; the deduplication/storage application module calls an interface provided by an accelerator card driver to command a hardware accelerator card to perform data blocking, hash, compression and other processing on data stored in a system memory; and the deduplication/compression application module reads the result data processed by the accelerator card, compares the result data with the data stored in the disk, and selects new data which is not repeated and writes the new data into the disk.
As shown in fig. 8, it is a typical way to perform deduplication/compression during link replication transmission of a storage device. The working process is as follows: reading data to be transmitted in a disk to a memory by a link copying application program in the storage system A, and calling a deduplication/compression application module to reduce the data; the deduplication/compression application module calls an accelerator card drive to instruct a hardware accelerator card to perform data blocking, hash, compression and other processing on data stored in a system memory; the link copy application module transmits the reduced data from the memory to the link transmission module, and the link transmission module transmits the reduced data to a remote storage system B; the corresponding link copy application on storage system B stores the reduced data to its disk.
According to the hardware acceleration device, the corresponding performance value is obtained according to the acquired request quantity, the FPGA configuration file corresponding to the performance value is loaded, hardware acceleration corresponding to the performance combination determined by the performance value can be achieved, the service request can be monitored in real time, the corresponding configuration file can be loaded in real time for adjustment, the problem that the hardware acceleration device cannot meet different performance combination requirements of a user is solved, the hardware acceleration device can automatically adapt to different performance combination requirements, and the production cost of the hardware acceleration device is reduced.
EXAMPLE six
The embodiment of the invention also provides a hardware acceleration system which comprises the field programmable gate array FPGA and the hardware acceleration device in any embodiment of the invention. The specific working principle can be described with reference to the embodiments of the apparatus and the method, and is not described again.
The hardware acceleration device comprises a service monitoring unit, a configuration loading control unit or an FPGA configuration file storage area, wherein the service monitoring unit, the configuration loading control unit or the FPGA configuration file storage area in the hardware acceleration device is arranged on the FPGA or outside the FPGA.
Those of ordinary skill in the art will understand that: all or a portion of the steps of implementing the above-described method embodiments may be performed by hardware associated with program instructions. The program may be stored in a computer-readable storage medium. When executed, the program performs steps comprising the method embodiments described above; and the aforementioned storage medium includes: various media that can store program codes, such as ROM, RAM, magnetic or optical disks.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.
Claims (6)
1. A hardware acceleration apparatus, comprising: the system comprises a service monitoring unit, a configuration loading unit and a configuration file storage area;
the service monitoring unit is used for acquiring a service request quantity of a data processing service, and comparing the service request quantity with a preset request quantity threshold corresponding to the data processing service to obtain a performance value of the data processing service matched with the service request quantity;
the configuration file storage area is used for storing FPGA configuration files of the field programmable gate array, the FPGA configuration files comprise configuration files respectively corresponding to different data processing services, and the configuration files of each data processing service comprise configuration files respectively corresponding to different service performance values;
the configuration loading unit is used for acquiring an FPGA configuration file matched with the performance value of the data processing service according to the acquired performance value of the data processing service matched with the service request amount, and loading the matched FPGA configuration file to realize hardware acceleration corresponding to the FPGA configuration file;
each configuration file is a combination file corresponding to at least two performance values, wherein each performance value corresponds to a data processing service, and correspondingly:
and the configuration loading unit loads the matched FPGA configuration file, and specifically comprises loading a combined file which corresponds to the data processing service and is matched with the performance value of the data processing service.
2. The hardware acceleration apparatus of claim 1, wherein the traffic monitoring unit comprises:
a comparison subunit, configured to compare the service request amount with a preset request amount threshold corresponding to the data processing service;
the first processing subunit is configured to, when the service request amount is higher than the request amount threshold, adopt the performance value that is closest to the service request amount after being promoted as the performance value matched with the service request amount;
and the second processing subunit is configured to, when the service request amount is lower than the request amount threshold, adopt the reduced performance value closest to the service request amount as the performance value corresponding to the service request amount.
3. A method for hardware acceleration, comprising:
acquiring a service request quantity of a data processing service, and comparing the service request quantity with a preset request quantity threshold corresponding to the data processing service to obtain a performance value of the data processing service matched with the service request quantity;
acquiring an FPGA configuration file matched with the performance value of the data processing service according to the acquired performance value of the data processing service matched with the service request amount, and loading the matched FPGA configuration file to realize hardware acceleration corresponding to the FPGA configuration file; the FPGA configuration file comprises configuration files respectively corresponding to different data processing services, and the configuration file of each data processing service comprises configuration files respectively corresponding to different service performance values;
each configuration file is a combination file corresponding to at least two performance values, wherein each performance value corresponds to a data processing service, and the loading of the matched FPGA configuration file comprises the following steps:
and loading a combined file which corresponds to the data processing service and is matched with the acquired performance value of the data processing service.
4. The hardware acceleration method of claim 3, wherein the comparing the service request amount with a preset request amount threshold corresponding to the data processing service to obtain a performance value of the data processing service matching the service request amount comprises:
comparing the service request quantity with a preset request quantity threshold corresponding to the data processing service;
if the service request quantity is higher than the request quantity threshold value, adopting the upgraded value which is closest to the service request quantity as the performance value matched with the service request quantity;
and if the service request volume is lower than the request volume threshold, adopting the reduced performance value closest to the service request volume as the performance value corresponding to the service request volume.
5. A hardware acceleration system, comprising: a field programmable gate array FPGA, and the hardware acceleration apparatus of any one of claims 1-2.
6. The hardware acceleration system of claim 5, wherein a service monitoring unit, a configuration loading control unit, or an FPGA configuration file storage area in the hardware acceleration device is disposed on the FPGA or outside the FPGA.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201110459423.6A CN103399758B (en) | 2011-12-31 | 2011-12-31 | Hardware-accelerated methods, devices and systems |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201110459423.6A CN103399758B (en) | 2011-12-31 | 2011-12-31 | Hardware-accelerated methods, devices and systems |
Publications (2)
Publication Number | Publication Date |
---|---|
CN103399758A CN103399758A (en) | 2013-11-20 |
CN103399758B true CN103399758B (en) | 2016-11-23 |
Family
ID=49563392
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201110459423.6A Active CN103399758B (en) | 2011-12-31 | 2011-12-31 | Hardware-accelerated methods, devices and systems |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103399758B (en) |
Families Citing this family (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105577801B (en) * | 2014-12-31 | 2019-01-11 | 华为技术有限公司 | A kind of business accelerating method and device |
CN104899085B (en) | 2015-05-29 | 2018-06-26 | 华为技术有限公司 | A kind of data processing method and device |
CN108073423B (en) | 2016-11-09 | 2020-01-17 | 华为技术有限公司 | Accelerator loading method and system and accelerator loading device |
CN108062239B (en) * | 2016-11-09 | 2020-06-16 | 华为技术有限公司 | Accelerator loading method and system and accelerator loading device |
CN106777729A (en) * | 2016-12-26 | 2017-05-31 | 中核控制系统工程有限公司 | A kind of algorithms library simulation and verification platform implementation method based on FPGA |
CN108319563B (en) * | 2018-01-08 | 2020-01-03 | 华中科技大学 | Network function acceleration method and system based on FPGA |
CN110334801A (en) * | 2019-05-09 | 2019-10-15 | 苏州浪潮智能科技有限公司 | A kind of hardware-accelerated method, apparatus, equipment and the system of convolutional neural networks |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1553502A2 (en) * | 2003-10-17 | 2005-07-13 | Kabushiki Kaisha Toshiba | Reconfigurable signal processing module |
CN101286738A (en) * | 2008-05-15 | 2008-10-15 | 华为技术有限公司 | Method, device and system for loading logic files based on equipment information |
CN101441574A (en) * | 2007-11-20 | 2009-05-27 | 中兴通讯股份有限公司 | Multiple-FPGA logical loading method in embedded system |
CN101452502A (en) * | 2008-12-30 | 2009-06-10 | 华为技术有限公司 | Method for loading on-site programmable gate array FPGA, apparatus and system |
CN102147735A (en) * | 2010-02-10 | 2011-08-10 | 华为技术有限公司 | Interface single board and business logic loading method |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7878902B2 (en) * | 2003-07-16 | 2011-02-01 | Igt | Secured verification of configuration data for field programmable gate array devices |
-
2011
- 2011-12-31 CN CN201110459423.6A patent/CN103399758B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1553502A2 (en) * | 2003-10-17 | 2005-07-13 | Kabushiki Kaisha Toshiba | Reconfigurable signal processing module |
CN101441574A (en) * | 2007-11-20 | 2009-05-27 | 中兴通讯股份有限公司 | Multiple-FPGA logical loading method in embedded system |
CN101286738A (en) * | 2008-05-15 | 2008-10-15 | 华为技术有限公司 | Method, device and system for loading logic files based on equipment information |
CN101452502A (en) * | 2008-12-30 | 2009-06-10 | 华为技术有限公司 | Method for loading on-site programmable gate array FPGA, apparatus and system |
CN102147735A (en) * | 2010-02-10 | 2011-08-10 | 华为技术有限公司 | Interface single board and business logic loading method |
Also Published As
Publication number | Publication date |
---|---|
CN103399758A (en) | 2013-11-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103399758B (en) | Hardware-accelerated methods, devices and systems | |
US10993127B2 (en) | Network slice instance management method, apparatus, and system | |
US9225668B2 (en) | Priority driven channel allocation for packet transferring | |
CN108989238A (en) | Method for distributing service bandwidth and related equipment | |
US7779175B2 (en) | System and method for rendezvous in a communications network | |
WO2011088767A1 (en) | Content delivery method, system and schedule server | |
CN102426552A (en) | Storage system service quality control method, device and system | |
US20140036680A1 (en) | Method to Allocate Packet Buffers in a Packet Transferring System | |
US9985893B2 (en) | Load sharing method and apparatus, and board | |
CA2718291A1 (en) | System and method for memory allocation in embedded or wireless communication systems | |
US11758532B2 (en) | Systems and methods for application aware slicing in 5G layer 2 and layer 1 using fine grain scheduling | |
CN108924203B (en) | Data copy self-adaptive distribution method, distributed computing system and related equipment | |
US10587526B2 (en) | Federated scheme for coordinating throttled network data transfer in a multi-host scenario | |
CN116301598A (en) | Setting method and device of OP of SSD and storage medium | |
US20110083136A1 (en) | Distributed processing system | |
US20120222030A1 (en) | Lazy resource management | |
US20230421513A1 (en) | Storage devices, methods of operating storage devices, and streaming systems including storage devices | |
KR20180050180A (en) | Data management system and method for distributed data processing | |
CN117978743A (en) | Equipment network transmission control method, device, system and storage medium | |
CN102647352B (en) | Message forwarding method and device as well as communication equipment | |
CN101360325B (en) | Method and apparatus for ground resource and wireless resource combined management | |
CN113258679A (en) | Power grid monitoring system channel distribution method based on server instance capacity reduction | |
CN116668379B (en) | Data transmission method and system, FDS management module, storage medium and electronic device | |
EP4439267A1 (en) | Data processing system and method and device | |
US12126502B1 (en) | Configurable quality of service provider pipeline |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right |
Effective date of registration: 20220927 Address after: No. 1899 Xiyuan Avenue, high tech Zone (West District), Chengdu, Sichuan 610041 Patentee after: Chengdu Huawei Technologies Co.,Ltd. Address before: 611731 Qingshui River District, Chengdu hi tech Zone, Sichuan, China Patentee before: HUAWEI DIGITAL TECHNOLOGIES (CHENG DU) Co.,Ltd. |
|
TR01 | Transfer of patent right |