CN118520922A - Compression method and decompression method of deep learning model based on serialization - Google Patents
Compression method and decompression method of deep learning model based on serialization Download PDFInfo
- Publication number
- CN118520922A CN118520922A CN202410797825.4A CN202410797825A CN118520922A CN 118520922 A CN118520922 A CN 118520922A CN 202410797825 A CN202410797825 A CN 202410797825A CN 118520922 A CN118520922 A CN 118520922A
- Authority
- CN
- China
- Prior art keywords
- operation unit
- file
- deep learning
- learning model
- character string
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/082—Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Machine Translation (AREA)
Abstract
The disclosure provides a compression method of a deep learning model, relates to the technical field of artificial intelligence, and particularly relates to the technical field of deep learning. The specific implementation scheme is as follows: acquiring an initial file of a deep learning model, wherein the deep learning model comprises an operation unit, and the operation unit and type elements and attribute elements for describing the operation unit are in an intermediate expression form in the initial file; carrying out text serialization on the initial file to obtain a text file of the deep learning model, wherein in the text file, an operation unit, a type element and an attribute element are in a character string form; and performing binary serialization on the operation unit, the type element and the attribute element in the form of the character string in the text file to obtain a compressed file of the deep learning model. The disclosure also provides a decompression method, a decompression device, electronic equipment and a storage medium of the deep learning model.
Description
Technical Field
The present disclosure relates to the field of artificial intelligence, and in particular, to the field of deep learning. More specifically, the present disclosure provides a compression method of a deep learning model, a decompression method, an apparatus, an electronic device, a storage medium, and a computer program product of a deep learning model.
Background
In the deep learning framework, in order to ensure that the deep learning model structure can be completely stored and loaded to support fine adjustment and reasoning of the model, necessary information of the model structure needs to be stored, and meanwhile, storage space and confidentiality of a storage result need to be considered.
Disclosure of Invention
The present disclosure provides a compression method of a deep learning model, a decompression method, an apparatus, an electronic device, a storage medium, and a computer program product of the deep learning model.
According to a first aspect, there is provided a compression method of a deep learning model, the method comprising: acquiring an initial file of a deep learning model, wherein the deep learning model comprises an operation unit, and the operation unit and type elements and attribute elements for describing the operation unit are in an intermediate expression form in the initial file; carrying out text serialization on the initial file to obtain a text file of the deep learning model, wherein in the text file, an operation unit, a type element and an attribute element are in a character string form; and performing binary serialization on the operation unit, the type element and the attribute element in the form of the character string in the text file to obtain a compressed file of the deep learning model.
According to a second aspect, there is provided a method of decompression of a deep learning model, the method comprising: obtaining a compressed file of a deep learning model, wherein the compressed file is obtained according to a compression method of the deep learning model, and the deep learning model comprises an operation unit; acquiring the character string identification and the binary identification mapping table of each of the operation unit, the type element and the attribute element; decompressing the compressed file into an initial file according to the character string identification and binary identification mapping table
According to a third aspect, there is provided a compression apparatus of a deep learning model, the apparatus comprising: the device comprises an initial file acquisition module, a deep learning module and a data processing module, wherein the initial file acquisition module is used for acquiring an initial file of a deep learning model, the deep learning model comprises an operation unit, the operation unit and type elements and attribute elements used for describing the operation unit are in an intermediate expression form in the initial file; the text file determining module is used for carrying out text serialization on the initial file to obtain a text file of the deep learning model, wherein in the text file, an operation unit, a type element and an attribute element are in a character string form; and the compressed file determining module is used for binary serializing the operation units, the type elements and the attribute elements in the form of character strings in the text file to obtain a compressed file of the deep learning model.
According to a fourth aspect, there is provided a decompression apparatus of a deep learning model, the apparatus comprising: the compressed file acquisition module is used for acquiring a compressed file of the deep learning model, wherein the compressed file is obtained according to a compression device of the deep learning model, the deep learning model comprises an operation unit, and in the compressed file, the operation unit and type elements and attribute elements used for describing the operation unit are in binary forms; the mapping table acquisition module is used for acquiring the character string identifiers and the binary identifier mapping tables of the operation unit, the type elements and the attribute elements; and the decompression module is used for decompressing the compressed file into an initial file according to the character string identification and binary identification mapping table.
According to a fifth aspect, there is provided an electronic device comprising: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform a method provided in accordance with the present disclosure.
According to a sixth aspect, there is provided a non-transitory computer readable storage medium storing computer instructions for causing a computer to perform a method provided according to the present disclosure.
According to a seventh aspect, there is provided a computer program product comprising a computer program stored on at least one of a readable storage medium and an electronic device, which, when executed by a processor, implements a method provided according to the present disclosure.
It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the disclosure, nor is it intended to be used to limit the scope of the disclosure. Other features of the present disclosure will become apparent from the following specification.
Drawings
The drawings are for a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:
FIG. 1 is a schematic diagram of an exemplary system architecture to which a compression method and decompression method of a deep learning model may be applied, according to one embodiment of the present disclosure;
FIG. 2 is a flow chart of a method of compression of a deep learning model according to one embodiment of the present disclosure;
FIG. 3A is a schematic diagram of a text serialization rule for a type element according to one embodiment of the present disclosure;
FIG. 3B is a schematic diagram of a text serialization rule for an attribute element according to one embodiment of the present disclosure;
FIG. 3C is a schematic diagram of a text serialization rule for an operation unit according to one embodiment of the present disclosure;
FIG. 4 is a flow chart of a method of compression of a deep learning model according to another embodiment of the present disclosure;
FIG. 5A is a schematic diagram of a string identification and binary identification mapping table for a dialect according to one embodiment of the present disclosure;
FIG. 5B is a schematic diagram of a mapping table of string identifications and binary identifications of type elements and attribute elements according to one embodiment of the present disclosure;
FIG. 5C is a schematic diagram of a string identification and binary identification mapping table of an operation unit according to one embodiment of the present disclosure;
FIG. 6 is a flow chart of a method of decompression of a deep learning model according to one embodiment of the present disclosure;
FIG. 7 is a flow chart of a method of decompression of a deep learning model according to another embodiment of the present disclosure;
FIG. 8 is a block diagram of a compression apparatus of a deep learning model according to one embodiment of the present disclosure;
FIG. 9 is a block diagram of a decompression apparatus of a deep learning model according to one embodiment of the present disclosure
Fig. 10 is a block diagram of an electronic device of at least one of a compression method and a decompression method of a deep learning model according to one embodiment of the present disclosure.
Detailed Description
Exemplary embodiments of the present disclosure are described below in conjunction with the accompanying drawings, which include various details of the embodiments of the present disclosure to facilitate understanding, and should be considered as merely exemplary. Accordingly, one of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
Serialization is the process of converting an object or data structure into a transmissible or storable format, while deserialization is the process of restoring data in this format into the original object or data structure. Serialization protocols are a convention that specifies how data structures or objects are converted into a format that can be transmitted or stored. Deserialization is the inverse of serialization, which converts serialized data into the original object or data structure. In the process of reverse serialization, analysis and restoration are required according to a protocol adopted in serialization so as to ensure the integrity and consistency of data.
In the deep learning framework, in order to ensure that the model structure can be completely stored and loaded to support fine adjustment and reasoning of the model, a reasonable serialization protocol can be designed to store necessary information of the model structure. In order to meet the requirements of different storage spaces, the model structure can be serialized into files with different compression levels for storage.
The serialization protocols of the deep learning framework in the industry are divided into two types, one type is a binary scheme and the other type is a text scheme.
The binary scheme is to convert the model structure into binary data for storage, so that the storage space can be saved. However, the current binary scheme requires defining the model structure within the framework as a specific text structure, and then converting the text structure into a data structure supported by the tool by using the tool, and further converting the data structure into binary data. That is, this method requires a single conversion of intermediate data by means of a tool, and also requires a reverse conversion of the intermediate data in the case of reverse serialization, and thus increases the cost of serialization and reverse serialization.
The text scheme is to map the model structure within the frame to text. Meanwhile, an inverse serialization rule of the text to the model structure needs to be established, and the structure in the framework is restored according to the rule. Text schemes can guarantee high readability of files, but typically take up a relatively high amount of memory. And the text serialization protocol has low confidentiality and is easy to tamper.
FIG. 1 is a schematic diagram of an exemplary system architecture for a compression method and decompression method in which a deep learning model may be applied, according to one embodiment of the present disclosure. It should be noted that fig. 1 is only an example of a system architecture to which embodiments of the present disclosure may be applied to assist those skilled in the art in understanding the technical content of the present disclosure, but does not mean that embodiments of the present disclosure may not be used in other devices, systems, environments, or scenarios.
As shown in fig. 1, the system architecture 100 of this embodiment may include a deep learning framework 110 and a storage device 120. The deep learning framework 110 is a tool that builds a deep learning model that can help a developer build and train the deep learning model quickly. For example, the deep learning model 110 provides the basic components of the deep learning model, and the user can obtain the desired deep learning model 111 by selecting the basic components to construct. The deep learning model 111 constructed by the deep learning model framework 110 may be in the form of an intermediate representation (INTERMEDIATE REPRESENTATION, IR).
The deep learning framework 110 also includes a serialization module 112 and an anti-serialization module 113. The serialization module 112 is configured to serialize the deep learning model 111, where the serialization may convert an intermediate expression form of the model structure into a character string or a binary form, to obtain a compressed model file. Storing the compressed model file in the storage device 120 may reduce the storage space. Model files in storage 120 may be used for fine-tuning and reasoning.
The deep learning framework 110 can also read the compressed model file from the storage device 120, and perform deserialization processing through the deserialization model 113, so as to recover the deep learning model 111 in the intermediate expression form, thereby ensuring the integrity and consistency of the model structure.
Fig. 2 is a flow chart of a method of compression of a deep learning model according to one embodiment of the present disclosure.
As shown in FIG. 2, the compression method 200 of the deep learning model includes steps S210 to S230.
In step S210, an initial file of the deep learning model is acquired.
For example, the initial file of the deep learning model may be generated when the deep learning model is built by the deep learning model framework. The initial file may be an application (Program), and the structure of the deep learning model may include an application (Program), a Region (Region), a Block (Block), and an Operation unit (Operation). The file is divided into a plurality of areas, each area including at least one block, each block including at least one operation unit. The operation unit is a basic structural unit of the deep learning model. The operation unit includes inputs, outputs, and parameters described by a Value (Value), a Type element (Type), and an Attribute element (Attribute).
In the initial document, each structure of the model is an intermediate expression form (IR), and the model structure may be referred to as an IR structure. In one example, the deep learning model framework is a flyweight framework (Paddle), and the intermediate expression may be PIR (PADDLE INTERMEDIATE presentation, new intermediate expression).
In the IR structure, the operation units, the type elements and the attribute elements are high in occurrence frequency, various in types and relevant to IR expansion, and the IR structure of the model is greatly influenced. The number of areas and blocks is small, the types are fixed, and the areas and blocks are not transformed along with the expansion of IR.
In one example, the initial file of the deep learning model is a c++ Program, and the operation units, type elements, and attribute elements are objects (i.e., IR) in the c++ Program.
In step S220, the initial file is text-serialized to obtain a text file of the deep learning model.
To convert the initial file into a storable, transmissible data format, the initial file may be text serialized, i.e., the IR structure is converted into a text structure. In particular, the intermediate expression form of each structure in the model can be converted into a text form.
For example, the operation unit, the type element, and the attribute element in the IR form in the initial file are converted into the character string form. The IR-form area, block in the original file is also converted into a character string form.
The operation units, the type elements, and the attribute elements are IR organized by a dialect system (Dialect) that distinguishes between different kinds of operation units. Therefore, the operation unit, the type element and the attribute element can be subjected to character string coding based on the dialect of the operation unit, so that the character string identifications of the operation unit, the type element and the attribute element can be obtained.
For the region and the block, according to the hierarchical relationship among the region, the block, and the operation unit, the character string identifications of the operation unit, the type element, and the attribute element may be added to the region and the block, and the character string codes may be performed in the appearance order for the region and the block, to obtain the character string identifications of the region and the block.
In step S230, binary serialization is performed on the operation units, the type elements and the attribute elements in the form of character strings in the text file, so as to obtain a compressed file of the deep learning model.
In order to further reduce the storage space occupied by the text file and enhance confidentiality, the text file may be further binary-serialized to obtain a binary text. However, a complete binary file can affect the readability of the file.
In view of this, embodiments of the present disclosure binary-serialize the string identifications of the operation unit, the type element, and the attribute element, which are related to the IR structure, with respect to the operation unit, the type element, and the attribute element, which have high occurrence frequency, are multiple in types, to obtain the string identifications of the operation unit, the type element, and the attribute element. Other objects (e.g., region and block string identifications) that occur less frequently, are of a fixed variety, and do not transform with IR expansion may not be binary serialized, but remain in string form to ensure readability.
Because the operation units, the type elements and the attribute elements are various and have high occurrence frequency, the character string identifications of the operation units, the type elements and the attribute elements are binary-sequenced, the text file can be greatly compressed, the space occupied by file storage is reduced, and the file has higher confidentiality due to the binary format.
Because the operation unit, the type element and the attribute element are related to the model structure/IR structure, the character string identifications of the operation unit, the type element and the attribute element are binary-serialized, so that binary compression results and IR structure layers are mapped layer by layer, text files can be directly converted into binary files, conversion of other intermediate data is avoided, and serialization/deserialization and compression/decompression costs are reduced.
Because the operation unit, the type element and the attribute element are related to IR expansion, the character string identifications of the operation unit, the type element and the attribute element are binary serialized, so that binary compression results can be changed along with expansion of an IR structure, and the expansion of a deep learning model is facilitated.
According to the embodiment of the disclosure, the initial file text in the IR form of the deep learning model is serialized into the text file, and the character string identifications of the operation unit, the type element and the attribute element in the text file are binary serialized to obtain the compressed file.
It should be noted that the initial file and the compressed file are different in form. The initial file is an application program, and the objects in the application program are intermediate expressions. The compressed file is a transmissible, storable file, and the objects in the compressed file are in the form of strings or binaries. For example, the operation unit, the type element, and the attribute element in the compressed file are in binary form, and the objects other than the operation unit, the type element, and the attribute element are in the form of character strings.
According to an embodiment of the present disclosure, step S220 includes performing string encoding on the operation unit, the type element, and the attribute element in the intermediate expression form, to obtain respective string identifiers of the operation unit, the type element, and the attribute element; according to the structural relation among the region, the block and the operation unit, the character string identifiers of the operation unit, the type element and the attribute element are added into the operation unit, the region and the block, and the respective character string identifiers are coded according to the region and the block in sequence, so that the text file of the deep learning model is obtained.
The operation unit, the type element, and the attribute element are IR organized by the dialect system. The dialect system is provided with a plurality of dialects, each dialect is defined with an own operation unit, a type element and an attribute element, and the operation units of the plurality of types in the deep learning model can come from a plurality of different dialects, so that the operation units of the deep learning model can carry out character string coding based on the identification of the dialects to which the operation units belong. For example, the dialect identifier and the self identifier may be spliced to obtain a customized character string identifier.
For example, for an operation unit whose dialect is identified as pd_op, the string code for the operation unit may be "pd_op.full" by the name (or identification) of full.
The Type element Type and the Attribute element Attribute can also perform character string coding according to a preset coding rule to obtain respective character string identifiers.
Fig. 3A is a schematic diagram of a text serialization rule for a type element according to one embodiment of the present disclosure.
Fig. 3B is a schematic diagram of a text serialization rule for an attribute element according to one embodiment of the present disclosure.
The Type element Type and Attribute element Attribute shown in fig. 3 a-3 b may be PIRs under a propeller frame (Paddle). The Type element Type is classified into a valued Type and an unvalued Type, and the Attribute element Attribute is also classified into a valued Attribute and an unvalued Attribute. The structure with the value Type/Attribute needs to store the value besides the identification, and the range of the value can contain c++ standard library data. The serialization rule for Type/Attribute is the mapping of IR to string identifier (Id).
As shown in fig. 3A, the null Type may include an int8Type, a float16Type, etc. The "Id" of these non-valued types are the string identifications encoded according to the custom encoding rules. The value Type may include DenseTensorType, which includes a value "Content" in addition to the string identification (i.e., id).
As shown in fig. 3B, attribute includes IndexAttribute, arrayAttribute, typeAttribute or the like having a value. The Ids of the valued Attributes are character string identifications obtained by coding according to a self-defined coding rule. The value Attribute includes a value "Content" in addition to the string identification (i.e., id).
Next, the character string identifications of the operation units, the type elements, and the attribute elements may be added to each hierarchical structure according to the hierarchical structure of the deep learning model, and the hierarchical relationship and nested relationship of the regions, the blocks, and the operation units. The hierarchical relationship of the structure of the deep learning model is shown in table 1 below.
TABLE 1
Key (Key) | Value (Value) |
"Program" | {"ModuleOp": ...} |
"ModuleOp" | {"Regions":[]} |
"Regions" | [{Region0},{Region1}] |
{Region} | ["Id": {RegionId_x},"Blocks":[]] |
{Block} | {"Id": "BlockId_x","BlockArgs":[] "Ops":[]} |
"Ops" | [{op0},{op1}] |
Op | {"Id":"xxx""OpOperands":[{value}]"Opresults":[{value}]"Attr":[{attribute}]"OpResultsAttr":[{attribute}]} |
As shown in table 1, keys (keys) and values (values) constitute dictionary pairs for characterizing relationships between structures. The Program is the outermost unit of files, each file containing only one. The Program contains units of the expression model structure: region, block, and operation unit, the operation unit being a basic structural unit. The regions, blocks and operation units may also be nested with one another, e.g., a region contains multiple blocks, and a block may also be nested with another region.
The Program includes a Region list Region including a plurality of Regions (Region 0, region 1), each Region including a Region identifier (Id) of the Region itself, and a block list block including a plurality of Blocks. Each block includes a block identification (Id) of the block itself and an operation unit list Ops including a plurality of operation units (op 0, op 1). Each operation unit includes its own operation unit identification Id and input (OpOperands), output (Opresults), attribute (Attr), output attribute (OpResultsAttr).
The inputs and outputs are described by values (Value), both of which include a list of values. The input/output values are independent of the key Value in table 1. Each Value in the Value list contains its identity (Id), which may be an int64 Value, and a Type (Type), which may be a Type element as shown in fig. 3A.
Attr is an attribute (not an attribute element) of an operation unit, and is a list describing the attribute. Each attribute contains its Name (Name), which is the Name of the string expressing the attribute, and a Type (Type), which may be an attribute element as shown in fig. 3B.
OpResultAttr are some optional attributes in OpResult, but they are not parameters necessary for de-serialization and therefore may be stored separately, some of which may not be stored.
Fig. 3C is a schematic diagram of a text serialization rule for an operation unit according to one embodiment of the present disclosure.
As shown in fig. 3C, "pd_op.full" is a string identifier of the operation unit "full", and "pd_op" is a dialect identifier. The operation unit has no input, and the output Value includes Id and type element "DenseTensorType" and the Value Contents of the type element. "Attr" is an attribute list of the operation unit, in which attribute names and attribute elements "FloatAttribute", "ArrayAttribute" and values Contents of the attribute elements are included. "OpResultsAttr" is the output result attribute, including the attribute name and attribute element and its value.
Based on the serialization rule shown in fig. 3a to 3c, an operation unit string identifier, a type element string identifier and an attribute element string identifier can be obtained, then, according to the hierarchical relationship of the operation unit, the region and the block, the string identifiers of the type element and the attribute element can be added into the operation unit, the type element and the attribute element, and the string identifiers of the operation unit, the type element and the attribute element can be added into the operation unit, the block and the region, so that the intermediate expression forms of most of the objects in the initial file can be encoded into the string form.
In the initial file, other objects than the operation unit, the type element, and the attribute element, such as blocks and regions, may sequentially encode the character string identifications in the order of appearance, for example, the order of appearance plus the identification of the blocks or regions themselves. Thus, intermediate expressions of the original file can be encoded into character string forms.
According to the embodiment of the disclosure, the intermediate expression forms of the operation unit, the type element and the attribute element are respectively encoded with the respective character string identifications, and then the character string identifications of the operation unit, the type element and the attribute element are added according to the hierarchical relationship of the deep learning model structure to obtain the text file in the character string form, so that the character string serialization result corresponds to the model structure one by one.
Fig. 4 is a flow chart of a method of compression of a deep learning model according to another embodiment of the present disclosure.
As shown in FIG. 4, the method includes steps S410-S470.
In step S410, the initial file is checked, and it is determined whether the check is passed.
For example, embodiments of the present disclosure are directed to PIR-form model structures, and thus, it may be checked whether the initial file is a file of the PIR model structure. In addition, the initial file may further include a model version, and it may be checked whether the current version of the initial file is the latest version.
If the verification passes, step S420 is executed, otherwise the flow ends.
In step S420, the initial file is text-serialized to obtain a text file.
For example, the operation unit, the type element and the attribute element in the intermediate expression form in the initial file are subjected to character string coding, so that the operation unit character string identifier, the type element character string identifier and the attribute element character string identifier in the character string form are obtained. And carrying out character string coding on the blocks and the areas except the operation unit according to the appearance sequence to obtain character string identifiers of the blocks and the areas.
The operation units, the type elements, the attribute elements, the blocks and the areas in the text file are all in the form of character strings.
In step S430, it is determined whether the text file is compressed.
Text files are already files that can be stored and transferred. When the file readability requirement is high, the text file may be stored directly without compression (step S470).
When the file storage space requirement is high, the text file may be compressed, and the following steps S440 to S470 are performed.
In step S440, binary serializing is performed on the operation unit, the type element and the attribute element in the form of the character string in the text file, and the character string identifier and the binary identifier mapping table are stored, so as to obtain the semi-compressed file.
Binary serialization is the encoding of a string identification into a binary identification. The operation unit, the type element and the attribute element are organized by Dilect, and the binary identification of the operation unit can be obtained by splicing the binary result of the dialect identification with the binary result of the character string identification of the operation unit. The binary identification of the type element can be obtained by splicing a binary result of the dialect identification with a binary result of the character string identification of the type element. The binary identification of the attribute element can be obtained by splicing a binary result of the dialect identification with a binary result of the character string identification of the attribute element.
The binary coding may be custom, e.g., one byte may be reserved to represent a dialect identifier, and multiple bytes may be reserved after the dialect identifier bytes to represent operation units and type elements, or operation units and attribute elements.
In the binary encoding process, a mapping table of the character string identifier and the binary identifier of the operation unit, a mapping table of the character string identifier and the binary identifier of the type element, and a mapping table of the character string identifier and the binary identifier of the attribute element can be stored. And the coding is stored while the one-to-one mapping between the coding result and the IR is ensured.
In step S450, it is determined whether the file is fully compressed.
Embodiments of the present disclosure may provide multiple compression levels, including half compression and full compression. The file obtained in step S440 is a semi-compressed file, the operation unit, the type element and the attribute element in the semi-compressed file are in binary form, and the other objects are in the form of character strings.
Encoding these objects as binary data has enabled a significant reduction in the storage space of the file due to the high frequency of occurrence of the operation units, type elements and attribute elements. However, to further meet the requirement of reducing the storage space, the full compression of the half-compressed file may be further performed, that is, step S460 is performed to obtain the full-compressed file.
In step S460, text compression is performed on the character strings other than the character string identifications of the operation unit, the type element, and the attribute element, to obtain a fully compressed file.
The full compression mode may be to compress text of other character strings (for example, character string identifiers of areas and blocks) except the operation unit, the type element and the attribute element in the half-compressed file, and the text compression refers to compressing long character strings into short character strings, so that the storage space is further reduced, and meanwhile, certain readability is ensured.
In step S470, the file is stored.
In an uncompressed scenario, the text file may be stored. In a semi-compressed scenario, a semi-compressed file may be stored. In a full compression scenario, a full compression file may be stored.
The embodiment of the disclosure provides a plurality of compression levels, and compressed files with different compression levels can be obtained, so that the storage requirement and the readability requirement of a user on the deep learning model file are met.
The string identification and binary identification mapping table of the operation unit, the type element, and the attribute element will be described below.
Fig. 5A is a schematic diagram of a string identification and binary identification mapping table for a dialect according to one embodiment of the present disclosure.
As shown in fig. 5A, DIALECTIDMAP is a mapping table of string identifications and binary identifications of dialects. A binary identification of a byte representation dialect is reserved in the mapping table. The first bit of the byte may be custom encoded as either a 0 or a1 to indicate whether the type element/attribute element or the operation unit is to be subsequently encoded. For example, 0 indicates that the type element/attribute element is encoded later, and 1 indicates that the operation unit is encoded later. The remaining 7 bits identify different dialects. For example, based on the fact that the number of dialects to be serialized is less than 128, if the number of dialects to be serialized exceeds the number of dialects, the extension code is occupied, and compatible function records of version upgrading are needed.
As shown in fig. 5A, the binary identification of the dialect "Builtin" is 10000000, followed by the binary identification of the operation unit. The binary identification of the other dialects in the mapping table is similar and will not be described in detail here.
Fig. 5B is a schematic diagram of a mapping table of string identifications and binary identifications of type elements and attribute elements according to one embodiment of the present disclosure.
As shown in fig. 5B, ATTRTYPEIDMAP is a mapping table of string identifications and binary identifications of attribute elements/type elements. The binary encoding mode of the attribute element/type element is DIALECTIDATTRTYPEID.
Dialect to ATTRTYPEID, ATTRTYPEID is expressed in two bytes, the value of the first bit indicating whether the element Type is Type or attribute Type Attr,0 being Type and 1 being Attr. Reserved 15 bits represent IDs, 128 x 256 IDs can be expressed. For example: the coding of typeid=1 is 00000000/00000001, and the coding of attridid=3 is 1000000/00000010.
As shown in fig. 5B, the final compression result is as follows: the complete binary identification of pir BoolType is 00000000/00000000/00000001.paddle: dialect: selectedRowsType complete two the binary identification is 00000010/00000000/00000001.
The compressed correspondence is recorded in ATTRTYPEIDMAP, and ATTRTYPEIDMAP may be saved to a compressed file as shown in fig. 5B.
Fig. 5C is a schematic diagram of a string identification and binary identification mapping table of an operation unit according to one embodiment of the present disclosure.
As shown in fig. 5C, opIdMap is a mapping table of string identifications and binary identifications of operation units. The operation unit is coded in DialectIDOperationID, and the operation ids in Dialect are mutually exclusive and expressed in a plurality of bytes.
As shown in fig. 5C, the final compression result is as follows: the complete coding of the combine is 10000000/00000000/00000000. The complete encoding of pd_op.full is 10000010/00000000/00000000.
The compressed correspondence is recorded in OpIdMap, and OpIdMap may be saved to a compressed file as shown in fig. 5B.
According to the embodiment of the disclosure, the operation unit, the type element and the attribute element are binary coded according to the dialect identifier, a mapping table of the character string identifier and the binary identifier of the operation unit, the type element and the attribute element is generated, the mapping table is stored, and deserialization can be performed by using the mapping table.
According to an embodiment of the disclosure, the disclosure further provides a decompression method of the deep learning model.
Fig. 6 is a flow chart of a method of decompression of a deep learning model according to one embodiment of the present disclosure.
As shown in fig. 6, the decompression method 600 of the deep learning model includes steps S610 to S630.
In step S610, a compressed file of the deep learning model is acquired.
For example, the compressed file is obtained according to the compression method of the deep learning model described above. In the compressed file, the operation unit, the type element, and the attribute element are in binary form.
In step S620, the string identification and binary identification mapping table of each of the operation unit, the type element, and the attribute element is acquired.
The mapping table may be generated and saved when binary encoding the operation unit, the type element, and the attribute element. The mapping table stores the mapping relation between the character string identification and the binary identification of the operation unit, the type element and the attribute element.
In step S630, the compressed file is decompressed into an initial file according to the string identification and binary identification mapping table.
According to the mapping table, the binary identification of the operation unit can be decoded into a character string identification. The binary identification of the type element is decoded into a string identification. And decoding the binary identification of the attribute element into a character string identification.
Next, the string identification of the operation unit may be decoded into an intermediate expression according to the string encoding rule of the operation unit. The string identification of the type element may be decoded into an intermediate representation according to the string encoding rules of the type element. The string identification of the attribute element may be decoded into an intermediate representation according to the string encoding rules of the attribute element.
According to the embodiment of the disclosure, the operation unit, the type element and the attribute element of the compressed file of the deep learning model can be restored to the character string form based on the mapping table by acquiring the character string identification and the binary identification mapping table, and then restored to the IR form, so that the initial file of the deep learning model is obtained, and the data consistency of the deep learning model is ensured.
Fig. 7 is a flow chart of a method of decompression of a deep learning model according to another embodiment of the present disclosure.
As shown in fig. 7, the present embodiment includes steps S710 to S770.
In step S710, a string identification and binary identification mapping table is obtained.
In step S720, binary identifications of the operation unit, the type element, and the attribute element are decoded into character string identifications based on the mapping table.
In step S730, the string identifications of the operation unit, the type element, and the attribute element are decoded into intermediate expressions based on the string encoding rule.
The specific implementation manner of obtaining the mapping table, performing binary decoding according to the mapping table, and performing character string decoding based on the character string encoding rule is described above, and will not be repeated here.
In step S740, it is determined whether the compressed file is half-compressed or full-compressed.
Embodiments of the present disclosure provide for multiple compression levels. In the case of half compression, the compressed file is a half compressed file. The operation unit, the type element, and the attribute element in the semi-compressed file are in binary form, and the objects (such as regions and blocks) other than the operation unit, the type element, and the attribute element are in the form of long strings.
In the case of full compression, the compressed file is a full compressed file. The operation unit, the type element, and the attribute element in the full compression file are in binary form, and the objects (such as regions and blocks) other than the operation unit, the type element, and the attribute element are in the form of short strings. The short character string is obtained by text compression of the long character string.
In the case of half compression, step S750 is performed. In the case of full compression, steps S760 to S770 are performed.
In step S750, the character string identifications of the objects other than the operation unit, the type element, and the attribute element are decoded into intermediate expressions.
According to an embodiment of the present disclosure, the deep learning model further includes a region and a block, the block including at least one operation unit, the region including at least one block. String identifications of objects other than the operation unit, the type element, and the attribute element include string identifications of areas and blocks.
According to an embodiment of the present disclosure, long text string identifications of regions and blocks are decoded into intermediate expressions according to string identification encoding rules of the regions and blocks, respectively, in response to a compression level being half-compression.
For example, the regions and blocks are encoded in order of appearance and their own identity, and the string identity of the regions and blocks in the compressed file may be decoded into an intermediate representation based on this rule.
In step S760, short character strings of other objects than the operation unit, the type element, and the attribute element are decompressed into long character strings.
In step S770, the long string is decoded into intermediate expressions.
According to an embodiment of the present disclosure, in response to the compression level being full compression, the short text string identifications of the region and the block are decompressed to long text string identifications, and the long text string identifications of the region and the block are decoded to intermediate representations, respectively, according to the string identification encoding rules of the region and the block.
For example, the short string of the region and the block may be decompressed into a long string identifier, and then the long string identifier of the region and the block may be decoded into an intermediate representation according to the string encoding rule of the region and the block.
Embodiments of the present disclosure may obtain a complete initial file by decoding an operation unit, a type element, and an attribute element, and decoding an area and a block.
According to an embodiment of the disclosure, the disclosure further provides a compression device of the deep learning model and a decompression device of the deep learning model.
Fig. 8 is a block diagram of a compression apparatus of a deep learning model according to one embodiment of the present disclosure.
As shown in fig. 8, the compression apparatus 800 of the deep learning model includes an initial file acquisition module 810, a text file determination module 820, and a compressed file determination module 830.
The initial file acquisition module 810 is configured to acquire an initial file of a deep learning model, where the deep learning model includes an operation unit, and a type element and an attribute element for describing the operation unit are in an intermediate expression form in the initial file.
The text file determining module 820 is configured to perform text serialization on the initial file to obtain a text file of the deep learning model, where in the text file, the operation unit, the type element and the attribute element are in a character string form.
The compressed file determining module 830 is configured to binary sequence the operation unit, the type element, and the attribute element in the form of a character string in the text file, so as to obtain a compressed file of the deep learning model.
According to an embodiment of the present disclosure, the deep learning model further includes a region and a block, the block including at least one operation unit, the region including at least one block. The text file determining module comprises a first character string encoding unit and a second character string encoding unit.
The first character string coding unit is used for carrying out character string coding on the operation unit, the type element and the attribute element in the intermediate expression form to obtain respective character string identifications of the operation unit, the type element and the attribute element.
The second character string coding unit is used for adding the character string identifications of the operation unit, the type element and the attribute element into the operation unit, the region and the block according to the structural relation among the region, the block and the operation unit, and coding the character string identifications of the region and the block according to the sequence to obtain the text file of the deep learning model.
Intermediate expressions are derived from the dialect system organization. The first character string coding unit is used for coding the character string identifications of the operation unit, the type elements and the attribute elements according to the dialect identifications and preset coding rules.
The compressed file determining module is used for respectively encoding the character string identifiers of the operation unit, the type element and the attribute element in the text file into binary identifiers to obtain a compressed file of the deep learning model.
The compression apparatus 800 of the deep learning model further includes a mapping table saving module.
The mapping table storage module is used for storing the mapping relations among the character string identifiers and the binary identifiers of the operation unit, the type element and the attribute element into the respective mapping tables.
The compression apparatus 800 of the deep learning model further includes a compression level determination module.
The compression level determination module is used for determining the compression level of the deep learning model, wherein the compression level comprises half compression and full compression.
The compressed file determining module is used for executing the step of binary serialization of the operation unit, the type element and the attribute element in the form of the character string in the text file under the condition that the compressed level determining module determines that the compressed level is half-compressed, so as to obtain a half-compressed file serving as the compressed file of the deep learning model.
The compression apparatus 800 of the deep learning model further includes a full compression file determination module.
The full compression file determining module is used for respectively encoding the character string identifiers of the operation unit, the type element and the attribute element in the text file into binary identifiers in response to the compression level being advanced, and performing text compression on other character strings except the character string identifiers of the operation unit, the type element and the attribute element in the text file to obtain a full compression file which is used as a compression file of the deep learning model.
Fig. 9 is a block diagram of a decompression apparatus of a deep learning model according to one embodiment of the present disclosure.
As shown in fig. 9, the decompression apparatus 900 of the deep learning model includes a compressed file acquisition module 910, a mapping table acquisition module 920, and a decompression module 930.
The decompression module comprises a first decoding unit and a second decoding unit.
The first decoding unit is used for respectively decoding the binary identifications of the operation unit, the type element and the attribute element into character string identifications according to the character string identifications and the binary identification mapping table.
The second decoding unit is used for respectively decoding the character string identifiers of the operation unit, the type element and the attribute element into intermediate expressions according to the coding rules of the character string identifiers of the operation unit, the type element and the attribute element.
The objects in the compressed file except the operation unit, the type element and the attribute element are in the form of character strings. The decompression apparatus 900 of the deep learning model further includes a compression level determination module.
And the compression level determining module is used for determining the compression level of the deep learning model, wherein the compression level comprises half compression and full compression.
The decompression module 930 is further configured to decompress the compressed file into an initial file according to the compression level.
In the case of half compression, character strings of other objects in the compressed file than the operation unit, the type element, and the attribute element are identified as long text. In the case of full compression, character strings of other objects in the compressed file than the operation unit, the type element, and the attribute element are identified as short text. Short text is obtained by text compression of long text.
The decompression module further comprises a first decompression unit and a second decompression unit.
The first decompression unit is configured to decode long text string identifications of objects other than the operation unit, the type element, and the attribute element into intermediate expressions in response to the compression level being half-compressed.
The second decompression unit is used for decompressing the short text string identifications of other objects except the operation unit, the type element and the attribute element into long text string identifications and decoding the long text string identifications into intermediate expressions in response to the compression level being full compression.
The deep learning model further includes a region and a block, the block including at least one operation unit, the region including at least one block; other objects than the operation unit, the type element, and the attribute element include areas and blocks.
According to embodiments of the present disclosure, the present disclosure also provides an electronic device, a readable storage medium and a computer program product.
Fig. 10 shows a schematic block diagram of an example electronic device 1000 that may be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the disclosure described and/or claimed herein.
As shown in fig. 10, the apparatus 1000 includes a computing unit 1001 that can perform various appropriate actions and processes according to a computer program stored in a Read Only Memory (ROM) 1002 or a computer program loaded from a storage unit 1008 into a Random Access Memory (RAM) 1003. In the RAM 1003, various programs and data required for the operation of the device 1000 can also be stored. The computing unit 1001, the ROM 1002, and the RAM 1003 are connected to each other by a bus 1004. An input/output (I/O) interface 1005 is also connected to bus 1004.
Various components in device 1000 are connected to I/O interface 1005, including: an input unit 1006 such as a keyboard, a mouse, and the like; an output unit 1007 such as various types of displays, speakers, and the like; a storage unit 1008 such as a magnetic disk, an optical disk, or the like; and communication unit 1009 such as a network card, modem, wireless communication transceiver, etc. Communication unit 1009 allows device 1000 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunications networks.
The computing unit 1001 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of computing unit 1001 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, etc. The computing unit 1001 performs the respective methods and processes described above, for example, at least one of a compression method and a decompression method of the deep learning model. For example, in some embodiments, at least one of the compression method and decompression method of the deep learning model may be implemented as a computer software program tangibly embodied on a machine-readable medium, such as the storage unit 1008. In some embodiments, part or all of the computer program may be loaded and/or installed onto device 1000 via ROM 902 and/or communication unit 1009. When the computer program is loaded into the RAM 1003 and executed by the computing unit 1001, one or more steps of at least one of the compression method and decompression method of the deep learning model described above may be performed. Alternatively, in other embodiments, the computing unit 1001 may be configured to perform at least one of a compression method and a decompression method of the deep learning model in any other suitable manner (e.g., by means of firmware).
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuit systems, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems On Chip (SOCs), complex Programmable Logic Devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.
Program code for carrying out methods of the present disclosure may be written in any combination of one or more programming languages. These program code may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus such that the program code, when executed by the processor or controller, causes the functions/operations specified in the flowchart and/or block diagram to be implemented. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and pointing device (e.g., a mouse or trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), and the internet.
The computer system may include a client and a server. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.
It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps recited in the present disclosure may be performed in parallel, sequentially, or in a different order, provided that the desired results of the disclosed aspects are achieved, and are not limited herein.
The above detailed description should not be taken as limiting the scope of the present disclosure. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present disclosure are intended to be included within the scope of the present disclosure.
Claims (27)
1. A compression method of a deep learning model, comprising:
Acquiring an initial file of a deep learning model, wherein the deep learning model comprises an operation unit, and the operation unit and type elements and attribute elements for describing the operation unit are in an intermediate expression form in the initial file;
carrying out text serialization on the initial file to obtain a text file of the deep learning model, wherein in the text file, the operation unit, the type element and the attribute element are in a character string form; and
And carrying out binary serialization on the operation units, the type elements and the attribute elements in the form of character strings in the text file to obtain the compressed file of the deep learning model.
2. The method of claim 1, wherein the deep learning model further comprises a region and a block, the block comprising at least one operation unit, the region comprising at least one block; the text serializing is performed on the initial file, and the text file obtaining of the deep learning model comprises the following steps:
Performing character string coding on the operation unit, the type element and the attribute element in the intermediate expression form to obtain respective character string identifications of the operation unit, the type element and the attribute element; and
According to the structural relation among the area, the block and the operation unit, the respective character string identifications of the operation unit, the type element and the attribute element are added into the operation unit, the area and the block, and the respective character string identifications are coded according to the area and the block in sequence, so that the text file of the deep learning model is obtained.
3. The method of claim 2, wherein the intermediate expression is derived from a dialect system organization; the step of performing character string coding on the operation unit, the type element and the attribute element in the intermediate expression form to obtain respective character string identifications of the operation unit, the type element and the attribute element comprises the following steps:
And according to the dialect identification, coding the character string identifications of the operation unit, the type element and the attribute element according to a preset coding rule.
4. The method of claim 1, wherein binary serializing the operation unit, the type element, and the attribute element in the form of a character string in the text file to obtain the compressed file of the deep learning model comprises:
And respectively encoding the character string identifiers of the operation unit, the type element and the attribute element in the text file into binary identifiers to obtain the compressed file of the deep learning model.
5. The method of claim 4, further comprising:
and storing the mapping relation among the character string identifiers and the binary identifiers of the operation unit, the type element and the attribute element into the mapping tables.
6. The method of claim 1, further comprising
Determining a compression level of the deep learning model, the compression level including half compression and full compression;
And executing the step of binary serialization of the operation unit, the type element and the attribute element in the form of the character string in the text file in response to the compression level being half compression, so as to obtain a half-compression file serving as the compression file of the deep learning model.
7. The method of claim 6, further comprising:
And in response to the compression level being full compression, respectively encoding the character string identifications of the operation unit, the type element and the attribute element in the text file into binary identifications, and carrying out text compression on the character string identifications of other objects except the character string identifications of the operation unit, the type element and the attribute element in the text file to obtain a full compression file which is used as the compression file of the deep learning model.
8. A method of decompression of a deep learning model, comprising:
Obtaining a compressed file of a deep learning model, wherein the compressed file is obtained according to the method of any one of claims 1 to 7, the deep learning model comprising an operation unit, in which the operation unit, and type elements and attribute elements for describing the operation unit, are in binary form;
Acquiring the character string identification and binary identification mapping table of each of the operation unit, the type element and the attribute element;
and decompressing the compressed file into an initial file according to the character string identification and binary identification mapping table.
9. The method of claim 8, wherein decompressing the compressed file to an initial file according to the string identification and binary identification mapping table comprises:
According to the character string identification and the binary identification mapping table, respectively decoding the binary identifications of the operation unit, the type element and the attribute element into character string identifications;
And respectively decoding the character string identifications of the operation unit, the type element and the attribute element into intermediate expressions according to the coding rules of the character string identifications of the operation unit, the type element and the attribute element.
10. The method according to claim 8 or 9, wherein the objects in the compressed file other than the operation unit, the type element and the attribute element are in the form of character strings; the method further comprises the steps of:
Determining a compression level of the deep learning model, wherein the compression level includes half compression and full compression; and
Decompressing the compressed file into an initial file according to the compression grade;
in the case of half compression, the character strings of other objects except the operation unit, the type element and the attribute element in the compressed file are marked as long text;
In the case of full compression, the character strings of other objects except the operation unit, the type element and the attribute element in the compressed file are marked as short text;
The short text is obtained by compressing the long text.
11. The method of claim 10, wherein said decompressing the compressed file to an initial file according to the compression level comprises:
Decoding long text string identifications of the objects other than the operation unit, the type element, and the attribute element into intermediate expressions in response to the compression level being half-compression;
And in response to the compression level being full compression, decompressing the short text string identifications of the objects except the operation unit, the type element and the attribute element into long text string identifications, and decoding the long text string identifications into intermediate expressions.
12. The method of claim 10, wherein the deep learning model further comprises a region and a block, the block comprising at least one operation unit, the region comprising at least one block; the other objects than the operation unit, the type element, and the attribute element include areas and blocks.
13. A compression apparatus for a deep learning model, comprising:
an initial file acquisition module, configured to acquire an initial file of a deep learning model, where the deep learning model includes an operation unit, and a type element and an attribute element for describing the operation unit are in an intermediate expression form in the initial file;
The text file determining module is used for carrying out text serialization on the initial file to obtain a text file of the deep learning model, wherein in the text file, the operation unit, the type element and the attribute element are in a character string form; and
And the compressed file determining module is used for binary serializing the operation unit, the type element and the attribute element in the form of the character string in the text file to obtain the compressed file of the deep learning model.
14. The apparatus of claim 13, wherein the deep learning model further comprises a region and a block, the block comprising at least one operation unit, the region comprising at least one block; the text file determining module includes:
the first character string coding unit is used for carrying out character string coding on the operation unit, the type element and the attribute element in the intermediate expression form to obtain respective character string identifications of the operation unit, the type element and the attribute element; and
And the second character string coding unit is used for adding the character string identifications of the operation unit, the type element and the attribute element into the operation unit, the region and the block according to the structural relation among the region, the block and the operation unit, and coding the character string identifications of the region and the block according to the sequence to obtain the text file of the deep learning model.
15. The apparatus of claim 14, wherein the intermediate expression is derived by a dialect system organization; the first character string coding unit is used for coding the character string identifications of the operation unit, the type element and the attribute element according to a preset coding rule according to the dialect identification.
16. The apparatus of claim 13, wherein the compressed file determining module is configured to encode the character string identifiers of the operation unit, the type element, and the attribute element in the text file into binary identifiers, respectively, to obtain the compressed file of the deep learning model.
17. The apparatus of claim 16, further comprising:
And the mapping table storage module is used for storing the mapping relations among the character string identifiers and the binary identifiers of the operation unit, the type element and the attribute element into the respective mapping tables.
18. The apparatus of claim 13, further comprising:
A compression level determining module for determining a compression level of the deep learning model, the compression level including half compression and full compression;
The compressed file determining module is configured to perform a step of binary serializing an operation unit, a type element and an attribute element in a string form in the text file when the compressed level determining module determines that the compressed level is half-compressed, to obtain a half-compressed file as the compressed file of the deep learning model.
19. The apparatus of claim 18, further comprising:
And the full compression file determining module is used for respectively encoding the character string identifiers of the operation unit, the type element and the attribute element in the text file into binary identifiers in response to the compression level being advanced, and carrying out text compression on the character string identifiers of other objects except the character string identifiers of the operation unit, the type element and the attribute element in the text file to obtain a full compression file which is used as the compression file of the deep learning model.
20. A decompression apparatus of a deep learning model, comprising:
A compressed file acquisition module for acquiring a compressed file of a deep learning model, wherein the compressed file is obtained according to the apparatus of any one of claims 13 to 19, the deep learning model including an operation unit in which the operation unit, and a type element and an attribute element for describing the operation unit are in binary form;
The mapping table acquisition module is used for acquiring the character string identifiers and the binary identifier mapping tables of the operation unit, the type elements and the attribute elements;
And the decompression module is used for decompressing the compressed file into an initial file according to the character string identification and binary identification mapping table.
21. The apparatus of claim 20, wherein the decompression module comprises:
the first decoding unit is used for respectively decoding the binary identifications of the operation unit, the type element and the attribute element into character string identifications according to the character string identifications and the binary identification mapping table;
And the second decoding unit is used for respectively decoding the character string identifiers of the operation unit, the type element and the attribute element into intermediate expressions according to the coding rules of the character string identifiers of the operation unit, the type element and the attribute element.
22. The apparatus of claim 20 or 21, wherein the objects in the compressed file other than the operation unit, the type element, and the attribute element are in the form of character strings; the apparatus further comprises:
A compression level determining module for determining a compression level of the deep learning model, the compression level including half compression and full compression; and
The decompression module is further configured to decompress the compressed file into an initial file according to the compression level;
in the case of half compression, the character strings of other objects except the operation unit, the type element and the attribute element in the compressed file are marked as long text;
In the case of full compression, the character strings of other objects except the operation unit, the type element and the attribute element in the compressed file are marked as short text;
The short text is obtained by compressing the long text.
23. The apparatus of claim 22, wherein the decompression module comprises:
A first decompression unit configured to decode, in response to the compression level being half-compression, the long text string identifier of the other object than the operation unit, the type element, and the attribute element into an intermediate expression;
And the second decompression unit is used for decompressing the short text string identifiers of the objects except the operation unit, the type elements and the attribute elements into long text string identifiers and decoding the long text string identifiers into intermediate expressions in response to the compression level being full compression.
24. The apparatus of claim 22, wherein the deep learning model further comprises a region and a block, the block comprising at least one operation unit, the region comprising at least one block; the other objects than the operation unit, the type element, and the attribute element include areas and blocks.
25. An electronic device, comprising:
at least one processor; and
A memory communicatively coupled to the at least one processor; wherein,
The memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1 to 12.
26. A non-transitory computer readable storage medium storing computer instructions for causing the computer to perform the method of any one of claims 1 to 12.
27. A computer program product comprising a computer program stored on at least one of a readable storage medium and an electronic device, which, when executed by a processor, implements the method according to any one of claims 1 to 12.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410797825.4A CN118520922A (en) | 2024-06-19 | 2024-06-19 | Compression method and decompression method of deep learning model based on serialization |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410797825.4A CN118520922A (en) | 2024-06-19 | 2024-06-19 | Compression method and decompression method of deep learning model based on serialization |
Publications (1)
Publication Number | Publication Date |
---|---|
CN118520922A true CN118520922A (en) | 2024-08-20 |
Family
ID=92285229
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202410797825.4A Pending CN118520922A (en) | 2024-06-19 | 2024-06-19 | Compression method and decompression method of deep learning model based on serialization |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN118520922A (en) |
-
2024
- 2024-06-19 CN CN202410797825.4A patent/CN118520922A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US7924183B2 (en) | Method and system for reducing required storage during decompression of a compressed file | |
US11463102B2 (en) | Data compression method, data decompression method, and related apparatus, electronic device, and system | |
US11514003B2 (en) | Data compression based on key-value store | |
CN107395209B (en) | Data compression method, data decompression method and equipment thereof | |
CN113590858B (en) | Target object generation method and device, electronic equipment and storage medium | |
CN107888197B (en) | Data compression method and device | |
CN110518917B (en) | LZW data compression method and system based on Huffman coding | |
US8849726B2 (en) | Information processing apparatus and control method for the same | |
CN112466285B (en) | Offline voice recognition method and device, electronic equipment and storage medium | |
US20200294629A1 (en) | Gene sequencing data compression method and decompression method, system and computer-readable medium | |
US9479195B2 (en) | Non-transitory computer-readable recording medium, compression method, decompression method, compression device, and decompression device | |
CN111131403A (en) | Message coding and decoding method and device for Internet of things equipment | |
US11606103B2 (en) | Data compression method, data compression device, data decompression method, and data decompression device | |
EP3846021B1 (en) | Data output method, data acquisition method, device, and electronic apparatus | |
CN114614829A (en) | Satellite data frame processing method and device, electronic equipment and readable storage medium | |
CN113163198B (en) | Image compression method, decompression method, device, equipment and storage medium | |
US20090055395A1 (en) | Method and Apparatus for XML Data Processing | |
US20150248432A1 (en) | Method and system | |
CN118520922A (en) | Compression method and decompression method of deep learning model based on serialization | |
CN113220651A (en) | Operation data compression method and device, terminal equipment and storage medium | |
CN115604365B (en) | Data encoding and decoding method, device, electronic equipment and readable storage medium | |
CN114095037B (en) | Application program updating method, updating data compression method, device and equipment | |
CN111767280A (en) | Data processing method, device and storage medium | |
CN112054805B (en) | Model data compression method, system and related equipment | |
CN115904240A (en) | Data processing method and device, electronic equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |