CN117271456B - Data serialization method, anti-serialization method, electronic device, and storage medium - Google Patents
Data serialization method, anti-serialization method, electronic device, and storage medium Download PDFInfo
- Publication number
- CN117271456B CN117271456B CN202311564963.XA CN202311564963A CN117271456B CN 117271456 B CN117271456 B CN 117271456B CN 202311564963 A CN202311564963 A CN 202311564963A CN 117271456 B CN117271456 B CN 117271456B
- Authority
- CN
- China
- Prior art keywords
- view
- data
- address
- trivial
- processed
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 75
- 238000012545 processing Methods 0.000 claims abstract description 69
- 230000006835 compression Effects 0.000 claims description 18
- 238000007906 compression Methods 0.000 claims description 18
- 238000004590 computer program Methods 0.000 claims description 12
- 238000004891 communication Methods 0.000 description 9
- 238000010586 diagram Methods 0.000 description 8
- 230000006870 function Effects 0.000 description 7
- 238000004422 calculation algorithm Methods 0.000 description 6
- 238000013507 mapping Methods 0.000 description 6
- 230000009191 jumping Effects 0.000 description 4
- 230000001360 synchronised effect Effects 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 2
- 238000007667 floating Methods 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 230000005856 abnormality Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 238000013475 authorization Methods 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/17—Details of further file system functions
- G06F16/174—Redundancy elimination performed by the file system
- G06F16/1744—Redundancy elimination performed by the file system using compression, e.g. sparse files
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/16—File or folder operations, e.g. details of user interfaces specifically adapted to file systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/17—Details of further file system functions
- G06F16/172—Caching, prefetching or hoarding of files
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Human Computer Interaction (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The application provides a data serialization method, an anti-serialization method, electronic equipment and a storage medium, and relates to the technical field of data processing, comprising the following steps: during serialization, data to be processed and data structure types corresponding to the data to be processed are obtained; determining a view corresponding to the data to be processed based on the data structure type; and carrying out serialization processing on the data to be processed according to the view. During deserialization, acquiring data after the serialization processing; determining a view corresponding to the serialized data; and according to the view, performing deserialization on the serialized data. In this embodiment, since the view stores data in a compressed byte stream manner, the view is used to sequence the data, so that the volume of the data after the sequence can be reduced; and performing deserialization on the data according to the starting address and the ending address of the view, and copying the data into a memory is not needed, so that the deserialization time is shortened, and the deserialization efficiency is improved.
Description
Technical Field
The present disclosure relates to the field of data processing technologies, and in particular, to a data serialization method, an anti-serialization method, an electronic device, and a storage medium.
Background
Serialization is the process of converting an object into a byte stream that can be stored or transmitted. Deserialization is the process of restoring a byte stream to an object. When objects are stored across platforms, transported across networks, and communicated between processes, serialization and de-serialization are required.
At present, some serialization and anti-serialization methods widely used in the industry need to store pointers during serialization, and store a corresponding offset for each field in an object, resulting in a larger volume of data after serialization; when the anti-serialization is performed, all fields need to be resolved from binary data and copied into a memory, and then a user can access the fields, so that the anti-serialization takes a long time.
Disclosure of Invention
The embodiment of the application provides a data serialization method, an anti-serialization method, electronic equipment and a storage medium, so that the volume of serialized data is reduced, and the anti-serialization efficiency is improved.
In a first aspect, an embodiment of the present application provides a data serialization method, including: acquiring data to be processed and data structure types corresponding to the data to be processed; determining a view corresponding to the data to be processed based on the data structure type; according to the view, carrying out serialization processing on the data to be processed; wherein the view stores data in a compressed byte stream and describes the stored data with data between a start address and an end address.
In a second aspect, embodiments of the present application provide a data deserializing method, including: acquiring data after serialization processing; determining a view corresponding to the serialized data; the view describes the data before the serialization processing corresponding to the data after the serialization processing by using the data between the start address and the end address; performing deserialization processing on the serialized data according to the view; wherein the view stores data in a compressed byte stream and describes the stored data with data between a start address and an end address.
In a third aspect, embodiments of the present application provide an electronic device including a memory, a processor, and a computer program stored on the memory, the processor implementing the method of any one of the above when the computer program is executed.
In a fourth aspect, embodiments of the present application provide a computer-readable storage medium having a computer program stored therein, which when executed by a processor, implements a method as in any of the above.
Compared with the prior art, the application has the following advantages:
the application provides a data serialization method, an inverse serialization method, electronic equipment and a storage medium, wherein data to be processed and data structure types corresponding to the data to be processed are obtained during serialization; determining a view corresponding to the data to be processed based on the data structure type; and carrying out serialization processing on the data to be processed according to the view. During deserialization, acquiring data after the serialization processing; determining a view corresponding to the serialized data; performing deserialization processing on the serialized data according to the view; wherein the view stores data in a compressed byte stream and describes the stored data with data between a start address and an end address. In this embodiment, since the view stores data in a compressed byte stream manner, the view is used to sequence the data, so that the volume of the data after the sequence can be reduced; and performing deserialization on the data according to the starting address and the ending address of the view, and copying the data into a memory is not needed, so that the deserialization time is shortened, and the deserialization efficiency is improved.
The foregoing description is merely an overview of the technical solutions of the present application, and in order to make the technical means of the present application more clearly understood, it is possible to implement the present application according to the content of the present specification, and in order to make the above and other objects, features and advantages of the present application more clearly understood, the following detailed description of the present application will be given.
Drawings
In the drawings, the same reference numerals refer to the same or similar parts or elements throughout the several views unless otherwise specified. The figures are not necessarily drawn to scale. It is appreciated that these drawings depict only some embodiments according to the application and are not to be considered limiting of its scope.
Fig. 1 is a schematic diagram of an application scenario provided in the present application.
Fig. 2 is a flowchart of a data serialization method according to an embodiment of the present application.
FIG. 3 is a flow chart of a data de-serialization method according to an embodiment of the present application.
Fig. 4 is a block diagram of a data serialization apparatus according to an embodiment of the present application.
Fig. 5 is a block diagram of a data de-serializing apparatus according to an embodiment of the present application.
Fig. 6 is a block diagram of an electronic device used to implement an embodiment of the present application.
Detailed Description
Hereinafter, only certain exemplary embodiments are briefly described. As will be recognized by those of skill in the pertinent art, the described embodiments may be modified in various different ways without departing from the spirit or scope of the present application. Accordingly, the drawings and description are to be regarded as illustrative in nature, and not as restrictive.
In order to facilitate understanding of the technical solutions of the embodiments of the present application, the following describes related technologies of the embodiments of the present application. The following related technologies may be optionally combined with the technical solutions of the embodiments of the present application, which all belong to the protection scope of the embodiments of the present application.
It should be noted that, the user information (including but not limited to user equipment information, user personal information, etc.) and the data (including but not limited to data for analysis, stored data, presented data, etc.) related to the present application are information and data authorized by the user or fully authorized by each party, and the collection, use and processing of the related data need to comply with the related laws and regulations and standards of the related country and region, and provide corresponding operation entries for the user to select authorization or rejection.
Example 1
The data serialization method and the data deserialization method provided by the technical scheme of the application can be applied to application scenes such as cross-platform storage, network transmission, inter-process communication and the like. Fig. 1 is a schematic diagram of an application scenario of the data serialization method provided in the present application. As shown in fig. 1, in a scenario in which a server sends data to a user terminal, the server performs serialization processing on an object, and the specific process includes: the server acquires data to be processed and data structure types corresponding to the data to be processed; determining a view corresponding to the data to be processed based on the data structure type; and carrying out serialization processing on the data to be processed according to the view. The user terminal receives the data after the serialization processing and carries out the deserialization processing, and the specific process comprises the following steps: acquiring data after serialization processing; determining a view corresponding to the serialized data; performing deserialization processing on the serialized data according to the view; wherein the view stores data in a compressed byte stream and describes the stored data with data between a start address and an end address.
In this embodiment, since the view stores data in a compressed byte stream manner, the view is used to sequence the data, so that the volume of the data after the sequence can be reduced; and performing deserialization on the data according to the starting address and the ending address of the view, and copying the data into a memory is not needed, so that the deserialization time is shortened, and the deserialization efficiency is improved.
Example two
The embodiment of the application provides a data serialization method, which can be applied to a computing device, and the computing device can include: server, user terminal, etc. FIG. 2 is a flowchart of a data serialization method according to an embodiment of the present application, including:
step S201, obtaining data to be processed and a data structure type corresponding to the data to be processed.
Step S202, based on the data structure type, determining a view corresponding to the data to be processed.
Step S203, according to the view, serializing the data to be processed.
Wherein the view stores data in a compressed byte stream and describes the stored data with data between a start address and an end address.
The data to be processed may be a plurality of types of data structures, and may include a plurality of objects. At compile time, the data structure type of the data to be processed is obtained through a type system, for example, which fields the data to be processed includes, what type of data each field is, e.g., floating point type, string type, etc. Determining a data structure type of the data to be processed according to the types of the fields in the data to be processed, wherein the data structure type can comprise any one of the following components: a trivial structure, a non-trivial structure, a random container, a string, an ordered set, an ordered map, a hash set, a hash map, an optional, a union, a bit set, a variable length structure, a variable length integer container, or a shared object.
The trivial structure body refers to all fields being trivial fields, i.e. fixed-length fields, such as integer (int) and floating point (float) types. By non-trivial structure is meant a structure comprising at least one non-trivial field, i.e. a field of non-fixed length, which may be a structure comprising both trivial and non-trivial fields. A random container is a data structure that sequentially stores several elements of the same type, wherein the total number of elements is not fixed. A character string refers to a data structure that is fixed in length and contains a plurality of characters. An ordered set refers to a data structure that includes a plurality of elements, and the arrangement of the elements is ordered. Ordered mapping refers to a set comprising a plurality of key-value pairs, each key corresponding to an associated value, the entire set of mappings being ordered by key size. A hash set refers to a set of values of the same data type without repeated values. The hash map, which contains several key-value pairs, one associated value for each key, is unordered for the entire set of maps. The options (optional) may contain two possible values: null or non-null (actual value). When using the selectable item, processing may be performed by determining whether the selectable item contains a value. If the selectable item contains a value, the actual value may be obtained by an unpacking (unwrapping) operation; if the selectable item is empty, corresponding processing can be performed to avoid the occurrence of an abnormality. Federation (unit) is an enumeration of type security, storing different types of data in the same memory space, allowing different types of values to be stored at different points in time, but only one type of value can be stored at a time, and the stored value can be one of a plurality of different types, up to 256. The bit set (bit set) represents and manipulates the bit sequence (bit sequence). It typically has a fixed length binary bit (bit) as a base unit for storing and manipulating a set of boolean (bol) values. The variable length structure means that it is itself a variable length integer or that all fields it contains are variable length integers. The length of the variable length structure body can be dynamically adjusted according to actual needs, that is, the memory space can be dynamically allocated according to the stored data quantity so as to adapt to data with different sizes. A variable length integer container is referred to as a variable length integer container when the element type of a sequential container is a variable length structure. The shared object is used to share data among multiple threads or processes. It provides a mechanism whereby multiple concurrent execution units can access and modify the same data object simultaneously to achieve data sharing and communication.
Depending on the data structure type, the corresponding view may be determined. The data structure type is associated with the corresponding view. The views corresponding to the data structures are respectively as follows: a trivial structure body view, a non-trivial structure body view, a random container view, a string view, an ordered set view, an ordered map view, a hash set view, a hash map view, an optional view, a joint (unit) view, a bit set (bitset) view, a variable length structure body view, a variable length integer container view, a shared object view.
A view is a data structure that stores data in a compressed byte stream and describes the data stored by the view with data between a start address and an end address. The start address and the end address may be memory addresses in a memory or a disk. Each view corresponds to different processing modes, the data to be processed is subjected to serialization processing according to the processing mode of the view corresponding to the data to be processed, and the data to be processed and the corresponding view after serialization corresponding to the data to be processed are obtained through storing the views.
Firstly, obtaining data to be processed and data structure types corresponding to the data to be processed; secondly, determining a view corresponding to the data to be processed based on the data structure type; finally, according to the view, serializing the data to be processed. Wherein the view stores data in a compressed byte stream and describes the stored data with data between a start address and an end address. In this embodiment, since the view stores data in a compressed byte stream manner, the view is used to sequence the data, so that the volume of the data after the sequence can be reduced.
In one implementation, step S203, according to the view, performs serialization processing on the data to be processed, including: and compressing the data to be processed into a byte stream, and arranging according to the binary layout of the view corresponding to the data to be processed.
In practical application, the data to be processed of different data structure types are different in compression processing mode into byte streams, and the binary layouts of the corresponding views are also different.
For trivial structures, the fields are first aligned, and specific alignment rules may be set according to specific needs, for example, alignment may be performed according to alignment rules in the C language. The fields in the trivial structure after alignment are closely arranged in byte copies and in binary layout of the trivial structure view.
In one example, the data to be processed is the following trivial structure:
the binary layout and number of bytes occupied for this trivial structure view are shown in table 1:
wherein,the data representing the first field of the stored veccle_data refers to a 64-bit integer data representing a unique Identifier (ID). />The data representing the second field of the stored veccle_data defines an array, named name, with 20 array elements. Padding represents the stuff bytes added. / >Data representing the third field of the stored vehicle_data, the speed of the vehicle.
In this embodiment, for the trivial structural body view, there is no need to store offsets to the trivial fields, and storage space can be further reduced.
For different types of data to be processed, the compression processing mode specifically refers to the following embodiments:
in one implementation, compressing data to be processed into a byte stream includes: compressing the byte number occupied by the offset of the sub view corresponding to each of a plurality of elements in the data to be processed relative to the view; wherein the child view describes elements of the data to be processed using data between a start address and an end address; the range between the start address and the end address of the sub-view is contained between the start address and the end address of the view.
Wherein the data to be processed is made up of a plurality of elements, which may be fields or other types of data structures in the container. For non-trivial structures and random containers containing non-trivial elements, the number of bytes occupied by the offset of the child view of the non-trivial element may be compressed when serializing, where the non-trivial element may include non-trivial fields, or other types of data types in the container that are not fixed in length. The length of the trivial element is fixed and no offset may be stored.
Each element corresponds to a sub-view, and the data between the start address and the end address of the sub-view may describe the element. The range between the starting address and the ending address of the sub-view is included between the starting address and the ending address of the view, and the inline storage mode enables data of each element to be accessed according to the starting address and the ending address of the sub-view when deserialization is carried out, and the data is not required to be accessed by jumping to the part outside the view, so that the data access speed is increased.
The specific implementation manner of compressing the byte number occupied by the offset is as follows:
in one implementation, compressing the number of bytes occupied by the offset of the sub-view corresponding to each of the plurality of elements in the data to be processed relative to the view includes: based on the length of the view and the shortest length of the view, determining the byte number occupied by the offset of the sub-view corresponding to the elements relative to the view respectively so as to realize compression processing of the byte number occupied by the offset; wherein the shortest length of the view is the sum of the length of the trivial element in the data to be processed and the shortest length of the non-trivial element in the data to be processed; the length of the trivial element is a fixed value; the length of the nontrivial element is a non-fixed value.
In practical applications, the offset is the distance between the starting position of the sub-view and the starting position of the view. The offset is an unsigned integer stored as the original code. The length of the offset is determined by a compression algorithm. For non-trivial structures, the number of bytes occupied by the offset is dynamically determined based on the length of the view and the shortest length of the view, the length representing the number of bytes occupied.
In one example, for a non-trivial structure view, the sum of the lengths of n sub-views is calculated recursively to be x (the length of a sub-view is the number of bytes occupied by the field described by that sub-view), the trivial field of that view is y, the shortest length of that view is z, and the number of bytes occupied by the offset k is set to be the set {1,2,4,8}, the smallest k is found such that the inequality x+y+nk-z.ltoreq.2ζ, where the length of the view is x+y+nk.
In another example, the length of the view minus the shortest length of the structure and the length of the offset table yields an upper bound on the offset, and the number of bytes occupied by the offset is dynamically determined based on the range of the upper bound, as shown in table 2:
in one example, the data to be processed is the following non-trivial structure:
the binary layout of the foo structure view is shown in table 3:
Wherein the trivial section is formed by sequentially arranging and closely attaching the trivial fields in the non-trivial structure body. Unlike a trivial structure, the fields are directly and closely arranged without considering the memory alignment problem. The offset table is an array of unsigned integers. Assuming that there are n non-trivial fields within the non-trivial structure, the offset table has a length of n-1. The offset table is empty when n is 1. The offset of the non-trivial field is recorded in the offset table, the i-th unsigned integer in the offset table is equal to the distance of the start address of the i+1-th non-trivial field in the structure relative to the end of the offset table, and the shortest length of the first i non-trivial fields is subtracted. Since the start address of the first non-trivial field is the address of the end of the offset table, the end address of the first non-trivial field is the start address plus the shortest length of the field, the offset of the first non-trivial field does not need to be recorded in the offset table, thereby saving the volume of the view. The non-trivial sections are formed by sequentially arranging the non-trivial fields in the structure.
In addition, the non-trivial field may be null, and in this case, the view length corresponding to the structure is the minimum length of the structure, that is, the upper bound of the offset in table 3 is 0. Since all non-trivial fields of the structure are null, the offset table may be omitted to compress the memory space.
Taking the structure foo in the above embodiment as an example, if all the non-trivial fields of the foo are null, i.e. string, con-tainer, hash_map, set fields are directly null, only field bar has the shortest length (because it contains trivial field int), the binary layout of the view of the foo is as shown in table 4:
it can be seen that in the case of null compression, the offset table is omitted, all non-trivial fields being of the shortest length thereof, which is 0 if they do not contain trivial fields, and can be directly ignored.
If the data to be processed is a random container, the binary layout of the random container view is shown in table 5:
located at the head of the random container is an unsigned integer field that indicates the size of the container, i.e., the number of elements in the random container. Following this is an offset table consisting of a total of size-1 unsigned integers (offset) representing the distance of the i-1 st non-trivial element of the random container relative to the end of the offset table minus (i-2) times the minimum length of the random container element. Finally, the random container stores the size of closely packed non-trivial elements. Since the start address of the first non-trivial element is the address of the end of the offset table, the end address of the first non-trivial element is the start address plus the shortest length of the element, and thus the offset of the first non-trivial element does not need to be recorded in the offset table, thereby saving the volume of the view.
The number of bytes occupied by the offset of the sub-view of the non-trivial element of the random container with respect to the view may be compressed, and the specific implementation manner is the same as the implementation method of offset compression of the non-trivial structure in the above embodiment, which is not repeated here.
Furthermore, the number of bytes occupied by the number of elements of the random container may also be compressed, and in one example, the compression algorithm is as follows:
let x belong to the set {1,2,4,8}, find a minimum x, make the total length of the view of the random container less than or equal to (256 x-1) (the shortest length of the random container element +1) +x, this minimum x is the number of bytes occupied by the length field of the random container. Where the total length of the random container view = unsigned integer field (this field stores the number of random container elements, 4 bytes are needed) +offset table length (x size of container) +the sum of the lengths of the individual sub-views stored by the random container.
In addition, when the number of elements of the random container is zero, the random container is empty. At this point, the length of the random container view is zero.
When the element stored in the random container is a trivial element, since the length of the trivial element is fixed, the offset table and the size of the random container may be omitted from the view, thereby optimizing the storage space.
In one example, the structure vector is composed of n trivial elements of the int type, the binary layout of the view of vector is shown in table 6:
for the data to be processed of the character string type, since the element type stored in the view is a fixed-length character type, the data can be processed according to the processing mode of the trivial structure body, and the details are not repeated here. To be compatible with the C language string, a '\0' character is added at the end of the string, which is not counted in the string length. The character types in the character string may include, but are not limited to, char8_t, char16_t, or char32_t.
For the data to be processed of the ordered set type, the binary layout of the view is the same as the binary layout of the random container view, and the arrangement order of the elements in the binary layout of the ordered set view is the same as the arrangement order of the elements in the ordered set.
For the data to be processed of the ordered mapping type, the binary layout of the view and the binary layout of the random container view in which the elements are key-value pairs are the same, and the arrangement order of the key-value pairs in the binary layout of the ordered mapping view is in the order of the keys from large to small. In addition, the ordered set also performs null compression, and when the number of elements of the ordered set is zero, the ordered set is null. At this point, the length of the ordered set view is zero.
For hash set type data to be processed, in one example, the binary layout of its view is as shown in Table 7:
wherein the binary layout of the hash set view is an unsigned integer at the beginning, representing the total number of elements of the hash set. Then, a random container < >. The length of the random container is equal to the bucket size bucket_size of the hash table, the element of the random container is also a random container < key, value >, and the element contained in the random container is the element of the hash set.
In addition, the length field (size) in the binary layout of the hash set view is variable length, and in one example, the compression algorithm is as follows:
let x belong to the set {1,2,4,8}, solve a minimum x, make the total length of Hash set view less than or equal to (256 x-1) +x, this minimum x is the number of bytes occupied by the length field of Hash set view. Wherein the total length of the hash set view = x + the total length of the random container view contained by the hash set view. The hash set also performs null compression, and when the number of elements of the hash set is zero, the length of the view is zero.
For the data to be processed of the hash mapping type, the layout of the view is the same as that of the hash set view with the set element being a key value pair. The compression method of the length field in the binary layout of the hash map view is the same as the hash set view. The hash map also performs null compression, and when the number of elements of the hash set is zero, the length of the view is zero.
For the data to be processed of the optional (optional) type, in an example, the optional stores the element T, then when the optional is not empty, the binary layout of its view is as shown in table 8:
wherein the binary layout of the optional view is initially a placeholder for one byte, all bits of which should be 0. Subsequently, the elements stored in the optional are attached. When the optional is null, the length of the optional view is zero. In addition, the number of bytes occupied by the placeholder is compressed, and when the minimum length of the element stored in the optional is not zero, the one-byte placeholder of which the optional header is used for distinguishing a null value from a non-null value can be omitted.
For pending data of the union (unit) type, in one example, the binary layout of its view is as shown in table 9:
wherein index occupies one byte, which type the unit stores is recorded. Followed by an element (value) of that type. The minimum length of this view is 1 byte.
The data to be processed of the bit set type, in one example, the boolean (bol) array has a length of 11, and the binary layout of its view is shown in table 10:
wherein bit corresponds to the boolean array Each of the elements is provided with a plurality of elements,indicating the filling position. The length of the view is +.>。
For data to be processed of the dynamic bit set (dynamic_bit set) type of the bit set (bit set) types, where the length of the block array is variable, the binary layout of the dynamic bit set view is shown in table 11:
wherein the binary layout of the dynamic bit set view is initially an unsigned integer length representing the length of the boolean array. The subsequent binary data (bits) is equivalent to a bit set view of the corresponding length.
Since the length field length is long, compression can be performed as follows:
the total length of the bit set (bitset) view is len, let x belong to the set {1,2,4,8}, find a minimum x, letThis smallest x is the number of bytes occupied by the length field length. Wherein the total length of the bitset view is +.>。
When the length of dynamic_bit is zero, the type is considered to be null, and null compression is performed, and the length of the view is zero.
For the data to be processed of the variable length structure type, in one example, the variable length integer type supported by the variable length structure is as shown in table 12:
for example, if there is a variable length structure:
The binary layout of the variable length structure view is shown in table 13:
for the data to be processed of the variable length integer container type, the binary layout of its view is shown in table 14:
wherein the view is first an unsigned integer representing the length of the variable length integer container followed by a sub-view of N variable length structure elements.
The number of bytes occupied by the length field of the variable length integer container may be compressed, the compression algorithm is as follows:
let x belong to the set {1,2,4,8}, find a minimum x, make the total length of the variable-length integer container view less than or equal to(shortest length of the variable length integer container element) +x, then this smallest x is the number of bytes occupied by the length field of the variable length integer container. Where the total length of the variable length integer container view = x + the length of each variable length integer field. When the number of elements contained in the variable-length integer container is zero, the container is considered to be empty. The corresponding view length is zero at this time.
For the data to be processed for the shared object type, in one example, the binary layout of the shared object view is as shown in Table 15:
where object represents the object stored by the shared object, 0xFF represents the relative offset, points to the address where the object is actually located, 0xFF takes 1 byte, and then follows the shared object.
In yet another example, the binary layout of the shared object view is as shown in Table 16:
where offset is an unsigned integer, the length of the integer may be 1, 2, 4 or 8 bytes. The integer is the relative offset of the location where the object is actually stored with respect to the view start address. In addition, when the object pointed to by the view is null, the view is null and the length is zero.
Example III
The embodiment of the application provides a data deserializing method, which can be applied to a computing device, and the computing device can comprise: server, user terminal, etc. FIG. 3 is a flowchart of a data de-serialization method according to an embodiment of the present application, which includes:
step S301, data after the serialization processing is acquired.
Step S302, determining a view corresponding to the serialized data.
Step S303, performing deserialization processing on the serialized data according to the view.
Wherein the view stores data in a compressed byte stream and describes the stored data using data between a start address and an end address.
The data after the serialization processing is a compressed and stored binary byte stream, and the anti-serialization process is a process of anti-serializing the binary byte stream into an object.
The types of data structures prior to the serialization process may include, but are not limited to, any of the following: a trivial structure, a non-trivial structure, a random container, a string, an ordered set, an ordered map, a hash set, a hash map, an optional, a union, a bit set, a variable length structure, a variable length integer container, or a shared object.
The view corresponding to the data after the serialization process includes: a trivial structure body view, a non-trivial structure body view, a random container view, a string view, an ordered set view, an ordered map view, a hash set view, a hash map view, an optional view, a joint (unit) view, a bit set (bitset) view, a variable length structure body view, a variable length integer container view, a shared object view.
In practical application, after determining the data to be subjected to deserialization, the data does not need to be copied into a memory, but each field in the data is accessed according to the binary layout rule of the view corresponding to the data, so that the deserialization of the object is realized.
The data deserialization method provided by the embodiment of the application comprises the steps of firstly, obtaining data after serialization processing; secondly, determining a view corresponding to the serialized data; finally, according to the view, performing deserialization on the serialized data; wherein the view stores data in a compressed byte stream and describes the stored data with data between a start address and an end address. In this embodiment, the data is deserialized according to the start address and the end address of the view, and the data does not need to be copied into the memory, so that the deserializing time is reduced, and the deserializing efficiency is improved.
In one implementation, step S302, determining a view corresponding to the serialized data includes:
step S3021, acquiring a start address, an end address, and a binary layout of a view;
step S3022, determining a start address and an end address of a sub-view corresponding to a plurality of elements of the data before the serialization processing according to the start address and the end address of the view and the binary layout of the view; wherein the child view describes elements in the data before the serialization process using the data between the start address and the end address; the range between the start address and the end address of the sub-view is contained between the start address and the end address of the view.
Wherein each element corresponds to a sub-view, and data between a start address and an end address of the sub-view may describe the element. The range between the starting address and the ending address of the sub-view is included between the starting address and the ending address of the view, and the inline storage mode enables data of each element to be accessed according to the starting address and the ending address of the sub-view when deserialization is carried out, and the data is not required to be accessed by jumping to the part outside the view, so that the data access speed is increased.
In practical application, according to the starting address, the ending address and the binary layout of the views carried in the data after the serialization processing, the starting address and the ending address of the sub-view corresponding to each element of the data before the serialization can be calculated, and each element can be directly accessed according to the starting address and the ending address of each sub-view, so that the deserialization is realized.
In an example, for a binary layout of a trivial structure view as shown in table 1, since each field of the trivial structure is a trivial field, the length of the trivial field is fixed, the starting address of the view is the starting address of the sub-view of the first field, and the ending address of the sub-view of the first field can be obtained from the starting address of the sub-view of the first field and the length of the first field; the end address of the sub-view of the first field is the start address of the sub-view of the second field, and then the end address of the sub-view of the second field can be obtained according to the start address of the sub-view of the second field and the length of the second field, and the start address and the end address of all the sub-views can be obtained by the same method.
In an example, for a binary layout of a random container view as shown in table 6, since the number of elements of the random container is n, the length of the trivial element can be obtained by dividing the length of the random container view by n, and then the start address and the end address of the sub-view of each element are obtained according to the start position of the view and the length of the trivial element.
In one example, for the binary layout of the optional view shown in table 8, the placeholder occupies one byte, and therefore, the start address of the view minus one byte is the start address of the child view of the element, and the end address of the view is the end address of the child view of the element.
In one example, for a binary layout of a joint (unit) view as shown in table 9, the placeholder occupies one byte, and therefore, the start address of the view minus one byte is the start address of the child view of the element value, and the end address of the view is the end address of the child view of the element value.
In an example, for a binary layout of a bit set view as shown in table 10, the start address of the view is the start address of the child view of the boolean array, and the end address of the view is the end address of the child view of the boolean array.
In one example, for a binary layout of a dynamic bit set view as shown in Table 10, the starting address minus the ending address of the view is the total length len of the view, let x belong to the set {1,2,4,8}, find a minimum x, letThis smallest x is the number of bytes occupied by the length field length. The starting address of the view minus the byte number occupied by the length field length is the starting address of the sub-view of the bitset, and the ending address of the view is the ending address of the sub-view of the bitset.
In one example, for a binary layout of a shared object view as shown in table 15, since 0xFF occupies 1 byte, then the start address of the view minus one byte is the start address of the sub-view of the object, and the end address of the view is the end address of the sub-view of the object.
In one example, for a binary layout of a shared object view as shown in Table 16, the actual storage location of the object may be derived from the starting address and relative offset of the view.
In one implementation, step S3022, determining, according to the start address and the end address of the view and the binary layout of the view, the start address and the end address of the sub-view respectively corresponding to the plurality of elements of the data before the serialization processing includes: determining an offset of a child view of a non-trivial element of a plurality of elements stored in a binary layout relative to the view; and determining the starting address and the ending address of the sub-view respectively corresponding to a plurality of elements of the data before the serialization processing according to the starting address, the ending address and a plurality of offsets of the view.
Wherein the offset is the distance of the starting position of the child view of the non-trivial element from the starting position of the view. According to the starting address and the ending address of the view and the offset of each sub-view, the starting address and the ending address of each sub-view can be calculated, and according to the starting address and the ending address of each sub-view, the data corresponding to each element can be directly accessed.
In one example, for a binary layout of a random container view as shown in table 5, the start address and end address of a child view of any element within the random container may be obtained within a constant time. For example, to calculate the start address and end address of the ith element, the start address of the child view of the ith element may be obtained by simply obtaining the offset of the ith-1 element from the offset table, and adding the tail address of the offset table and (i-2) times the minimum length of the elements in the random container. The end address of the sub-view of the i-th element is the start address of the sub-view of the i+1th element, and thus can be calculated according to the algorithm as described above.
For a non-trivial structural view, a specific implementation of determining the offset is seen in the following example:
in one implementation, the view comprises a non-trivial structural body view; determining an offset of a child view of a non-trivial element of a plurality of elements stored in a binary layout relative to the view, comprising: an offset of the child view of the current non-trivial element relative to the view is determined based on the starting address of the view, the starting address and the ending address of the child view of the trivial element of the plurality of elements, and the shortest length of the non-trivial element preceding the current non-trivial element.
Wherein the offset is the distance of the starting address of the child view of the non-trivial element relative to the starting address of the view.
In one example, for a binary layout of a non-trivial structure view as shown in table 3, since the length and order of the trivial fields are fixed, the start address and end address of the sub-view of each trivial field in the non-trivial structure can be calculated from the start address of the view.
When the sub-view of the ith non-trivial field is calculated, the distance between the start address and the end address of the sub-view of the ith non-trivial field and the tail of the offset table can be obtained by only reading the (i-1) th element and the (i) th element in the offset table and respectively adding the shortest length sum of the (i-1)/i th non-trivial field, and the distance between the start address of the sub-view of the ith non-trivial field and the start address of the view can be obtained by adding the total length of the trivial field and the length of the offset table, namely the offset of the sub-view of the ith non-trivial field and the view.
Because the length of the offset table and the length of the trivial field are known, the tail address of the offset table can be calculated through the initial address of the view, the tail address of the offset table is added with the offset stored in the offset table, and the sum of the shortest lengths of the first i-1 trivial fields is added to be the initial address of the sub-view of the ith trivial field; the tail address of the offset table is added with the offset stored in the offset table and the shortest length sum of the first i non-trivial fields, namely the end address of the sub-view of the i non-trivial field.
Wherein, since the shortest length of each non-trivial field is known, the prefix sum of the shortest length of each non-trivial field can be pre-calculated as the sum of the shortest length of the current non-trivial field and the shortest length of each non-trivial field preceding the current non-trivial field. That is, the prefix of the i-th non-trivial field and the shortest length sum of the first i non-trivial fields. In this way, the start address and the end address of any sub-view can be obtained in a constant time.
Where i is an integer greater than 1, because, for the child view of the first non-trivial field, its starting address is the end of the offset table. For the sub-view of the last non-trivial field, its end address is the end address of the view.
In one example, for the case where the non-trivial field in the non-trivial structure body is null, the binary layout of the non-trivial structure body view as shown in table 4, if the starting address of the child view of the i-th non-trivial field is to be obtained, it is only necessary to use the starting address of the non-trivial structure body view, plus the length of the trivial field, plus the shortest length of the first i-1 non-trivial fields. The operation of acquiring the sub-views may also be completed within a constant time.
In one implementation, step S303, performing, according to the view, deserialization processing on the serialized data, includes: sequentially accessing the data after the serialization processing according to the start address and the end address of the sub-view respectively corresponding to the plurality of elements, and sequentially obtaining the data before the serialization processing respectively corresponding to the plurality of elements.
In practical application, for views corresponding to different data types, data before serialization processing corresponding to a plurality of elements can be obtained sequentially according to the sequence of each sub-view.
In one example, for a binary layout of a variable length structure volume view as shown in table 13, the start address and end address of the sub-view of each field are iteratively calculated, first the start address and end address of the sub-view of x0 are calculated, then the start address and end address of the sub-view of x1 are iteratively calculated, then the start address and end address of the sub-view of y0 are iteratively calculated, finally the start address and end address of the sub-view of y1 are iteratively calculated, and then x0, x1, y0, y1 are sequentially accessed.
Wherein, the starting address of the sub-view of x0 is the starting address of the view, the ending address is the starting address of the view plus the minimum length of x0, the starting address of the sub-view of x1 is the ending address of the sub-view of x0, the ending address of the sub-view of x1 is the ending address of the sub-view of x0 plus the minimum length of x1, and so on, to obtain the starting address and ending address corresponding to x0, x1, y0, y1 respectively.
In one example, for a binary layout of a view of a variable-length integer container as shown in table 14, the starting address and the ending address of the view are subtracted to obtain the total length of the view, let x belong to the set {1,2,4,8}, find a minimum x, let the total length < = (256 x-1)/(the shortest length of the variable-length integer container element) +x, and then this minimum x is the number of bytes occupied by the length field of the variable-length integer container. The total length of the view minus x is the starting address of the sub-view of the element1, the starting address of the sub-view of the element1 plus the minimum length of the element1, the ending address of the sub-view of the element1 is obtained, the starting address of the sub-view of the element2 is also obtained, the starting address of the sub-view of the element2 plus the minimum length of the element2 is obtained, the ending address of the sub-view of the element2 is obtained, and the sub-views corresponding to the element1 … element N are obtained by the same.
In one implementation, step S303, performing, according to the view, deserialization processing on the serialized data, includes: inquiring target elements needing to be subjected to deserialization; and accessing the data after the serialization processing according to the starting address and the ending address of the sub-view corresponding to the target element to obtain the data before the serialization processing corresponding to the target element.
For the ordered set view and ordered map view, a dichotomy may be used to query the target element and then calculate the starting address and ending address of the child view of the target element. And for the Hash set view and the Hash map view, querying the target element by using an open-chain method, and then calculating the starting address and the ending address of the sub view of the target element.
Wherein, for the ordered set view and the ordered map view, the binary layout is the same as the binary layout of the random container view. The starting address and the ending address of the view are known, in the process of binary search, the subscript (i) of the offset table corresponding to the target element is determined, and the starting address and the ending address of the sub-view can be obtained through a random range. The random container view provides an algorithm for obtaining the start address and the end address of the sub-view by giving a subscript, and the calculation method of the start address and the end address of the sub-view of the random container view in the above embodiment is specifically described.
In one example, for a binary layout of a hash set view as shown in Table 7, the target element is queried using an open-chain approach. Firstly, calculating the length of a container < key, value > in a view, namely the bucket size bucket_size of a hash table, calculating a hash value of a key transmitted by a user, and taking a module of the bucket_size, namely the bucket_id where an element is located. Then, the element corresponding to the bucket_id may be randomly accessed. The element is a random access container, and its member is the element of hash table. Only the elements of the hash table in the random access container need to be traversed to judge whether the target elements exist in the set. If the element is found to be the same as the target element to be queried when traversing to the ith element, the starting and ending addresses of the child view of the element, namely the starting and ending addresses of the child view of the target element.
The traversal operation of the random access container specifically comprises the following steps: the starting address of the first sub-view can be obtained by adding the starting address of the view of the random access container to the unsigned integer of the head and the byte number occupied by the offset table. The start address of the nth sub-view of the second … is then determined based on the lengths of the fields of the offset table. The starting address of the ith sub-view is the ending address of the ith-1 th view. The end address of the last child view is the same as the end address of the random container view. Thus, the fields of the offset table are traversed, and the start address and end address of accessing each child view may be traversed.
And for the Hash mapping view, searching the target element by adopting the same searching method, and calculating the starting address and the ending address of the sub view of the target element.
Compared with the related art, the anti-serialization method and device are faster in anti-serialization speed, and in the related art, more read-memory operations and skip operations are needed to access the field. For example, if a string field contained in one structure is to be read. The following operations are required in the related art:
1. the 32-bit offset of the structure is read, and the position where the vtable is located is jumped to according to the offset.
2. And reading the first member n of the vtable of the structure body to obtain the number of member fields of the structure body.
3. And reading 16-bit offset corresponding to the string field in the vtable, and jumping to the offset of the string according to the offset.
4. The 16-bit offset of string is read, and the actual storage location of string is skipped according to the offset.
It can be seen that four caches need to be read and four jumps before the string is actually read.
The scheme only needs a lot of less reading/jumping times, and the specific operation is as follows:
1. from the view of the structure, the offset of the string in the structure is directly read, and from the offset, a sub-view of the string is acquired.
2. If the structure in 1 contains only one non-trivial field of string, then the sub-view of string can be directly calculated without reading the offset of string in the structure.
Compared with four read jumps in the related art, the scheme only needs to read the offset for 1 time at most. Inline storage brings great access speed advantages and storage space advantages.
In addition, in the related art, the offset is a fixed-length 32-bit signed integer, and data exceeding 32 bits is not supported, so that the serialized data cannot be 2GB, otherwise, the representation range of the 32-bit signed integer is exceeded. The size of the offset in the present application can be dynamically compressed according to the view size, so that more than 32 bits of serialized data can be supported.
Example IV
Corresponding to the application scene and the method of the method provided by the embodiment of the application, the embodiment of the application also provides a data serialization device. Fig. 4 is a block diagram of a data serializing apparatus according to an embodiment of the present application, where the apparatus includes:
the first obtaining module 401 is configured to obtain data to be processed and a data structure type corresponding to the data to be processed.
A first determining module 402, configured to determine, based on the data structure type, a view corresponding to the data to be processed.
A serialization module 403, configured to perform serialization processing on the data to be processed according to the view.
Wherein the view stores data in a compressed byte stream and describes the stored data with data between a start address and an end address.
The data serialization device provided by the embodiment of the application firstly obtains data to be processed and data structure types corresponding to the data to be processed; secondly, determining a view corresponding to the data to be processed based on the data structure type; finally, according to the view, serializing the data to be processed. Wherein the view stores data in a compressed byte stream and describes the stored data with data between a start address and an end address. In this embodiment, since the view stores data in a compressed byte stream manner, the view is used to sequence the data, so that the volume of the data after the sequence can be reduced.
In one implementation, serialization module 403 is configured to:
and compressing the data to be processed into a byte stream, and arranging according to the binary layout of the view corresponding to the data to be processed.
In one implementation, serialization module 403, when compressing the data to be processed into byte streams, is configured to:
compressing the byte number occupied by the offset of the sub view corresponding to each of a plurality of elements in the data to be processed relative to the view;
wherein the child view describes elements of the data to be processed using data between a start address and an end address; the range between the start address and the end address of the sub-view is contained between the start address and the end address of the view.
In one implementation, the serialization module 403 is configured to, when performing compression processing on the number of bytes occupied by the offset of the sub-view corresponding to each of the plurality of elements in the data to be processed relative to the view:
based on the length of the view and the shortest length of the view, determining the byte number occupied by the offset of the sub-view corresponding to the elements relative to the view respectively so as to realize compression processing of the byte number occupied by the offset;
wherein the shortest length of the view is the sum of the length of the trivial element in the data to be processed and the shortest length of the non-trivial element in the data to be processed; the length of the trivial element is a fixed value; the length of the nontrivial element is a non-fixed value; the length represents the number of bytes occupied.
The functions of each module in the embodiments of the present application may be described correspondingly in the above methods, and have corresponding beneficial effects, which are not described herein.
Example five
Corresponding to the application scene and the method of the method provided by the embodiment of the application, the embodiment of the application also provides a data anti-serialization device. Fig. 5 is a block diagram of a data serializing apparatus according to an embodiment of the present application, where the apparatus includes:
a second obtaining module 501, configured to obtain the serialized data;
a second determining module 502, configured to determine a view corresponding to the serialized data;
a deserializing module 503, configured to deserialize the serialized data according to the view;
wherein the view stores data in a compressed byte stream and describes the stored data with data between a start address and an end address.
The data anti-serialization device provided by the embodiment of the application firstly acquires data after serialization processing; secondly, determining a view corresponding to the serialized data; finally, according to the view, performing deserialization on the serialized data; wherein the view stores data in a compressed byte stream and describes the stored data with data between a start address and an end address. In this embodiment, the data is deserialized according to the start address and the end address of the view, and the data does not need to be copied into the memory, so that the deserializing time is reduced, and the deserializing efficiency is improved.
In one implementation, the second determining module 502 is configured to:
acquiring a starting address, an ending address and a binary layout of the view;
determining a starting address and an ending address of a sub-view respectively corresponding to a plurality of elements of the data before serialization processing according to the starting address and the ending address of the view and the binary layout of the view;
wherein the child view describes elements in the data before the serialization process using the data between the start address and the end address; the range between the start address and the end address of the sub-view is contained between the start address and the end address of the view.
In one implementation, the view comprises a non-trivial structural body view; the second determining module 502 is configured to, when determining a start address and an end address of a sub-view respectively corresponding to a plurality of elements of data before serialization processing according to a start address and an end address of a view and a binary layout of the view:
determining an offset of a child view of a non-trivial element of a plurality of elements stored in a binary layout relative to the view;
and determining the starting address and the ending address of the sub-view respectively corresponding to a plurality of elements of the data before the serialization processing according to the starting address, the ending address and a plurality of offsets of the view.
In one implementation, the second determination module 502, when determining an offset of a child view of a non-trivial element from among a plurality of elements stored in a binary layout relative to the view, is to:
an offset of the child view of the current non-trivial element relative to the view is determined based on the starting address of the view, the starting address and the ending address of the child view of the trivial element of the plurality of elements, and the shortest length of the non-trivial element preceding the current non-trivial element.
In one implementation, the inverse serialization module 503 is configured to:
sequentially accessing the data after the serialization processing according to the start address and the end address of the sub-view respectively corresponding to the plurality of elements, and sequentially obtaining the data before the serialization processing respectively corresponding to the plurality of elements.
In one implementation, the inverse serialization module 503 is configured to:
inquiring target elements needing to be subjected to deserialization;
and accessing the data after the serialization processing according to the starting address and the ending address of the sub-view corresponding to the target element to obtain the data before the serialization processing corresponding to the target element.
The functions of each module in the embodiments of the present application may be described correspondingly in the above methods, and have corresponding beneficial effects, which are not described herein.
Fig. 6 is a block diagram of an electronic device used to implement an embodiment of the present application. As shown in fig. 6, the electronic device includes: a memory 610 and a processor 620, the memory 610 storing a computer program executable on the processor 620. The processor 620, when executing the computer program, implements the methods of the above-described embodiments. The number of memory 610 and processors 620 may be one or more.
The electronic device further includes:
the communication interface 630 is used for communicating with external devices for data interactive transmission.
If the memory 610, the processor 620, and the communication interface 630 are implemented independently, the memory 610, the processor 620, and the communication interface 630 may be connected to each other and perform communication with each other through buses. The bus may be an industry standard architectureBus, external device interconnectBus or extended industry standard architectureBus, etc. The total isThe lines may be classified into address buses, data buses, control buses, etc. For ease of illustration, only one thick line is shown in fig. 6, but not only one bus or one type of bus.
Alternatively, in a specific implementation, if the memory 610, the processor 620, and the communication interface 630 are integrated on a chip, the memory 610, the processor 620, and the communication interface 630 may communicate with each other through internal interfaces.
The present embodiments provide a computer-readable storage medium storing a computer program that, when executed by a processor, implements the methods provided in the embodiments of the present application.
The embodiment of the application also provides a chip, which comprises a processor and is used for calling the instructions stored in the memory from the memory and running the instructions stored in the memory, so that the communication device provided with the chip executes the method provided by the embodiment of the application.
The embodiment of the application also provides a chip, which comprises: the input interface, the output interface, the processor and the memory are connected through an internal connection path, the processor is used for executing codes in the memory, and when the codes are executed, the processor is used for executing the method provided by the application embodiment.
It should be appreciated that the processor may be a central processing unitOther general purpose processors, digital signal processors are also possible>Application specific integrated circuitField programmable gate array->Or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. The general purpose processor may be a microprocessor orWhich is any conventional processor or the like. It is noted that the processor may be a machine supporting an advanced reduced instruction set >An architecture processor.
Further alternatively, the memory may include a read-only memory and a random access memory. The memory may be volatile memory or nonvolatile memory, or may include both volatile and nonvolatile memory. Wherein the nonvolatile memory may include read-only memoryProgrammable read-only memoryErasable programmable read-only memory +.>Electrically erasable programmable read-only memory +.>Or flash memory. The volatile memory may comprise random access memory +.>Which acts as an external cache. By way of example, and not limitation, many forms of RAM are available. For example static random access memoryDynamic random access memory->Synchronous dynamic random access memory->Double data rate synchronous dynamic random access memory +.>Enhanced synchronous dynamic random access memory ++>Synchronous link dynamic random access memoryAnd direct memory bus random access memory。
In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on a computer, the processes or functions in accordance with the present application are produced in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. Computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another.
In the description of the present specification, a description referring to terms "one embodiment," "some embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present application. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, the different embodiments or examples described in this specification and the features of the different embodiments or examples may be combined and combined by those skilled in the art without contradiction.
Furthermore, the terms "first," "second," and the like, are used for descriptive purposes only and are not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include at least one such feature. In the description of the present application, the meaning of "a plurality" is two or more, unless explicitly defined otherwise.
Any process or method described in flow charts or otherwise herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing specific logical functions or steps of the process. And the scope of the preferred embodiments of the present application includes additional implementations in which functions may be performed in a substantially simultaneous manner or in an opposite order from that shown or discussed, including in accordance with the functions that are involved.
Logic and/or steps described in the flowcharts or otherwise described herein, e.g., may be considered a ordered listing of executable instructions for implementing logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions.
It is to be understood that portions of the present application may be implemented in hardware, software, firmware, or a combination thereof. In the above-described embodiments, the various steps or methods may be implemented in software or firmware stored in a memory and executed by a suitable instruction execution system. All or part of the steps of the methods of the embodiments described above may be performed by a program that, when executed, comprises one or a combination of the steps of the method embodiments, instructs the associated hardware to perform the method.
In addition, each functional unit in each embodiment of the present application may be integrated in one processing module, or each unit may exist alone physically, or two or more units may be integrated in one module. The integrated modules may be implemented in hardware or in software functional modules. The integrated modules described above, if implemented in the form of software functional modules and sold or used as a stand-alone product, may also be stored in a computer-readable storage medium. The storage medium may be a read-only memory, a magnetic or optical disk, or the like.
The foregoing is merely exemplary embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily think of various changes or substitutions within the technical scope of the present application, which should be covered in the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.
Claims (8)
1. A method of serializing data, the method comprising:
acquiring data to be processed and a data structure type corresponding to the data to be processed;
determining a view corresponding to the data to be processed based on the data structure type;
according to the view, carrying out serialization processing on the data to be processed;
the view stores data in a compressed byte stream mode, and the stored data is described by utilizing the data between a starting address and an ending address;
wherein the serializing the data to be processed according to the view includes:
compressing the data to be processed into byte streams, and arranging according to the binary layout of the view corresponding to the data to be processed;
wherein the compressing the data to be processed into a byte stream includes:
Compressing byte numbers occupied by sub-views corresponding to a plurality of elements in the data to be processed relative to the offset of the views;
wherein the child view describes elements of the data to be processed using data between a start address and an end address; the range between the start address and the end address of the sub-view is included between the start address and the end address of the view.
2. The method according to claim 1, wherein the compressing the byte count occupied by the offset of the sub-view corresponding to each of the plurality of elements in the data to be processed relative to the view includes:
determining the byte number occupied by the offset of the sub-view corresponding to a plurality of elements relative to the view based on the length of the view and the shortest length of the view, so as to realize compression processing of the byte number occupied by the offset;
wherein the shortest length of the view is the sum of the length of the trivial element in the data to be processed and the shortest length of the non-trivial element in the data to be processed; the length of the trivial element is a fixed value; the length of the nontrivial element is a non-fixed value; the length represents the number of bytes occupied.
3. A method of deserializing data, the method comprising:
acquiring data after serialization processing;
determining a view corresponding to the serialized data;
performing deserialization processing on the serialized data according to the view;
the view stores data in a compressed byte stream mode, and the stored data is described by utilizing the data between a starting address and an ending address;
wherein the determining the view corresponding to the serialized data includes:
acquiring a starting address and an ending address of the view and a binary layout of the view;
determining a starting address and an ending address of a sub-view respectively corresponding to a plurality of elements of the data before the serialization processing according to the starting address and the ending address of the view and the binary layout of the view;
wherein the child view describes elements in the data before the serialization processing using the data between the start address and the end address; the range between the start address and the end address of the sub-view is contained between the start address and the end address of the view;
wherein the view comprises a non-trivial structural body view; the determining, according to the start address and the end address of the view and the binary layout of the view, the start address and the end address of the sub-view respectively corresponding to the multiple elements of the data before the serialization processing includes:
Determining an offset of a child view of a non-trivial element of the plurality of elements stored in the binary layout relative to the view;
and determining the starting address and the ending address of the sub-view respectively corresponding to a plurality of elements of the data before the serialization processing according to the starting address and the ending address of the view and a plurality of offsets.
4. A method according to claim 3, wherein said determining an offset of a sub-view of a non-trivial element of said plurality of elements stored in said binary layout relative to said view comprises:
an offset of a child view of a current non-trivial element relative to the view is determined based on a starting address of the view, starting and ending addresses of child views of trivial elements in the plurality of elements, and a shortest length of the non-trivial element before the current non-trivial element.
5. A method according to claim 3, wherein said deserializing said serialized data from said view comprises:
and sequentially accessing the data after the serialization processing according to the starting address and the ending address of the sub-view respectively corresponding to the plurality of elements, and sequentially obtaining the data before the serialization processing respectively corresponding to the plurality of elements.
6. A method according to claim 3, wherein de-serializing the serialized data in accordance with the view comprises:
inquiring target elements needing to be subjected to deserialization;
and accessing the data after the serialization processing according to the starting address and the ending address of the sub-view corresponding to the target element to obtain the data before the serialization processing corresponding to the target element.
7. An electronic device comprising a memory, a processor and a computer program stored on the memory, the processor implementing the method of any one of claims 1-2 or 3-6 when the computer program is executed.
8. A computer readable storage medium having stored therein a computer program which, when executed by a processor, implements the method of any of claims 1-2 or 3-6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311564963.XA CN117271456B (en) | 2023-11-22 | 2023-11-22 | Data serialization method, anti-serialization method, electronic device, and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311564963.XA CN117271456B (en) | 2023-11-22 | 2023-11-22 | Data serialization method, anti-serialization method, electronic device, and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN117271456A CN117271456A (en) | 2023-12-22 |
CN117271456B true CN117271456B (en) | 2024-03-26 |
Family
ID=89209135
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311564963.XA Active CN117271456B (en) | 2023-11-22 | 2023-11-22 | Data serialization method, anti-serialization method, electronic device, and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117271456B (en) |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102981884A (en) * | 2012-11-22 | 2013-03-20 | 用友软件股份有限公司 | Serializing device and serializing method |
CN105512305A (en) * | 2015-12-14 | 2016-04-20 | 北京奇虎科技有限公司 | Serialization-based document compression and decompression method and device |
CN106155630A (en) * | 2015-04-14 | 2016-11-23 | 阿里巴巴集团控股有限公司 | Sequencing method, unserializing method, serializing device and unserializing device |
US10439923B1 (en) * | 2016-09-22 | 2019-10-08 | Amazon Technologies, Inc. | Deserialization service |
CN113315801A (en) * | 2020-06-08 | 2021-08-27 | 阿里巴巴集团控股有限公司 | Method and system for storing blockchain data |
CN114153896A (en) * | 2021-11-23 | 2022-03-08 | 计易数据科技(上海)有限公司 | Serialization and deserialization method, apparatus, device and medium thereof |
CN114253553A (en) * | 2021-12-24 | 2022-03-29 | 珠海金山数字网络科技有限公司 | Data processing method and device |
CN116561202A (en) * | 2022-01-27 | 2023-08-08 | 京东科技控股股份有限公司 | Method and device for serializing object |
CN116774910A (en) * | 2022-03-11 | 2023-09-19 | 腾讯科技(深圳)有限公司 | Network data processing method, device, equipment, storage medium and program product |
-
2023
- 2023-11-22 CN CN202311564963.XA patent/CN117271456B/en active Active
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102981884A (en) * | 2012-11-22 | 2013-03-20 | 用友软件股份有限公司 | Serializing device and serializing method |
CN106155630A (en) * | 2015-04-14 | 2016-11-23 | 阿里巴巴集团控股有限公司 | Sequencing method, unserializing method, serializing device and unserializing device |
CN105512305A (en) * | 2015-12-14 | 2016-04-20 | 北京奇虎科技有限公司 | Serialization-based document compression and decompression method and device |
US10439923B1 (en) * | 2016-09-22 | 2019-10-08 | Amazon Technologies, Inc. | Deserialization service |
CN113315801A (en) * | 2020-06-08 | 2021-08-27 | 阿里巴巴集团控股有限公司 | Method and system for storing blockchain data |
CN114153896A (en) * | 2021-11-23 | 2022-03-08 | 计易数据科技(上海)有限公司 | Serialization and deserialization method, apparatus, device and medium thereof |
CN114253553A (en) * | 2021-12-24 | 2022-03-29 | 珠海金山数字网络科技有限公司 | Data processing method and device |
CN116561202A (en) * | 2022-01-27 | 2023-08-08 | 京东科技控股股份有限公司 | Method and device for serializing object |
CN116774910A (en) * | 2022-03-11 | 2023-09-19 | 腾讯科技(深圳)有限公司 | Network data processing method, device, equipment, storage medium and program product |
Non-Patent Citations (2)
Title |
---|
Accelerating Data Serialization/Deserialization Protocols with In-Network Compute;Shiyi Cao,等;《2022 IEEE/ACM International Workshop on Exascale MPI (ExaMPI)》;第22-30页 * |
探究序列化与反序列化;舒尹;《通讯世界》;第第26卷卷(第第1期期);第190-191页 * |
Also Published As
Publication number | Publication date |
---|---|
CN117271456A (en) | 2023-12-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10908925B2 (en) | Dynamic loading method, and target file creation method and apparatus | |
US9977598B2 (en) | Electronic device and a method for managing memory space thereof | |
CN114490853B (en) | Data processing method, device, equipment, storage medium and program product | |
CN115964002B (en) | Electric energy meter terminal archive management method, device, equipment and medium | |
US20090171651A1 (en) | Sdram-based tcam emulator for implementing multiway branch capabilities in an xml processor | |
CN115034176A (en) | Text processing method and device, electronic equipment and computer readable storage medium | |
CN117271456B (en) | Data serialization method, anti-serialization method, electronic device, and storage medium | |
CN112765676B (en) | Intelligent contract executing method, intelligent contract executing device and node equipment | |
CN111881220B (en) | Data operation method and device under list storage, electronic equipment and storage medium | |
US9201982B2 (en) | Priority search trees | |
CN112015751A (en) | Data query method and related equipment | |
US11379232B2 (en) | Method for generic vectorized d-heaps | |
CN117931763A (en) | Log information generation method for embedded system, electronic device, and storage medium | |
US20020178332A1 (en) | Method and system to pre-fetch compressed memory blocks suing pointers | |
US7676651B2 (en) | Micro controller for decompressing and compressing variable length codes via a compressed code dictionary | |
CN118069024A (en) | Data storage method and device, storage medium and electronic equipment | |
CN110737409B (en) | Data loading method and device and terminal equipment | |
CN100367203C (en) | A string reference method | |
CN110941600B (en) | Database processing system and method for offloading database operations | |
CN114168541A (en) | Data processing method, apparatus, medium and electronic equipment | |
CN106528623A (en) | Search engine speeding up method and device | |
CN115495226A (en) | Memory management method, device, equipment and computer readable storage medium | |
CN115687466A (en) | Data serialization method, deserialization device, equipment and storage medium | |
CN112035380A (en) | Data processing method, device and equipment and readable storage medium | |
US6243800B1 (en) | Computer |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |