Disclosure of Invention
This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the detailed description. This summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.
In order to at least partially solve the above technical problems, according to one aspect of the present disclosure, the following technical solutions are provided:
a data processing method, comprising:
acquiring a plurality of first data, wherein the plurality of first data has a first length;
Encoding the plurality of first data into second data, the second data having a second length;
In response to performing the first task, reading the second data;
the second data is converted into a plurality of third data, wherein the plurality of third data has a second length and the values of the plurality of third data are the same as the values of the plurality of first data.
Further, after the encoding the plurality of first data into the second data, the method further includes:
Storing the second data in a cache of a processor, the processor being addressed at the second length.
Further, the encoding the plurality of first data into second data includes:
generating a plurality of data segments according to the sequence of the plurality of first data;
and arranging the plurality of data segments according to the sequence to generate the second data.
Further, the encoding the plurality of first data into second data includes:
combining the plurality of first data two by two to obtain at least one first data set;
and encoding two first data in the at least one first data group to obtain at least one second data.
Further, the converting the second data into a plurality of third data includes:
decoding the second data to obtain a plurality of first data;
The plurality of first data is converted into a plurality of third data, wherein the plurality of third and the plurality of first data have the same value.
Further, the converting the plurality of first data into a plurality of third data includes:
The high order bits of the first data are complemented by 0's to generate the third data.
Further, the decoding the second data to obtain the plurality of first data includes:
segmenting the second data to obtain a plurality of data segments;
And performing second calculation on the plurality of data segments to obtain the plurality of first data.
In order to achieve the above object, according to one aspect of the present disclosure, there is provided the following technical solutions:
A data processing apparatus, comprising:
The first data acquisition module is used for acquiring a plurality of first data, and the plurality of first data have a first length;
An encoding module for encoding the plurality of first data into second data, the second data having a second length;
a second data reading module for reading the second data in response to performing the first task;
and a data conversion module for converting the second data into a plurality of third data, wherein the plurality of third data has a second length and the values of the plurality of third data are the same as the values of the plurality of first data.
Further, the data processing device is further configured to:
Storing the second data in a cache of a processor, the processor being addressed at the second length.
Further, the encoding module is further configured to:
generating a plurality of data segments according to the sequence of the plurality of first data;
and arranging the plurality of data segments according to the sequence to generate the second data.
Further, the encoding module is further configured to:
combining the plurality of first data two by two to obtain at least one first data set;
and encoding two first data in the at least one first data group to obtain at least one second data.
Further, the data conversion module is further configured to:
decoding the second data to obtain a plurality of first data;
The plurality of first data is converted into a plurality of third data, wherein the plurality of third and the plurality of first data have the same value.
Further, the data conversion module is further configured to:
The high order bits of the first data are complemented by 0's to generate the third data.
Further, the data conversion module is further configured to:
segmenting the second data to obtain a plurality of data segments;
And performing second calculation on the plurality of data segments to obtain the plurality of first data.
In order to achieve the above object, according to one aspect of the present disclosure, there is provided the following technical solutions:
An electronic device, comprising:
A memory for storing non-transitory computer readable instructions, and
A processor for executing the computer readable instructions such that the processor performs any of the above data processing methods.
In order to achieve the above object, according to one aspect of the present disclosure, there is provided the following technical solutions:
a computer readable storage medium storing non-transitory computer readable instructions which, when executed by a computer, cause the computer to perform the data processing method of any of the preceding claims.
In order to achieve the above object, according to still another aspect of the present disclosure, there is further provided the following technical solutions:
a data processing terminal comprises any one of the data processing devices.
The present disclosure discloses a data processing method, apparatus, electronic device, and computer-readable storage medium. Wherein the method comprises obtaining a plurality of first data, the plurality of first data having a first length, encoding the plurality of first data into second data, the second data having a second length, reading the second data in response to performing a first task, converting the second data into a plurality of third data, wherein the plurality of third data has a second length and the plurality of third data has the same value as the plurality of first data. The embodiment of the disclosure saves the storage space by encoding the data, and then decodes the data to adapt to the data length required in the calculation process, so that the memory space can be saved, and more calculation processes can be applied.
The foregoing description is only an overview of the disclosed technology, and may be implemented in accordance with the disclosure of the present disclosure, so that the above-mentioned and other objects, features and advantages of the present disclosure can be more clearly understood, and the following detailed description of the preferred embodiments is given with reference to the accompanying drawings.
Detailed Description
Embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While certain embodiments of the present disclosure have been shown in the accompanying drawings, it is to be understood that the present disclosure may be embodied in various forms and should not be construed as limited to the embodiments set forth herein, but are provided to provide a more thorough and complete understanding of the present disclosure. It should be understood that the drawings and embodiments of the present disclosure are for illustration purposes only and are not intended to limit the scope of the present disclosure.
It should be understood that the various steps recited in the method embodiments of the present disclosure may be performed in a different order and/or performed in parallel. Furthermore, method embodiments may include additional steps and/or omit performing the illustrated steps. The scope of the present disclosure is not limited in this respect.
The term "including" and variations thereof as used herein are intended to be open-ended, i.e., including, but not limited to. The term "based on" is based at least in part on. The term "one embodiment" means "at least one embodiment," another embodiment "means" at least one additional embodiment, "and" some embodiments "means" at least some embodiments. Related definitions of other terms will be given in the description below.
The embodiment of the disclosure provides a data processing method. As shown in fig. 1, the data processing method mainly includes the following steps S101 to S104.
Step S101, acquiring a plurality of first data, wherein the plurality of first data have a first length;
Optionally, the first data is quantized data in a quantized network model, such as quantization parameters. Illustratively, the first data is an 8-bit quantization parameter, and the length thereof is 8 bits. The quantization is to approximate the original data to obtain data with low precision and small required storage space, such as the original parameter of the network model is 32bit data, and the data is quantized into 8bit parameters for calculation through the quantization, so that the same storage space can store 4 times of the previous parameters, and the available parameters are greatly increased.
Step S102, encoding the plurality of first data into second data, wherein the second data has a second length;
since some processors do not support the calculation of 8bit data, their addressing length is at least 16 bits, i.e. their cache (RAM) takes 16 bits as a unit of storage, and even if an 8bit data is stored, it needs to occupy 16bit storage space. Thus, if an 8bit quantized network model is used, if the model has N parameters, then when a 16bit processor is used, its cache needs to have at least a 16Nbit size to store the N parameters, but the actual size of the parameters is 8Nbit after. In this step S12, the plurality of first data is thus encoded such that it can accommodate addressing greater than the first length.
Optionally, the step S102 includes:
step S201, generating a plurality of data segments according to the sequence of the plurality of first data;
step S202, arranging the plurality of data segments according to the order to generate the second data.
Wherein the plurality of first data and the plurality of data segments are in one-to-one correspondence, i.e. each first data generates one data segment. The plurality of first data are 28 bit quantized data, respectively 0x12 and 0x34 in 16 scale, and then a first data segment is generated by 0x12 and a second data segment is generated by 0x34, and then the two data segments are arranged into the second data according to the sequence of the first data. In one example, the first data may be directly used as the data segment, i.e., 2 pieces of 8-bit quantized data may be combined to be respectively used as the high order bits and the low order bits of 16-bit data, and the original values of the first data may be respectively used as the high order bits and the low order bits of 16-bit data, i.e., 0x1234. Thus, a plurality of first data of shorter length can be encoded into a second data of longer length to accommodate the memory space and computation of a longer length processor.
Optionally, the step S201 includes performing a first calculation on the plurality of first data in sequence to obtain a plurality of data segments. Wherein each first data in the sequence may be calculated differently. If 0 is added to the first data, a first preset value is added to the second first data. For example, the 2 pieces of quantized data of 8 bits, such as 0x12 and 0x34, wherein 0x12 is unchanged, and the high order data of the second data, 0x34 plus a preset value 128 (0 x 80) is converted into the low order data of 0xb4 as the second data, and the second data is 0x34b4.
Optionally, the step S102 includes:
Step S301, combining the plurality of first data two by two to obtain at least one first data set;
Step S302, encoding two first data in the at least one first data set to obtain at least one second data.
Illustratively, the first plurality of data is 168 bit quantized data, such as:
0x12、0x56、0xab、0x2a、0x35、0x89、0x65、0xcd;
0x34、0x78、0xcd、0x3b、0x67、0x34、0xab、0xde;
the 16 8-bit quantized data are grouped into (0 x12,0x 34), (0 x56,0x 78) groups every two (0 xab,0 xcd), (0 x2a,0x3 b) (0x35, 0x67) (0x35, 0x67) a) is arranged on the surface of the base.
The two 8bit quantized data in each group are encoded, and the calculation as described above yields 8 16bit data, 0x12b4, 0x56f8, 0xab4d, 0x2abb, 0x35e7, 0x89b4, 0x652b, 0xcd e.
Through the steps, the plurality of first data are converted into second data with a second length so as to adapt to the addressing length of the processor, so that the data storage is more compact, and the storage space is saved.
Optionally, after the step S102, the method further includes:
Storing the second data in a cache of a processor, the processor being addressed at the second length.
In the foregoing embodiment, when performing model calculation, the parameters of the quantization model need to be transferred into the cache of the processor, and because the addressing length of the processor is large and the length of the parameters of the quantization model is short, the parameters are encoded into the second length in step S102 and then transferred into the cache of the processor to perform the calculation task of the model.
Step S103, in response to executing the first task, reading the second data;
optionally, i.e. in response to the processor executing a first task, such as a model calculation task, relevant second data of said first task, such as parameters of the encoded 8bit model, are read from the cache of the processor.
Step S104, the second data are converted into a plurality of third data, wherein the third data have a second length, and the values of the third data are the same as the values of the first data.
Since the processor only supports the calculation of the data of the minimum second length, and the second data is the data after encoding and cannot be directly used for calculation, in the step S104, the second data is converted into a plurality of third data, and the third data has the same value as the first data but the length is also the second length.
Optionally, the step S104 includes:
step S401, decoding the second data to obtain the plurality of first data;
step S402, converting the plurality of first data into a plurality of third data, wherein the plurality of third data and the plurality of first data have the same value.
In the above step, the second data is restored to a plurality of first data by a decoding method opposite to the encoding method, and then the first data is converted to third data having a second length.
Optionally, the step S401 includes:
step S501, segmenting the second data to obtain a plurality of data segments;
step S502, performing a second calculation on the plurality of data segments to obtain the plurality of first data.
As described above for the exemplary 16bit data 0x12b4, it is first divided into high data and low data, the high data is unchanged to 0x12, and the low data is subtracted by 128 (0 x 80) to obtain 0x34.
Optionally, the step S402 includes supplementing high order bits of the first data with 0 to generate the third data. The 0x12 high-order complement 0 generates 0x0012, and the 0x34 high-order complement 0 generates 0x0034. In this way, each third data generated is the same value as its corresponding first data, but is 16 bits long, so that the third data can be calculated in the processor.
Optionally, the step S104 includes:
Step S601, shifting the second data to obtain first third data;
and step S602, performing third calculation on the second data to obtain second third data.
Taking the example as an example, the second data is 16bit data 0x12b4, the 0x12b4 is moved to the right by 8 bits to obtain 0x0012, the 0x12b4 is AND-operated with 0x00ff to obtain 0x00b4, and then 128 (0 x 80) is subtracted to obtain 0x0034.
Through the operation in step S104 described above, the 16-bit encoded data in the above example is converted into 16-bit calculation data. As described above, 0x12b4, 0x56f8, 0xab4d, 0x2abb, 0x35e7, 0x89b4, 0x652b, and 0xcd e are converted into :0x0012、0x0056、0x00ab、0x002a、0x0035、0x0089、0x0065、0x00cd、0x0034、0x0078、0x00cd、0x003b、0x0067、0x0034、0x00ab、0x00de., each data has the same value as the original data, and the length becomes 16 bits, so that the processor can adapt to the 16bit processor to perform the calculation.
The method comprises the steps of S101 and S102, encoding a plurality of first data to obtain second data with a second length, enabling a buffer memory storing the data with the second length to store more first data, and converting the length of the first data into the second length through the steps of S103 and S104, enabling a processor with the minimum support of the second length to use the first data for calculation. Thus solving the problems of storage space waste and inapplicability of the processor.
The above embodiment discloses a data processing method including acquiring a plurality of first data having a first length, encoding the plurality of first data into second data having a second length, reading the second data in response to performing a first task, and converting the second data into a plurality of third data, wherein the plurality of third data has the second length and the plurality of third data has the same value as the plurality of first data. The embodiment of the disclosure saves the storage space by encoding the data, and then decodes the data to adapt to the data length required in the calculation process, so that the memory space can be saved, and more calculation processes can be applied.
It will be appreciated by those skilled in the art that obvious modifications (e.g., combinations of the listed modes) or equivalent substitutions may be made on the basis of the above-described embodiments.
In the foregoing, although the steps in the embodiments of the data processing method are described in the above order, it should be clear to those skilled in the art that the steps in the embodiments of the present disclosure are not necessarily performed in the above order, but may be performed in reverse order, parallel, cross, etc., and other steps may be further added to those skilled in the art on the basis of the above steps, and these obvious modifications or equivalent manners are also included in the protection scope of the present disclosure and are not repeated herein.
The following is an embodiment of the disclosed apparatus, which may be used to perform steps implemented by an embodiment of the disclosed method, and for convenience of explanation, only those portions relevant to the embodiment of the disclosed method are shown, and specific technical details are not disclosed, referring to the embodiment of the disclosed method.
Example two
In order to solve the technical problems that a 16bit quantization model in the prior art increases the computational burden of a terminal and increases memory occupation, an embodiment of the present disclosure provides a data processing apparatus 700. The apparatus 700 may perform the steps of the data processing method embodiment described in the first embodiment. As shown in fig. 7, the apparatus mainly includes a first data acquisition module 701, an encoding module 702, a second data reading module 703 and a data conversion module 704, wherein,
A first data acquisition module 701, configured to acquire a plurality of first data, where the plurality of first data has a first length;
an encoding module 702, configured to encode the plurality of first data into second data, where the second data has a second length;
a second data reading module 703 for reading the second data in response to performing the first task;
The data conversion module 704 is configured to convert the second data into a plurality of third data, where the plurality of third data has a second length, and a value of the plurality of third data is the same as a value of the plurality of first data.
Further, the data processing apparatus 700 is further configured to:
Storing the second data in a cache of a processor, the processor being addressed at the second length.
Further, the encoding module 702 is further configured to:
generating a plurality of data segments according to the sequence of the plurality of first data;
and arranging the plurality of data segments according to the sequence to generate the second data.
Further, the encoding module 702 is further configured to:
combining the plurality of first data two by two to obtain at least one first data set;
and encoding two first data in the at least one first data group to obtain at least one second data.
Further, the data conversion module 704 is further configured to:
decoding the second data to obtain a plurality of first data;
The plurality of first data is converted into a plurality of third data, wherein the plurality of third and the plurality of first data have the same value.
Further, the data conversion module 704 is further configured to:
The high order bits of the first data are complemented by 0's to generate the third data.
Further, the data conversion module 704 is further configured to:
segmenting the second data to obtain a plurality of data segments;
And performing second calculation on the plurality of data segments to obtain the plurality of first data.
For detailed descriptions of the working principles, the technical effects of the embodiments of the data processing apparatus, and the like, reference may be made to the related descriptions in the foregoing embodiments of the data processing method, which are not repeated herein.
Example III
Referring now to fig. 8, a schematic diagram of an electronic device 800 suitable for use in implementing embodiments of the present disclosure is shown. The terminal devices in the embodiments of the present disclosure may include, but are not limited to, mobile terminals such as mobile phones, notebook computers, digital broadcast receivers, PDAs (personal digital assistants), PADs (tablet computers), PMPs (portable multimedia players), in-vehicle terminals (e.g., in-vehicle navigation terminals), and the like, and stationary terminals such as digital TVs, desktop computers, and the like. The electronic device shown in fig. 8 is merely an example and should not be construed to limit the functionality and scope of use of the disclosed embodiments.
As shown in fig. 8, the electronic device 800 may include a processing means (e.g., a central processor, a graphics processor, etc.) 801, which may perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 802 or a program loaded from a storage means 808 into a Random Access Memory (RAM) 803. In the RAM 803, various programs and data required for the operation of the electronic device 800 are also stored. The processing device 801, the ROM 802, and the RAM 803 are connected to each other by a bus 804. An input/output (I/O) interface 805 is also connected to the bus 804.
In general, devices may be connected to I/O interface 805 including input devices 806 such as a touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc., output devices 807 including a Liquid Crystal Display (LCD), speaker, vibrator, etc., storage devices 808 including magnetic tape, hard disk, etc., and communication devices 809. The communication means 809 may allow the electronic device 800 to communicate wirelessly or by wire with other devices to exchange data. While fig. 8 shows an electronic device 800 having various means, it is to be understood that not all of the illustrated means are required to be implemented or provided. More or fewer devices may be implemented or provided instead.
In particular, according to embodiments of the present disclosure, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a non-transitory computer readable medium, the computer program comprising program code for performing the method shown in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network via communication device 809, or installed from storage device 808, or installed from ROM 802. The above-described functions defined in the methods of the embodiments of the present disclosure are performed when the computer program is executed by the processing device 801.
It should be noted that the computer readable medium described in the present disclosure may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of a computer-readable storage medium may include, but are not limited to, an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this disclosure, a computer-readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present disclosure, however, the computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, with the computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to electrical wiring, fiber optic cable, RF (radio frequency), and the like, or any suitable combination of the foregoing.
In some embodiments, the clients, servers may communicate using any currently known or future developed network protocol, such as HTTP (HyperText Transfer Protocol ), and may be interconnected with any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include a local area network ("LAN"), a wide area network ("WAN"), the internet (e.g., the internet), and peer-to-peer networks (e.g., ad hoc peer-to-peer networks), as well as any currently known or future developed networks.
The computer readable medium may be included in the electronic device or may exist alone without being incorporated into the electronic device.
The computer readable medium carries one or more programs, when the one or more programs are executed by the electronic equipment, the electronic equipment comprises 8-bit quantization of network parameters of a neural network model to obtain first network parameters, encoding of the first network parameters to obtain 16-bit quantized second network parameters supported by a terminal processor, wherein one 16-bit quantized second network parameter is obtained after encoding of the two first network parameters, and the second network parameters are decoded into 16-bit network parameters in the calculation process of the neural network model.
Computer program code for carrying out operations of the present disclosure may be written in one or more programming languages, including, but not limited to, an object oriented programming language such as Java, smalltalk, C ++ and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider).
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units involved in the embodiments of the present disclosure may be implemented by means of software, or may be implemented by means of hardware. The name of the unit does not in any way constitute a limitation of the unit itself, for example the first acquisition unit may also be described as "unit acquiring at least two internet protocol addresses".
The functions described above herein may be performed, at least in part, by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic that may be used include Field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems-on-a-chip (SOCs), complex Programmable Logic Devices (CPLDs), and the like.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
The foregoing description is only of the preferred embodiments of the present disclosure and description of the principles of the technology being employed. It will be appreciated by persons skilled in the art that the scope of the disclosure referred to in this disclosure is not limited to the specific combinations of features described above, but also covers other embodiments which may be formed by any combination of features described above or equivalents thereof without departing from the spirit of the disclosure. Such as those described above, are mutually substituted with the technical features having similar functions disclosed in the present disclosure (but not limited thereto).
Moreover, although operations are depicted in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order. In certain circumstances, multitasking and parallel processing may be advantageous. Likewise, while several specific implementation details are included in the above discussion, these should not be construed as limiting the scope of the present disclosure. Certain features that are described in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination.
Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are example forms of implementing the claims.