CN112286579B - Data processing method, device, computer readable storage medium and computer equipment - Google Patents
Data processing method, device, computer readable storage medium and computer equipment Download PDFInfo
- Publication number
- CN112286579B CN112286579B CN201910673627.6A CN201910673627A CN112286579B CN 112286579 B CN112286579 B CN 112286579B CN 201910673627 A CN201910673627 A CN 201910673627A CN 112286579 B CN112286579 B CN 112286579B
- Authority
- CN
- China
- Prior art keywords
- data
- target
- mask
- written
- updated
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30003—Arrangements for executing specific machine instructions
- G06F9/30007—Arrangements for executing specific machine instructions to perform operations on data operands
- G06F9/30018—Bit or string instructions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30098—Register arrangements
- G06F9/3012—Organisation of register space, e.g. banked or distributed register file
- G06F9/3013—Organisation of register space, e.g. banked or distributed register file according to data content, e.g. floating-point registers, address registers
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline or look ahead
- G06F9/3885—Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units
- G06F9/3887—Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units controlled by a single instruction for multiple data lanes [SIMD]
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The present application relates to a data processing method, apparatus, computer readable storage medium and computer device, the method comprising: acquiring a data set, and acquiring a preset mask corresponding to the data set; determining target data from the data set according to a preset mask; taking the target data as data to be written, and writing the data to be written into a register; updating a preset mask; determining reference comparison data and data to be compared from the data to be written, comparing the reference comparison data with the data to be compared in full quantity to obtain a first output mask, and determining first output data from the reference comparison data according to the first output mask; and obtaining target data from the corresponding data set according to the updating preset mask, obtaining updated target data, taking the updated target data as data to be written, and returning to the step of writing the data to be written into the register until the data set is traversed, thereby obtaining target output data. The application can improve the CPU operation performance.
Description
Technical Field
The present application relates to the field of computer technologies, and in particular, to a data processing method, apparatus, computer readable storage medium, and computer device.
Background
The instruction set is a hard program stored inside a CPU (central processing unit) to guide and optimize CPU operations. With the development of instruction set technology, various instruction sets have emerged, such as SIMD (Single Instruction Multiple Data, single instruction multiple data stream) instruction sets, SSE instruction sets, and AVX series instruction sets, among others. The CPU typically uses SIMD instruction sets, SSE (STREAMING SIMD Extensions) instruction sets, and AVX-series instruction sets for accelerated data processing. At present, AVX instruction sets optimized for SSE instruction sets and AVX series instruction sets are newly proposed, and when the CPU processes data by using the AVX512 instruction set in the traditional method, the operation performance is severely reduced.
Disclosure of Invention
Based on this, it is necessary to provide a data processing method, apparatus, computer-readable storage medium and computer device capable of improving the operation performance of the CPU, aiming at the technical problem that the performance is drastically reduced when the CPU processes data using avx instruction set in the conventional method.
A data processing method includes
Acquiring a first data set and a second data set, and acquiring a preset first mask corresponding to the first data set and a preset second mask corresponding to the second data set;
Determining first target data from the first data set according to a preset first mask, determining second target data from the second data set according to a preset second mask, taking the first target data as first data to be written, and taking the second target data as second data to be written;
writing first data to be written into a first register, and writing second data to be written into a second register;
updating a preset first mask and a preset second mask according to the first data to be written and the second data to be written to obtain a corresponding updated first mask and an updated second mask;
Determining reference comparison data and data to be compared from the first data to be written and the second data to be written, performing full comparison on the reference comparison data and the data to be compared to obtain a full comparison result, obtaining a first output mask according to the full comparison result, and determining first output data from the reference comparison data according to the first output mask;
And respectively obtaining updated first target data and updated second target data from the corresponding data sets according to the updated first mask and the updated second mask, taking the updated first target data as first data to be written, taking the updated second target data as second data to be written, returning to write the first target data into the first register, and writing the second target data into the second register for execution until the first data set and the second data set are traversed, and combining the first output data corresponding to each traversal to obtain target output data.
A data processing apparatus comprising:
The data acquisition module is used for acquiring a first data set and a second data set, and acquiring a preset first mask corresponding to the first data set and a preset second mask corresponding to the second data set;
the target data determining module is used for determining first target data from a first data set according to a preset first mask, determining second target data from a second data set according to a preset second mask, taking the first target data as first data to be written, and taking the second target data as second data to be written;
the data writing module is used for writing first data to be written into the first register and writing second data to be written into the second register;
The mask updating module is used for updating a preset first mask and a preset second mask according to the first data to be written and the second data to be written to obtain a corresponding updated first mask and an updated second mask;
The output determining module is used for determining reference comparison data and data to be compared from the first data to be written and the second data to be written, comparing the reference comparison data with the data to be compared in full quantity to obtain a full quantity comparison result, obtaining a first output mask according to the full quantity comparison result, and determining the first output data from the reference comparison data according to the first output mask;
And the traversing module is used for respectively obtaining updated first target data and updated second target data from the corresponding data sets according to the updated first mask and the updated second mask, taking the updated first target data as first data to be written, taking the updated second target data as second data to be written, returning to write the first data to be written into the first register, and writing the second data to be written into the second register to execute the steps until the first data set and the second data set are traversed, and combining the first output data corresponding to each traversal until the target output data is obtained.
A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the following steps when executing the program:
Acquiring a first data set and a second data set, and acquiring a preset first mask corresponding to the first data set and a preset second mask corresponding to the second data set;
Determining first target data from the first data set according to a preset first mask, determining second target data from the second data set according to a preset second mask, taking the first target data as first data to be written, and taking the second target data as second data to be written;
writing first data to be written into a first register, and writing second data to be written into a second register;
updating a preset first mask and a preset second mask according to the first data to be written and the second data to be written to obtain a corresponding updated first mask and an updated second mask;
Determining reference comparison data and data to be compared from the first data to be written and the second data to be written, performing full comparison on the reference comparison data and the data to be compared to obtain a full comparison result, obtaining a first output mask according to the full comparison result, and determining first output data from the reference comparison data according to the first output mask;
And obtaining updated first target data and updated second target data from the corresponding data sets according to the updated first mask and the updated second mask, taking the updated first target data as first data to be written, taking the updated second target data as second data to be written, returning to write the first target data into the first register, and writing the second target data into the second register, and executing until the first data sets and the second data sets are traversed, and combining the first output data corresponding to each traversal until the target output data is obtained.
A computer readable storage medium having stored thereon a computer program which, when executed by a processor, causes the processor to perform the steps of:
Acquiring a first data set and a second data set, and acquiring a preset first mask corresponding to the first data set and a preset second mask corresponding to the second data set;
Determining first target data from the first data set according to a preset first mask, determining second target data from the second data set according to a preset second mask, taking the first target data as first data to be written, and taking the second target data as second data to be written;
writing first data to be written into a first register, and writing second data to be written into a second register;
updating a preset first mask and a preset second mask according to the first data to be written and the second data to be written to obtain a corresponding updated first mask and an updated second mask;
Determining reference comparison data and data to be compared from the first data to be written and the second data to be written, performing full comparison on the reference comparison data and the data to be compared to obtain a full comparison result, obtaining a first output mask according to the full comparison result, and determining first output data from the reference comparison data according to the first output mask;
And obtaining updated first target data and updated second target data from the corresponding data sets according to the updated first mask and the updated second mask, taking the updated first target data as first data to be written, taking the updated second target data as second data to be written, returning to write the first target data into the first register, and writing the second target data into the second register, and executing until the first data sets and the second data sets are traversed, and combining the first output data corresponding to each traversal until the target output data is obtained.
According to the data processing method, the device, the computer readable storage medium and the computer equipment, the first target data and the second target data which are updated are obtained from the corresponding data sets through updating the first mask and updating the second mask, the updated first target data is used as first data to be written, the updated second target data is used as second data to be written, the first target data is returned to be written into the first register, the second target data is written into the second register, the target data is obtained through the updated mask in each cycle, and the maximum data which can be processed in the first data set and the second data set can be processed in each cycle instead of obtaining the data with fixed quantity for processing, so that the cycle times in the CPU operation process are reduced, and the operation performance of the CPU is improved.
Drawings
FIG. 1 is a diagram of an application environment for a data processing method in one embodiment;
FIG. 2 is a flow diagram of a data processing method in one embodiment;
FIG. 3 is a flow chart of obtaining first target data according to an embodiment;
FIG. 4 is a flow chart of obtaining second target data according to an embodiment;
FIG. 5 is a flow chart of updating a preset mask in one embodiment;
FIG. 6 is a flowchart of updating a start position according to one embodiment;
FIG. 7 is a flow diagram of determining an update first mask in one embodiment;
FIG. 8 is a flow diagram of determining an updated second mask in one embodiment;
FIG. 9 is a flow chart of obtaining first output data according to an embodiment;
FIG. 10 is a flow diagram of traversing the remaining elements in one embodiment;
FIG. 11 is a flow chart of obtaining target output data according to an embodiment;
FIG. 12 is a schematic diagram illustrating a portion of the operation of a data processing method in one embodiment;
FIG. 13 is a schematic diagram illustrating a portion of the operation of a data processing method in another embodiment;
FIG. 14 is a flow chart of a data processing method in one embodiment;
FIG. 15 is a block diagram of a data processing apparatus in one embodiment;
FIG. 16 is a block diagram of a computer device in one embodiment.
Detailed Description
The present application will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present application more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application.
FIG. 1 is a diagram of an application environment for a data processing method in one embodiment. Referring to fig. 1, the data processing method is applied to a data processing system. The data processing system includes a terminal 102 and a server 104, the CPU of the server 104 supporting avx instruction sets. The terminal 102 and the server 104 are connected through a network. The terminal 102 may be a desktop terminal or a mobile terminal, and the mobile terminal may be at least one of a mobile phone, a tablet computer, a notebook computer, and the like. The server 104 may be implemented as a stand-alone server or as a server cluster of multiple servers.
Specifically, the terminal 102 receives a processing instruction for the first data set and the second data set, sends the first data set and the second data set to the server 104 according to the processing instruction, the server 104 obtains the first data set and the second data set, obtains a preset first mask corresponding to the first data set and a preset second mask corresponding to the second data set, determines first target data from the first data set according to the preset first mask, determines second target data from the second data set according to the preset second mask, uses the first target data as first data to be written, and uses the second target data as second data to be written. And writing the first data to be written into the first register, and writing the second data to be written into the second register. The server 104 updates the preset first mask and the preset second mask according to the first data to be written and the second data to be written, and obtains the corresponding updated first mask and updated second mask. The server 104 determines reference comparison data and data to be compared from the first data to be written and the second data to be written, compares the reference comparison data with the data to be compared in full quantity to obtain a full quantity comparison result, obtains a first output mask according to the full quantity comparison result, and determines first output data from the reference comparison data according to the first output mask. The server 104 obtains updated first target data and updated second target data from the corresponding data sets according to the updated first mask and the updated second mask, uses the updated first target data as first data to be written, uses the updated second target data as second data to be written, returns to write the first target data into the first register, writes the second target data into the second register, and executes the steps until the first data set and the second data set are traversed, combines the first output data corresponding to each traversal to obtain target output data, and then the server 104 sends the target output data to the terminal 102 for display.
As shown in fig. 2, in one embodiment, a data processing method is provided. The present embodiment is mainly exemplified by the application of the method to the terminal 110 or the server 120 in fig. 1. Referring to fig. 2, the data processing method specifically includes the steps of:
s202, acquiring a first data set and a second data set, and acquiring a preset first mask corresponding to the first data set and a preset second mask corresponding to the second data set.
The data set refers to a set of data arranged in a certain order. The first data set and the second data set are data sets with the same data type, and the data sets can be a text data set, a graph data set, a list attribute set, a friend data set and the like. The mask is a string of binary codes, predefined and set. Each mask has a corresponding data set, and the mask is used to obtain data elements from the corresponding data set. The preset first mask and the preset second mask may be the same or different. For example, the preset first mask may be 0xFFFF (1111 1111 1111 1111) and the preset second mask may be 0xFFFF (1111 1111 1111 1111).
Specifically, the server or the terminal acquires a first data set and a second data set which need to be operated, and then acquires a preset first mask corresponding to the first data set and a preset second mask corresponding to the second data set according to the first data set and the second data set. The CPU of the server or terminal supports avx an instruction set.
In one embodiment, the server may obtain the first data set and the second data set from a database of data stores.
In one embodiment, the server may also obtain the first data set and the second data set sent by the terminal, and then obtain the corresponding preset first mask and the preset second mask. Wherein the CPU of the server supports avx512,512 instruction sets.
In one embodiment, the terminal may obtain a first data set and a second data set entered by a user.
In one embodiment, the terminal may also obtain the first data set and the second data set from the server according to the user's selection
In one embodiment, the server may obtain a first original data set and a second original data set, and arrange the first original data set and the second original data set in a certain order to obtain the first data set and the second data set.
S204, determining first target data from the first data set according to a preset first mask, determining second target data from the second data set according to a preset second mask, taking the first target data as first data to be written, and taking the second target data as second data to be written.
The target data is obtained by sequentially selecting elements in the set from the data set according to a preset mask. The first data to be written refers to an element selected from the first data set according to the first mask in each cycle. The second data to be written refers to elements selected from the second data set according to the second mask in each cycle.
Specifically, the avx instruction is used to sequentially select the same number of elements from the first data set according to the number of target bits in the preset first mask to obtain the first target data. And sequentially selecting the same number of elements from the second data set according to the preset target bit number in the second mask by using avx instructions to obtain the first target data. The target bit may be bit 1 or bit 0. And taking the first target data as first data to be written and the second target data as second data to be written.
For example, according to the number 16 of bits 1 in the preset first mask 0xFFFF, 16 elements are sequentially selected from the first data set in sequence by using an instruction of_mm512_mask_ expandload to obtain first target data. According to the preset number of bits 1 in the second mask 0xFFFF is 16, 16 elements are sequentially selected from the second data set by using an instruction of_mm512_mask_ expandload to obtain second target data, the first target data is used as first data to be written, and the second target data is used as second data to be written.
S206, writing the first data to be written into the first register, and writing the second data to be written into the second register.
Wherein the registers are integral parts of the central processing unit. Registers are high-speed memory elements of limited memory capacity that can be used to temporarily store instructions, data, and addresses. The first register may be a preset register in which first data to be written is written, and the second register is a preset register in which second data to be written is written.
Specifically, after the first data to be written is determined to be obtained, the first data to be written is written into the first register, and after the second data to be written is determined to be obtained, the second data to be written is written into the first register.
And S208, updating a preset first mask and a preset second mask according to the first data to be written and the second data to be written, and obtaining a corresponding updated first mask and an updated second mask.
Specifically, the preset first mask is updated according to each element of the first data to be written in the first register and the second target element of the second data to be written in the second register by using avx instruction, so as to obtain a corresponding updated first mask. And updating the preset second mask by using avx instructions according to the first target element of the first data to be written in the first register and each element of the second data to be written in the second register, so as to obtain a corresponding updated second mask. Wherein the target element may be the maximum value in the target data.
S210, determining reference comparison data and data to be compared from the first data to be written and the second data to be written, comparing the reference comparison data with the data to be compared in full quantity to obtain a full quantity comparison result, obtaining a first output mask according to the full quantity comparison result, and determining first output data from the reference comparison data according to the first output mask.
The reference comparison data refers to data to be written which is compared with elements in the data to be compared by taking the elements in the reference comparison data as references during comparison. The data to be compared refers to data to be written to be compared with the reference comparison data. The reference comparison data may be first data to be written or second data to be written. When the reference comparison data is the first data to be written, the data to be compared is the second data to be written, and when the reference comparison data is the second data to be written, the data to be compared is the first data to be written.
Full-scale comparison refers to comparing each element in the baseline comparison data with all elements in the data to be compared. The full-scale comparison result includes elements that are the same and elements that are different. The same element means that the element in the reference comparison data is the same as at least one element in the data to be compared. Element difference means that an element in the reference comparison data is different from all elements in the data to be compared.
The first output mask is obtained by binary encoding the full-scale comparison result after obtaining the full-scale comparison result corresponding to each element in the reference comparison data. For example, the elements of the reference comparison data, in which the total comparison result is the same elements, may be encoded as 1, and the elements of the different elements may be encoded as 0. All elements in the reference comparison data are encoded to obtain a first output mask. The first output data is derived from elements of the reference comparison data output according to the first output mask. For example, when the intersection of the first data set and the second data set needs to be calculated, the target bit with the same bit representing element in the first output mask can be obtained, and the element corresponding to the target comparison bit is selected from the reference comparison data to obtain the first output data.
Specifically, reference comparison data is determined to be first target data from first target data and second target data, data to be compared is determined to be second target data, each element in the first target data is compared with all elements in the second target data to obtain a comparison result, each element in the first target data is encoded according to the comparison result to obtain a first output mask, and the first output data is determined from the first target data according to the first output mask.
In one embodiment, the reference comparison data is determined to be the second target data from the first target data and the second target data, the data to be compared is determined to be the first target data, each element in the second target data is compared with all elements in the first target data by using avx instruction to obtain a comparison result, each element in the second target data is encoded according to the comparison result to obtain a first output mask, and the first output data is determined from the second target data by using avx instruction according to the first output mask.
In one embodiment, after step S206, step S210 may be performed first, and then step S208 may be performed. That is, after step S206, the steps of determining the reference comparison data and the data to be compared from the first data to be written and the second data to be written, performing full-scale comparison of the reference comparison data and the data to be compared to obtain a full-scale comparison result, obtaining a first output mask according to the full-scale comparison result, determining the first output data from the reference comparison data according to the first output mask, and updating the preset first mask and the preset second mask according to the first data to be written and the second data to be written are performed to obtain the corresponding updated first mask and the updated second mask.
S212, respectively obtaining updated first target data and updated second target data from the corresponding data sets according to the updated first mask and the updated second mask, taking the updated first target data as first data to be written, and taking the updated second target data as second data to be written.
Specifically, according to the number of target bits in the updated first mask, using avx instructions to sequentially select the same number of elements from the first data set to obtain updated first target data, and according to the number of target bits in the updated second mask, using avx instructions to sequentially select the same number of elements from the second data set to obtain updated second target data. And taking the updated first target data as first data to be written, and taking the updated second target data as second data to be written.
S214, judging whether the first data set and the second data set are traversed. When the first data set and the second data set are traversed, step S216 is executed, and when the first data set and the second data set are not traversed, the loop is executed back to step S206.
S216, combining the first output data corresponding to each traversal to obtain target output data.
The first data set and the second data set are traversed, namely, each element in the first data set is compared with all elements in the second data set, namely, the traversing result of all elements in the first data set is completed, or each element in the second data set is compared with all elements in the first data set, namely, the traversing of all elements in the second data set is completed.
The target output data refers to data obtained from the first data set and the second data set. For example, the elements in the target output data may be elements in both the first data set and the second data set.
Specifically, after each element in the first data set is compared to each element of the second data set, the first data set and the second data set are traversed. At this time, step S216 is executed to combine the first output data obtained in each cycle to obtain target output data. When the first data set and the second data set are not traversed, returning to step S206 for loop execution, namely writing the first target data into the first register, and writing the second target data into the second register, continuing to execute, wherein the data written into the register in each loop overlaps the data written into the register in the last loop, and the uncovered data is reserved in the register.
According to the data processing method, the updated first target data and the updated second target data are obtained from the corresponding data sets respectively through updating the first mask and updating the second mask, the updated first target data is used as first data to be written, the updated second target data is used as second data to be written, the first target data is returned to be written into the first register, the second target data is written into the second register to be executed, the target data are acquired through the updated mask in each cycle, and the maximum processable data in the first data set and the second data set can be processed in each cycle instead of the fixed number of data are acquired for processing, so that the cycle times in the CPU operation process are reduced, and the CPU operation performance is improved.
In one embodiment, as shown in fig. 3, determining the first target data from the first data set according to a preset first mask includes the steps of:
s302, calculating a first number of target bits in a preset first mask and acquiring a preset starting position corresponding to a first data set.
S304, sequentially selecting a first number of elements from the first data set according to a preset starting position corresponding to the first data set to obtain first target data.
The target bit refers to a bit value in the mask, and may be 1 or 0. The first number refers to the number of target bits in the first mask. The preset starting position refers to a preset starting position of selecting elements from the data set. For example, it may be set to fetch starting from the first bit element in the data set. It is also possible to set the fetch from the last bit element, etc.
Specifically, the number of target bits contained in a preset first mask is calculated, and a preset starting position corresponding to the first data set is obtained. And sequentially selecting elements from preset starting positions in the first data set by using avx instructions until a first number of elements are selected, so as to obtain first target data. For example, the number of target bits contained in the first mask is calculated to be 16. The preset starting position is the first bit element in the first data set. Then a _mm512_mask_ expandload instruction is used to start to select from the first bit element in the first data set until 16 elements in the first data set are sequentially selected, and the first target data is obtained according to the 16 elements.
In the above embodiment, the first target data is obtained by selecting a first number of elements from a starting position in the first data set according to a first number of target bits in a preset first mask. The first target data obtained each time is the maximum number which can be processed, and the processing efficiency of the data is improved.
In one example, as shown in fig. 4, determining the second target data from the second data set according to the preset second mask includes the steps of:
S402, calculating a second number of target bits in the second mask and acquiring a preset starting position corresponding to the second data set.
S404, sequentially selecting a second number of elements from the second data set according to the preset starting position corresponding to the second data set to obtain second target data.
Wherein. The target bit refers to a bit value in the mask, which may be 1 or 0. The second number refers to the number of target bits in the second mask. The preset starting position refers to a preset starting position of selecting elements from the data set. For example, it may be arranged to start the selection from the first bit element in the data set. It may also be arranged to start the selection from the last bit element, etc.
Specifically, the number of target bits contained in the preset second mask is calculated, and a preset starting position corresponding to the second data set is obtained. And sequentially selecting elements from preset starting positions in the second data set by using avx instructions until a second number of elements are selected, so as to obtain second target data. For example, the number of target bits contained in the second mask is calculated to be 16. The preset starting position is the first bit element in the second data set. Then a _mm512_mask_ expandload instruction is used to start to select from the first bit element in the second data set until 16 elements in the second data set are sequentially selected, and second target data is obtained according to the 16 elements.
In the above embodiment, the second target data is obtained by selecting a second number of elements from the starting position in the second data set according to a second number of target bits in the preset second mask. The second target data obtained each time is the maximum number which can be processed, and the processing efficiency of the data is improved.
In one example, as shown in fig. 5, updating the preset first mask and the preset second mask according to the first target data and the second target data, to obtain corresponding updated first mask and updated second mask includes:
S502, determining a second target element according to the size of each element in the second data to be written, comparing the second target element with each element in the first data to be written to obtain a first comparison result, and determining to update the first mask according to the first comparison result.
The second target element is obtained according to the size of each element in the second data to be written, and may be the largest element in the second data to be written, or may be the highest ordered element in the second data to be written. Updating the first mask refers to a mask obtained after comparing the second target element with each element in the first data to be written, and is used for selecting the element from the first data set in the next cycle. The first comparison result includes that the second target element is not smaller than the element in the first target data and the second target element is smaller than the element in the first target data.
Specifically, a second target element is determined according to the size of each element in the second data to be written by using avx instructions, the second target element is compared with each element in the first data to be written by using avx instructions, a first comparison result is obtained, and the first mask is determined to be updated according to the first comparison result. Wherein. The first mask is preset to correspond to the elements in the first data to be written one by one, and the corresponding bits of the first mask can be updated according to the comparison result of the second target element and each element in the first data to be written. For example, when the second target element is not smaller than the element in the first data to be written, the bit of the preset first mask corresponding to the element not smaller than the first data to be written is updated to 1, when the second target element is smaller than the element in the first data to be written, the bit of the preset first mask corresponding to the element smaller than the first data to be written is updated to 0, and when the comparison is completed, the updated second mask is obtained. .
S504, determining a first target element according to the size of each element in the first data to be written, comparing each element in the first data to be written with each element in the second data to be written to obtain a second comparison result, and updating a second mask according to the second comparison result.
The first target element is obtained according to the size of each element in the first data to be written, and may be the largest element in the first data to be written, or may be the highest ordered element in the first data to be written. Updating the second mask refers to a mask obtained after comparing the first target element with each element in the second data to be written, and is used for selecting the element from the second data set in the next cycle. The second comparison result includes that the first target element is not smaller than the element in the second data to be written and the first target element is smaller than the element in the second data to be written.
Specifically, a first target element is determined according to the size of each element in the first data to be written by using avx instructions, the first target element is compared with each element in the second data to be written by using avx instructions, a second comparison result is obtained, and the second mask is determined to be updated according to the second comparison result. And the bits of the corresponding preset second mask can be updated according to the comparison result of the first target element and each element in the second data to be written. For example, when the first target element is not smaller than the element in the second data to be written, the bit of the preset second mask corresponding to the element not smaller than the second data to be written is updated to 1, when the first target element is smaller than the element in the second data to be written, the bit of the preset second mask corresponding to the element smaller than the second data to be written is updated to 0, and when the comparison is completed, the updated second mask is obtained.
In the above embodiment, the updated first mask is obtained by comparing the second target element with each element in the first data to be written. And comparing the first target element with each element in the second data to be written to obtain an updated second mask. In the next cycle, elements can be quickly selected from the data set according to the updated mask and written into the register, so that the running efficiency is improved. And in each cycle, the maximum step size of the first data set and the second data set is increased, wherein the step size refers to the number of elements marked as processed in each cycle and not used in the next cycle. The maximum step length is the step length which can be adopted maximally on the premise of correct calculation logic, namely the number of the maximum elements processed in the data set is increased in each cycle, the number of cycles required in the whole operation process is minimized, the cycle process is not executed by using conditional sentences, the risk of clearing a CPU pipeline at the degree of prediction branching is avoided, and the operation performance of the CPU is improved.
In one example, as shown in fig. 6, after determining a first target element according to the size of each element in the first data to be written, comparing each element in the first target element and the second data to be written to obtain a second comparison result, determining to update the second mask according to the second comparison result, further includes:
S602, a preset first starting position corresponding to the first data set and a preset second starting position corresponding to the second data set are obtained.
S604, calculating an updating first quantity of target bits in the updating first mask, and determining an updating first starting position according to a preset first starting position and the updating first quantity;
the preset first starting position is a preset starting position of selecting an element from the first data set, and the preset first starting position may be a first element in the first data set or a last element in the first data set. The first number is the number of target bits in the first mask after updating, and the first starting position is obtained according to the preset first starting position and the number of target bits. The starting position is updated in each cycle, so that the elements from the first data set are selected according to the updated starting position in the next cycle.
Specifically, a first preset starting position corresponding to the first data set is obtained, the first quantity of updating the target bits in the first mask is calculated, and the first starting position is determined according to the first preset starting position corresponding to the first data set and the first quantity of updating. For example, the preset first starting position may be set to a pointer i=0, where the pointer points to the first element in the first data set, and when the first number of update target bits in the update first mask is calculated to be n=16, the pointer that updates the first starting position is i=16, that is, the pointer points to the 17 th element in the first data set. The next time the first target data is selected in a loop, the selection starts from the 17 th element in the first data set.
S606, calculating an updated second quantity of target bits in the updated second mask, and determining an updated second starting position according to the preset second starting position and the updated second quantity.
The preset second starting position is a preset starting position of selecting an element from the second data set, and the preset second starting position may be a first element in the second data set or a last element in the second data set. The updated second number refers to the number of target bits in the updated second mask, which may be the sum of the number of bits 1 and the number of bits 0. Updating the second starting position is obtained according to the preset second starting position and the number of target bits in the second mask. The starting position is updated in each cycle, so that the elements from the second data set are selected according to the updated starting position in the next cycle.
Specifically, a second initial position, which corresponds to the preset second data set, is obtained, a second quantity of updated target bits in the second mask is calculated, and the second initial position is determined according to the preset second initial position and the second quantity of updated target bits, which correspond to the second data set. For example, the preset second starting position may be set to a pointer j=0, where the pointer points to the first element in the second data set, and when the first number of update target bits in the update second mask is calculated to be n=16, the pointer that updates the first starting position is obtained to be j=16, that is, the pointer points to the 17 th element in the first data set. The next time the second target data is selected in the loop, the selection starts from the 17 th element in the second data set.
In the above embodiment, the initial position of the selection element is updated by the preset initial position and the number of the target bits in the preset mask, so that the efficiency of selecting the target data in the cycle is improved, and the efficiency is improved.
In one example, as shown in fig. 7, comparing the second target element with each element in the first data to be written to obtain a first comparison result, and determining to update the first mask according to the first comparison result includes:
S702, when the second target element exceeds the target first element in the first data to be written, obtaining a first mark corresponding to the target first element.
The target first element refers to an element in the first data to be written that does not exceed the second target element, for example, the target first element may be less than or equal to the second target element. The first mark refers to a mark of the target first element for representing a comparison result of the second target element and the target first element.
Specifically, when the second target element exceeds the target first element in the first data to be written, the target first element is marked as a corresponding first mark. For example, if the second target element is 10 and the first target element is 8 and 8 is less than 10, then 8 is marked as the corresponding 1.
And S704, when the second target element does not exceed the target second element of the first data to be written, obtaining a second mark corresponding to the target second element.
The target second element refers to an element in the first data to be written that does not exceed the second target element, for example, the target first element may be larger than the second target element. The second mark refers to a mark of the target second element for representing a comparison result of the second target element and the target second element.
Specifically, when the second target element does not exceed the target first element in the first target data, the target first element is marked as a corresponding first mark. For example, if the second target element is 10 and the first target element is 11, and 11 is greater than 10, then 11 is marked as the corresponding 0.
S706, determining to update the first mask according to the first mark corresponding to the target first element and the second mark corresponding to the target second element.
Specifically, each element in the first target data is marked, and then the first mask is updated according to the first mark corresponding to the first element of the target and the second mark corresponding to the second element of the target in sequence.
In the above embodiment, the comparison result of the second target element and each element in the first data to be written is marked, and the updated first mask is obtained according to the mark, so that the subsequent use is convenient.
In one example, as shown in fig. 8, comparing the first target element with each element in the second data to be written to obtain a second comparison result, and obtaining the updated second mask according to the second comparison result includes:
s802, when the first target element exceeds a target third element in the second data to be written, a first mark corresponding to the target third element is obtained.
The target third element refers to an element in the second data to be written that exceeds the first target element, for example, the target third element may be less than or equal to the first target element. The first mark refers to a mark of the first element of the target, and is used for representing the comparison result of the first target element and the third target element.
Specifically, when the first target element exceeds the target third element in the first data to be written, the target third element is marked as a corresponding first mark. For example, if the first target element is 11 and the target third element is 8,8 is smaller than 11, then 8 is marked as corresponding 1.
And S804, when the first target element does not exceed the target fourth element of the second data to be written, obtaining a second mark corresponding to the target fourth element.
The fourth target element refers to an element in the second data to be written that does not exceed the first target element, for example, the fourth target element may be less than or equal to the first target element. The second mark refers to a mark of the target fourth element, and is used for representing the comparison result of the first target element and the target fourth element.
Specifically, when the first target element does not exceed the target fourth element in the first data to be written, the target fourth element is marked as a corresponding second mark. For example, if the first target element is 11, the target third element is 12, and 12 is greater than 11, then 12 is marked as corresponding 0.
S806, determining to update the second mask according to the first mark corresponding to the target third element and the second mark corresponding to the target fourth element.
Specifically, each element in the first target data is marked, and then the first mask is updated according to the first mark corresponding to the first element of the target and the second mark corresponding to the second element of the target in sequence.
In the above embodiment, the comparison result of the first target element and each element in the second data to be written is marked, and the updated second mask is obtained according to the mark, so that the subsequent use is convenient.
In one example, as shown in fig. 9, the comparing the reference comparison data with the data to be compared in full, to obtain a full comparison result, obtaining a first output mask according to the full comparison result, and determining the first output data from the reference comparison data according to the first output mask includes:
S902, comparing elements in the reference comparison data with elements in the data to be compared.
S904, when the target element in the reference comparison data is the same as any element in the data to be compared, obtaining a first bit corresponding to the target element in the reference comparison data.
S906, when all the target elements in the reference comparison data are different from the elements in the data to be compared, obtaining a second bit corresponding to the target elements in the reference comparison data.
Wherein the first bit may be 1 and the second bit may be 0.
Specifically, each element in the reference comparison data is compared with all elements in the comparison data, and when a target element in the reference comparison data is identical to any one element in all elements in the data to be compared, a bit value of an output mask bit corresponding to the target element in the reference comparison data is obtained to be a first bit. And when the target element in the reference comparison data is different from all the elements in the data to be compared, obtaining the bit value of the output mask bit corresponding to the target element in the reference comparison data as a second bit.
S908, determining a first output mask according to the bit corresponding to the element in the basic comparison data, and obtaining first output data according to the element corresponding to the first bit in the first output mask in the basic comparison data.
Specifically, according to the comparison result of the elements in the basic comparison data and the elements in the data to be compared, namely the first bit and the second bit, a first output mask corresponding to the elements in the basic comparison data is obtained. And selecting the first bit in the first output mask from the element corresponding to the reference comparison data to obtain first output data.
In the above embodiment, according to the comparison result of the element in the basic comparison data and the element in the data to be compared, the first output data is obtained according to the same element in the basic comparison data and the data to be compared in the comparison result, and the processing result of the basic comparison data and the data to be compared is obtained.
In one example, after obtaining the updated first target data and the updated second target data from the corresponding data sets according to the updated first mask and the updated second mask, respectively, the method further includes:
And rearranging the updated second mask and updated second target data corresponding to the updated second mask to obtain rearranged second mask and rearranged second target data.
Specifically, the updated second mask and the updated second target data corresponding to the updated second mask are rearranged according to the corresponding sequence, so as to obtain rearranged second mask and rearranged second target data, the rearranged second target data can be used as second data to be written into the register, and the comparison efficiency can be improved when the full comparison is performed.
In one embodiment, the updated first mask and the updated first target data corresponding to the updated first mask may be rearranged to obtain the rearranged first mask and the rearranged first target data.
In one example, as shown in fig. 10, until the first data set and the second data set are traversed, combining the first output data corresponding to each traversal to obtain the target output data includes:
S1002, calculating the number of residual elements in the first data set, calculating a preset first mask bit number, and acquiring the residual elements in the first data set and the residual elements in the second data set when the number of residual elements in the first data set is less than the preset first mask bit number.
The remaining number of elements refers to the number of elements remaining in the data set when each cycle is completed.
Specifically, the number of remaining elements in the first data set and the preset first mask bit number are calculated, and when the number of remaining elements in the first data set is less than the preset first mask bit number, it is indicated that the number of remaining elements in the first data set cannot support completion of one cycle. At this time, the remaining elements in the first data set are acquired, and the remaining elements in the second data set are detected. When there are no remaining elements in the second data set, then all traversals of the elements in the second data set are completed. And directly combining the first output data corresponding to each cycle traversal to obtain target output data. And directly acquiring the residual elements in the second data set when the residual elements exist in the second data set.
S1004, comparing the residual elements in the first data set with the residual elements in the second data set to obtain residual output data, determining that the first data set and the second data set are traversed, and combining the first output data and the residual output data corresponding to each traversal to obtain target output data.
The remaining output data refers to elements which are simultaneously contained in the remaining elements in the first data set and the remaining elements in the second data set.
Specifically, a comparison algorithm is used to compare the remaining elements in the first data set with the remaining elements in the second data set to obtain remaining output data. The comparison algorithm may be a classical scalar method, and in the program, a c++ stl (standard template library) library function may be directly called to perform comparison, specifically, each time, the forefront element of the remaining elements in the first data set and the forefront element of the remaining elements in the second data set are compared, if the elements are the same, a remaining output element is obtained, if the elements are different, the data set with smaller number of remaining elements advances by 1 element, comparison is performed again, all the remaining elements in the first data set and the remaining elements in the second data set are directly compared, and the obtained remaining output element is used as remaining output data. At this time, it is determined that the first data set and the second data set are traversed, and the first output data and the remaining output data corresponding to each traversal are combined to obtain the target output data.
In the above embodiment, when the elements in the data set cannot complete the cycle, the comparison algorithm is used for comparison, so that the accuracy of obtaining the target output data is improved.
In a specific embodiment, the data processing method comprises the steps of:
S1102, a first data set and a second data set are acquired, and a preset first mask corresponding to the first data set and a preset second mask corresponding to the second data set are acquired.
S1104a, calculating a first number of target bits in a preset first mask and acquiring a preset starting position corresponding to the first data set.
And S1104b, sequentially selecting a first number of elements from the first data set according to a preset starting position corresponding to the first data set to obtain first target data.
And S1106a, calculating a second number of target bits in the second mask and acquiring a preset starting position corresponding to the second data set.
And 1106b, sequentially selecting a second number of elements from the second data set according to the preset starting position corresponding to the second data set to obtain second target data, taking the first target data as first data to be written, and taking the second target data as second data to be written.
S1108, writing the first data to be written into the first register, and writing the second data to be written into the second register.
S1110a, determining a second target element according to the size of each element in the second data to be written, comparing the second target element with each element in the first data to be written to obtain a first comparison result, and determining to update the first mask according to the first comparison result.
S1110b, determining a first target element according to the size of each element in the first data to be written, comparing the first target element with each element in the second data to be written to obtain a second comparison result, and updating a second mask according to the second comparison result.
S1112a, a preset first starting position corresponding to the first data set and a preset second starting position corresponding to the second data set are obtained.
S1112b, calculating an update first number of target bits in the update first mask, and determining an update first starting position according to the preset first starting position and the update first number.
S1112c, calculating an updated second number of target bits in the updated second mask, and determining an updated second start position according to the preset second start position and the updated second number.
S1114a, determining reference comparison data and data to be compared from the first data to be written and the second data to be written, and comparing the elements in the reference comparison data with the elements in the data to be compared.
S1114b, when the target element in the reference comparison data is the same as any element in the data to be compared, obtaining the first bit corresponding to the target element in the reference comparison data.
And S1114c, when all the target elements in the reference comparison data are different from the elements in the data to be compared, obtaining a second bit corresponding to the target elements in the reference comparison data.
And S1114d, determining a first output mask according to the bit corresponding to the element in the basic comparison data, and obtaining first output data according to the element corresponding to the first bit in the first output mask in the basic comparison data.
S1116, obtaining updated first target data from the first data set according to the updated first mask and the updated first starting position, obtaining updated second target data from the second data set according to the updated second mask and the updated second starting position, taking the updated first target data as first data to be written, and taking the updated second target data as second data to be written.
S1118, judging whether the first data set and the second data set are traversed, returning to step S1108, when the traversing is not completed, writing the first data to be written into the first register, and writing the second data to be written into the second register. When the traversal is completed, step S1120 is performed.
S1120, combining the first output data corresponding to each traversal to obtain target output data.
FIG. 12 is a schematic diagram illustrating a portion of the operation of a data processing method in one embodiment. Set input_a and set input_b, and compute the intersection of set input_a and set input_b. In loop0 loop, 16 elements are selected from the set input_a according to the number of bits 1 in the preset mask_a, so as to obtain first target data, and the first target data is written into the register va. And selecting 16 elements from the set input_b according to the number of bits 1 in the preset mask_b to obtain second target data, and writing the second target data into the register vb. Comparing the maximum value 16 in the register va with the element in the register vb, the mask_b in the loop of loop 1 is obtained. Comparing the maximum value 31 in the register vb with the elements in the register va, the mask_a in the loop of loop 1 is obtained. The register va is compared with the full amount of the register vb to obtain the first output data (1, 3,5,7,9,11,13, 15). And selecting 16 elements from the set input_a according to the number of bits 1 in mask_a in the loop of loop 1, writing the elements into a register va, and covering the data written in the loop of loop0 to obtain the register va in the loop of loop 1.8 elements are selected from the set input_b according to the number of bits 1 in mask_b in loop 1 (the elements with bits 0 of mask in loop 1 are skipped during selection), selected data (33,35,37,39,41,43,45,47) are written into the register vb, and the data written in loop0 loop is covered, so that the register vb in loop 1 loop is obtained. At this time, the loop is repeated, and the next loop is entered for execution until the element traversal in the set input_a and the set input_b is completed. And combining the obtained first output data to obtain target output data. The target output data is the intersection of the set input_a and the set input_b.
In one embodiment, the data of register vb and mask in mask_b are rearranged before each loop load is performed. As shown in fig. 14, in a partial operation schematic diagram of a data processing method in one embodiment, before loop1 loops, the data in the register vb and the mask in the mask_b are rearranged correspondingly, so as to improve the operation efficiency.
In one example, as shown in fig. 14, the method further comprises:
S1402, when the data set is a friend data set, determining corresponding common friend data according to the first friend data set and the second friend data set.
The friend data set refers to a set of personal user identifications added by personal users of the instant messaging tool. The personal user can add a plurality of personal user identifications, namely a plurality of friends through the instant messaging tool. The personal user identification is used to uniquely identify the personal user and may be a code, name, etc. The instant messaging tools may be Tencent QQ, weChat (WeChat), YY voice, IS, full circle, mobile flyer, LAHOO (music tiger), LASIN (music message), fastMsg, ant ao, and the like. The first set of buddy data refers to a set of personal user identifications added by the first user. The second set of buddy data refers to a set of personal user identifications added by the second user. The first user and the second user may be users in the same user group, and the user group may be a QQ group, a micro-channel group, and the like. The common friend data refers to personal user identifications in both the first friend data set and the second friend data set.
Specifically, when the data set is a friend data set, a first friend data set of a first user and a second friend data set of the first user are obtained, and a preset first mask corresponding to the first friend data set and a preset second mask corresponding to the second friend data set are obtained; determining first target data from a first friend data set according to a preset first mask, determining second target data from a second friend data set according to a preset second mask, taking the first target data as first data to be written, and taking the second target data as second data to be written; writing first data to be written into a first register, and writing second data to be written into a second register; updating a preset first mask and a preset second mask according to the first data to be written and the second data to be written; determining reference comparison data and data to be compared from the first data to be written and the second data to be written, performing full comparison on the reference comparison data and the data to be compared to obtain a full comparison result, obtaining a first output mask according to the full comparison result, and determining first output data from the reference comparison data according to the first output mask; and respectively obtaining updated first target data and updated second target data from the corresponding friend data sets according to the updated first mask and the updated second mask, taking the updated first target data as first data to be written, taking the updated second target data as second data to be written, returning to the step of writing the first data to be written into a first register, writing the second data to be written into a second register, executing until the first friend data set and the second friend data set are traversed, combining the first output data corresponding to each traversal to obtain common friend data, and sending the common friend data to a terminal corresponding to the first user and a terminal corresponding to the second user for display. Or the number of the personal user identifications in the common friend data obtained by statistics can be sent to the terminal corresponding to the first user and the terminal corresponding to the second user for display.
S1404, when the data set is text data, corresponding common text data is determined according to the first text data and the second text data.
In particular, when the data set is text data, where text refers to a representation of a written language, from a grammatical point of view, it is typically a sentence or a combination of sentences having a complete, systematic meaning (Message). A text may be a sentence (Sentence), a Paragraph (Paragraph), or a chapter (Discourse). Generalized "text": any utterance that is fixed by the writing. (Li Keer) narrow "text": the literary entity composed of language and words refers to "works", and an independent and self-contained system is formed relative to authors and the world.
In one embodiment, the text data may be a paper. Papers are often used to refer to articles that conduct research in various academic fields and describe academic research results, including academic papers, graduation papers, academic papers, scientific papers, achievement papers, and the like. When the user needs to check the duplicate paper, the user paper is acquired and is used as the first text data. And acquiring the paper data in the paper database, wherein each paper data in the paper database can be used as second text data, and corresponding common text data, namely repeated paper data, is determined according to the first text data and each second text data. A repetition rate is determined from the repeated paper data and each of the second text data. And sending the repetition rate and the repeated paper data to a terminal corresponding to the corresponding user paper for display. And the second text corresponding to the highest repetition rate can be sent to a terminal corresponding to the user paper for display.
In one embodiment, the text data may be text in a web page, the text in the first web page and the text in the second web page are obtained, and the same keyword is determined according to the text in the first web page and the text in the second web page. And acquiring the website of the second webpage, and associating the same keywords in the first webpage with the website of the second webpage. Acquiring the website of the first webpage, and associating the same keywords in the second webpage with the website of the first webpage. When the user clicks on the same keyword in the first web page, the user may jump to the second web page. When the user clicks on the same keyword in the second web page, the user may jump to the first web page.
S1406, when the data set is a list attribute, determining a corresponding common list attribute according to the first list attribute and the second list attribute.
Wherein the data objects are stored in the form of data tuples in a data table, rows of the data table corresponding to the data objects and columns corresponding to the attributes. An attribute is a data field that represents a characteristic of a data object. The list attribute is a field that refers to a column in the data table. The common list attribute refers to the same column field.
Specifically, when the data set is a list attribute, a first list attribute input by a user, that is, a column attribute of a first data table, may be acquired, a column attribute corresponding to each data table in the database may be acquired, and a column attribute of each data table in the database may be used as each second list attribute. And determining the common list attribute of the first list attribute and each second list attribute according to the first list attribute and each second list attribute, and obtaining the data table with the most common list attribute with the first list attribute.
S1408, when the data set is graph data, corresponding common sub-graph data is determined from the first graph data and the second graph data.
Wherein, the graph is a data structure used for representing the relation between entity sets, and the graph data comprises the corresponding relation between entity machines and entity sets.
Specifically, when the data set is graph data, sub-graph matching is performed according to the first graph data and the second graph data, and common sub-graph data of the first graph data and the second graph data is determined.
In one embodiment, the data set is a knowledge-graph of the geographic location when it is graph data. The geographic path can be determined according to the initial geographic position input by the user and the knowledge graph of the geographic position of the terminal at the geographic position.
It should be understood that, although the steps in the flowcharts of fig. 2-11 and 14 are shown in order as indicated by the arrows, these steps are not necessarily performed in order as indicated by the arrows. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, at least some of the steps of FIGS. 2-11 and 14 may include multiple sub-steps or stages that are not necessarily performed at the same time, but may be performed at different times, nor does the order in which the sub-steps or stages are performed necessarily occur in sequence, but may be performed alternately or alternately with other steps or at least a portion of the sub-steps or stages of other steps.
As shown in fig. 15, a schematic diagram of a data processing 1500 in an embodiment, the apparatus includes:
A data obtaining module 1502, configured to obtain a first data set and a second data set, and obtain a preset first mask corresponding to the first data set and a second mask corresponding to the second data set;
A target data determining module 1504, configured to determine first target data from the first data set according to a preset first mask, determine second target data from the second data set according to a preset second mask, use the first target data as first data to be written, and use the second target data as second data to be written;
a data writing module 1506, configured to write the first data to be written into the first register and write the second data to be written into the second register;
A mask updating module 1508 for updating a preset first mask and a preset second mask according to the first data to be written and the second data to be written;
The output determining module 1510 is configured to determine reference comparison data and data to be compared from the first data to be written and the second data to be written, compare the reference comparison data with the data to be compared in full, obtain a full comparison result, obtain a first output mask according to the full comparison result, and determine the first output data from the reference comparison data according to the first output mask;
The traversing module 1512 is configured to obtain updated first target data and updated second target data from corresponding data sets according to the updated first mask and the updated second mask, take the updated first target data as first data to be written, take the updated second target data as second data to be written, return to writing the first data to be written into the first register, and write the second data to be written into the second register, until the first data set and the second data set are traversed, and combine the first output data corresponding to each traversal to obtain the target output data.
In one embodiment, the target data determining module 1504 is further configured to calculate a first number of target bits in the preset first mask and obtain a preset starting position corresponding to the first data set; and sequentially selecting a first number of elements from the first data set according to a preset starting position corresponding to the first data set to obtain first target data.
In one example, the target data determining module 1504 is further configured to calculate a second number of target bits in the second mask and obtain a preset starting position corresponding to the second data set; and sequentially selecting a second number of elements from the second data set according to the preset starting position corresponding to the second data set to obtain second target data.
In one example, the mask updating module 1508 is further configured to determine a second target element according to a size of each element in the second data to be written, compare the second target element with each element in the first data to be written to obtain a first comparison result, and determine to update the first mask according to the first comparison result; determining a first target element according to the size of each element in the first data to be written, comparing the first target element with each element in the second data to be written to obtain a second comparison result, and determining to update a second mask according to the second comparison result.
In one example, the mask updating module 1508 is further configured to obtain a preset first starting position corresponding to the first data set and a preset second starting position corresponding to the second data set; calculating an updating first quantity of target bits in the updating first mask, and determining an updating first starting position according to a preset first starting position and the updating first quantity; and calculating an updated second quantity of target bits in the updated second mask, and determining an updated second starting position according to the preset second starting position and the updated second quantity.
In one example, the mask updating module 1508 is further configured to obtain a first flag corresponding to a target first element when the second target element exceeds the target first element in the first data to be written; when the second target element does not exceed the target second element of the first data to be written, obtaining a second mark corresponding to the target second element; and determining to update the first mask according to the first mark corresponding to the target first element and the second mark corresponding to the target second element.
In one example, the mask updating module 1508 is further configured to obtain a first flag corresponding to a target third element when the first target element exceeds the target third element in the second data to be written; when the first target element does not exceed the target fourth element of the second data to be written, obtaining a second mark corresponding to the target fourth element; and determining to update the second mask according to the first mark corresponding to the target third element and the second mark corresponding to the target fourth element.
In one example, the output determination module 1510 is further configured to compare elements in the baseline comparison data with elements in the data to be compared; when the target element in the reference comparison data is the same as any element in the data to be compared, obtaining a first bit corresponding to the target element in the reference comparison data; when all the target elements in the reference comparison data are different from the elements in the data to be compared, obtaining second bits corresponding to the target elements in the reference comparison data; and determining a first output mask according to the bit corresponding to the element in the basic comparison data, and obtaining first output data according to the element corresponding to the first bit in the first output mask in the basic comparison data.
In one example, the data processing apparatus 1500 further comprises:
and the rearrangement module is used for rearranging the updated second mask and the updated second target data corresponding to the updated second mask to obtain the rearranged second mask and the rearranged second target data.
In one example, the traversal module 1512 is further configured to calculate a number of remaining elements in the first data set, calculate a preset first mask bit, and obtain remaining elements in the first data set and remaining elements in the second data set when the number of remaining elements in the first data set is less than the preset first mask bit; and comparing the residual elements in the first data set with the residual elements in the second data set to obtain residual output data, determining that the first data set and the second data set are traversed, and combining the first output data and the residual output data corresponding to each traversal to obtain target output data.
In one example, the data processing apparatus 1500 further comprises:
The common friend determining module is used for determining corresponding common friend data according to the first friend data set and the second friend data set when the data set is the friend data set;
The common text determining module is used for determining corresponding common text data according to the first text data and the second text data when the data set is the text data;
the list attribute determining module is used for determining a corresponding common list attribute according to the first list attribute and the second list attribute when the data set is the list attribute;
And the sub-graph data determining module is used for determining corresponding common sub-graph data according to the first graph data and the second graph data when the data set is the graph data.
In one embodiment, a computer device is provided, which may be a server or a terminal.
When the computer device is a server, its internal structure may be as shown in fig. 16. The computer device includes a processor, a memory, and a network interface connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a data processing method.
When the computer equipment is a terminal, the internal structure of the computer equipment also comprises a display screen, an input device, a camera, a sound collecting device, a loudspeaker and the like, wherein the display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, the input device of the computer equipment can be a touch layer covered on the display screen, can also be a key, a track ball or a touch pad arranged on a shell of the computer equipment, and can also be an external keyboard, a touch pad or a mouse and the like.
It will be appreciated by those skilled in the art that the structure shown in FIG. 16 is merely a block diagram of some of the structures associated with the present inventive arrangements and is not limiting of the computer device to which the present inventive arrangements may be applied, and that a particular computer device may include more or fewer components than shown, or may combine some of the components, or have a different arrangement of components.
In one embodiment, the data processing apparatus provided by the present application may be implemented in the form of a computer program that is executable on a computer device as shown in fig. 16. The memory of the computer device may store various program modules constituting the data processing apparatus, such as a data acquisition module, a target data determination module, a data writing module, a mask updating module, an output determination module, and a traversal module shown in fig. 14. The computer program constituted by the respective program modules causes the processor to execute the steps in the data processing method of the respective embodiments of the present application described in the present specification.
For example, the computer apparatus shown in fig. 16 may execute step S202 by the data acquisition module in the data processing device shown in fig. 15. The computer device may perform step S204 through the target data determination module. The computer device may execute step S206 through the data writing module. The computer device may execute step S208 through the mask update module. The computer device may perform step S210 through the output determination module. The computer device may execute step S212 through the traversal module.
In one embodiment, a computer device is provided comprising a memory and a processor, the memory storing a computer program which, when executed by the processor, causes the processor to perform the steps of the data processing method described above. The steps of the data processing method herein may be the steps of the data processing method of the above-described respective embodiments.
In one embodiment, a computer readable storage medium is provided, storing a computer program which, when executed by a processor, causes the processor to perform the steps of the data processing method described above. The steps of the data processing method herein may be the steps of the data processing method of the above-described respective embodiments.
Those skilled in the art will appreciate that all or part of the processes in the methods of the above embodiments may be implemented by a computer program for instructing relevant hardware, where the program may be stored in a non-volatile computer readable storage medium, and where the program, when executed, may include processes in the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in embodiments provided herein may include non-volatile and/or volatile memory. The nonvolatile memory can include Read Only Memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous link (SYNCHLINK) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), among others.
The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.
The foregoing examples illustrate only a few embodiments of the application and are described in detail herein without thereby limiting the scope of the application. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the application, which are all within the scope of the application. Accordingly, the scope of protection of the present application is to be determined by the appended claims.
Claims (14)
1. A data processing method, comprising:
Acquiring a first data set and a second data set, and acquiring a preset first mask corresponding to the first data set and a preset second mask corresponding to the second data set;
determining first target data from the first data set according to the preset first mask, determining second target data from the second data set according to the preset second mask, taking the first target data as first data to be written, and taking the second target data as second data to be written;
Writing the first data to be written into a first register, and writing the second data to be written into a second register;
Updating the preset first mask and the preset second mask according to the first data to be written and the second data to be written to obtain a corresponding updated first mask and an updated second mask;
determining reference comparison data and data to be compared from the first data to be written and the second data to be written, performing full comparison on the reference comparison data and the data to be compared to obtain a full comparison result, obtaining a first output mask according to the full comparison result, and determining first output data from the reference comparison data according to the first output mask;
And respectively obtaining updated first target data and updated second target data from the corresponding data sets according to the updated first mask and the updated second mask, taking the updated first target data as first data to be written, taking the updated second target data as second data to be written, returning to the step of writing the first data to be written into a first register, and writing the second data to be written into a second register, and executing until the first data set and the second data set are traversed, and combining the first output data corresponding to each traversal until the target output data is obtained.
2. The method of claim 1, wherein the determining first target data from the first data set according to the preset first mask comprises:
Calculating a first number of target bits in the preset first mask and acquiring a preset starting position corresponding to the first data set;
and sequentially selecting the first number of elements from the first data set according to a preset starting position corresponding to the first data set to obtain the first target data.
3. The method of claim 1, wherein the determining second target data from the second data set according to the preset second mask comprises:
Calculating a second number of target bits in the second mask and acquiring a preset starting position corresponding to the second data set;
and sequentially selecting the second number of elements from the second data set according to a preset starting position corresponding to the second data set to obtain the second target data.
4. The method of claim 1, wherein updating the preset first mask and the preset second mask based on the first data to be written and the second data to be written results in corresponding updated first mask and updated second mask, comprising:
determining a second target element according to the size of each element in the second data to be written, comparing the second target element with each element in the first data to be written to obtain a first comparison result, and determining the updated first mask according to the first comparison result;
Determining a first target element according to the size of each element in the first data to be written, comparing the first target element with each element in the second data to be written to obtain a second comparison result, and determining the updated second mask according to the second comparison result.
5. The method of claim 4, wherein after determining a first target element according to the size of each element in the first data to be written, comparing each element in the first target element and the second data to be written to obtain a second comparison result, determining the updated second mask according to the second comparison result, further comprising:
Acquiring a preset first starting position corresponding to the first data set and a preset second starting position corresponding to the second data set;
calculating an updating first quantity of target bits in the updating first mask, and determining an updating first starting position according to the preset first starting position and the updating first quantity;
and calculating an updated second quantity of target bits in the updated second mask, and determining an updated second starting position according to the preset second starting position and the updated second quantity.
6. The method of claim 4, wherein comparing the second target element with each element in the first data to be written to obtain a first comparison result, and determining the updated first mask according to the first comparison result comprises:
when the second target element exceeds the target first element in the first data to be written, a first mark corresponding to the target first element is obtained;
When the second target element does not exceed the target second element in the first data to be written, a second mark corresponding to the target second element is obtained;
And determining the updated first mask according to the first mark corresponding to the target first element and the second mark corresponding to the target second element.
7. The method of claim 4, wherein comparing the first target element with each element in the second data to be written to obtain a second comparison result, and obtaining the updated second mask according to the second comparison result comprises:
When the first target element exceeds a target third element in the second data to be written, a first mark corresponding to the target third element is obtained;
when the first target element does not exceed the target fourth element in the second data to be written, obtaining a second mark corresponding to the target fourth element;
and determining the updated second mask according to the first mark corresponding to the target third element and the second mark corresponding to the target fourth element.
8. The method according to claim 1, wherein the comparing the reference comparison data with the data to be compared in full to obtain a full comparison result, obtaining a first output mask according to the full comparison result, and determining first output data from the reference comparison data according to the first output mask includes:
comparing the elements in the reference comparison data with the elements in the data to be compared;
When the target element in the reference comparison data is the same as any element in the data to be compared, obtaining a first bit corresponding to the target element in the reference comparison data;
When all the target elements in the reference comparison data are different from the elements in the data to be compared, obtaining a second bit corresponding to the target elements in the reference comparison data;
And determining a first output mask according to the bit corresponding to the element in the reference comparison data, and obtaining first output data according to the first bit in the first output mask in the element corresponding to the reference comparison data.
9. The method of claim 1, further comprising, after the obtaining updated first target data and second target data from the corresponding data sets according to the updated first mask and the updated second mask:
And rearranging the updated second mask and updated second target data corresponding to the updated second mask to obtain rearranged second mask and rearranged second target data.
10. The method of claim 1, wherein the combining the first output data corresponding to each traversal until the first data set and the second data set traversal are completed, to obtain the target output data, comprises:
Calculating the number of residual elements in the first data set, calculating the preset first mask bit number, and acquiring residual elements in the first data set and residual elements in the second data set when the number of residual elements in the first data set is smaller than the preset first mask bit number;
and comparing the residual elements in the first data set with the residual elements in the second data set to obtain residual output data, determining that the first data set and the second data set are traversed, and combining the first output data and the residual output data corresponding to each traversal to obtain the target output data.
11. The method according to claim 1, wherein the method further comprises:
When the data set is a friend data set, corresponding common friend data is determined according to the first friend data set and the second friend data set;
When the data set is text data, corresponding common text data are determined according to the first text data and the second text data;
when the data set is a list attribute, determining a corresponding common list attribute according to the first list attribute and the second list attribute;
when the data set is graph data, corresponding common sub-graph data is determined according to the first graph data and the second graph data.
12. A data processing apparatus, the apparatus comprising:
the data acquisition module is used for acquiring a first data set and a second data set, and acquiring a preset first mask corresponding to the first data set and a preset second mask corresponding to the second data set;
The target data determining module is used for determining first target data from the first data set according to the preset first mask, determining second target data from the second data set according to the preset second mask, taking the first target data as first data to be written and taking the second target data as second data to be written;
the data writing module is used for writing the first data to be written into the first register and writing the second data to be written into the second register;
The mask updating module is used for updating the preset first mask and the preset second mask according to the first data to be written and the second data to be written to obtain a corresponding updated first mask and an updated second mask;
The output determining module is used for determining reference comparison data and data to be compared from the first data to be written and the second data to be written, comparing the reference comparison data with the data to be compared in full quantity to obtain a full quantity comparison result, obtaining a first output mask according to the full quantity comparison result, and determining first output data from the reference comparison data according to the first output mask;
And the traversing module is used for respectively obtaining updated first target data and updated second target data from the corresponding data sets according to the updated first mask and the updated second mask, taking the updated first target data as first data to be written, taking the updated second target data as second data to be written, returning to the step of writing the first data to be written into a first register, and writing the second data to be written into a second register, and executing until the first data set and the second data set are traversed, and combining the first output data corresponding to each traversal until the target output data is obtained.
13. A computer readable storage medium storing a computer program which, when executed by a processor, causes the processor to perform the steps of the method of any one of claims 1 to 11.
14. A computer device comprising a memory and a processor, the memory storing a computer program that, when executed by the processor, causes the processor to perform the steps of the method of any of claims 1 to 11.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910673627.6A CN112286579B (en) | 2019-07-24 | 2019-07-24 | Data processing method, device, computer readable storage medium and computer equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910673627.6A CN112286579B (en) | 2019-07-24 | 2019-07-24 | Data processing method, device, computer readable storage medium and computer equipment |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112286579A CN112286579A (en) | 2021-01-29 |
CN112286579B true CN112286579B (en) | 2024-05-24 |
Family
ID=74419222
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910673627.6A Active CN112286579B (en) | 2019-07-24 | 2019-07-24 | Data processing method, device, computer readable storage medium and computer equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112286579B (en) |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104011649A (en) * | 2011-12-23 | 2014-08-27 | 英特尔公司 | Apparatus and method for propagating conditionally evaluated values in simd/vector execution |
CN104603746A (en) * | 2012-09-28 | 2015-05-06 | 英特尔公司 | Vector move instruction controlled by read and write masks |
CN106534392A (en) * | 2015-09-10 | 2017-03-22 | 阿里巴巴集团控股有限公司 | Positioning information acquiring method, positioning method and apparatus |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8972697B2 (en) * | 2012-06-02 | 2015-03-03 | Intel Corporation | Gather using index array and finite state machine |
-
2019
- 2019-07-24 CN CN201910673627.6A patent/CN112286579B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104011649A (en) * | 2011-12-23 | 2014-08-27 | 英特尔公司 | Apparatus and method for propagating conditionally evaluated values in simd/vector execution |
CN104603746A (en) * | 2012-09-28 | 2015-05-06 | 英特尔公司 | Vector move instruction controlled by read and write masks |
CN106534392A (en) * | 2015-09-10 | 2017-03-22 | 阿里巴巴集团控股有限公司 | Positioning information acquiring method, positioning method and apparatus |
Non-Patent Citations (1)
Title |
---|
基于数据加密标准掩码的功耗分析方法;陶文卿;顾星远;李菁;;计算机工程;20150515(第05期);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN112286579A (en) | 2021-01-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110765763A (en) | Error correction method and device for speech recognition text, computer equipment and storage medium | |
KR102482391B1 (en) | A method for presenting candidate words as substitutes for an input string received at an electronic device | |
WO2019085474A1 (en) | Calculation engine implementing method, electronic device, and storage medium | |
CN111159329A (en) | Sensitive word detection method and device, terminal equipment and computer-readable storage medium | |
CN108595338A (en) | Test case write method, device, computer equipment and storage medium | |
CN115080039A (en) | Front-end code generation method, apparatus, computer equipment, storage medium and product | |
CN111339166A (en) | Thesaurus-based matching recommendation method, electronic device and storage medium | |
CN111602129B (en) | Smart search for notes and ink | |
CN112286579B (en) | Data processing method, device, computer readable storage medium and computer equipment | |
US11194885B1 (en) | Incremental document object model updating | |
CN113177407A (en) | Data dictionary construction method and device, computer equipment and storage medium | |
US20080275959A1 (en) | Distributed Search in a Casual Network of Servers | |
CN110580333A (en) | data table processing method, searching method, device, equipment and storage medium | |
CN113010550B (en) | Batch object generation and batch processing method and device for structured data | |
JP2012173745A (en) | Database analysis device and database analysis program | |
CN116127098A (en) | Knowledge graph construction method and device | |
CN115809304A (en) | Method and device for analyzing field-level blood margin, computer equipment and storage medium | |
CN107688948A (en) | Claims Resolution data processing method, device, computer equipment and storage medium | |
JP2018181121A (en) | Analyzer, analysis program and analysis method | |
CN113297273A (en) | Method and device for querying metadata and electronic equipment | |
CN119149549B (en) | Method, device, electronic device and computer program product for data table screening | |
CN114090928B (en) | Nested HTML entity decoding method and device, computer equipment and storage medium | |
CN118778971B (en) | Code reverse analysis method, system, electronic equipment, medium and product | |
CN115169335B (en) | Invoice data calibration method and device, computer equipment and storage medium | |
CN116089725A (en) | Search recommendation method and device, electronic equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |