Disclosure of Invention
Aiming at the defects of the prior art, the invention aims to solve the technical problems that the data of an I/O request spans a plurality of external memory devices, the competition of multithreading on the devices and a file lock, the parallel capability is limited and the multi-external memory I/O performance is not fully exerted in the multi-external memory mode single-queue I/O management method in the prior art.
To achieve the above object, in a first aspect, an embodiment of the present invention provides a multi-external memory device multi-queue based I/O management method, where the method includes:
s1, dividing all edge block files after image data partitioning into strip units with equal sizes according to an updating sequence, and circularly striping the strip units into a plurality of striping files according to an increasing sequence;
s2, carrying out address mapping on the original I/O request by adopting a striping mode consistent with the step S1;
s3, judging whether the original I/O request needs to be decomposed or not, and if so, decomposing the original I/O request into a plurality of new I/O requests which are aligned with the boundaries of the stripe units; otherwise, the original I/O request is not decomposed and is directly mapped into a new I/O request with equal length;
s4, prefetching and merging the new I/O request;
and S5, distributing the new I/O request to a corresponding I/O task queue of the corresponding external storage device.
Specifically, step S2 is as follows:
and mapping the address of the original I/O request by adopting a striping mode consistent with the step S1 to obtain the initial striping file number and the initial offset address in the initial striping file of the original I/O request.
Specifically, step S3 is as follows:
if the initial offset address% stripe unit boundary + data length > stripe unit boundary within the initial striped file of the original I/O request, then according to the stripe unit boundary, decomposing the original I/O request into a plurality of new I/O requests aligned to the stripe unit boundary;
if the initial offset address% stripe unit boundary + data length in the initial striped file of the original I/O request is less than or equal to the stripe unit boundary, the original I/O request is not decomposed and is directly mapped into a new I/O request with equal length, and the% is a remainder operation.
Specifically, step S4 is as follows:
1) for a new I/O request directly mapped by the original I/O request, judging whether the size of the new I/O request is equal to the size of an I/O buffer area, if so, not considering prefetch merging, and entering step S5, otherwise, judging whether the new I/O request is prefetched and merged before, if so, skipping the dispatching of the new I/O request, otherwise, entering step 2); for the first new I/O request aligned to the right of the stripe unit after the original I/O request is decomposed and the next several new I/O requests equal to the size of the stripe unit, judging whether the new I/O requests are prefetched and merged before, if so, skipping the dispatch of the new I/O request, otherwise, not considering the prefetching and merging, and entering the step S5; for a new I/O request which is subjected to decomposition, is aligned to the left of a stripe unit at the last and has the length smaller than the size of the stripe unit, judging whether the new I/O request is prefetched and merged before, if so, skipping the dispatching of the new I/O request, and otherwise, entering the step 2);
2) determining a maximum prefetch offset for a current new I/O request in a current active block;
3) judging whether the maximum prefetching offset is smaller than the termination offset of the current active block, if so, updating the prefetching offset in the original large linear address space to be the maximum prefetching offset, updating the length of the current new I/O request to be the difference value between the prefetching offset and the initial offset of the current new I/O request in the original linear address space, and entering the step 5); otherwise, updating the prefetch offset to be the termination offset of the current active edge block, and entering the step 4);
4) judging whether the adjacent edge block is active or not, if not, terminating prefetching, updating the length of the current new I/O request to be the difference value of the prefetching offset and the initial offset of the current new I/O request in the original linear address space, and entering the step 5); if the current prefetch offset is less than the termination offset of the adjacent edge block, the prefetch offset is updated to be the maximum prefetch offset; otherwise, updating the prefetch offset to the edge block termination offset, and repeating the step 4);
5) the new I/O request is merged.
Specifically, the determination of whether the new I/O request was prefetched before and merged specifically is as follows:
cur _ offset _ new < prefetch _ offset, indicating that the current new I/O request has been prefetched and merged; cur _ offset _ new ≧ prefetch _ offset, which is the starting offset in the original large linear address space of the currently processed new I/O request, indicates that the current new I/O request has not been prefetched and merged, and prefetch _ offset is the prefetch offset indicating the address of the prefetch location in the original large linear address space.
To achieve the above object, in a second aspect, the embodiments of the present invention provide a computer-readable storage medium, on which a computer program is stored, and the computer program, when executed by a processor, implements the I/O management method as described above.
Generally, compared with the prior art, the above technical solution conceived by the present invention has the following beneficial effects:
(1) the invention maintains a special I/O task queue for each external memory device through I/O management based on multiple queues of multiple external memory devices, realizes the work of mapping, decomposing, prefetching, merging and dispatching of I/O requests in an application layer, ensures that each I/O request is only served by one external memory device, and avoids the situation that the data of one I/O request spans multiple external memory devices;
(2) according to the invention, through I/O management based on multiple external memory devices and multiple queues, each application I/O thread is dedicated to one external memory device, so that competition of multithreading on devices and file locks is reduced, and the problem of limited parallel service capability of the multiple external memory devices caused by the fact that each application I/O thread initiates I/O requests aiming at different external memory devices is avoided;
(3) the invention combines a plurality of small new I/O requests into a large I/O request by prefetching and combining the new I/O requests, thereby effectively reducing the access times to the external memory equipment and leading the multi-external memory I/O performance to be fully exerted.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
The patent provides an I/O management method based on multiple external memory devices and multiple queues, which is realized by adopting multiple external memory devices and multiple queue thread pools. FIG. 2 is a diagram of a multi-queue I/O management system for multiple external memory devices according to the present invention. As shown in fig. 2, the work of splitting, prefetching, merging, and dispatching I/O requests is performed at the application layer, a dedicated I/O task queue is maintained for each external memory device, a split new I/O request that does not span multiple external memory devices is added to the corresponding I/O task queue according to the number of the striped file where the request data is located, and application I/O threads and the I/O task queue are bound, so that each application I/O thread is dedicated to a specific external memory device.
Fig. 3 is a flowchart of an I/O management method based on multiple queues of multiple external memory devices according to the present invention. As shown in fig. 3, the method comprises the steps of:
s1, dividing all edge block files after image data partitioning into strip units with equal sizes according to an updating sequence, and circularly striping the strip units into a plurality of striping files according to an increasing sequence;
s2, carrying out address mapping on the original I/O request by adopting a striping mode consistent with the step S1;
s3, judging whether the original I/O request needs to be decomposed or not, and if so, decomposing the original I/O request into a plurality of new I/O requests which are aligned with the boundaries of the stripe units; otherwise, the original I/O request is not decomposed and is directly mapped into a new I/O request with equal length;
s4, prefetching and merging the new I/O request;
and S5, distributing the new I/O request to a corresponding I/O task queue of the corresponding external storage device.
S1, dividing all edge block files after image data partitioning into strip units with equal sizes according to an updating sequence, and circularly striping the strip units into a plurality of striping files according to an increasing sequence.
The method comprises the steps of carrying out two-dimensional partition on original graph data to obtain a plurality of edge block files, carrying out striping on the edge block files after graph partition on an application layer, dividing all the edge block files after the original graph data partition into coarse-grained stripe units with equal size according to an updating sequence, and circularly and uniformly distributing the coarse-grained stripe units in N continuous striped files in an increasing sequence. The striped files are numbered consecutively from 0 to N-1 and stored on the N external storage devices one-to-one.
And S2, carrying out address mapping on the original I/O request in a striping mode consistent with the step S1.
And mapping the address of the original I/O request by adopting a striping mode consistent with the step S1 to obtain the initial striping file number and the initial offset address in the initial striping file of the original I/O request. And dividing the original I/O request by taking the boundary of the edge block as a main boundary and the size of the I/O buffer area as a secondary boundary, wherein the size of the original I/O request is the size of the edge block or the size of the I/O buffer area.
S3, judging whether the original I/O request needs to be decomposed or not, and if so, decomposing the original I/O request into a plurality of new I/O requests which are aligned with the boundaries of the stripe units; otherwise, the original I/O request is not decomposed and is directly mapped into a new I/O request with equal length.
Judging whether the original I/O request needs to be decomposed or not according to the initial offset address, the data length and the strip unit boundary in the initial striping file of the original I/O request, and if so, decomposing the original I/O request into a plurality of new I/O requests which are aligned with the strip unit boundary; otherwise, the original I/O request is not decomposed and is directly mapped into a new I/O request with equal length.
If the starting offset address% stripe unit boundary + data length > stripe unit boundary within the starting striped file of the original I/O request indicates that the data of the original I/O request spans stripe units of multiple striped files, the original I/O request is decomposed into multiple new I/O requests that align stripe unit boundaries according to the stripe unit boundaries. The new I/O request corresponds to the corresponding striping file, and the striping files correspond to the external memory devices one by one, so that the decomposed new I/O request is only in one external memory device and does not span multiple external memory devices.
If the starting offset address% stripe unit boundary + the data length in the starting striped file of the original I/O request is less than or equal to the stripe unit boundary, which indicates that the data of the original I/O request is completely in one stripe unit of the starting striped file, the original I/O request is not decomposed and is directly mapped into a new I/O request with equal length.
FIG. 4 is a schematic diagram of the mapping and decomposition process of the original I/O request address provided by the present invention, where FIG. 4(a) shows that the original I/O request data is in a single stripe unit, and FIG. 4(b) shows that the original I/O request spans multiple stripe units. For convenience of the subsequent description, the following symbols are defined: blockiThe file is the ith edge block file accessed according to the updating sequence; IO (input/output)jIs the jth original I/O request; SIOkMapping the address and decomposing the kth new I/O request; SUmThe mth stripe unit in the jth striped file is striped; stripe _ filenStriping the file for the nth; MAX _ IOSIZE is the I/O buffer size, i.e., the maximum original I/O request size; s is the size of a strip unit; wherein i, j, k, m, n is 0,1, … ….
As shown in FIG. 4(a), when the original I/O request data is in a single stripe unit, the original I ≧The structure of the O request information comprises the starting offset address O and the I/O request data length thereof in the original large linear space
Obtaining the initial striping file number SI of the new I/O request after the address mapping
0Start offset address SO in start striping file
0And resolved I/O request data length
I/O manager transfers SI
0Corresponding file descriptor fin _ profiles [ SI ]
0]、SO
0And
new I/O request information is assembled. The new I/O request is dispatched to the corresponding I/O task queue Multi _ task _ queues SI
0]。
As shown in FIG. 4(b), when the original I/O request spans multiple stripe units, the structure of the original I/O request information includes its offset address O and I/O request data length starting in the original large linear space
Obtaining the initial striping file number SI of the new I/O request after address mapping and decomposition
iStart offset address SO in start striping file
iAnd resolved I/O request data length
I/O manager transfers SI
iCorresponding file descriptor fin _ profiles [ SI ]
i]、SO
iAnd
new I/O request information is assembled.
And S4, performing prefetching and merging on the new I/O request.
FIG. 5 is a diagram illustrating a distribution of new I/O request data after address mapping and decomposition according to the present invention, where FIG. 5(a) is an example one and FIG. 5(b) is an example two. In order to solve the problem that an original I/O request spans multiple external memory devices, the invention maps and decomposes the original I/O request to obtain a plurality of new I/O requests, so that each new I/O request is in a stripe unit of one external memory device. However, since it cannot be guaranteed that the boundary of each edge block after partitioning is aligned with the boundary of the stripe unit, the following situations may exist after mapping decomposition:
1) an original I/O request that is not larger than the stripe unit size S may fall into a stripe unit of a striped file after mapping, and not aligned with the left and right boundaries of the stripe unit, such as IO in FIG. 5(a)j+2And IOj+3Address mapping conditions of (1); or mapping, decomposing into two new I/O requests, the first aligned to the right of the previous stripe unit and the second aligned to the left of the next stripe unit, such as IO in FIG. 5(a)j+1Address mapping conditions.
2) An original I/O request larger than the stripe unit size S is mapped and decomposed into at least two new I/O requests, wherein the first new I/O request is aligned to the right of the first stripe unit and is not larger than S, and the last new I/O request is aligned to the left of the last stripe unit and is not larger than S, as shown in FIG. 5(a) for IOjMapping decomposed SIOkIn the middle are several new I/O requests equal to the stripe unit size.
Therefore, there must be a case where two or more small new I/O request data are contained in one stripe unit. FIG. 5 is a drawing showing
And the data of the new I/O request after mapping and decomposition of the original I/O request can be distributed. In FIG. 5(a), SIO
kAnd SIO
k+1Is directed to Block
iOriginal I/O request IO requesting data
jAnd IO
j+1Obtained after mapping decomposition, and the requested data falls into SU
mIf Block
iIs an active edge block, we are handling IO
jIt can pre-process IO
j+1Mapping and decomposition of (1), SIO
kAnd SIO
k+1Merging the two requests into a new I/O request to realize the prefetching of the graph data; SIO
k+2Is directed to the present activityBlock jump
iLast original I/O request IO
j+1Mapping decomposed with SU
m+1Left aligned new I/O request, if Block
i+1Is also an active Block, for Block
i+1IO (input/output)
j+2Mapping the resulting SIO
k+3Can be prefetched and matched with the SIO
k+2Merge, in the same way, by judging Block
i+2、Block
i+3Determines whether to prefetch merge, and if both are active, the request data falls into the SU
m+1The 4 new I/O requests within can be merged into one large, continuous new I/O request. In FIG. 5(b), if we are processing for the active edge Block Block
iIO (input/output)
jDecomposition to obtain SIO
kFor the same for Block
iAnd requests the SIO that the data falls into the same stripe unit
k+1Directly prefetching merging, considering prefetching merging according to activity of subsequent adjacent edge blocks, and if all the prefetching merging are active, the request data falls into SU
mThe 5 new I/O requests within can be merged into one new I/O request.
Prefetch merge is specifically as follows:
1) for a new I/O request directly mapped from an original I/O request, determining whether the new I/O request size is equal to the I/O buffer size MAX _ IOSIZE, if so, disregarding the prefetch merge, and proceeding to step S5, otherwise, determining whether the new I/O request was previously prefetched and merged, if so, skipping dispatch of the new I/O request, otherwise, proceeding to step 2); for the first new I/O request aligned to the right of the stripe unit after the original I/O request is decomposed and the next several new I/O requests equal to the size of the stripe unit, judging whether the new I/O requests are prefetched and merged before, if so, skipping the dispatch of the new I/O request, otherwise, not considering the prefetching and merging, and entering the step S5; for a new I/O request which is subjected to decomposition, is aligned to the left of a stripe unit at the last and has the length smaller than the size of the stripe unit, judging whether the new I/O request is prefetched and merged before, if so, skipping the dispatching of the new I/O request, and otherwise, entering the step 2);
to implement such a prefetch merge function, the multi-queue based I/O manager needs to maintain two important pointers, one is the starting offset cur _ offset _ new of the currently processed new I/O request in the original large linear address space, and one is the prefetch offset prefetch _ offset indicating the address of the prefetch location in the original large linear address space, and updates cur _ offset _ new every time a new I/O request is processed, and updates prefetch _ offset once the prefetch merge is completed. By determining the magnitude relationship between cur _ offset _ new and prefetch _ offset, it is determined whether to skip the prefetch merge process and dispatch of the current new I/O request:
(1) cur _ offset _ new < prefetch _ offset, which indicates that the current new I/O request has been prefetched and merged and dispatched to the corresponding I/O task queue, then processing is skipped;
(2) cur _ offset _ new ≧ prefetch _ offset, which indicates that the current new I/O request is not processed, then its prefetch merge processing is considered, and the large new I/O request after prefetch merge is dispatched to the corresponding I/O task queue.
2) Determining a maximum prefetch offset for a current new I/O request in a current active block;
if the starting offset of the current new I/O request in the original large linear address space plus MAX _ IOSIZE is greater than the offset of the current stripe unit right boundary in the original large linear address space, then the maximum prefetch offset is the offset of the current stripe unit right boundary in the original large linear address space; if not, then the maximum prefetch offset is the starting offset of the current new I/O request in the original large linear address space plus MAX _ IOSIZE.
3) Judging whether the maximum prefetching offset is smaller than the termination offset of the current active block, if so, updating the prefetching offset in the original large linear address space to be the maximum prefetching offset, updating the length of the current new I/O request to be the difference value of the prefetching offset and the initial offset of the current new I/O request in the original linear address space, and turning to 5); otherwise, updating the prefetch offset to be the current active edge block termination offset, and turning to 4);
4) judging the activity degree of the next adjacent edge block, if not, terminating prefetching, updating the length of the current new I/O request to be the difference value of the prefetching offset and the initial offset of the current new I/O request in the original linear address space, and turning to 5); if the current prefetch offset is not less than the termination offset of the adjacent edge block, the prefetch offset is updated to be the maximum prefetch offset; otherwise, updating the prefetch offset to the edge block termination offset, and repeating the step 4);
5) the new I/O request is merged.
The new I/O requests are merged based on the updated prefetch offset in the original large linear address space and the length of the current new I/O request.
And S5, distributing the new I/O request to a corresponding I/O task queue of the corresponding external storage device.
Each new I/O request is assigned to a corresponding I/O task queue, multi _ task _ requests SIi]。
The above description is only for the preferred embodiment of the present application, but the scope of the present application is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present application should be covered within the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.