Disclosure of Invention
In view of the foregoing problems in the prior art, an object of an embodiment of the present disclosure is to provide a method, an apparatus, a device, a storage medium and a product for reconstructing a log, so as to improve reconstruction accuracy of the log, thereby providing accurate and reliable data support for exploration and development decisions of an oil-gas field.
In order to solve the above technical problems, the specific technical solutions of the embodiments of the present specification are as follows:
in one aspect, embodiments of the present disclosure provide a method of log reconstruction, the method comprising:
Acquiring an original logging curve of a specified period, wherein the original logging curve comprises a logging curve to be reconstructed and a plurality of basic logging curves, and the basic logging curves are logging curves of other types except the logging curve to be reconstructed;
Performing characterization learning on the original logging curve to obtain an initial feature vector matrix;
Inputting the initial feature vector matrix into a pre-trained curve reconstruction model to obtain a reconstruction curve of the logging curve to be reconstructed, wherein the curve reconstruction model comprises at least one decoding layer, the decoding layer comprises an asymmetric causal attention layer, a first normalization layer, a feedforward network layer and a second normalization layer, the asymmetric causal attention layer is used for extracting space-time association weights of the initial feature vector matrix, and the space-time association weights represent space-time association degrees among feature representations in the initial feature vector matrix.
Further, the performing characterization learning on the original log to obtain an initial eigenvector matrix includes:
aligning the well logging curve to be reconstructed and the plurality of basic well logging curves in a time dimension;
dividing the aligned logging curves to be reconstructed and each basic logging curve according to a preset time sequence length to obtain a plurality of time sequence sections corresponding to the logging curves to be reconstructed and each basic logging curve;
Inputting the plurality of time sequence segments into a pre-constructed residual model for characterization learning to obtain the well logging curves to be reconstructed and characteristic representation sequences of each basic well logging curve, wherein the characteristic representation sequences comprise characteristic representations corresponding to the plurality of time sequence segments;
Adding time sequence information to the feature representation in each feature representation sequence, and constructing an initial feature vector matrix according to each feature representation sequence after adding the time sequence information.
Further, the residual model comprises an input layer, at least one hidden layer, a residual connecting layer and an output layer;
the input layer is used for converting the time sequence segment into a first characteristic representation;
the hidden layer is used for converting the first characteristic representation into a second characteristic representation;
the residual connection layer is used for converting the time sequence segment into a third characteristic representation;
and the output layer is used for adding the second characteristic representation and the third characteristic representation and carrying out normalization operation to obtain the characteristic representation of the time sequence section.
The method for reconstructing the curve comprises the steps of obtaining a curve to be reconstructed, wherein the curve reconstruction model further comprises a linear mapping layer and a softmax layer, and the step of inputting the initial eigenvector matrix into a pre-trained curve reconstruction model to obtain the reconstructed curve of the well logging to be reconstructed comprises the following steps:
inputting the initial eigenvector matrix into the decoding layer to obtain a first eigenvector matrix;
Inputting the first eigenvector matrix into the linear mapping layer to map the first eigenvector matrix to a numerical space corresponding to the logging curve to be reconstructed to obtain a target characteristic representation sequence;
Inputting the target feature representation sequence into the softmax layer to obtain a predicted value corresponding to each feature representation in the target feature representation sequence;
And splicing the predicted values according to the time sequence information of the feature representation in the target feature representation sequence to obtain a reconstruction curve of the logging curve to be reconstructed.
Further, the inputting the initial eigenvector matrix into the decoding layer to obtain a first eigenvector matrix includes:
Inputting the initial feature vector matrix into the asymmetric causal attention layer to extract space-time association weights of the initial feature vector matrix, and mapping the space-time association weights to the initial feature vector matrix to obtain a second feature vector matrix;
inputting the second eigenvector matrix into the first normalization layer for normalization to obtain a third eigenvector matrix;
Inputting the third eigenvector matrix into the feedforward network layer to extract local nonlinear characteristics of the third eigenvector matrix, and mapping the local nonlinear characteristics to the second eigenvector matrix to obtain a fourth eigenvector matrix;
And inputting the fourth eigenvector matrix into the second normalization layer for normalization to obtain a first eigenvector matrix.
Further, the extracting the space-time association weight of the initial feature vector matrix and mapping the space-time association weight to the initial feature vector matrix to obtain a second feature vector matrix includes:
performing linear transformation on the initial feature vector matrix to obtain a query matrix, a key matrix and a value matrix, wherein the query matrix is generated according to the feature representation sequence of the well logging curve to be reconstructed, and the key matrix and the value matrix are generated according to the feature representation sequences of the well logging curve to be reconstructed and the basic well logging curve;
constructing a space-time correlation matrix according to the time sequence information of each feature representation in the initial feature vector matrix and the physical mechanism of each logging curve, wherein each element value in the space-time correlation matrix represents whether time correlation and space correlation exist among the feature representations;
Calculating to obtain the attention score between the query matrix and the key matrix according to the space-time correlation matrix;
normalizing the attention score by using a softmax function to obtain a space-time correlation weight;
and carrying out weighted summation on the value matrix according to the space-time association weight to obtain a second eigenvector matrix.
Further, the number of rows and columns of the space-time correlation matrix is the same as the total number of feature representations in the initial feature vector matrix, and the construction of the space-time correlation matrix according to the time sequence information of each feature representation in the initial feature vector matrix and the physical mechanism of each logging curve comprises the following steps:
determining a characteristic representation corresponding to a logging curve to be reconstructed and a characteristic representation corresponding to a basic logging curve from the initial characteristic vector matrix;
Judging whether the characteristic representation corresponding to the basic well logging curve has time correlation with the characteristic representation corresponding to the well logging curve to be reconstructed according to the time sequence information of each characteristic representation;
If so, assigning the elements corresponding to the feature representations with the time association and the space association as a first set value, and assigning the elements corresponding to the feature representations without the time association and the space association as a second set value.
Further, the determining, according to the physical mechanism of each log, whether the spatial correlation exists between the feature representation corresponding to the base log and the feature representation corresponding to the log to be reconstructed includes:
calculating the characterization correlation of each basic well logging curve and the well logging curve to be reconstructed according to the characterization characteristics of each basic well logging curve and the well logging curve to be reconstructed in different stratum;
calculating the matching degree of each basic well logging curve and the well logging curve to be reconstructed according to the historical well logging interpretation experience knowledge;
calculating to obtain a spatial correlation value of each basic well logging curve and the well logging curve to be reconstructed according to the characterization correlation and the matching degree;
And if the spatial correlation value is larger than a preset threshold value, determining that the spatial correlation exists between the basic well logging curve and the well logging curve to be reconstructed.
Further, the determining, according to the timing information of each feature representation, whether the feature representation corresponding to the base log corresponds to the log to be reconstructed includes:
judging whether the time sequence position of a feature representation corresponding to the basic well logging curve is not later than the time sequence position of a feature representation corresponding to the well logging curve to be reconstructed according to the time sequence information of each feature representation;
if yes, determining that time correlation exists between the two characteristic representations;
if not, determining that no time correlation exists between the two characteristic representations.
In another aspect, embodiments of the present disclosure provide a log reconstruction apparatus, the apparatus comprising:
the system comprises an acquisition module, a reconstruction module and a reconstruction module, wherein the acquisition module is used for acquiring an original logging curve of a specified period, the original logging curve comprises a logging curve to be reconstructed and a plurality of basic logging curves, and the basic logging curves are logging curves of other types except the logging curve to be reconstructed;
the characterization learning module is used for performing characterization learning on the original logging curve to obtain an initial feature vector matrix;
The reconstruction module is used for inputting the initial feature vector matrix into a pre-trained curve reconstruction model to obtain a reconstruction curve of the logging curve to be reconstructed, wherein the curve reconstruction model comprises at least one decoding layer, the decoding layer comprises an asymmetric causal attention layer, a first normalization layer, a feed-forward network layer and a second normalization layer, the asymmetric causal attention layer is used for extracting space-time correlation weights of the initial feature vector matrix, and the space-time correlation weights represent space-time correlation degrees among feature representations in the initial feature vector matrix.
In yet another aspect, embodiments of the present disclosure further provide a computer device including a memory, a processor, and a computer program stored on the memory, which when executed by the processor, performs instructions of any one of the methods described above.
In yet another aspect, embodiments of the present disclosure also provide a computer-readable storage medium having stored thereon a computer program which, when executed by a processor of a computer device, performs instructions of any of the methods described above.
In yet another aspect, the present description embodiment also provides a computer program product, which when executed by a processor of a computer device, performs the instructions of any of the methods described above.
By adopting the technical scheme, the log curve reconstruction method provided by the embodiment of the specification is characterized by learning the original log curve, mapping the original log curve to a high-dimensional feature space to obtain an initial feature vector matrix, processing by using a pre-training curve reconstruction model comprising an asymmetric causal attention mechanism, and generating a missing log curve by depending on the relevance among the log curves, thereby realizing the reconstruction of the log curve to be reconstructed. The asymmetric causal attention mechanism can fully excavate complex space-time association relation between the logging curve to be reconstructed and a plurality of basic logging curves, thereby remarkably improving the accuracy of the logging curve reconstruction curve to be reconstructed, being capable of better coping with the problems of data loss, distortion and the like in actual logging, providing more reliable basic data for subsequent works such as geological interpretation, reservoir evaluation and the like, and reducing the oil and gas exploration and development risks.
The foregoing description is only a summary of some embodiments of the present disclosure, and it is to be understood that the following detailed description of the preferred embodiments is provided, along with the accompanying figures, in order to provide a better understanding of the nature of some embodiments of the disclosure.
Detailed Description
The technical solutions of the embodiments of the present specification will be clearly and completely described below with reference to the drawings in the embodiments of the present specification, and it is apparent that the described embodiments are only some embodiments of the present specification, not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are intended to be within the scope of the present disclosure.
It should be noted that the terms "first," "second," and the like in the description and the claims, and in the foregoing figures, are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the present description described herein may be capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, apparatus, article, or device that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed or inherent to such process, method, article, or device.
The user information (including but not limited to user equipment information, user personal information, etc.) and the data (including but not limited to data for analysis, stored data, presented data, etc.) related to the present application are information and data authorized by the user or sufficiently authorized by each party. In addition, the technical scheme described in the embodiment of the application accords with relevant regulations on data acquisition, storage, use, processing and the like.
In the fields of petroleum exploration and geological research, logging technology is a key means for acquiring physical properties, reservoir characteristics and fluid information of underground rock formations, and underground structure information is acquired by recording the response of the stratum to physical quantities such as electricity, sound, radioactivity and the like. The log as a visual representation of these responses has an irreplaceable role in identifying formation characteristics, evaluating hydrocarbon reservoir properties. However, in actual logging operations, the logging curves often suffer from defects, distortions, or poor quality due to a variety of factors. For example, tool failure, complex borehole conditions (e.g., enlarged borehole diameter, collapsed borehole wall), interference signals during logging, incomplete coverage of portions of the log, etc., may result in incomplete or inaccurate acquired log. These problems can seriously affect the accuracy of subsequent geologic interpretation and resource assessment, increasing the risk of exploration and development. However, the re-logging is high in cost and difficult to construct, so that a log reconstruction technology is developed, and aims to recover the log to be reconstructed, which is missing or distorted, through a certain algorithm or model. By means of the well logging curve reconstruction mode, the integrity of the well logging curve can be restored, noise and errors in the original data can be filtered, and the accuracy of the data is improved.
The traditional well logging curve reconstruction method comprises an empirical model method, a multivariate fitting method, a machine learning-based method and the like, wherein the empirical model method uses the experience of an engineer to select the well logging curve of the adjacent well to fit the data of the current well, the requirement on the adjacent well is high, and the operation difficulty is high. The multi-element fitting method is to fit based on the data of the same depth point by a nonlinear fitting method, and the method has the advantages that the model ignores the complex condition of the underground stratum and has strong heterogeneity, and the real condition of the stratum is greatly simplified, so that the effect of a prediction result is poor, and the accuracy requirements of logging interpretation and reservoir characterization are difficult to meet. With the development of artificial intelligence technology, a logging curve reconstruction method based on machine learning and deep learning gradually becomes a research hotspot. However, existing deep learning methods fail to adequately mine correlation information between different logging parameters when processing a log. The logging curves have obvious sequence characteristics in the depth direction, and different types of logging curves have complex association relations in spatial distribution and time evolution, and the association information is important for accurately reconstructing the logging curves to be reconstructed. Therefore, how to effectively extract the space-time correlation characteristics in the logging curve and improve the reconstruction accuracy of the logging curve under the complex geological condition becomes a key problem to be solved in the current logging curve reconstruction field.
To solve the above-mentioned problems, the present embodiment provides a log reconstruction method, and fig. 1 is a schematic step diagram of a log reconstruction method provided in the present embodiment, where the present disclosure provides the method operation steps described in the examples or flowcharts, but may include more or fewer operation steps based on conventional or non-creative labor. The order of steps recited in the embodiments is merely one way of performing the order of steps and does not represent a unique order of execution. When a system or apparatus product in practice is executed, it may be executed sequentially or in parallel according to the method shown in the embodiments or the drawings. As shown in fig. 1, the method may include:
S101, acquiring an original well logging curve of a designated period, wherein the original well logging curve comprises a well logging curve to be reconstructed and a plurality of basic well logging curves, and the basic well logging curves are well logging curves of other types except the well logging curve to be reconstructed.
The logging curve is a series of data obtained by measuring physical and chemical properties of a downhole stratum through logging instruments in oil and gas exploration and development, and the nature of the logging curve is to reflect stratum attribute characteristics acquired at different times (equivalent depths). The raw log is an unprocessed raw data set collected during a specified period of time, including the log to be reconstructed and other log that can be referenced, in this embodiment, the log includes pore size, natural gamma, resistivity, acoustic waves, density, neutrons, and the like. The well-logging curve to be reconstructed refers to a specific type of well-logging curve which has the problems of partial or total deletion, distortion and the like and needs to be recovered or corrected, and the basic well-logging curve refers to other types of well-logging curves except the well-logging curve to be reconstructed, and the well-logging curve to be reconstructed is used as a reference basis in the reconstruction process to provide information support for the recovery of the well-logging curve to be reconstructed. For example, when the log to be reconstructed is a density log, the base log may be at least two of an aperture, a natural gamma, a resistivity, a sonic, and a neutron log. It should be noted that when a log is selected as the base log, it should be a complete log, i.e., a log that does not require reconstruction.
And S102, performing characterization learning on the original logging curve to obtain an initial eigenvector matrix.
It will be appreciated that the nature of the log is such that time series data over depth or time contains a significant amount of redundant information or noise, which if directly entered into the model would affect reconstruction efficiency and accuracy. According to the embodiment of the description, through the characteristic learning of the original logging curve, key features are automatically extracted from the original logging curve, the complex original logging curve is converted into the structured initial feature vector matrix, and each feature vector in the matrix corresponds to the feature representation of a certain dimension of the original logging curve, so that a data deep rule can be more efficiently mined in the subsequent reconstruction process.
S103, inputting the initial feature vector matrix into a pre-trained curve reconstruction model to obtain a reconstruction curve of the logging curve to be reconstructed, wherein the curve reconstruction model comprises at least one decoding layer, the decoding layer comprises an asymmetric causal attention layer, a first normalization layer, a feed-forward network layer and a second normalization layer, the asymmetric causal attention layer is used for extracting space-time correlation weights of the initial feature vector matrix, and the space-time correlation weights represent space-time correlation degrees among feature representations in the initial feature vector matrix.
It will be appreciated that there is a temporal and spatial correlation between the logs, with temporal correlation referring to the correlation of the individual feature representations over time, i.e. the feature representation at a time is correlated with the feature representations at and before that time. In actual logging, there may be abrupt changes in subsurface bodies, the log values of adjacent depths may differ significantly, and excessive time-dependent correlation may force smoothing of these abrupt changes, resulting in reconstruction distortion. The spatial correlation refers to different logging curves with the same depth, and is constrained by the same geological condition, and has definite physical correlation, by using the spatial correlation, the model can realize reconstruction through multi-parameter cross verification, when a certain curve has the missing or noise, the information of other curves can provide reliable constraint, such as reconstructing the resistivity curve with the assistance of acoustic time difference and density curve, reducing the influence of single curve error, and improving the reconstruction precision. In the oil and gas exploration and development process, stratum characteristics of underground geologic bodies are commonly reflected by logging curves acquired from different dimensions by a plurality of logging methods, the nature of the logging curves is an indirect observation signal of the geologic characteristics, and the final aim of reconstruction is to recover the geologic properties of underground real lithology, porosity, oiliness and the like through the curves, rather than simply enabling the curves to accord with the historical modes of the curves. Therefore, for well logging curve reconstruction, the contribution of the spatial correlation among the well logging curves to the well logging curve reconstruction is as large as possible, and the contribution of the temporal correlation of the reconstructed curves to the well logging curve reconstruction is as small as possible, so that error propagation is avoided, the reconstructed curves are finally enabled to be closer to the underground real geological features, and the actual geological analysis requirements of reservoir evaluation, lithology division and the like are met. Based on the principle that the relevance between the feature representations in the initial feature vector matrix is asymmetric through the analysis, the embodiment of the specification designs an asymmetric causal attention mechanism, and the attention mechanism can extract the space-time relevance weights between the feature representations and accurately capture the space-time relevance between the basic log curve and the log curve to be reconstructed, so that the missing log curve can be generated by depending on the space-time relevance.
By adopting the technical scheme, the log curve reconstruction method provided by the embodiment of the specification is characterized by learning the original log curve, mapping the original log curve to a high-dimensional feature space to obtain an initial feature vector matrix, processing by using a pre-training curve reconstruction model comprising an asymmetric causal attention mechanism, and generating a missing log curve by depending on the relevance among the log curves, thereby realizing the reconstruction of the log curve to be reconstructed. The asymmetric causal attention mechanism can fully excavate complex space-time association relation between the logging curve to be reconstructed and a plurality of basic logging curves, thereby remarkably improving the accuracy of the logging curve reconstruction curve to be reconstructed, being capable of better coping with the problems of data loss, distortion and the like in actual logging, providing more reliable basic data for subsequent works such as geological interpretation, reservoir evaluation and the like, and reducing the oil and gas exploration and development risks.
It should be noted that, the method for reconstructing a logging curve provided by the embodiment of the specification solves the problem of low accuracy of fine interpretation of logging data caused by incomplete and missing logging curves, and can be applied to reconstruction of acoustic logging curves, reconstruction of density logging curves and reconstruction of other logging curves which are common in logging interpretation, and also is applicable to calculation of reservoir parameters in logging interpretation, including calculation of porosity, permeability, water saturation, clay content and the like.
In this embodiment of the present disclosure, referring to fig. 2, step S102 performs a feature learning on the original log to obtain an initial feature vector matrix, and specifically includes the following steps:
and S201, aligning the well logging to be reconstructed and the plurality of basic well logging in the time dimension.
Specifically, the original log is time series data continuously recorded along with depth/time, and due to the lowering speed, depth calibration errors and the like of different logging instruments, the problem that different log corresponding to the same geological depth point are misplaced can be caused. Thus, to eliminate the time offset between the log curves, the log curves to be reconstructed and the base log curve are aligned in the time dimension. In some embodiments of the present disclosure, to eliminate the influence of log dimension and amplitude differences on subsequent processing, a normalization process is performed on each log to convert the log into a standard normal distribution before performing characterization learning on each log.
S202, dividing the aligned logging curves to be reconstructed and each basic logging curve according to a preset time sequence length to obtain a plurality of time sequence sections corresponding to the logging curves to be reconstructed and each basic logging curve.
It can be understood that the normalized log is divided into a plurality of time sequence segments (Patch) with equal length according to the time sequence according to the preset time sequence length, and the time sequence segments are used as the basic unit for processing, so that the follow-up study of local geological features can be focused, and the irrelevant information interference in the global long sequence is avoided. In the embodiment of the present specification, these time periods may or may not overlap, and are determined by the step size S. For a sequence of sequence length seq_len, a sequence of sequence length Patch_len and step length S is preset, and the number of sequence segments is N= [ (L-P)/S ] +1. As shown in fig. 3, for a log sequence seq_len containing 12 data points, if the preset time sequence length is 8 and the step size is 1, it can be divided into 5 time sequence segments, i.e. patch_01, patch_02, patch_03, patch_04, and patch_05.
And S203, inputting the plurality of time sequence segments into a pre-constructed residual model for characterization learning to obtain the well logging curves to be reconstructed and the characteristic representation sequence of each basic well logging curve, wherein the characteristic representation sequence comprises characteristic representations corresponding to the plurality of time sequence segments.
It can be appreciated that the time sequence segments are characterized and learned by a pre-constructed residual model, and the log curves are mapped into a high-dimensional feature space, i.e. converted into vectors of fixed dimensions. In this embodiment of the present disclosure, the residual model structure is shown in fig. 4, and the residual model includes an input layer, at least one hidden layer, a residual connection layer, and an output layer. The input layer is used for converting the time sequence segment into a first characteristic representation with a dimension of hidden_dim, the hidden layer is used for converting the first characteristic representation into a second characteristic representation with a dimension of out_dim, the residual connection layer is used for converting the time sequence segment into a third characteristic representation with a dimension of out_dim, and the output layer is used for adding the second characteristic representation and the third characteristic representation, carrying out normalization operation and outputting the characteristic representation of the time sequence segment, wherein the dimension of the characteristic representation is out_dim. Thus, the residual model can convert each time period Patch_01 into a corresponding feature representation token_01, so as to obtain a well logging curve to be reconstructed and a feature representation sequence corresponding to each basic well logging curve.
And S204, adding time sequence information to the feature representation in each feature representation sequence, and constructing an initial feature vector matrix according to each feature representation sequence added with the time sequence information.
It can be understood that the timing information of the log includes a key geological meaning, if the missing timing information causes that the model cannot distinguish the sequence of the features with different depths, and thus cannot learn the real time-space association, therefore, the timing information is added to each feature representation, the timing information includes time ID information and space ID information, wherein the time ID information represents the number of Patch and the sequence relationship, and the variable ID information represents the number of log. Then, the conversion of the time ID and the variable ID into position-coded vectors can be achieved by absolute position coding or relative position coding. And then fusing the characteristic representation of each time sequence segment with the time ID coding vector and the variable ID coding vector in characteristic dimension to obtain the characteristic representation containing position information, wherein the characteristic representation not only maintains the original logging characteristic of the variable, but also clearly distinguishes different curves and different depth segments in vector space through the coding of the time ID and the variable ID, and provides a basis for capturing cross-curve space association and cross-time sequence association for a follow-up attention mechanism.
In the embodiment of the present disclosure, the rows and columns of the initial eigenvector matrix respectively correspond to different log curves and different time periods, and each element in the matrix is a characteristic representation of a curve with time information in a certain time period. As shown in fig. 5, assuming that there are 2 base curves and 1 curve to be reconstructed, wherein the base curves are gamma curves and resistivity curves, the curve to be reconstructed is a density curve, and each log curve is divided into 3 time-sequence segments, i.e. the gamma curves have a characteristic representation sequence of [ token_0, token_1, token_2], the resistivity curves have a characteristic representation sequence of [ token_3, token_4, token_5], and the density curves have a characteristic representation sequence of [ token_6, token_7, token_8], the initial feature vector matrix is a 3×3 structure, and each cell is a characteristic representation of each curve in the corresponding time-sequence segment.
In the present embodiment, as shown in fig. 6, the curve reconstruction model includes one to a plurality of decoding layers, a linear mapping layer, and a softmax layer. As shown in fig. 7, step S103 includes inputting the initial eigenvector matrix into a pre-trained curve reconstruction model to obtain a reconstructed curve of the log to be reconstructed, and specifically includes the following steps:
s301, inputting the initial eigenvector matrix into the decoding layer to obtain a first eigenvector matrix;
S302, inputting the first eigenvector matrix into the linear mapping layer to map the first eigenvector matrix to a numerical space corresponding to the logging curve to be reconstructed to obtain a target characteristic representation sequence;
S303, inputting the target feature representation sequence into the softmax layer to obtain a predicted value corresponding to each feature representation in the target feature representation sequence;
And S304, splicing the predicted values according to the time sequence information of the feature representation in the target feature representation sequence to obtain a reconstruction curve of the well logging curve to be reconstructed.
Specifically, the curve reconstruction model firstly extracts key space-time correlation features related to a curve to be reconstructed from an initial feature vector matrix through a decoding layer to obtain a first feature vector matrix, wherein the first feature vector matrix is a feature vector matrix subjected to deep correlation mining, compared with the initial feature vector matrix, the feature focusing is used for reconstructing valuable correlation information, redundant or interference features are filtered, and a foundation is laid for subsequent reconstruction. The nature of the logging curve is a numerical geological attribute representation, the feature vectors in the first feature vector matrix are abstract vector representations subjected to multi-layer processing, an abstract high-level feature map is projected to a numerical space corresponding to the logging curve to be reconstructed by linear transformation such as matrix multiplication through a linear mapping layer, and if the logging curve to be reconstructed is an acoustic logging curve, the logging curve to be reconstructed is mapped to a numerical range corresponding to microseconds/meter. The target feature representation sequence is a feature sequence matched with the numerical dimension of the curve to be reconstructed, and the dimension of the feature sequence is consistent with the original data dimension of the curve to be reconstructed, so that a specific predicted value of each time sequence section can be generated conveniently. In log reconstruction, a softmax layer is used to convert the target feature representation into interpretable specific predicted values, and the feature representation is converted into a single predicted value by normalization or linear transformation. Because the predicted values are partial results segmented according to time periods, and the logging curves are continuous sequences of depth-values, the predicted values of all the time periods are sequentially connected according to the sequence of the time ID marks and the depth sequence, so that a complete sequence covering the whole designated time period is formed.
In this embodiment of the present disclosure, as shown in fig. 8, in step S301, the initial feature vector matrix is input into the decoding layer to obtain a first feature vector matrix, which specifically includes the following steps:
S401, inputting the initial feature vector matrix into the asymmetric causal attention layer to extract space-time association weights of the initial feature vector matrix, and mapping the space-time association weights to the initial feature vector matrix to obtain a second feature vector matrix;
S402, inputting the second eigenvector matrix into the first normalization layer for normalization to obtain a third eigenvector matrix;
S403, inputting the third eigenvector matrix into the feedforward network layer to extract local nonlinear characteristics of the third eigenvector matrix, and mapping the local nonlinear characteristics to the second eigenvector matrix to obtain a fourth eigenvector matrix;
S404, inputting the fourth eigenvector matrix into the second normalization layer for normalization to obtain a first eigenvector matrix.
Specifically, the asymmetric causal attention layer quantifies the association degree of different logging curves in different time sequence sections in the initial characteristics through calculation, namely, the space association weight is real-time, the asymmetric characteristics are characterized in that the spatial association between different logging curves is amplified preferentially, the association between resistivity at the same time sequence position and acoustic curves is restrained, the time association of the logging curves to be reconstructed is restrained, namely, the association of the logging curves at different time sequence positions is accordant with the reconstruction target with preferential mutual association contribution. And carrying out weighted fusion on the calculated space-time association weight and the initial feature vector matrix, so that the reconstruction valuable inter-association information in the feature matrix is strengthened, the redundant and error self-association information is weakened, and finally a second feature vector matrix is obtained. The first normalization layer performs normalization processing on the second eigenvector matrix, converts each eigenvalue into a normalized value by calculating the mean value and variance of the features in the matrix, eliminates the numerical scale difference between different feature dimensions, avoids the situation that the model excessively pays attention to the features with large numerical values and ignores the key features with small numerical values due to overlarge numerical scale difference of the features, stabilizes the gradient of the model during training reasoning, and improves convergence efficiency. The feedforward network layer consists of a full-connection layer and a nonlinear activation function, can perform nonlinear transformation on local features in a third feature vector matrix, capture complex nonlinear geological relations in a logging curve, and then map extracted local nonlinear features to a second feature vector matrix through linear transformation to finally obtain a fourth feature vector matrix. The fourth feature vector matrix is a feature matrix fused with global association and local nonlinear relationship, not only comprises macroscopic association of different curves, but also captures fine geological features of local depth segments, overcomes the problem of insufficient capture of the local fine features by a attention layer, and improves the characterization capability of the features by mining deeper geological rules through nonlinear transformation. The second normalization layer performs normalization processing on the fourth eigenvector matrix again, and the principle is consistent with that of the first normalization layer, so that the aim is to eliminate the characteristic distribution offset possibly caused by nonlinear transformation of the feedforward network layer, so that the numerical scale of the output characteristic is stabilized again, and reliable input is provided for the processing of the subsequent linear mapping layer and the softmax layer.
In this embodiment of the present disclosure, as shown in fig. 9, step S401 extracts a space-time correlation weight of the initial feature vector matrix, maps the space-time correlation weight to the initial feature vector matrix, and obtains a second feature vector matrix, and specifically includes the following steps:
S501, performing linear transformation on the initial feature vector matrix to obtain a query matrix, a key matrix and a value matrix, wherein the query matrix is generated according to the feature representation sequence of the well logging curve to be reconstructed, and the key matrix and the value matrix are generated according to the feature representation sequences of the well logging curve to be reconstructed and the basic well logging curve;
s502, constructing a space-time correlation matrix according to time sequence information of each feature representation in the initial feature vector matrix and a physical mechanism of each logging curve, wherein each element value in the space-time correlation matrix represents whether time correlation and space correlation exist among the feature representations;
S503, calculating attention scores between the query matrix and the key matrix according to the space-time correlation matrix;
s504, normalizing the attention score by using a softmax function to obtain a space-time correlation weight;
And S505, carrying out weighted summation on the value matrix according to the space-time correlation weight to obtain a second eigenvector matrix.
Specifically, step S501 converts the initial feature vector into a query-key-value triplet suitable for calculating the relevance, thereby laying a foundation for the subsequent attention score calculation. The query matrix Q is generated based on the characteristic representation sequence of the well logging curve to be reconstructed only, the target which the model wants to query is represented, namely, the correlation information which the well logging curve to be reconstructed needs to acquire from other data, the key matrix K and the value matrix V are generated based on the well logging curve to be reconstructed and the characteristic representation sequences of all the basic well logging curves, wherein the keys are used for calculating the similarity with the query, namely, the correlation degree, and the values are used for storing the characteristic information which needs to be weighted and fused.
In step S502, the number of rows and columns of the space-time correlation matrix is the same as the total number of feature representations in the initial feature vector matrix, for example, if the total number of feature representations in the initial feature vector matrix is 9, the structure size of the space-time correlation matrix is 9×9. In this embodiment of the present disclosure, as shown in fig. 10, the construction of the space-time correlation matrix according to the time sequence information represented by each feature in the initial feature vector matrix and the physical mechanism of each log curve specifically includes the following steps:
S601, determining a characteristic representation corresponding to a well logging curve to be reconstructed and a characteristic representation corresponding to a basic well logging curve from the initial characteristic vector matrix;
S602, judging whether the characteristic representation corresponding to the basic well logging curve has time correlation with the characteristic representation corresponding to the well logging curve to be reconstructed according to the time sequence information of each characteristic representation, and judging whether the characteristic representation corresponding to the basic well logging curve has space correlation with the characteristic representation corresponding to the well logging curve to be reconstructed according to the physical mechanism of each well logging curve;
and S603, if so, assigning the elements corresponding to the feature representations with the time association and the space association as a first set value, and assigning the elements corresponding to the feature representations without the time association and the space association as a second set value.
In this embodiment of the present disclosure, as shown in fig. 11, step S602 of determining whether a spatial correlation exists between a feature representation corresponding to a base log and a feature representation corresponding to a log to be reconstructed according to a physical mechanism of each log may include the following steps:
s701, calculating the characterization correlation of each basic well logging curve and the well logging curve to be reconstructed according to the characterization characteristics of each basic well logging curve and the well logging curve to be reconstructed in different stratum;
S702, calculating the matching degree of each basic well logging curve and the well logging curve to be reconstructed according to the empirical knowledge of the historical well logging interpretation;
s703, calculating to obtain a spatial correlation value of each basic well logging curve and the well logging curve to be reconstructed according to the characterization correlation and the matching degree;
and S704, if the spatial correlation value is larger than a preset threshold value, determining that spatial correlation exists between the basic well logging curve and the well logging curve to be reconstructed.
It can be understood that the characterization features of step S701 refer to specific numerical rules or change patterns of the log curves in different strata (such as sandstone, mudstone, limestone, shale, etc.), and numerical features of the base log curve and the log curve to be reconstructed, such as mean, variance and change slope, are extracted for different strata of the same well, and the correlation strength of the base log curve and the log curve to be reconstructed is calculated by pearson correlation coefficients, spearman rank correlation, feature distribution similarity, etc. The empirical knowledge of the historical logging interpretation of step S702 refers to empirical rules formed by correlation of physical meanings of different logging curves summarized in long-term practice, for example, the matching degree of natural gamma and acoustic time difference is high, because natural gamma reflects the argillaceous content, acoustic time difference reflects the rock compactness, and high argillaceous content usually corresponds to rock porosity. In some embodiments, the degree of matching may be quantified by counting the frequencies in the historical wells that are effective in interpreting the formation together. Step S703 may calculate a spatial correlation value by a weighted summation algorithm. The preset threshold in step S704 is a correlation strength critical value set according to the actual requirement, and if the spatial correlation value is greater than the preset threshold, it is indicated that the base curve and the curve to be reconstructed have significant correlation in terms of feature performance and practical experience, and it can be determined that there is spatial correlation between the two, that is, there is a dependency relationship between the two responses in the stratum attribute that varies with depth, and the base curve can be used as an effective input for reconstructing the curve to be reconstructed.
In this embodiment of the present disclosure, as shown in fig. 12, step S602 of determining whether a time correlation exists between a feature representation corresponding to a base log and a feature representation corresponding to a log to be reconstructed according to timing information of each feature representation may include the following steps:
S801, judging whether the time sequence position of a feature representation corresponding to the basic well logging curve is not later than the time sequence position of a feature representation corresponding to the well logging curve to be reconstructed according to the time sequence information of each feature representation;
s802, if yes, determining that time correlation exists between the two characteristic representations;
if not, determining that there is no time correlation between the two feature representations.
As shown in fig. 13, taking the example of the density curve reconstruction, it is assumed that the density curve is reconstructed from the gamma curve and the resistivity curve, each curve including 3 Token, and the curve reconstruction is reconstructed from the current time and Token before the current time. In one embodiment, assuming that the density curve to be reconstructed includes token_6, token_7, token_8, then for token_6, the reconstruction may be performed from token_0 of the gamma curve and token_3 of the resistivity curve, for token_7, the reconstruction may be performed from token_0, token_1 of the gamma curve and token_3, token_4 of the resistivity curve, and for token_8, the reconstruction may be performed from token_0, token_1, token_2 of the gamma curve and token_3, token_4, token_5 of the resistivity curve. Note that, the token_7 is reconstructed without considering the contribution of token_6, and the token_8 is reconstructed without considering the contributions of token_6 and token_7, because token_6 and token_7 are reconstructed, and errors exist in themselves, so that the propagation of the errors can be avoided. The relevance between Token can be represented by a space-time relevance matrix M, 1 indicates that there is a relevance, an operation requiring an attention mechanism is performed, 0 indicates that there is no relevance, and no operation requiring an attention mechanism is performed, and in the density curve reconstruction scene, the space-time relevance matrix is shown in table 1, and Token is abbreviated as Tk.
TABLE 1
In the embodiment of the present disclosure, in step S503, after obtaining the space-time correlation matrix, the attention score between the query matrix and the key matrix is calculated by the following formula:
;
wherein M is a space-time correlation matrix, Q is a query matrix, K is a key matrix,Is a scaling factor.
And then calculating to obtain the space-time correlation weight through the following formula:
Wherein V is a matrix of values. And finally, carrying out weighted summation on the space-time association weight pair value matrix to obtain a second eigenvector matrix.
In order to verify that the reconstruction accuracy of the curve reconstruction model in the embodiment of the present specification meets the production requirement, in the embodiment of the present specification, taking a reconstruction scene of an acoustic curve and a density curve as an example, a complete log curve of 30 wells is selected as a training sample, and a log curve of 10 wells is selected as a test sample to train the curve reconstruction model, and training results are shown in table 2.
TABLE 2
The absolute error of the density curve reconstruction reaches 0.014, the production requirement is less than 0.025, the visible density curve reconstruction meets the production requirement, the absolute error of the acoustic logging curve reconstruction is 1.28, the production requirement is less than 2, and the acoustic logging curve reconstruction meets the production requirement. FIG. 14 is a graph comparing a reconstructed density curve of a well with an actual density curve, wherein the abscissa is the well depth, the ordinate is the density value corresponding to each well depth, wherein the dashed line is the reconstructed log, the solid line is the actual log, and the reconstructed log is approximately coincident with the actual log.
Based on the above-mentioned method for reconstructing a well logging curve, the embodiment of the present disclosure further provides a device for reconstructing a well logging curve correspondingly. The apparatus may include a system (including a distributed system), software (applications), modules, components, servers, clients, etc. that employ the methods described in the embodiments of the present specification in combination with the necessary apparatus to implement the hardware. Based on the same innovative concepts, the embodiments of the present description provide means in one or more embodiments as described in the following embodiments. Because the implementation scheme and the method for solving the problem by the device are similar, the implementation of the device in the embodiment of the present disclosure may refer to the implementation of the foregoing method, and the repetition is not repeated. As used below, the term "unit" or "module" may be a combination of software and/or hardware that implements the intended function. While the means described in the following embodiments are preferably implemented in software, implementation in hardware, or a combination of software and hardware, is also possible and contemplated.
Specifically, fig. 15 is a schematic block diagram of an embodiment of a log reconstruction device provided in an embodiment of the present disclosure, and referring to fig. 15, the log reconstruction device provided in an embodiment of the present disclosure includes:
An acquisition module 1501, configured to acquire an original log of a specified period, where the original log includes a log to be reconstructed and a plurality of base logs, where the base logs are other types of logs except the log to be reconstructed;
The characterization learning module 1502 is configured to perform characterization learning on the original log to obtain an initial feature vector matrix;
The reconstruction module 1503 is configured to input the initial feature vector matrix into a pre-trained curve reconstruction model to obtain a reconstructed curve of the logging curve to be reconstructed, where the curve reconstruction model includes at least one decoding layer, the decoding layer includes an asymmetric causal attention layer, a first normalization layer, a feedforward network layer, and a second normalization layer, and the asymmetric causal attention layer is configured to extract a space-time correlation weight of the initial feature vector matrix, where the space-time correlation weight represents a space-time correlation degree between feature representations in the initial feature vector matrix.
The beneficial effects obtained by the device provided by the embodiment of the present disclosure are consistent with those obtained by the above method, and will not be described herein.
Referring to fig. 16, a computer device 1602 is also provided in an embodiment of the present disclosure based on a log reconstruction method as described above, wherein the method is run on the computer device 1602. The computer device 1602 may include one or more processors 1604, such as one or more Central Processing Units (CPUs), each of which may implement one or more hardware threads. The computer device 1602 may also include any memory 1606 for storing any kind of information, such as code, settings, data, and the like. By way of non-limiting example, memory 1606 may include any one or more combinations of any type of RAM, any type of ROM, a flash memory device, a hard disk, an optical disk, and the like. More generally, any memory may store information using any technique. Further, any memory may provide volatile or non-volatile retention of information. Further, any memory may represent fixed or removable components of the computer device 1602. In one case, when the processor 1604 executes associated instructions stored in any memory or combination of memories, the computer device 1602 can perform any of the operations of the associated instructions. The computer device 1602 also includes one or more drive mechanisms 1608, such as a hard disk drive mechanism, an optical disk drive mechanism, and the like, for interacting with any memory.
The computer device 1602 may also include an input/output module 1610 (I/O) for receiving various inputs (via an input device 1612) and for providing various outputs (via an output device 1614). One particular output mechanism may include a presentation device 1616 and an associated Graphical User Interface (GUI) 1618. In other embodiments, input/output modules 1610 (I/O), input devices 1612, and output devices 1614 may not be included as just one computer device in a network. The computer device 1602 may also include one or more network interfaces 1620 for exchanging data with other devices via one or more communication links 1622. One or more communication buses 1624 couple the above-described components together.
Communication link 1622 may be implemented in any manner, for example, through a local area network, a wide area network (e.g., the internet), a point-to-point connection, etc., or any combination thereof. Communication link 1622 may include any combination of hardwired links, wireless links, routers, gateway functions, name servers, etc., governed by any protocol or combination of protocols.
Corresponding to the above method, the embodiments of the present description also provide a computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the above method.
The present description also provides computer-readable instructions, wherein the program therein causes a processor to perform the above-described method when the processor executes the instructions.
The present description also provides a computer program product comprising at least one instruction or at least one program loaded into and executed by a processor to implement the above-described method.
It should be understood that, in various embodiments of the present disclosure, the sequence numbers of the foregoing processes do not mean the order of execution, and the order of execution of the processes should be determined by the functions and internal logic thereof, and should not constitute any limitation on the implementation of the embodiments of the present disclosure.
It should also be understood that, in the embodiments of the present specification, the term "and/or" is merely one association relationship describing the association object, meaning that three relationships may exist. For example, A and/or B may mean that A alone, both A and B, and B alone are present. In the present specification, the character "/" generally indicates that the front and rear related objects are an or relationship.
Those of ordinary skill in the art will appreciate that the elements and algorithm steps described in connection with the embodiments disclosed herein may be embodied in electronic hardware, in computer software, or in a combination of the two, and that the various example components and steps have been generally described in terms of function in the foregoing description to clearly illustrate the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present specification.
It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described systems, apparatuses and units may refer to corresponding procedures in the foregoing method embodiments, and are not repeated herein.
In the several embodiments provided in this specification, it should be understood that the disclosed systems, apparatuses, and methods may be implemented in other ways. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of the units is merely a logical function division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. In addition, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices, or elements, or may be an electrical, mechanical, or other form of connection.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purposes of the embodiments of the present description.
In addition, each functional unit in each embodiment of the present specification may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on this understanding, the technical solution of the present specification is essentially or a part contributing to the prior art, or all or part of the technical solution may be embodied in the form of a software product stored in a storage medium, including several instructions to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the method described in the embodiments of the present specification. The storage medium includes a U disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, an optical disk, or other various media capable of storing program codes.
The principles and embodiments of the present invention have been described in the present specification by using specific examples, which are provided to assist in understanding the method and core ideas of the present invention, and modifications will be apparent to those skilled in the art from the teachings of the present invention, and it is intended that the present invention not be limited to these examples.