Disclosure of Invention
In view of the above, the present invention aims to provide a method for preventing lost circulation in geothermal drilling, which is to prevent lost circulation caused by various parameter characteristics and avoid occurrence of lost circulation by constructing a support vector machine model of drilling data in advance.
To achieve the above object, the present invention provides a method for preventing lost circulation in geothermal drilling, the method comprising:
acquiring a drilling history data set, wherein the drilling history data set comprises lost circulation data and non-lost circulation data, and the lost circulation data and the non-lost circulation data have the same plurality of original parameter characteristics;
Marking lost circulation data with lost circulation labels, marking non-lost circulation data with non-lost circulation labels, and randomly dividing a drilling history data set into training set data and test set data;
carrying out principal component analysis on a plurality of original parameter features in the drilling history data set to obtain parameter features after dimension reduction;
mapping the training set data into a multidimensional space according to the parameter characteristics after dimension reduction;
Building a support vector machine model, and separating lost circulation data and non-lost circulation data in a multidimensional space through lost circulation labels and non-lost circulation labels in training set data;
And carrying out accuracy test on the support vector machine model by using the test set data, and correcting the support vector machine model.
Furthermore, after the model structure is modified, current drilling data is collected in real time, the current drilling data is mapped into a multidimensional space according to the parameter characteristics after dimension reduction, and the current drilling data is positioned on the non-lost circulation data side in the support vector machine model by adjusting the data of the original parameter characteristics.
Further, if the current drilling data indicate that lost circulation occurs, the current drilling data are marked with lost circulation labels and added into a support vector machine model for training the model;
Meanwhile, searching corresponding lost circulation points according to the current drilling data, and adding a plugging material into the lost circulation points to perform plugging.
Still further, the principal component analysis of the plurality of parameter features in the drilling history data set includes:
s1, after data standardization processing is carried out on data in a drilling history data set, a multidimensional random vector matrix is constructed;
S2, calculating a covariance matrix of the multidimensional random vector matrix;
s3, carrying out feature decomposition on the covariance matrix to obtain a group of feature values and corresponding feature vectors;
And S4, selecting the feature vectors corresponding to the previous k feature values from large to small as main components according to the sizes of the feature values.
Further, the data normalization processing for the data in the drilling history data set includes:
The digital class data are ranked by adopting a dimensionless criterion;
carrying out structural mapping processing on the text data;
And carrying out normalization processing on the aligned digital class data and the text class data after structural mapping.
Still further, the constructing a multidimensional random vector matrix is expressed as:
Wherein, X represents a multidimensional random vector matrix, X np represents a value corresponding to the nth sample and the p-th parameter feature, and X p represents a vector corresponding to the p-th parameter feature.
Further, the process of constructing the support vector machine model includes:
According to the distribution of lost circulation data and non-lost circulation data in the multidimensional space, a classification hyperplane of lost circulation data and non-lost circulation data is obtained, a function expression of the classification hyperplane is obtained, and all lost circulation data are arranged on one side of the classification hyperplane and all non-lost circulation data are arranged on the other side of the classification hyperplane.
Further, the distance between any lost circulation data and all non-lost circulation data is calculated, a non-lost circulation data calculation center point closest to the lost circulation data is selected, a classification hyperplane is adjusted according to the center point, and the center point is located on the classification hyperplane.
Furthermore, the testing set data is used for testing the accuracy of the model and correcting the model, and the method comprises the following steps:
According to the parameter characteristics after dimension reduction, mapping the test set data into a multidimensional space one by one, and calculating the accuracy of a model according to the distribution condition of lost circulation labels and non-lost circulation data in the test set data on two sides of a classification hyperplane;
after accuracy testing, the test set data is added into the training set data, and the model is subjected to iterative training.
Compared with the prior art, the invention has the beneficial effects that:
according to the geothermal drilling well leakage prevention method, the drilling history data set is collected, the multiple parameter characteristics in the drilling history data set are analyzed, a plurality of deep parameter characteristics causing well leakage are finally obtained, the drilling history data set is mapped into a high-dimensional space according to the deep parameter characteristics, a classification hyperplane is generated in the high-dimensional space through a support vector machine model, well leakage data are separated from non-well leakage data, when a site is drilled, whether the current drilling point has well leakage risk can be judged in advance only by mapping the deep parameter characteristics of the site into the high-dimensional space in advance, and then the current drilling point can be adjusted to a non-well leakage state by adjusting part of the parameter characteristics, so that well leakage is avoided, and precious reference data are accumulated for subsequent geological development work.
Detailed Description
The invention is described in further detail below with reference to the drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting thereof. It should be further noted that, for convenience of description, only some, but not all of the structures related to the present invention are shown in the drawings.
Example 1
The embodiment provides a method for preventing lost circulation in geothermal drilling, which is applicable to the situation of geothermal resource drilling.
Geothermal drilling processes include geological exploration, well site selection, drilling equipment installation, drilling, geothermal well testing, and completion.
Geological exploration, namely carrying out detailed investigation and analysis on stratum, structure, magma rock, hydrogeological conditions and the like of a target area by using methods of geological investigation, geophysical exploration, geochemical exploration and the like, and preliminarily determining occurrence conditions and distribution ranges of geothermal resources.
Well position selection, namely selecting the most suitable drilling position according to geological exploration results, and ensuring that a geothermal well can accurately penetrate through a geothermal reservoir.
The drilling equipment is installed, namely the drilling equipment is installed, and the drilling equipment comprises a drilling machine, a drill rod, a drill bit and the like, so that the drilling equipment is ready for subsequent drilling work.
Drilling, namely drilling underground by using drilling equipment until reaching a geothermal reservoir. During drilling, it is necessary to employ suitable drilling methods, flushing media, drilling tools, etc. to overcome particular difficulties in drilling.
And (3) testing the geothermal well, namely testing the geothermal well in the drilling process or after drilling so as to determine the temperature, flow and other parameters of geothermal resources and provide basis for subsequent exploitation and utilization.
And after the well is completed, carrying out well completion treatment on the geothermal well, wherein the well completion treatment comprises the steps of installing a wellhead device, sealing the well and the like so as to ensure that the geothermal well can stably run for a long time.
It should be noted that in the prior art, various main precautions are taken for preventing lost circulation, including geological investigation, reasonable design of well structure, drilling fluid adjustment, drilling operation specification, abnormal parameter record, etc. However, these precautions only reduce the risk of lost circulation to a certain extent, which not only cannot avoid lost circulation caused by the combined action of various characteristic parameters, but also require repeated tripping, drilling fluid density adjustment and drilling rate adjustment, and even require continuous sleeve replacement, which seriously affects the drilling progress and efficiency.
In order to solve the problems, the embodiment of the invention provides a method for preventing lost circulation in geothermal drilling, which is characterized in that main parameter characteristics are obtained by carrying out main component analysis on historical drilling data, lost circulation data and non-lost circulation data are segmented in space by constructing a support vector machine model, so that lost circulation caused by various parameter characteristics is prevented, and the occurrence of lost circulation is avoided.
As shown in fig. 1, the method specifically includes the following steps:
and S110, acquiring a drilling history data set, wherein the drilling history data set comprises lost circulation data and non-lost circulation data, and the lost circulation data and the non-lost circulation data have the same plurality of original parameter characteristics.
Wherein the plurality of original parameter features includes, but is not limited to, the following:
drilling duration, the time that the drill bit spends drilling at a certain depth, reflects the nature of the formation, and the formation is either harder or more brittle, can extend the duration of drilling and may even experience stuck drill bits during drilling.
The drilling depth, the maximum depth reached during drilling, is an important basis for evaluating the drilling capacity and formation structure.
Drilling rate, which reflects the drilling efficiency of a drill bit in a particular formation, is an important parameter in assessing the drilling process and formation characteristics.
The abrasion of the drill bit, namely the abrasion condition of the drill bit in the drilling process, is recorded, and has important significance for optimizing the drill bit model and improving the drilling efficiency.
Lithology data, including rock type, mineral composition, structural configuration, etc., are the basis for evaluating formation properties, reservoir conditions.
And (3) stratum data, namely recording stratum sequences, thickness, lithology changes and the like which are disclosed in the drilling process, and is very important for constructing a geological model and predicting the distribution of oil and gas reservoirs.
The porosity, the permeability and the saturation, which reflect the reservoir performance and the fluid flow capacity of the rock, are important bases for evaluating the characteristics of the reservoir of the oil and gas.
The downhole temperature and pressure reflect the formation temperature and pressure conditions, and have important significance for understanding the oil gas generation, migration and aggregation processes.
Fluid resistivity, natural gamma rays, which reflect formation fluid properties and rock radioactivity characteristics, are important means for evaluating reservoir properties and reservoir characteristics.
The electric logging, acoustic logging and nuclear logging data provide rich formation physical property information such as resistivity, sonic velocity, density and the like, and have key effects on formation division and reservoir evaluation.
The parameters of the drilling fluid, including the density, viscosity, water loss and the like of the drilling fluid reflect the performance and underground condition of the drilling fluid, and have important significance for maintaining the stability of a well wall and preventing the pollution of a stratum.
And the drilling accident record is used for recording accidents such as stuck drilling, blowout, lost circulation and the like and handling measures thereof in the drilling process, and has guiding significance for improving the drilling safety and reducing the accident risk.
Core and cuttings samples, which are important physical data of geological research, can be used for analysis of multiple aspects such as petrography, mineralogy, geochemistry and the like.
In addition, lost circulation data in the drilling history data set refers to the above-mentioned multiple original parameter features collected when lost circulation occurs, and the collection of these information may be by means of sensors, sonic logging, etc. devices or methods. Such information includes digital data, such as drilling depth, drilling rate, etc., as well as text data, such as drill bit wear, lithology data, etc., as represented by numerical values.
Text class data is converted into digital class data to perform principal component analysis, and the conversion of the text class data comprises structural mapping and the like.
The non-lost circulation data refers to various parameter data collected in a normal drilling state.
S120, marking lost circulation data with lost circulation labels, marking non-lost circulation data with non-lost circulation labels, and randomly dividing a drilling history data set into training set data and test set data.
The lost circulation data and non-lost circulation data are labeled here for the purpose of distinguishing the two types of data in space later. The data set is divided for the subsequent model construction and training and accuracy testing.
Typically the ratio of training set data to test set data is 4:1, i.e. 80% of the data belongs to the training set data and the remaining 20% belongs to the test set data throughout the drilling history data set.
And S130, carrying out principal component analysis on a plurality of original parameter features in the drilling history data set to obtain the parameter features after dimension reduction.
Principal component analysis can reveal the inherent relationship between multiple variable samples, the principle of which is to reduce the dimension of the variable and simultaneously keep the information contained in the original variable as much as possible. The principal component analysis includes two kinds, one is to discard part of the features, K features are selected from the original P features as main features, and the rest features are discarded, and the other is to synthesize one feature by combining several features and finally convert the P features into K features.
Principal component analysis, comprising:
constructing a multidimensional random vector (P-dimensional random vector):
X= (X 1,x2,L,xp)T, the main component is Y i (i=1, 2., k; k.ltoreq.p).
The main components meet the following conditions:
Yi=ai TX,
Wherein a i is a p×1 dimension digital vector, a i satisfies a i T ai =1, and the value principle is that the variance of Y i is maximized, and the main components Y 1,Y2,...Yk are mutually independent.
The original P-dimensional characteristic can be reduced to a K-dimensional characteristic through principal component analysis, so that the main parameter characteristic in the drilling history data set is screened, and the lost circulation data and the non-lost circulation data can be conveniently mapped in a multidimensional space.
And S140, mapping the training set data into a multidimensional space according to the parameter characteristics after the dimension reduction.
The drilling history data set after principal component analysis only comprises K parameter features, the parameter features are ordered according to a certain sequence, and each lost circulation data and non-lost circulation data in the training set data are projected into a multidimensional space.
And S150, constructing a support vector machine model, and separating lost circulation data and non-lost circulation data in the multidimensional space through lost circulation labels and non-lost circulation labels in the training set data.
After the training set data is mapped to the multidimensional space, because lost circulation data is marked with lost circulation labels, non-lost circulation data is marked with non-lost circulation labels, two types of data can be directly distinguished, and the distribution of the two types of data in the space can be seen, and the support vector machine model is to divide the two types of data through a function or a hyperplane expression.
And S160, testing the accuracy of the support vector machine model by using the test set data, and correcting the support vector machine model.
After the support vector machine model finds such a function or hyperplane expression, the test set data is input into the support vector machine model, and the classification result of the statistical model on the test set data is obtained, wherein the duty ratio of the correct classification result is used as the accuracy of the model. And after the accuracy test is finished, adding all the test set data into the training set data, and updating the model.
The embodiment provides a method for preventing lost circulation in geothermal drilling, which is characterized in that through analyzing main parameter characteristics and establishing a support vector machine model, lost circulation data and non-lost circulation data can be separated by using the model, thus preventing lost circulation accidents, and simultaneously, because a large number of adjustable characteristics such as drilling speed, drill rod replacement, drilling fluid parameters and the like exist in the parameter characteristics, the lost circulation state in the drilling process can be avoided through advanced adjustment.
Example two
Fig. 2 is a flowchart of a method for preventing lost circulation in geothermal drilling according to a second embodiment of the present invention, where the method for preventing lost circulation in geothermal drilling according to the first embodiment is further optimized based on the first embodiment, as shown in fig. 2, and specifically includes the following steps:
And S210, acquiring a drilling history data set, wherein the drilling history data set comprises lost circulation data and non-lost circulation data, and the lost circulation data and the non-lost circulation data have the same plurality of original parameter characteristics.
In accordance with embodiment one, the raw parameter characteristics include, but are not limited to, drilling duration, drilling depth, rate of penetration, drill bit wear, lithology data, formation data, porosity, permeability, saturation, downhole temperature, pressure, fluid resistivity, natural gamma rays, electrical logging, acoustic logging, nuclear logging, drilling fluid parameters, drilling event logs, cores, cuttings samples, and the like.
In addition, the original parameter characteristics can also comprise water level, water quality, water temperature, recharging amount and the like of the recharging well.
S220, marking lost circulation data with lost circulation labels, marking non-lost circulation data with non-lost circulation labels, and randomly dividing a drilling history data set into training set data and test set data.
The lost circulation data and the non-lost circulation data are labeled for distinguishing the two types of data in space later. The data set is divided for the subsequent model construction and training and accuracy testing.
And S230, after data normalization processing is carried out on the data in the drilling history data set, constructing a multidimensional random vector matrix.
Since principal component analysis is sensitive to the data scale, it is necessary to normalize the data so that the average value of each feature is 0 and the standard deviation is 1. Normalization helps to eliminate the effect of different dimension on the results.
The data standardization processing process comprises the following steps:
and (5) ranking the digital class data by adopting a dimensionless criterion. Dimensionless, also called standardization, normalization of data, is a commonly used method for preprocessing data. The main purpose of the method is to eliminate the dimensional influence among different characteristics or indexes through data transformation, so that the data has comparability, and further analysis is performed.
And carrying out structural mapping processing on the text data. For example, loose rock, clastic rock, carbonate rock are mapped to different numbers 2, 4, 6, respectively, depending on the rock type.
And carrying out normalization processing on the aligned digital class data and the text class data after structured mapping, and mapping the digital class data and the text class data between (0 and 1). For example, 2,4, 6 are mapped to 0, 0.5, 1.
The multidimensional random vector matrix is expressed as:
Wherein, X represents a multidimensional random vector matrix, X np represents a value corresponding to the nth sample and the p-th parameter feature, and X p represents a vector corresponding to the p-th parameter feature. Each element in the matrix is between (0, 1).
S240, calculating a covariance matrix of the multidimensional random vector matrix.
For the normalized data, the covariance matrix is calculated. The covariance matrix describes the correlation between features, where the diagonal elements are the variances of each feature and the off-diagonal elements are the covariances between the features.
The covariance matrix is expressed as:
wherein, R represents a covariance matrix, i represents a row of the matrix, j represents a column of the matrix, and the calculation of any element in the covariance matrix is as follows:
Wherein r ij represents the ith row and jth column elements in the covariance matrix, X ki represents the ith row and kth column elements in the multidimensional random vector matrix, Representing the average value of the ith row in the multidimensional random vector matrix,Representing the average value of the j-th column in the multidimensional random vector matrix.
And S250, carrying out feature decomposition on the covariance matrix to obtain a group of feature values and corresponding feature vectors.
R=λ1a1+λ2a2+...+λp ap
Each column in the covariance matrix constitutes a feature vector, a 1,a2,...,ap being the feature vector, which corresponds to a parameter feature. The eigenvalue lambda 1,λ2,...,λp represents the importance (i.e., variance size) of each principal component, while the eigenvector describes the distribution of the data in the direction of the principal component.
And S260, selecting the feature vectors corresponding to the first k feature values from large to small as main components according to the sizes of the feature values.
The criteria for selection typically include a eigenvalue greater than 1, a cumulative variance contribution reaching a certain value (e.g., above 80%).
After principal component selection, principal component scores may also be calculated by projecting the raw data onto the selected principal component to obtain a score for each sample on each principal component. These scores may be used for subsequent analysis and processing, such as clustering, classification, and the like. Finally, the selected principal components can be analyzed and interpreted to understand the characteristics and information of the original data they represent.
And S270, mapping the training set data into a multidimensional space according to the parameter characteristics after the dimension reduction.
Each of the lost circulation data and the non-lost circulation data in the training set data is projected into a multidimensional space.
S280, constructing a support vector machine model, and separating lost circulation data and non-lost circulation data in the multidimensional space through lost circulation labels and non-lost circulation labels in the training set data.
According to the distribution of lost circulation data and non-lost circulation data in the multidimensional space, a classification hyperplane of lost circulation data and non-lost circulation data is obtained, a function expression of the classification hyperplane is obtained, and all lost circulation data are arranged on one side of the classification hyperplane and all non-lost circulation data are arranged on the other side of the classification hyperplane.
And calculating the distance between any lost circulation data and all non-lost circulation data, selecting the non-lost circulation data closest to the lost circulation data, and adjusting the classification hyperplane according to a central point, wherein the central point is positioned on the classification hyperplane.
Taking two-dimensional space as an example, if lost circulation data (x 1, y 1) and non-lost circulation data (x 2, y 2) exist, the distance between the lost circulation data and the non-lost circulation data is
For three-dimensional space, there is lost circulation data (x 1, y1, z 1) and non-lost circulation data (x 2, y2, z 1), then the lost circulation data is at a distance from the non-lost circulation data ofWith such a push-up dimensional point-to-point distance calculation.
The classification hyperplane can enable lost circulation data to be separated from non-lost circulation data.
And S290, testing the accuracy of the model by using the test set data, and correcting the model.
The accuracy test is carried out on the model by utilizing the test set data, and the model is corrected, comprising the following steps:
According to the parameter characteristics after dimension reduction, mapping the test set data into a multidimensional space one by one, and calculating the accuracy of a model according to the distribution condition of lost circulation labels and non-lost circulation data in the test set data on two sides of a classification hyperplane;
after accuracy testing, the test set data is added into the training set data, and the model is subjected to iterative training.
After the model is corrected, current drilling data is collected in real time, the current drilling data is mapped into a multidimensional space according to the parameter characteristics after dimension reduction, and the current drilling data is positioned on the non-lost circulation data side in the support vector machine model by adjusting the data of the original parameter characteristics.
Because some adjustable parameters exist in the parameter characteristics, such as drilling speed and other factors which can be controlled artificially, the parameters can be extracted and adjusted before the drill bit reaches a certain depth, so that the current drilling point is in a normal drilling state with a higher probability, and the lost circulation state is avoided.
If the current drilling data indicate that the well leakage occurs, marking the current drilling data with a well leakage label, and adding the current drilling data into a support vector machine model to train the model;
Meanwhile, searching corresponding lost circulation points according to the current drilling data, and adding a plugging material into the lost circulation points to perform plugging.
For one drilling, pre-hydrated bentonite slurry with the density of 1.05g/cm 3 is prepared for 80m 3 before drilling, and the polyacrylamide potassium salt is prepared into a dilute solution for two drilling, and is added into the bentonite slurry, and the two drilling is performed after stirring uniformly. To maintain the polymer content above 0.5%, 50-80 kg of polymer is added every 120m or 24h of drilling. The method strengthens the solid control measure, 100 percent of the drilling process uses a vibrating screen, the utilization rate of the sand remover and the mud remover reaches 85 percent, the irrigation pool is washed in time, and the sand content and the drill cuttings content of the mud are reduced as much as possible. The three-open well drilling and the four-open well drilling need to prepare the polyacrylamide potassium salt into a dilute solution, add the dilute solution into bentonite slurry, stir the slurry uniformly and drill. And an anti-collapse lubricant is added to ensure the stability of the well bore and prevent collapse. Strictly controlling the proper bentonite content and strengthening the solid control treatment are key to controlling the well slurry performance. During drilling, a vibrating screen, a sand remover and a mud remover are used for washing the irrigation pool in time, so that the sand content and the drill cuttings content of the well slurry are reduced as much as possible.
The first embodiment provides a method for preventing lost circulation in geothermal drilling, on the basis of the first embodiment, by constructing a covariance matrix to analyze main parameter characteristics and building a support vector machine model, lost circulation data and non-lost circulation data can be separated by the model, and thus lost circulation accidents can be prevented.
The foregoing description of the preferred embodiments of the invention is not intended to limit the invention to the precise form disclosed, and any such modifications, equivalents, and alternatives falling within the spirit and scope of the invention are intended to be included within the scope of the invention.