CN118690271A

CN118690271A - An intelligent identification method for early warning signals of pipe sticking accidents under long-tail distribution

Info

Publication number: CN118690271A
Application number: CN202410737173.5A
Authority: CN
Inventors: 汪敏; 张浩洋; 乔豁通; 刘丽艳; 韩雄; 卢齐; 冯万富
Original assignee: Southwest Petroleum University
Current assignee: Southwest Petroleum University
Priority date: 2024-06-07
Filing date: 2024-06-07
Publication date: 2024-09-24

Abstract

The present invention discloses an intelligent identification method for early sign signals of drill stuck accidents under long-tail distribution, which belongs to the field of artificial intelligence. The method comprises the following steps: step S10, collecting logging time series data, performing data cleaning and labeling, and constructing a drill stuck accident data set; step S20, establishing a channel-time series hybrid drill stuck prediction network according to the drill stuck accident data set; step S30, training the network using the loss function, fitting the data, and obtaining a trained channel-time series hybrid drill stuck prediction network model; step S40, predicting drill stuck accidents using the channel-time series hybrid drill stuck prediction network model. The beneficial effect of this method is that it can automatically learn the potential patterns of logging time series, perceive the changes in the precursor parameters of drill stuck, and promptly give early warnings for drill stuck, thereby reducing the economic losses and safety hazards caused by drill stuck accidents.

Description

An intelligent identification method for early warning signals of pipe sticking accidents under long-tail distribution

技术领域Technical Field

本发明涉及一种卡钻事故预测方法。尤其涉及一种长尾分布下的卡钻事故早期征兆信号智能识别方法。具体是指使用本发明数据处理方法处理录井数据，利用与之对应的深度学习算法模型学习特征与卡钻早期征兆信号之间的关联关系并使用长尾问题损失函数提高了少数类样本的识别准确率，实现对卡钻早期征兆信号识别，以此达到卡钻事故预测的目的，属于人工智能领域。The present invention relates to a method for predicting a drill stuck accident. In particular, it relates to an intelligent method for identifying early sign signals of a drill stuck accident under a long-tail distribution. Specifically, it refers to using the data processing method of the present invention to process logging data, using the corresponding deep learning algorithm model to learn the correlation between features and early sign signals of a drill stuck, and using the long-tail problem loss function to improve the recognition accuracy of minority class samples, thereby realizing the recognition of early sign signals of a drill stuck, thereby achieving the purpose of predicting a drill stuck accident, and belongs to the field of artificial intelligence.

背景技术Background Art

钻井工作是石油工业中的核心环节。卡钻是指在钻井进程中，由于钻柱在起下钻的过程中失去了自由活动，即钻井管柱不能上下活动也不能转动，在井眼的某一井段遇到阻碍的钻井事故。近三分之一的钻井时间损失是由卡钻引起的。卡钻事故不仅会导致钻井时间延长、钻具损失和井眼报废，还会严重威胁作业人员的生命安全。导致卡钻的原因错综复杂，包括岩层特征、井筒稳定性等。同时，这些因素之间存在复杂的关联关系。卡钻的复杂性和多样性使其成为一个极具挑战性的问题。因此，开展卡钻事故的识别和预测研究，对指导钻井作业具有重要意义。工程现场针对卡钻的判断基本来自人工的经验，钻井现场尚未出现完备的卡钻预测软件。Drilling is a core part of the oil industry. A stuck drill string refers to a drilling accident in which the drill string loses its free movement during the process of drilling, that is, the drill string cannot move up and down or rotate, and encounters an obstruction in a certain section of the wellbore. Nearly one-third of drilling time loss is caused by stuck drill string. A stuck drill string accident not only leads to extended drilling time, loss of drilling tools and scrapping of wellbore, but also seriously threatens the life safety of operators. The causes of stuck drill string are complex, including rock formation characteristics, wellbore stability, etc. At the same time, there are complex correlations between these factors. The complexity and diversity of stuck drill string make it a very challenging problem. Therefore, conducting research on the identification and prediction of stuck drill string accidents is of great significance to guiding drilling operations. The judgment of stuck drill string at the engineering site is basically based on manual experience, and there is no complete stuck drill string prediction software at the drilling site.

钻井作业受复杂地理环境因素影响，钻井过程各有差异。复杂的钻井情况，发生卡钻事故时的类型也各不相同。典型的卡钻事故包括压差卡钻、沉砂卡钻、键槽卡钻等。卡钻事故中最常见的卡钻类型为压差卡钻。发生压差卡钻的基本条件有三：1.钻井液静液柱压力大于地层压力。2.存在渗透性地层，且在井壁上有较厚泥饼。3.钻具有一段时间静止。我们认为钻具与泥饼贴紧是一个过程，而非一次性发生的。这个过程也即压差卡钻的提前期，静水压力和孔隙压力之间的差值逐渐增大，泥饼逐渐变厚，泥饼中的压力逐渐下降，最后导致卡钻。其他类型卡钻事故在发生前也均有其征兆信号。如果可以检测到这些卡钻事故早期征兆信号，现场就能及时做出应对措施从而减少损失。然而卡钻事故始终属于小概率事件，卡钻事故早期征兆信号样本也属于小样本，可学习的样本数量稀少，将导致模型在识别时表现不佳。目前针对小样本识别的方法主要为重采样，是以直观的方式去平衡类别之间样本数量的差距，此方法有一定提升效果但存在以下问题：1.过采样重复采样前驱信号类，会造成模型过拟合；2.欠采样抛弃了大部分工况正常样本，会造成信息缺失，模型精度不高。因此需要在不破坏类别分布情况下，提高卡钻前驱信号样本的识别精度，方便现场及时处理事故，避免卡钻发生。Drilling operations are affected by complex geographical environmental factors, and the drilling process varies. In complex drilling situations, the types of drill bit stuck accidents are also different. Typical drill bit stuck accidents include differential pressure drill bit stuck, sand settling drill bit stuck, keyway drill bit stuck, etc. The most common type of drill bit stuck accident is differential pressure drill bit stuck. There are three basic conditions for differential pressure drill bit stuck: 1. The static column pressure of the drilling fluid is greater than the formation pressure. 2. There is a permeable formation and there is a thick mud cake on the well wall. 3. The drill has been stationary for a period of time. We believe that the close contact between the drill tool and the mud cake is a process, not a one-time occurrence. This process is also the lead time of differential pressure drill bit stuck. The difference between the hydrostatic pressure and the pore pressure gradually increases, the mud cake gradually becomes thicker, and the pressure in the mud cake gradually decreases, which finally leads to drill bit stuck. Other types of drill bit stuck accidents also have their signs before they occur. If these early signs of drill bit stuck accidents can be detected, timely response measures can be taken on site to reduce losses. However, drill stuck accidents are always low-probability events, and the early warning signal samples of drill stuck accidents are also small samples. The number of samples that can be learned is scarce, which will lead to poor performance of the model in recognition. The current method for small sample recognition is mainly resampling, which is an intuitive way to balance the difference in the number of samples between categories. This method has a certain improvement effect but has the following problems: 1. Oversampling and repeated sampling of the precursor signal class will cause the model to overfit; 2. Undersampling discards most of the normal working condition samples, which will cause information loss and low model accuracy. Therefore, it is necessary to improve the recognition accuracy of the precursor signal samples of drill stuck without destroying the category distribution, so as to facilitate timely handling of accidents on site and avoid drill stuck.

发明内容Summary of the invention

本发明针对上述问题，融合现场监测的多源数据，提出一种长尾分布下基于卡钻前驱信号检测的卡钻预测方法，有效感知各参数的变化，自动发现复杂的特征模式，实现卡钻前驱信号的检测，拥有更高的精度和泛化能力。In response to the above problems, the present invention integrates multi-source data of on-site monitoring and proposes a drill stuck prediction method based on drill stuck precursor signal detection under long-tail distribution. The method can effectively sense the changes of various parameters, automatically discover complex feature patterns, and realize the detection of drill stuck precursor signals with higher accuracy and generalization ability.

为达到上述目的，本发明采用的技术方案如下：To achieve the above object, the technical solution adopted by the present invention is as follows:

S1、获取不同工区录井时间序列数据；将录井数据归一化处理和缺失值填补；利用专家知识对卡钻事故进行标注，获得已标注的若干录井时序数据；将录井时间序列数据划分为若干样本，分为卡钻事故早期征兆信号样本与正常样本；S1. Obtain logging time series data from different work areas; normalize the logging data and fill in missing values; use expert knowledge to mark the stuck drill accident and obtain a number of marked logging time series data; divide the logging time series data into a number of samples, including early sign signal samples of stuck drill accident and normal samples;

S2、构建通道-时序混合卡钻预测网络；使用交叉验证训练测试模型，在测试特定油井的数据时，该油井的数据会在训练阶段从数据集中移除。得到录井时间序列数据样本为卡钻事故前驱信号的概率；S2. Construct a channel-time series hybrid pipe-stuck prediction network; use cross-validation to train and test the model. When testing the data of a specific oil well, the data of the oil well will be removed from the data set during the training phase. Obtain the probability that the logging time series data sample is a precursor signal of a pipe-stuck accident;

S3、采用长尾分布的损失函数进一步优化网络，解决网络在录井数据正负样本不平衡下对卡钻样本的挖掘。引入调制系数，使模型更关注稀少的卡钻事故早期征兆信号正样本；S3. Use the long-tail distribution loss function to further optimize the network and solve the problem of mining stuck pipe samples when the positive and negative samples of logging data are unbalanced. Introduce the modulation coefficient to make the model pay more attention to the rare positive samples of early signs of stuck pipe accidents;

进一步，实现S10的具体过程为，获取不同工区历史和实时更新的录井数据，存储为时序数据。数据参数包含钻头位置、扭矩、立压等。检测数据连续性，使用缺失填补；基于钻井日志标注卡钻标签。Furthermore, the specific process of implementing S10 is to obtain the historical and real-time updated logging data of different work areas and store them as time series data. The data parameters include drill bit position, torque, vertical pressure, etc. Check the data continuity and use missing fill; mark the stuck drill label based on the drilling log.

S11、所述缺失填补为K最近邻填充法；S11, the missing filling is a K nearest neighbor filling method;

S12、基于钻井日志将获得的卡钻事故发生前15分钟的录井参数时间序列以每三分钟为间隔分为一个样本，作为卡钻早期征兆信号样本。S12. Based on the drilling log, the logging parameter time series obtained 15 minutes before the occurrence of the pipe sticking accident is divided into a sample at intervals of three minutes as an early sign signal sample of the pipe sticking.

S13、从钻井数据库中随机抽取正常工作状态下的录井参数时间序列，同样以以每三分钟为间隔分为一个样本，作为钻井正常信号样本。S13. Randomly extract the logging parameter time series under normal working conditions from the drilling database, and divide it into a sample at intervals of three minutes as the normal drilling signal sample.

S14、以专家知识优选与卡钻事故发生相关的工程参数。抽取的参数为时间、井深、钻头位置、钻时、大钩高度、钻压、转速、套管压力、立管压力、入口流量、出口流量。S14. Optimize engineering parameters related to the occurrence of stuck drill pipe accidents based on expert knowledge. The extracted parameters are time, well depth, drill bit position, drilling time, hook height, drilling pressure, rotation speed, casing pressure, riser pressure, inlet flow rate, and outlet flow rate.

优选地，实现S2的具体步骤为Preferably, the specific steps to implement S2 are:

S21、确定采样间隔参数s。对原始录井序列数据进行均匀下采样，分解成s个子序列。S21. Determine a sampling interval parameter s. Evenly downsample the original logging sequence data and decompose it into s subsequences.

S22、对每个子序列分别进行时间特征学习，采用两层全连接网络，激活函数为GELU。得到时间维度特征表示。S22. Perform temporal feature learning on each subsequence, using a two-layer fully connected network with GELU as the activation function, to obtain temporal dimension feature representation.

S23、将s个子序列根据原始顺序重组合并，实现时间特征的融合。S23. Reorganize and merge the s subsequences according to the original order to achieve the fusion of time features.

S24、基于简化线性模型，学习参数之间的关联关系，获得通道维度特征。S24. Based on the simplified linear model, learn the correlation between parameters and obtain channel dimension features.

S25、残差连接层，深度融合步骤S23中的时间特征和步骤S24中的通道特征。S25, residual connection layer, deeply fuses the temporal features in step S23 and the channel features in step S24.

S26、线性映射层，将融合特征映射到类别(正常/前驱信号)。S26, linear mapping layer, maps the fused features to categories (normal/precursor signals).

S27、Softmax层将输出转化为卡钻事故征兆信号的概率。S27, Softmax layer converts the output into the probability of the drill stuck accident sign signal.

实现S3的具体步骤为：The specific steps to implement S3 are:

S31、模型预测得到每个样本为卡钻类的预测概率。计算每个样本实际标签与预测结果之间的交叉熵作为基础损失。S31. The model predicts the probability of each sample being a stuck diamond class. The cross entropy between the actual label of each sample and the predicted result is calculated as the basic loss.

S32、引入调制系数，构建Focal Loss，对基础损失进行调制。S32, introduce the modulation coefficient, construct the Focal Loss, and modulate the basic loss.

S33、Focal Loss通过设定的调制因子，降低了模型对分类准确的易样本的损失贡献。S33, Focal Loss reduces the loss contribution of the model to the easy samples with accurate classification by setting the modulation factor.

S34、采用Focal Loss作为最终损失，实施模型参数梯度下降优化更新。S34. Use Focal Loss as the final loss and implement gradient descent optimization update of model parameters.

S4、输入实际录井时间序列数据，抽取指定参数列，检测是否有数据缺失；若有数据缺失则进行缺失值填补后构建预测数据集输入模型，输出为是否为卡钻前驱信号的概率。S4. Input the actual logging time series data, extract the specified parameter column, and detect whether there is missing data; if there is missing data, fill the missing values and build a prediction data set input model, and output the probability of whether it is a precursor signal of stuck drill.

与现有技术相比，本发明的有益效果Compared with the prior art, the present invention has the following beneficial effects:

(1)本发明以卡钻前驱信号检测等效卡钻事故预测，避免参数预测带来的不确定性；(1) The present invention uses the drill stuck precursor signal detection to predict the equivalent drill stuck accident, thus avoiding the uncertainty caused by parameter prediction;

(2)本发明将录井时间序列数据与深度学习模型相结合，可以准确识别卡钻前驱信号，及时报警；(2) The present invention combines logging time series data with a deep learning model to accurately identify precursor signals of drill sticking and issue an alarm in a timely manner;

(3)本发明能够有效提高小样本的卡钻事故前驱信号识别精度且不会造成过拟合或者欠拟合。(3) The present invention can effectively improve the recognition accuracy of precursor signals of drill stuck accidents in small samples without causing overfitting or underfitting.

本发明为一种通道-时序混合卡钻事故前驱信号检测方法，能够有效感知录井数据中各个参数潜在变化规律和隐含的相互关联关系，提取卡钻事故数据特征。在卡钻样本数据分布不平衡的情况下使用长尾分布损失函数，解决数据样本类别不平衡问题。The present invention is a channel-time series hybrid pipe-stuck accident precursor signal detection method, which can effectively perceive the potential change rules and implicit correlations of various parameters in logging data and extract the characteristics of pipe-stuck accident data. In the case of unbalanced distribution of pipe-stuck sample data, a long-tail distribution loss function is used to solve the problem of unbalanced data sample categories.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

为了更清楚地说明本发明实施例的技术方案，下面将对实施例的附图作简单地介绍，显而易见地，下面描述中的附图仅仅涉及本发明的一些实施例，而非对本发明的限制。In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings of the embodiments will be briefly introduced below. Obviously, the drawings in the following description only relate to some embodiments of the present invention, but are not intended to limit the present invention.

图1为本发明的流程示意图；Fig. 1 is a schematic diagram of the process of the present invention;

图2为本发明的因子化时间通道混合网络示意图；FIG2 is a schematic diagram of a factorized time channel hybrid network of the present invention;

具体实施方式DETAILED DESCRIPTION

下面对本发明的具体实施方式进行描述，以便于本技术领域的技术人员理解本发明，但应该清楚，本发明不限于具体实施方式的范围，对本技术领域的普通技术人员来讲，只要各种变化在所附的权利要求限定和确定的本发明的精神和范围内，这些变化是显而易见的，一切利用本发明构思的发明创造均在保护之列。The specific implementation modes of the present invention are described below so that those skilled in the art can understand the present invention. However, it should be clear that the present invention is not limited to the scope of the specific implementation modes. For those of ordinary skill in the art, as long as various changes are within the spirit and scope of the present invention as defined and determined by the attached claims, these changes are obvious, and all inventions and creations utilizing the concept of the present invention are protected.

本发明的通道-时序混合卡钻预测方法，包括以下步骤：The channel-sequence hybrid pipe sticking prediction method of the present invention comprises the following steps:

S1、收集不同工区录井数据，如部分参数统计如表1所示S1. Collect logging data from different work areas. Some parameter statistics are shown in Table 1.

表1部分参数统计表Table 1 Statistics of some parameters

输入数据后检查数据是否有缺失值，本发明采用“k最近距离邻法”进行缺失值填补。首先，确定填补缺失值时所采用的最近邻居的数量k。接着，计算缺失值所在样本与其他样本之间的距离，使用欧氏距离或其他距离度量。After inputting the data, check whether there are missing values in the data. The present invention adopts the "k nearest neighbor method" to fill the missing values. First, determine the number of nearest neighbors k used to fill the missing values. Then, calculate the distance between the sample where the missing value is located and other samples, using Euclidean distance or other distance measurement.

其中，dxy为有缺失值数据与完备数据的欧氏距离，x为带有缺失值的数据，y为完整数据。对缺失值所在数据与其他完备数据全部计算欧式距离后，选取与缺失值样本最近的k个样本。利用这k个最近邻的平均值或加权平均值来填补缺失值。最后，将填补后的值应用于缺失值所在的位置。Among them, dxy is the Euclidean distance between the data with missing values and the complete data, x is the data with missing values, and y is the complete data. After calculating the Euclidean distance between the data with missing values and other complete data, select the k samples closest to the missing value sample. Use the average or weighted average of these k nearest neighbors to fill the missing value. Finally, apply the filled value to the location where the missing value is located.

根据现场施工日志将卡钻事故发生前15分钟的数据标记为卡钻前驱信号，以三分钟为间隔划分样本。同样以三分钟为样本时间跨度抽取正常数据样本。According to the on-site construction log, the data 15 minutes before the drill stuck accident occurred was marked as the precursor signal of the drill stuck, and the samples were divided into intervals of three minutes. Similarly, the normal data samples were extracted with a sample time span of three minutes.

在卡钻预测时各参数的整体变化趋势比具体数值更为重要，并且各参数数值差异较大。因此，为提升模型收敛速度，将各参数归一化为0～1的值，以消除不同参数量纲带来的影响。设某参数原始值为x，归一化后的值为x'，其转换公式为：When predicting stuck drill, the overall trend of each parameter is more important than the specific value, and the values of each parameter vary greatly. Therefore, in order to improve the convergence speed of the model, each parameter is normalized to a value of 0 to 1 to eliminate the influence of different parameter dimensions. Suppose the original value of a parameter is x, and the normalized value is x', the conversion formula is:

下面结合附图对本方法进行具体描述，如图1所示为网络的总体框架图。The method is described in detail below with reference to the accompanying drawings. FIG1 is a general framework diagram of a network.

S1、构建因子化时间通道混合网络，对各录井参数变化规律和参数关联关系进行建模。从时间和空间两个维度分别提取特征，得到了特征和卡钻前驱信号之间的映射关系。S1. Construct a factorized time channel hybrid network to model the variation rules and parameter correlation of each logging parameter. Extract features from the time and space dimensions respectively, and obtain the mapping relationship between the features and the precursor signal of the stuck drill.

参考图2，以下对步骤S2进行详细说明。With reference to FIG2 , step S2 is described in detail below.

因子化时间通道混合预测网络结构图如图2所示。由时序交互(TemporalInteraction)和通道混合两部分组成。The structure of the factorized time-channel mixed prediction network is shown in Figure 2. It consists of two parts: temporal interaction and channel mixing.

首先，设计时间交互模块以充分捕获不同参数细微变化特征。First, a temporal interaction module is designed to fully capture the subtle changes in different parameters.

将原始录井时间序列数据下采样分成s个交错的子序列，每个子序列通过线性层单独学习时间维度特征，最后将这些子序列的特征按原序重新拼接在一起，以此避免分解所造成的特征丢失。设输入序列为则交错采样按如下步骤实施；The original logging time series data The downsampling is divided into s interleaved subsequences, each subsequence learns the time dimension features separately through the linear layer, and finally the features of these subsequences are reassembled in the original order to avoid feature loss caused by decomposition. Suppose the input sequence is Then the interleaved sampling is implemented as follows:

1.均匀下采样：将X_h按间隔s进行下采样，分成s个子序列：1. Uniform downsampling: Downsample _Xh by interval s and divide it into s subsequences:

X_h,i＝X_h[i-1::s,:]，1≤i≤s (2)X _h,i =X _h [i-1::s,:]，1≤i≤s (2)

其中，i索引第i个子序列，[::s]表示每s步采样一次。s的值一般根据经验选取。Here, i indexes the i-th subsequence, and [::s] means sampling every s steps. The value of s is usually selected based on experience.

特征学习：对每个子序列X_h,i分别应用时间特征提取器。本文采用线性交互捕获卡钻数据的时间模式，将含两个隐藏层的多层感知机作为特征提取模块。两层隐藏层的设计，避免了过拟合的风险。通过多层感知机提取的高级特征，为后续的时间序列预测任务提供了更具判别性的输入，提高了模型的预测精度。采用GELU激活函数，避免输入为负值时，激活单元的输出长期为零的情况。线性模型学习参数学习得到其时间维度特征表示X_h,i′：Feature learning: Apply a temporal feature extractor to each subsequence X _h,i . This paper uses linear interaction to capture the temporal pattern of stuck drill data, and uses a multilayer perceptron with two hidden layers as a feature extraction module. The design of two hidden layers avoids the risk of overfitting. The high-level features extracted by the multilayer perceptron provide more discriminative input for subsequent time series prediction tasks, improving the prediction accuracy of the model. The GELU activation function is used to avoid the situation where the output of the activation unit is zero for a long time when the input is negative. Linear model learning parameters The time dimension feature representation _Xh,i ′ is obtained by learning:

其中为列方向的加法。in is the addition in the column direction.

3.特征聚合：如式(3)，将s个子序列的特征X_h,i′按原始顺序重新聚合，得到新的时间序列表示X_h ^* 3. Feature aggregation: As shown in formula (3), the features of s subsequences _Xh,i ′ are re-aggregated in the original order to obtain a new time series representation _Xh ^*

X_h ^*＝[X_h,1′[0],X_h,2′[0],...,X_h,1′[1],X_h,2′[1],...] (4)X _h ^* =[X _h,1 ′[0],X _h,2 ′[0],...,X _h,1 ′[1],X _h,2 ′[1],...] ( 4)

经过上述处理，原始冗余序列被下采样为多个子序列，并在每个子序列上单独学习特征，最后将各子序列特征按原序重新聚合，以此降低冗余并编码时间维度信息。通过调节下采样步长s，可以控制降采样的程度，以平衡信息量和冗余。After the above processing, the original redundant sequence is downsampled into multiple subsequences, and features are learned separately on each subsequence. Finally, the features of each subsequence are reaggregated in the original order to reduce redundancy and encode time dimension information. By adjusting the downsampling step size s, the degree of downsampling can be controlled to balance the amount of information and redundancy.

其次，设计通道混合模块捕获通道间的关联关系。Secondly, a channel mixing module is designed to capture the correlation between channels.

进一步，我们希望获得扭矩、钻速等参数之间的相互关系。因此，设计通道混合捕获模块。Furthermore, we hope to obtain the relationship between parameters such as torque and drilling speed. Therefore, a channel mixing capture module is designed.

通道间关联关系可使用下式表示：The relationship between channels can be expressed as follows:

其中 in

表示噪声，表示去噪后的通道依赖(关联)关系。和表示分解后的通道相互作用。公式(5)与线性表达式类似，因此可以使用线性模型学习时间序列数据的通道依赖性，如下represents noise, Indicates the channel dependency (association) relationship after denoising. and represents the channel interaction after decomposition. Formula (5) is similar to the linear expression, so the channel dependency of time series data can be learned using a linear model as follows

其中σ(·)为激活函数。in σ(·) is the activation function.

这样可以有效用极简的方式，实现通道间相关关系的提取，有效降低模型的时间复杂度和空间复杂度。This can effectively extract the correlation between channels in a very simple way, effectively reducing the time complexity and space complexity of the model.

具体做法为将原始数据经过转置，线性投影层，和GELU激活函数，最后得到参数间关联关系表示转置使每个参数的整条序列作为特征通道。线性投影层将特征映射到更低维空间，目的是压缩特征，减少冗余成分。GELU激活函数加线性投影层实现维度变换以保证最后输出的维度与输入一致。The specific method is to transpose the original data, pass the linear projection layer, and the GELU activation function, and finally get the correlation between the parameters. The transposition makes the entire sequence of each parameter as a feature channel. The linear projection layer maps the features to a lower dimensional space in order to compress the features and reduce redundant components. The GELU activation function plus the linear projection layer realizes the dimensional transformation to ensure that the dimension of the final output is consistent with the input.

使用残差连接将时间混合特征和通道混合特征进一步深度融合。使用线性映射将原始输入长度映射到目标预测长度，避免迭代预测带来的误差累计。至此即已获得参数空间的编码表示Use residual connections to further deeply fuse the time mixing features and channel mixing features. Use linear mapping to map the original input length to the target prediction length to avoid the error accumulation caused by iterative prediction. So far, the encoding representation of the parameter space has been obtained.

最后由一个线性层映射和Softmax层得到是否卡钻的概率，输出二者中较大的索引作为输出。完成卡钻前驱信号样本的检测。Finally, a linear layer mapping and a Softmax layer are used to obtain the probability of whether the drill is stuck, and the larger index of the two is output as the output, completing the detection of the drill stuck precursor signal sample.

S3、考虑卡钻事件样本占全体样本的比例极低的问题，采用Focal Loss损失函数显著提高网络对卡钻样本的识别能力。最终实现了对卡钻事故的预测与预警。S3. Considering the problem that the proportion of stuck drill event samples to all samples is extremely low, the Focal Loss function is used to significantly improve the network's ability to identify stuck drill samples. Finally, the prediction and early warning of stuck drill accidents are achieved.

其中Focal Loss损失函数如下：The Focal Loss loss function is as follows:

FL(p_t)＝α(1-p_t)^γlog(p_t) (7)FL(p _t )=α(1-p _t ) ^γ log(p _t ) (7)

其中p_t为模型对正样本的预测概率，1-p_t表示模型预测错误的概率。γ是一个调节因子，当γ增大时，可以使得损失函数更加关注难分类的卡钻样本。Where _pt is the model's predicted probability for positive samples, and 1- _pt represents the probability of the model's prediction error. γ is an adjustment factor. When γ increases, the loss function can pay more attention to the difficult-to-classify stuck drill samples.

为了优化模型性能并防止过拟合，本文采用了早停法(Early Stopping)策略对模型进行训练。早停法是一种基于验证集表现的迭代终止方法，旨在提高模型的泛化能力。在每个训练周期(epoch)结束时，评估模型在独立的验证数据集上的性能。如果验证集上的性能(如损失函数值或准确率)在连续的若干训练周期内没有显著改善，训练过程将被提前终止。In order to optimize model performance and prevent overfitting, this paper adopts the early stopping strategy to train the model. Early stopping is an iterative termination method based on the performance of the validation set, which aims to improve the generalization ability of the model. At the end of each training cycle (epoch), the performance of the model on an independent validation dataset is evaluated. If the performance on the validation set (such as the loss function value or accuracy) does not improve significantly in several consecutive training cycles, the training process will be terminated early.

S4：输入测试数据样本，检测数据是否有缺失，缺失则进行数据填充后将完备测试数据输入模型，输出为卡钻前驱信号识别结果，为样本属于卡钻前驱信号的概率。S4: Input the test data sample, check whether the data is missing, fill the data if missing, and then input the complete test data into the model, and output the identification result of the drill stuck precursor signal, which is the probability that the sample belongs to the drill stuck precursor signal.

以上所述，并非对本发明作任何形式上的限制，虽然本发明已通过上述实施例揭示，然而并非用以限定本发明，任何熟悉本专业的技术人员，在不脱离本发明技术方案范围内，当可利用上述揭示的技术内容作出些变动或修饰为等同变化的等效实施例，但凡是未脱离本发明技术方案的内容，依据本发明的技术实质对以上实施例所作的任何简单修改、等同变化与修饰，均仍属于本发明技术方案的范围内。The above description is not intended to impose any form of limitation on the present invention. Although the present invention has been disclosed through the above embodiments, it is not intended to limit the present invention. Any technician familiar with the profession can make some changes or modifications to equivalent embodiments of equivalent changes using the technical contents disclosed above without departing from the scope of the technical solution of the present invention. However, any simple modification, equivalent change and modification made to the above embodiments based on the technical essence of the present invention without departing from the content of the technical solution of the present invention still falls within the scope of the technical solution of the present invention.

Claims

1. A method for intelligently identifying early warning signs of a pipe-stuck accident under a long-tail distribution, the method being implemented by a computer and comprising:

S1: Obtaining an initial sample, where the initial sample is logging data;

The logging data is input into a channel-time series hybrid pipe-stuck precursor signal detection network model. The network model includes a factorized time channel hybrid prediction network. The network model outputs the probability that the sample is an early sign signal of a pipe-stuck accident.

S2: Training channel-time series hybrid stuck drill precursor signal detection method. This method predicts stuck drill accidents by identifying early sign signals of stuck drill accidents. This method fully captures various nonlinear characteristics and their correlations of stuck drill accidents from the two dimensions of time and channel of logging data, and designs a corresponding loss function training model based on the scarcity of stuck drill accidents to realize the identification of early sign signals of stuck drill accidents, so as to achieve early warning prediction of stuck drill accidents.

S3: Use the long-tail problem loss function to train the network, fit the data, and obtain a trained channel-time series hybrid stuck drill prediction network model.

S4: Predicting pipe stuck accidents using a channel-time hybrid pipe stuck precursor signal detection network.

2. According to the method for intelligently identifying early sign signals of a stuck drill accident under a long-tail distribution in claim 1, it is characterized in that: the step S1 is to obtain time series data of logging in different work areas; normalize the logging data and fill in missing values; use expert knowledge to mark the stuck drill accident and obtain a number of marked logging time series data; divide the logging time series data into a number of samples, which are divided into samples of early sign signals of stuck drill accidents and normal samples.

3. The method for intelligently identifying early warning signals of a stuck pipe accident under a long-tail distribution according to claim 1, characterized in that the step S2 constructs a channel-time series hybrid stuck pipe prediction network; uses a cross-validation training test model, and when testing data of a specific oil well, the data of the oil well will be removed from the data set during the training phase. The probability that the logging time series data sample is a precursor signal of a stuck pipe accident is obtained.

4. The method for intelligently identifying early warning signals of a stuck pipe accident under a long-tail distribution according to claim 1 is characterized in that the step S3 further optimizes the network using the loss function of the long-tail distribution to solve the problem of mining stuck pipe samples under the imbalance of positive and negative samples in the logging data. The modulation coefficient is introduced to make the model pay more attention to the rare positive samples of early warning signals of a stuck pipe accident.