[go: up one dir, main page]

CN118690271A - An intelligent identification method for early warning signals of pipe sticking accidents under long-tail distribution - Google Patents

An intelligent identification method for early warning signals of pipe sticking accidents under long-tail distribution Download PDF

Info

Publication number
CN118690271A
CN118690271A CN202410737173.5A CN202410737173A CN118690271A CN 118690271 A CN118690271 A CN 118690271A CN 202410737173 A CN202410737173 A CN 202410737173A CN 118690271 A CN118690271 A CN 118690271A
Authority
CN
China
Prior art keywords
stuck
drill
data
time series
accidents
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410737173.5A
Other languages
Chinese (zh)
Inventor
汪敏
张浩洋
乔豁通
刘丽艳
韩雄
卢齐
冯万富
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Southwest Petroleum University
Original Assignee
Southwest Petroleum University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Southwest Petroleum University filed Critical Southwest Petroleum University
Priority to CN202410737173.5A priority Critical patent/CN118690271A/en
Publication of CN118690271A publication Critical patent/CN118690271A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/10Pre-processing; Data cleansing
    • G06F18/15Statistical pre-processing, e.g. techniques for normalisation or restoring missing data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/2431Multiple classes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0635Risk analysis of enterprise or organisation activities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/02Agriculture; Fishing; Forestry; Mining
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2123/00Data types
    • G06F2123/02Data types in the time domain, e.g. time-series data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2218/00Aspects of pattern recognition specially adapted for signal processing
    • G06F2218/02Preprocessing
    • G06F2218/04Denoising
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2218/00Aspects of pattern recognition specially adapted for signal processing
    • G06F2218/12Classification; Matching

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Human Resources & Organizations (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Economics (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)
  • Probability & Statistics with Applications (AREA)
  • Marketing (AREA)
  • Tourism & Hospitality (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Primary Health Care (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Game Theory and Decision Science (AREA)
  • Educational Administration (AREA)
  • Development Economics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Mining & Mineral Resources (AREA)
  • Marine Sciences & Fisheries (AREA)
  • Animal Husbandry (AREA)
  • Agronomy & Crop Science (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

本发明公开了一种长尾分布下的卡钻事故早期征兆信号智能识别方法,属于人工智能领域。所述方法包括以下步骤:步骤S10、收集录井时间序列数据,进行数据清洗和标注,并构建卡钻事故数据集;步骤S20、根据卡钻事故数据集建立通道‑时序混合卡钻预测网络;步骤S30、利用所述损失函数训练网络,对数据进行拟合,获得训练好的通道‑时序混合卡钻预测网络模型;步骤S40、利用通道‑时序混合卡钻预测网络模型进行卡钻事故预测。本方法有益效果在于能自动学习录井时间序列潜在模式,感知卡钻前驱参数变化,及时地对卡钻做出预警,减少卡钻事故带来的经济损失和安全隐患。

The present invention discloses an intelligent identification method for early sign signals of drill stuck accidents under long-tail distribution, which belongs to the field of artificial intelligence. The method comprises the following steps: step S10, collecting logging time series data, performing data cleaning and labeling, and constructing a drill stuck accident data set; step S20, establishing a channel-time series hybrid drill stuck prediction network according to the drill stuck accident data set; step S30, training the network using the loss function, fitting the data, and obtaining a trained channel-time series hybrid drill stuck prediction network model; step S40, predicting drill stuck accidents using the channel-time series hybrid drill stuck prediction network model. The beneficial effect of this method is that it can automatically learn the potential patterns of logging time series, perceive the changes in the precursor parameters of drill stuck, and promptly give early warnings for drill stuck, thereby reducing the economic losses and safety hazards caused by drill stuck accidents.

Description

一种长尾分布下的卡钻事故早期征兆信号智能识别方法An intelligent identification method for early warning signals of pipe sticking accidents under long-tail distribution

技术领域Technical Field

本发明涉及一种卡钻事故预测方法。尤其涉及一种长尾分布下的卡钻事故早期征兆信号智能识别方法。具体是指使用本发明数据处理方法处理录井数据,利用与之对应的深度学习算法模型学习特征与卡钻早期征兆信号之间的关联关系并使用长尾问题损失函数提高了少数类样本的识别准确率,实现对卡钻早期征兆信号识别,以此达到卡钻事故预测的目的,属于人工智能领域。The present invention relates to a method for predicting a drill stuck accident. In particular, it relates to an intelligent method for identifying early sign signals of a drill stuck accident under a long-tail distribution. Specifically, it refers to using the data processing method of the present invention to process logging data, using the corresponding deep learning algorithm model to learn the correlation between features and early sign signals of a drill stuck, and using the long-tail problem loss function to improve the recognition accuracy of minority class samples, thereby realizing the recognition of early sign signals of a drill stuck, thereby achieving the purpose of predicting a drill stuck accident, and belongs to the field of artificial intelligence.

背景技术Background Art

钻井工作是石油工业中的核心环节。卡钻是指在钻井进程中,由于钻柱在起下钻的过程中失去了自由活动,即钻井管柱不能上下活动也不能转动,在井眼的某一井段遇到阻碍的钻井事故。近三分之一的钻井时间损失是由卡钻引起的。卡钻事故不仅会导致钻井时间延长、钻具损失和井眼报废,还会严重威胁作业人员的生命安全。导致卡钻的原因错综复杂,包括岩层特征、井筒稳定性等。同时,这些因素之间存在复杂的关联关系。卡钻的复杂性和多样性使其成为一个极具挑战性的问题。因此,开展卡钻事故的识别和预测研究,对指导钻井作业具有重要意义。工程现场针对卡钻的判断基本来自人工的经验,钻井现场尚未出现完备的卡钻预测软件。Drilling is a core part of the oil industry. A stuck drill string refers to a drilling accident in which the drill string loses its free movement during the process of drilling, that is, the drill string cannot move up and down or rotate, and encounters an obstruction in a certain section of the wellbore. Nearly one-third of drilling time loss is caused by stuck drill string. A stuck drill string accident not only leads to extended drilling time, loss of drilling tools and scrapping of wellbore, but also seriously threatens the life safety of operators. The causes of stuck drill string are complex, including rock formation characteristics, wellbore stability, etc. At the same time, there are complex correlations between these factors. The complexity and diversity of stuck drill string make it a very challenging problem. Therefore, conducting research on the identification and prediction of stuck drill string accidents is of great significance to guiding drilling operations. The judgment of stuck drill string at the engineering site is basically based on manual experience, and there is no complete stuck drill string prediction software at the drilling site.

钻井作业受复杂地理环境因素影响,钻井过程各有差异。复杂的钻井情况,发生卡钻事故时的类型也各不相同。典型的卡钻事故包括压差卡钻、沉砂卡钻、键槽卡钻等。卡钻事故中最常见的卡钻类型为压差卡钻。发生压差卡钻的基本条件有三:1.钻井液静液柱压力大于地层压力。2.存在渗透性地层,且在井壁上有较厚泥饼。3.钻具有一段时间静止。我们认为钻具与泥饼贴紧是一个过程,而非一次性发生的。这个过程也即压差卡钻的提前期,静水压力和孔隙压力之间的差值逐渐增大,泥饼逐渐变厚,泥饼中的压力逐渐下降,最后导致卡钻。其他类型卡钻事故在发生前也均有其征兆信号。如果可以检测到这些卡钻事故早期征兆信号,现场就能及时做出应对措施从而减少损失。然而卡钻事故始终属于小概率事件,卡钻事故早期征兆信号样本也属于小样本,可学习的样本数量稀少,将导致模型在识别时表现不佳。目前针对小样本识别的方法主要为重采样,是以直观的方式去平衡类别之间样本数量的差距,此方法有一定提升效果但存在以下问题:1.过采样重复采样前驱信号类,会造成模型过拟合;2.欠采样抛弃了大部分工况正常样本,会造成信息缺失,模型精度不高。因此需要在不破坏类别分布情况下,提高卡钻前驱信号样本的识别精度,方便现场及时处理事故,避免卡钻发生。Drilling operations are affected by complex geographical environmental factors, and the drilling process varies. In complex drilling situations, the types of drill bit stuck accidents are also different. Typical drill bit stuck accidents include differential pressure drill bit stuck, sand settling drill bit stuck, keyway drill bit stuck, etc. The most common type of drill bit stuck accident is differential pressure drill bit stuck. There are three basic conditions for differential pressure drill bit stuck: 1. The static column pressure of the drilling fluid is greater than the formation pressure. 2. There is a permeable formation and there is a thick mud cake on the well wall. 3. The drill has been stationary for a period of time. We believe that the close contact between the drill tool and the mud cake is a process, not a one-time occurrence. This process is also the lead time of differential pressure drill bit stuck. The difference between the hydrostatic pressure and the pore pressure gradually increases, the mud cake gradually becomes thicker, and the pressure in the mud cake gradually decreases, which finally leads to drill bit stuck. Other types of drill bit stuck accidents also have their signs before they occur. If these early signs of drill bit stuck accidents can be detected, timely response measures can be taken on site to reduce losses. However, drill stuck accidents are always low-probability events, and the early warning signal samples of drill stuck accidents are also small samples. The number of samples that can be learned is scarce, which will lead to poor performance of the model in recognition. The current method for small sample recognition is mainly resampling, which is an intuitive way to balance the difference in the number of samples between categories. This method has a certain improvement effect but has the following problems: 1. Oversampling and repeated sampling of the precursor signal class will cause the model to overfit; 2. Undersampling discards most of the normal working condition samples, which will cause information loss and low model accuracy. Therefore, it is necessary to improve the recognition accuracy of the precursor signal samples of drill stuck without destroying the category distribution, so as to facilitate timely handling of accidents on site and avoid drill stuck.

发明内容Summary of the invention

本发明针对上述问题,融合现场监测的多源数据,提出一种长尾分布下基于卡钻前驱信号检测的卡钻预测方法,有效感知各参数的变化,自动发现复杂的特征模式,实现卡钻前驱信号的检测,拥有更高的精度和泛化能力。In response to the above problems, the present invention integrates multi-source data of on-site monitoring and proposes a drill stuck prediction method based on drill stuck precursor signal detection under long-tail distribution. The method can effectively sense the changes of various parameters, automatically discover complex feature patterns, and realize the detection of drill stuck precursor signals with higher accuracy and generalization ability.

为达到上述目的,本发明采用的技术方案如下:To achieve the above object, the technical solution adopted by the present invention is as follows:

S1、获取不同工区录井时间序列数据;将录井数据归一化处理和缺失值填补;利用专家知识对卡钻事故进行标注,获得已标注的若干录井时序数据;将录井时间序列数据划分为若干样本,分为卡钻事故早期征兆信号样本与正常样本;S1. Obtain logging time series data from different work areas; normalize the logging data and fill in missing values; use expert knowledge to mark the stuck drill accident and obtain a number of marked logging time series data; divide the logging time series data into a number of samples, including early sign signal samples of stuck drill accident and normal samples;

S2、构建通道-时序混合卡钻预测网络;使用交叉验证训练测试模型,在测试特定油井的数据时,该油井的数据会在训练阶段从数据集中移除。得到录井时间序列数据样本为卡钻事故前驱信号的概率;S2. Construct a channel-time series hybrid pipe-stuck prediction network; use cross-validation to train and test the model. When testing the data of a specific oil well, the data of the oil well will be removed from the data set during the training phase. Obtain the probability that the logging time series data sample is a precursor signal of a pipe-stuck accident;

S3、采用长尾分布的损失函数进一步优化网络,解决网络在录井数据正负样本不平衡下对卡钻样本的挖掘。引入调制系数,使模型更关注稀少的卡钻事故早期征兆信号正样本;S3. Use the long-tail distribution loss function to further optimize the network and solve the problem of mining stuck pipe samples when the positive and negative samples of logging data are unbalanced. Introduce the modulation coefficient to make the model pay more attention to the rare positive samples of early signs of stuck pipe accidents;

进一步,实现S10的具体过程为,获取不同工区历史和实时更新的录井数据,存储为时序数据。数据参数包含钻头位置、扭矩、立压等。检测数据连续性,使用缺失填补;基于钻井日志标注卡钻标签。Furthermore, the specific process of implementing S10 is to obtain the historical and real-time updated logging data of different work areas and store them as time series data. The data parameters include drill bit position, torque, vertical pressure, etc. Check the data continuity and use missing fill; mark the stuck drill label based on the drilling log.

S11、所述缺失填补为K最近邻填充法;S11, the missing filling is a K nearest neighbor filling method;

S12、基于钻井日志将获得的卡钻事故发生前15分钟的录井参数时间序列以每三分钟为间隔分为一个样本,作为卡钻早期征兆信号样本。S12. Based on the drilling log, the logging parameter time series obtained 15 minutes before the occurrence of the pipe sticking accident is divided into a sample at intervals of three minutes as an early sign signal sample of the pipe sticking.

S13、从钻井数据库中随机抽取正常工作状态下的录井参数时间序列,同样以以每三分钟为间隔分为一个样本,作为钻井正常信号样本。S13. Randomly extract the logging parameter time series under normal working conditions from the drilling database, and divide it into a sample at intervals of three minutes as the normal drilling signal sample.

S14、以专家知识优选与卡钻事故发生相关的工程参数。抽取的参数为时间、井深、钻头位置、钻时、大钩高度、钻压、转速、套管压力、立管压力、入口流量、出口流量。S14. Optimize engineering parameters related to the occurrence of stuck drill pipe accidents based on expert knowledge. The extracted parameters are time, well depth, drill bit position, drilling time, hook height, drilling pressure, rotation speed, casing pressure, riser pressure, inlet flow rate, and outlet flow rate.

优选地,实现S2的具体步骤为Preferably, the specific steps to implement S2 are:

S21、确定采样间隔参数s。对原始录井序列数据进行均匀下采样,分解成s个子序列。S21. Determine a sampling interval parameter s. Evenly downsample the original logging sequence data and decompose it into s subsequences.

S22、对每个子序列分别进行时间特征学习,采用两层全连接网络,激活函数为GELU。得到时间维度特征表示。S22. Perform temporal feature learning on each subsequence, using a two-layer fully connected network with GELU as the activation function, to obtain temporal dimension feature representation.

S23、将s个子序列根据原始顺序重组合并,实现时间特征的融合。S23. Reorganize and merge the s subsequences according to the original order to achieve the fusion of time features.

S24、基于简化线性模型,学习参数之间的关联关系,获得通道维度特征。S24. Based on the simplified linear model, learn the correlation between parameters and obtain channel dimension features.

S25、残差连接层,深度融合步骤S23中的时间特征和步骤S24中的通道特征。S25, residual connection layer, deeply fuses the temporal features in step S23 and the channel features in step S24.

S26、线性映射层,将融合特征映射到类别(正常/前驱信号)。S26, linear mapping layer, maps the fused features to categories (normal/precursor signals).

S27、Softmax层将输出转化为卡钻事故征兆信号的概率。S27, Softmax layer converts the output into the probability of the drill stuck accident sign signal.

实现S3的具体步骤为:The specific steps to implement S3 are:

S31、模型预测得到每个样本为卡钻类的预测概率。计算每个样本实际标签与预测结果之间的交叉熵作为基础损失。S31. The model predicts the probability of each sample being a stuck diamond class. The cross entropy between the actual label of each sample and the predicted result is calculated as the basic loss.

S32、引入调制系数,构建Focal Loss,对基础损失进行调制。S32, introduce the modulation coefficient, construct the Focal Loss, and modulate the basic loss.

S33、Focal Loss通过设定的调制因子,降低了模型对分类准确的易样本的损失贡献。S33, Focal Loss reduces the loss contribution of the model to the easy samples with accurate classification by setting the modulation factor.

S34、采用Focal Loss作为最终损失,实施模型参数梯度下降优化更新。S34. Use Focal Loss as the final loss and implement gradient descent optimization update of model parameters.

S4、输入实际录井时间序列数据,抽取指定参数列,检测是否有数据缺失;若有数据缺失则进行缺失值填补后构建预测数据集输入模型,输出为是否为卡钻前驱信号的概率。S4. Input the actual logging time series data, extract the specified parameter column, and detect whether there is missing data; if there is missing data, fill the missing values and build a prediction data set input model, and output the probability of whether it is a precursor signal of stuck drill.

与现有技术相比,本发明的有益效果Compared with the prior art, the present invention has the following beneficial effects:

(1)本发明以卡钻前驱信号检测等效卡钻事故预测,避免参数预测带来的不确定性;(1) The present invention uses the drill stuck precursor signal detection to predict the equivalent drill stuck accident, thus avoiding the uncertainty caused by parameter prediction;

(2)本发明将录井时间序列数据与深度学习模型相结合,可以准确识别卡钻前驱信号,及时报警;(2) The present invention combines logging time series data with a deep learning model to accurately identify precursor signals of drill sticking and issue an alarm in a timely manner;

(3)本发明能够有效提高小样本的卡钻事故前驱信号识别精度且不会造成过拟合或者欠拟合。(3) The present invention can effectively improve the recognition accuracy of precursor signals of drill stuck accidents in small samples without causing overfitting or underfitting.

本发明为一种通道-时序混合卡钻事故前驱信号检测方法,能够有效感知录井数据中各个参数潜在变化规律和隐含的相互关联关系,提取卡钻事故数据特征。在卡钻样本数据分布不平衡的情况下使用长尾分布损失函数,解决数据样本类别不平衡问题。The present invention is a channel-time series hybrid pipe-stuck accident precursor signal detection method, which can effectively perceive the potential change rules and implicit correlations of various parameters in logging data and extract the characteristics of pipe-stuck accident data. In the case of unbalanced distribution of pipe-stuck sample data, a long-tail distribution loss function is used to solve the problem of unbalanced data sample categories.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

为了更清楚地说明本发明实施例的技术方案,下面将对实施例的附图作简单地介绍,显而易见地,下面描述中的附图仅仅涉及本发明的一些实施例,而非对本发明的限制。In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings of the embodiments will be briefly introduced below. Obviously, the drawings in the following description only relate to some embodiments of the present invention, but are not intended to limit the present invention.

图1为本发明的流程示意图;Fig. 1 is a schematic diagram of the process of the present invention;

图2为本发明的因子化时间通道混合网络示意图;FIG2 is a schematic diagram of a factorized time channel hybrid network of the present invention;

具体实施方式DETAILED DESCRIPTION

下面对本发明的具体实施方式进行描述,以便于本技术领域的技术人员理解本发明,但应该清楚,本发明不限于具体实施方式的范围,对本技术领域的普通技术人员来讲,只要各种变化在所附的权利要求限定和确定的本发明的精神和范围内,这些变化是显而易见的,一切利用本发明构思的发明创造均在保护之列。The specific implementation modes of the present invention are described below so that those skilled in the art can understand the present invention. However, it should be clear that the present invention is not limited to the scope of the specific implementation modes. For those of ordinary skill in the art, as long as various changes are within the spirit and scope of the present invention as defined and determined by the attached claims, these changes are obvious, and all inventions and creations utilizing the concept of the present invention are protected.

本发明的通道-时序混合卡钻预测方法,包括以下步骤:The channel-sequence hybrid pipe sticking prediction method of the present invention comprises the following steps:

S1、收集不同工区录井数据,如部分参数统计如表1所示S1. Collect logging data from different work areas. Some parameter statistics are shown in Table 1.

表1部分参数统计表Table 1 Statistics of some parameters

输入数据后检查数据是否有缺失值,本发明采用“k最近距离邻法”进行缺失值填补。首先,确定填补缺失值时所采用的最近邻居的数量k。接着,计算缺失值所在样本与其他样本之间的距离,使用欧氏距离或其他距离度量。After inputting the data, check whether there are missing values in the data. The present invention adopts the "k nearest neighbor method" to fill the missing values. First, determine the number of nearest neighbors k used to fill the missing values. Then, calculate the distance between the sample where the missing value is located and other samples, using Euclidean distance or other distance measurement.

其中,dxy为有缺失值数据与完备数据的欧氏距离,x为带有缺失值的数据,y为完整数据。对缺失值所在数据与其他完备数据全部计算欧式距离后,选取与缺失值样本最近的k个样本。利用这k个最近邻的平均值或加权平均值来填补缺失值。最后,将填补后的值应用于缺失值所在的位置。Among them, dxy is the Euclidean distance between the data with missing values and the complete data, x is the data with missing values, and y is the complete data. After calculating the Euclidean distance between the data with missing values and other complete data, select the k samples closest to the missing value sample. Use the average or weighted average of these k nearest neighbors to fill the missing value. Finally, apply the filled value to the location where the missing value is located.

根据现场施工日志将卡钻事故发生前15分钟的数据标记为卡钻前驱信号,以三分钟为间隔划分样本。同样以三分钟为样本时间跨度抽取正常数据样本。According to the on-site construction log, the data 15 minutes before the drill stuck accident occurred was marked as the precursor signal of the drill stuck, and the samples were divided into intervals of three minutes. Similarly, the normal data samples were extracted with a sample time span of three minutes.

在卡钻预测时各参数的整体变化趋势比具体数值更为重要,并且各参数数值差异较大。因此,为提升模型收敛速度,将各参数归一化为0~1的值,以消除不同参数量纲带来的影响。设某参数原始值为x,归一化后的值为x',其转换公式为:When predicting stuck drill, the overall trend of each parameter is more important than the specific value, and the values of each parameter vary greatly. Therefore, in order to improve the convergence speed of the model, each parameter is normalized to a value of 0 to 1 to eliminate the influence of different parameter dimensions. Suppose the original value of a parameter is x, and the normalized value is x', the conversion formula is:

下面结合附图对本方法进行具体描述,如图1所示为网络的总体框架图。The method is described in detail below with reference to the accompanying drawings. FIG1 is a general framework diagram of a network.

S1、构建因子化时间通道混合网络,对各录井参数变化规律和参数关联关系进行建模。从时间和空间两个维度分别提取特征,得到了特征和卡钻前驱信号之间的映射关系。S1. Construct a factorized time channel hybrid network to model the variation rules and parameter correlation of each logging parameter. Extract features from the time and space dimensions respectively, and obtain the mapping relationship between the features and the precursor signal of the stuck drill.

参考图2,以下对步骤S2进行详细说明。With reference to FIG2 , step S2 is described in detail below.

因子化时间通道混合预测网络结构图如图2所示。由时序交互(TemporalInteraction)和通道混合两部分组成。The structure of the factorized time-channel mixed prediction network is shown in Figure 2. It consists of two parts: temporal interaction and channel mixing.

首先,设计时间交互模块以充分捕获不同参数细微变化特征。First, a temporal interaction module is designed to fully capture the subtle changes in different parameters.

将原始录井时间序列数据下采样分成s个交错的子序列,每个子序列通过线性层单独学习时间维度特征,最后将这些子序列的特征按原序重新拼接在一起,以此避免分解所造成的特征丢失。设输入序列为则交错采样按如下步骤实施;The original logging time series data The downsampling is divided into s interleaved subsequences, each subsequence learns the time dimension features separately through the linear layer, and finally the features of these subsequences are reassembled in the original order to avoid feature loss caused by decomposition. Suppose the input sequence is Then the interleaved sampling is implemented as follows:

1.均匀下采样:将Xh按间隔s进行下采样,分成s个子序列:1. Uniform downsampling: Downsample Xh by interval s and divide it into s subsequences:

Xh,i=Xh[i-1::s,:],1≤i≤s (2)X h,i =X h [i-1::s,:],1≤i≤s (2)

其中,i索引第i个子序列,[::s]表示每s步采样一次。s的值一般根据经验选取。Here, i indexes the i-th subsequence, and [::s] means sampling every s steps. The value of s is usually selected based on experience.

特征学习:对每个子序列Xh,i分别应用时间特征提取器。本文采用线性交互捕获卡钻数据的时间模式,将含两个隐藏层的多层感知机作为特征提取模块。两层隐藏层的设计,避免了过拟合的风险。通过多层感知机提取的高级特征,为后续的时间序列预测任务提供了更具判别性的输入,提高了模型的预测精度。采用GELU激活函数,避免输入为负值时,激活单元的输出长期为零的情况。线性模型学习参数学习得到其时间维度特征表示Xh,i′:Feature learning: Apply a temporal feature extractor to each subsequence X h,i . This paper uses linear interaction to capture the temporal pattern of stuck drill data, and uses a multilayer perceptron with two hidden layers as a feature extraction module. The design of two hidden layers avoids the risk of overfitting. The high-level features extracted by the multilayer perceptron provide more discriminative input for subsequent time series prediction tasks, improving the prediction accuracy of the model. The GELU activation function is used to avoid the situation where the output of the activation unit is zero for a long time when the input is negative. Linear model learning parameters The time dimension feature representation Xh,i ′ is obtained by learning:

其中为列方向的加法。in is the addition in the column direction.

3.特征聚合:如式(3),将s个子序列的特征Xh,i′按原始顺序重新聚合,得到新的时间序列表示Xh * 3. Feature aggregation: As shown in formula (3), the features of s subsequences Xh,i ′ are re-aggregated in the original order to obtain a new time series representation Xh *

Xh *=[Xh,1′[0],Xh,2′[0],...,Xh,1′[1],Xh,2′[1],...] (4)X h * =[X h,1 ′[0],X h,2 ′[0],...,X h,1 ′[1],X h,2 ′[1],...] ( 4)

经过上述处理,原始冗余序列被下采样为多个子序列,并在每个子序列上单独学习特征,最后将各子序列特征按原序重新聚合,以此降低冗余并编码时间维度信息。通过调节下采样步长s,可以控制降采样的程度,以平衡信息量和冗余。After the above processing, the original redundant sequence is downsampled into multiple subsequences, and features are learned separately on each subsequence. Finally, the features of each subsequence are reaggregated in the original order to reduce redundancy and encode time dimension information. By adjusting the downsampling step size s, the degree of downsampling can be controlled to balance the amount of information and redundancy.

其次,设计通道混合模块捕获通道间的关联关系。Secondly, a channel mixing module is designed to capture the correlation between channels.

进一步,我们希望获得扭矩、钻速等参数之间的相互关系。因此,设计通道混合捕获模块。Furthermore, we hope to obtain the relationship between parameters such as torque and drilling speed. Therefore, a channel mixing capture module is designed.

通道间关联关系可使用下式表示:The relationship between channels can be expressed as follows:

其中 in

表示噪声,表示去噪后的通道依赖(关联)关系。表示分解后的通道相互作用。公式(5)与线性表达式类似,因此可以使用线性模型学习时间序列数据的通道依赖性,如下represents noise, Indicates the channel dependency (association) relationship after denoising. and represents the channel interaction after decomposition. Formula (5) is similar to the linear expression, so the channel dependency of time series data can be learned using a linear model as follows

其中σ(·)为激活函数。in σ(·) is the activation function.

这样可以有效用极简的方式,实现通道间相关关系的提取,有效降低模型的时间复杂度和空间复杂度。This can effectively extract the correlation between channels in a very simple way, effectively reducing the time complexity and space complexity of the model.

具体做法为将原始数据经过转置,线性投影层,和GELU激活函数,最后得到参数间关联关系表示转置使每个参数的整条序列作为特征通道。线性投影层将特征映射到更低维空间,目的是压缩特征,减少冗余成分。GELU激活函数加线性投影层实现维度变换以保证最后输出的维度与输入一致。The specific method is to transpose the original data, pass the linear projection layer, and the GELU activation function, and finally get the correlation between the parameters. The transposition makes the entire sequence of each parameter as a feature channel. The linear projection layer maps the features to a lower dimensional space in order to compress the features and reduce redundant components. The GELU activation function plus the linear projection layer realizes the dimensional transformation to ensure that the dimension of the final output is consistent with the input.

使用残差连接将时间混合特征和通道混合特征进一步深度融合。使用线性映射将原始输入长度映射到目标预测长度,避免迭代预测带来的误差累计。至此即已获得参数空间的编码表示Use residual connections to further deeply fuse the time mixing features and channel mixing features. Use linear mapping to map the original input length to the target prediction length to avoid the error accumulation caused by iterative prediction. So far, the encoding representation of the parameter space has been obtained.

最后由一个线性层映射和Softmax层得到是否卡钻的概率,输出二者中较大的索引作为输出。完成卡钻前驱信号样本的检测。Finally, a linear layer mapping and a Softmax layer are used to obtain the probability of whether the drill is stuck, and the larger index of the two is output as the output, completing the detection of the drill stuck precursor signal sample.

S3、考虑卡钻事件样本占全体样本的比例极低的问题,采用Focal Loss损失函数显著提高网络对卡钻样本的识别能力。最终实现了对卡钻事故的预测与预警。S3. Considering the problem that the proportion of stuck drill event samples to all samples is extremely low, the Focal Loss function is used to significantly improve the network's ability to identify stuck drill samples. Finally, the prediction and early warning of stuck drill accidents are achieved.

其中Focal Loss损失函数如下:The Focal Loss loss function is as follows:

FL(pt)=α(1-pt)γlog(pt) (7)FL(p t )=α(1-p t ) γ log(p t ) (7)

其中pt为模型对正样本的预测概率,1-pt表示模型预测错误的概率。γ是一个调节因子,当γ增大时,可以使得损失函数更加关注难分类的卡钻样本。Where pt is the model's predicted probability for positive samples, and 1- pt represents the probability of the model's prediction error. γ is an adjustment factor. When γ increases, the loss function can pay more attention to the difficult-to-classify stuck drill samples.

为了优化模型性能并防止过拟合,本文采用了早停法(Early Stopping)策略对模型进行训练。早停法是一种基于验证集表现的迭代终止方法,旨在提高模型的泛化能力。在每个训练周期(epoch)结束时,评估模型在独立的验证数据集上的性能。如果验证集上的性能(如损失函数值或准确率)在连续的若干训练周期内没有显著改善,训练过程将被提前终止。In order to optimize model performance and prevent overfitting, this paper adopts the early stopping strategy to train the model. Early stopping is an iterative termination method based on the performance of the validation set, which aims to improve the generalization ability of the model. At the end of each training cycle (epoch), the performance of the model on an independent validation dataset is evaluated. If the performance on the validation set (such as the loss function value or accuracy) does not improve significantly in several consecutive training cycles, the training process will be terminated early.

S4:输入测试数据样本,检测数据是否有缺失,缺失则进行数据填充后将完备测试数据输入模型,输出为卡钻前驱信号识别结果,为样本属于卡钻前驱信号的概率。S4: Input the test data sample, check whether the data is missing, fill the data if missing, and then input the complete test data into the model, and output the identification result of the drill stuck precursor signal, which is the probability that the sample belongs to the drill stuck precursor signal.

以上所述,并非对本发明作任何形式上的限制,虽然本发明已通过上述实施例揭示,然而并非用以限定本发明,任何熟悉本专业的技术人员,在不脱离本发明技术方案范围内,当可利用上述揭示的技术内容作出些变动或修饰为等同变化的等效实施例,但凡是未脱离本发明技术方案的内容,依据本发明的技术实质对以上实施例所作的任何简单修改、等同变化与修饰,均仍属于本发明技术方案的范围内。The above description is not intended to impose any form of limitation on the present invention. Although the present invention has been disclosed through the above embodiments, it is not intended to limit the present invention. Any technician familiar with the profession can make some changes or modifications to equivalent embodiments of equivalent changes using the technical contents disclosed above without departing from the scope of the technical solution of the present invention. However, any simple modification, equivalent change and modification made to the above embodiments based on the technical essence of the present invention without departing from the content of the technical solution of the present invention still falls within the scope of the technical solution of the present invention.

Claims (4)

1.一种长尾分布下的卡钻事故早期征兆信号智能识别方法,所述方法由计算机实现,其特征在于,所述方法包括:1. A method for intelligently identifying early warning signs of a pipe-stuck accident under a long-tail distribution, the method being implemented by a computer and comprising: S1:获取初始样本,所述初始样本为录井数据;S1: Obtaining an initial sample, where the initial sample is logging data; 将所述录井数据输入到通道-时序混合卡钻前驱信号检测网络模型。所述网络模型包括因子化时间通道混合预测网络。所述网络模型输出样本为卡钻事故早期征兆信号的概率。The logging data is input into a channel-time series hybrid pipe-stuck precursor signal detection network model. The network model includes a factorized time channel hybrid prediction network. The network model outputs the probability that the sample is an early sign signal of a pipe-stuck accident. S2:训练通道-时序混合卡钻前驱信号检测方法。本方法通过对卡钻事故早期征兆信号的识别完成对卡钻事故的预测。本方法从录井数据的时间、通道两个维度充分捕捉卡钻事故的各类非线性特征及其关联关系,并且针对卡钻事故的稀少性设计相应的损失函数训练模型,实现卡钻事故早期征兆信号的识别,以此达到卡钻事故的预警预测。S2: Training channel-time series hybrid stuck drill precursor signal detection method. This method predicts stuck drill accidents by identifying early sign signals of stuck drill accidents. This method fully captures various nonlinear characteristics and their correlations of stuck drill accidents from the two dimensions of time and channel of logging data, and designs a corresponding loss function training model based on the scarcity of stuck drill accidents to realize the identification of early sign signals of stuck drill accidents, so as to achieve early warning prediction of stuck drill accidents. S3:利用长尾问题损失函数训练网络,对数据进行拟合,获得训练好的通道-时序混合卡钻预测网络模型。S3: Use the long-tail problem loss function to train the network, fit the data, and obtain a trained channel-time series hybrid stuck drill prediction network model. S4:利用通道-时序混合卡钻前驱信号检测网络进行卡钻事故预测。S4: Predicting pipe stuck accidents using a channel-time hybrid pipe stuck precursor signal detection network. 2.根据权利要求1所述的一种长尾分布下的卡钻事故早期征兆信号智能识别方法,其特征在于:所述步骤S1为获取不同工区录井时间序列数据;将录井数据归一化处理和缺失值填补;利用专家知识对卡钻事故进行标注,获得已标注的若干录井时序数据;将录井时间序列数据划分为若干样本,分为卡钻事故早期征兆信号样本与正常样本。2. According to the method for intelligently identifying early sign signals of a stuck drill accident under a long-tail distribution in claim 1, it is characterized in that: the step S1 is to obtain time series data of logging in different work areas; normalize the logging data and fill in missing values; use expert knowledge to mark the stuck drill accident and obtain a number of marked logging time series data; divide the logging time series data into a number of samples, which are divided into samples of early sign signals of stuck drill accidents and normal samples. 3.根据权利要求1所述的一种长尾分布下的卡钻事故早期征兆信号智能识别方法,其特征在于,所述步骤S2构建通道-时序混合卡钻预测网络;使用交叉验证训练测试模型,在测试特定油井的数据时,该油井的数据会在训练阶段从数据集中移除。得到录井时间序列数据样本为卡钻事故前驱信号的概率。3. The method for intelligently identifying early warning signals of a stuck pipe accident under a long-tail distribution according to claim 1, characterized in that the step S2 constructs a channel-time series hybrid stuck pipe prediction network; uses a cross-validation training test model, and when testing data of a specific oil well, the data of the oil well will be removed from the data set during the training phase. The probability that the logging time series data sample is a precursor signal of a stuck pipe accident is obtained. 4.根据权利要求1所述的一种长尾分布下的卡钻事故早期征兆信号智能识别方法,其特征在于,所述步骤S3采用长尾分布的损失函数进一步优化网络,解决网络在录井数据正负样本不平衡下对卡钻样本的挖掘。引入调制系数,使模型更关注稀少的卡钻事故早期征兆信号正样本。4. The method for intelligently identifying early warning signals of a stuck pipe accident under a long-tail distribution according to claim 1 is characterized in that the step S3 further optimizes the network using the loss function of the long-tail distribution to solve the problem of mining stuck pipe samples under the imbalance of positive and negative samples in the logging data. The modulation coefficient is introduced to make the model pay more attention to the rare positive samples of early warning signals of a stuck pipe accident.
CN202410737173.5A 2024-06-07 2024-06-07 An intelligent identification method for early warning signals of pipe sticking accidents under long-tail distribution Pending CN118690271A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410737173.5A CN118690271A (en) 2024-06-07 2024-06-07 An intelligent identification method for early warning signals of pipe sticking accidents under long-tail distribution

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410737173.5A CN118690271A (en) 2024-06-07 2024-06-07 An intelligent identification method for early warning signals of pipe sticking accidents under long-tail distribution

Publications (1)

Publication Number Publication Date
CN118690271A true CN118690271A (en) 2024-09-24

Family

ID=92773395

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410737173.5A Pending CN118690271A (en) 2024-06-07 2024-06-07 An intelligent identification method for early warning signals of pipe sticking accidents under long-tail distribution

Country Status (1)

Country Link
CN (1) CN118690271A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118940201A (en) * 2024-10-15 2024-11-12 西南石油大学 A method for early warning of stuck pipe based on time series anomaly detection

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109508827A (en) * 2018-11-14 2019-03-22 西南石油大学 A kind of drilling failure Early-warning Model based on time recurrent neural network
CN110778307A (en) * 2019-10-24 2020-02-11 西南石油大学 Drill jamming early warning and type diagnosis method
US20230074074A1 (en) * 2021-09-02 2023-03-09 Southwest Petroleum University Intelligent recognition method for while-drilling safety risk based on convolutional neural network
CN116066062A (en) * 2021-11-03 2023-05-05 中石化石油工程技术服务有限公司 A real-time early warning method for stuck pipe based on abnormal diagnosis of parameter variation trend
CN117345194A (en) * 2022-06-23 2024-01-05 中国石油天然气股份有限公司 Differential pressure stuck probability prediction method and system based on Bayesian belief network
CN117459300A (en) * 2023-11-15 2024-01-26 东北大学 An industrial control system intrusion detection method and device

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109508827A (en) * 2018-11-14 2019-03-22 西南石油大学 A kind of drilling failure Early-warning Model based on time recurrent neural network
CN110778307A (en) * 2019-10-24 2020-02-11 西南石油大学 Drill jamming early warning and type diagnosis method
US20230074074A1 (en) * 2021-09-02 2023-03-09 Southwest Petroleum University Intelligent recognition method for while-drilling safety risk based on convolutional neural network
CN116066062A (en) * 2021-11-03 2023-05-05 中石化石油工程技术服务有限公司 A real-time early warning method for stuck pipe based on abnormal diagnosis of parameter variation trend
CN117345194A (en) * 2022-06-23 2024-01-05 中国石油天然气股份有限公司 Differential pressure stuck probability prediction method and system based on Bayesian belief network
CN117459300A (en) * 2023-11-15 2024-01-26 东北大学 An industrial control system intrusion detection method and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
ZHE LI等: ""MTS-Mixers: Multivariate Time Series Forecastingvia Factorized Temporal and Channel Mixing"", 《ARXIV》, 9 February 2023 (2023-02-09), pages 1 - 14 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118940201A (en) * 2024-10-15 2024-11-12 西南石油大学 A method for early warning of stuck pipe based on time series anomaly detection

Similar Documents

Publication Publication Date Title
CN109508827B (en) Drilling accident early warning method based on time recursion neural network
US8457897B2 (en) Methods and systems to estimate wellbore events
CN114896903A (en) Forced learning-based decision optimization method for oil field production system
CN107272644B (en) DBN Network Fault Diagnosis Method for Submersible Reciprocating Pumping Unit
CN112487356B (en) A Data Augmentation Method for Structural Health Monitoring
CN102128022A (en) Drilling engineering early warning method and system thereof
CN116128309B (en) Petroleum engineering well site operation maintenance management system based on Internet of things
CN103470202B (en) The online comprehensive monitoring of overflow and method for early warning in oil gas well drilling process
CN109902265B (en) Underground early warning method based on hidden Markov model
CN112949900B (en) A kind of reservoir dam safety information intelligent perception fusion early warning method and terminal equipment
CN118690271A (en) An intelligent identification method for early warning signals of pipe sticking accidents under long-tail distribution
CN113756786A (en) Method for predicting time sequence indicator diagram of rod-pumped well based on deep learning
CN118363308B (en) Pipe jacking hoisting control system and control method based on hydrologic digital twinning
CN115081741A (en) A method for intelligent prediction of natural gas metrology verification based on neural network
EP2090742A1 (en) Methods and systems to estimate wellbore events
CN116383722A (en) A process monitoring method of fracturing measures based on gated recurrent unit neural network
CN117077057A (en) Ore pressure early warning tree system based on deep learning time sequence prediction and multidimensional dynamic inspection
CN117349610A (en) Fracturing operation multi-time-step pressure prediction method based on time sequence model
CN113129157B (en) A real-time early warning method for downhole stuck drill failures in long water sections of shale gas
CN115860197A (en) Data-driven coal bed gas yield prediction method and system
CN119203855A (en) A three-dimensional modeling simulation method and system for oil and gas production early warning
CN117076915B (en) Intelligent fault attribution analysis method and system for FPSO crude oil process system
CN116245025A (en) Turbine equipment performance degradation prediction modeling method considering data loss
CN117910654A (en) Wellhead pressure interval prediction method based on self-adaptive graph convolution neural network
CN117189042A (en) Marine thickened oil thermochemical oil extraction method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination