CN117933316B

CN117933316B - Groundwater level probability forecasting method based on interpretable Bayesian convolution network

Info

Publication number: CN117933316B
Application number: CN202410339570.7A
Authority: CN
Inventors: 莫绍星; 彭泽辰; 吴吉春; 施小清; 曾献奎
Original assignee: Nanjing University
Current assignee: Nanjing University
Priority date: 2024-03-25
Filing date: 2024-03-25
Publication date: 2024-05-31
Anticipated expiration: 2044-03-25
Also published as: CN117933316A

Abstract

The invention discloses a ground water level probability forecasting method based on an interpretable Bayesian convolution network, which utilizes a leading edge time sequence forecasting model and a Bayesian method to realize ground water level reliable probability forecasting, and utilizes an interpretation algorithm to identify and quantify the contribution degree of each input characteristic to ground water level forecasting results. The invention can convert the one-dimensional time sequence into the two-dimensional space based on the periodic characteristics of the one-dimensional time sequence, and then extract the periodic characteristics of the sequence through the convolution network, thereby realizing the reliable forecast of the groundwater level. The invention combines the Monte Carlo discarding Bayesian method and the SHAP interpretability method, quantifies the uncertainty of the forecast result and the contribution degree of the input characteristics to the forecast result, and realizes the groundwater level probability and the interpretable forecast. Based on the ground water level monitoring data and the meteorological data, the ground water level monitoring system can realize reliable forecast of ground water level change in the future for one month, and provides decision support for ground water resource optimal allocation and ecological environment protection.

Description

A probabilistic groundwater level forecasting method based on interpretable Bayesian convolutional network

技术领域Technical Field

本发明涉及水文和深度学习交叉技术领域，具体为一种基于可解释贝叶斯卷积网络的地下水位概率预报方法。The present invention relates to the cross-technical field of hydrology and deep learning, and specifically to a groundwater level probability forecasting method based on an interpretable Bayesian convolutional network.

背景技术Background technique

地下水资源在农业、工业和饮用水供给等方面发挥重要作用，此外，地下水也是维持生态环境安全的重要保障。地下水位是衡量地下水资源可利用性的直接指标，地下水位变化也与生态系统稳定性息息相关。开展地下水位预报研究，预估未来地下水位的变化趋势，可以及时地优化制定地下水开采方案和生态环境保护策略，从而实现水资源可持续性发展和生态环境保护。Groundwater resources play an important role in agriculture, industry and drinking water supply. In addition, groundwater is also an important guarantee for maintaining ecological and environmental security. The groundwater level is a direct indicator of the availability of groundwater resources, and the change of groundwater level is also closely related to the stability of the ecosystem. Conducting groundwater level forecasting research and estimating the future trend of groundwater level changes can timely optimize the formulation of groundwater exploitation plans and ecological and environmental protection strategies, thereby achieving sustainable development of water resources and ecological and environmental protection.

目前已有许多地下水位预报技术，主要分为传统物理驱动的数值模型和数据驱动的机器学习和深度学习方法两大类。其中，深度学习方法能够克服数值模型参数需求量大的限制，且相较结构较为简单的机器学习模型往往具有更好的预报性能，因此近年来已被广泛应用于地下水位预报。尽管如此，深度学习预报方法仍然面临一些挑战，主要包括模型的预报精度还需进一步提高，特别对于长预见期的地下水位预报精度仍然不够理想；深度学习模型的黑箱属性导致模型预报结果的可解释性较差，预报机制不清；此外，现有预报模型多为确定性模型，只能给出单个预报值，无法量化预结果的不确定性。There are many groundwater level forecasting technologies, which are mainly divided into two categories: traditional physics-driven numerical models and data-driven machine learning and deep learning methods. Among them, deep learning methods can overcome the limitation of large number of numerical model parameters and often have better forecasting performance than machine learning models with simpler structures. Therefore, they have been widely used in groundwater level forecasting in recent years. Despite this, deep learning forecasting methods still face some challenges, including the need to further improve the forecasting accuracy of the model, especially for groundwater level forecasting in the long forecast period, which is still not ideal; the black box property of deep learning models leads to poor interpretability of model forecasting results and unclear forecasting mechanism; in addition, most existing forecasting models are deterministic models that can only give a single forecast value and cannot quantify the uncertainty of the forecast results.

发明内容Summary of the invention

本发明的目的在于提供一种基于可解释贝叶斯卷积网络的地下水位概率预报方法，能够实现未来地下水位变化的可靠预报，并通过贝叶斯方法量化预报的不确定性，所采用的SHAP解释算法通过计算不同输入特征对结果的贡献度，以解决上述背景技术中提出预报机制不清的问题。The purpose of the present invention is to provide a groundwater level probabilistic forecasting method based on an interpretable Bayesian convolutional network, which can achieve reliable forecasts of future groundwater level changes and quantify the uncertainty of the forecast through a Bayesian method. The SHAP interpretation algorithm adopted calculates the contribution of different input features to the results to solve the problem of unclear forecasting mechanism raised in the above background technology.

为了解决上述技术问题，本发明提供如下技术方案，一种基于可解释贝叶斯卷积网络的地下水位概率预报方法，其步骤包括：In order to solve the above technical problems, the present invention provides the following technical solution, a groundwater level probability forecasting method based on an interpretable Bayesian convolutional network, the steps of which include:

确定研究区范围，获取研究区内地下水位监测井的时间序列监测数据；Determine the scope of the study area and obtain time series monitoring data of groundwater level monitoring wells in the study area;

获取地理位置相同的地下水位监测井在对应时间序列监测数据所覆盖的时间段内的气象因子数据；Obtain meteorological factor data for groundwater level monitoring wells with the same geographical location within the time period covered by the corresponding time series monitoring data;

按一时间节点将所述时间序列监测数据所覆盖的时间段划分为前后两个时间段，所述时间节点前对应的时间段称为训练期，所述时间节点后对应的时间段称为测试期；Divide the time period covered by the time series monitoring data into two time periods before and after a time node, the time period before the time node is called a training period, and the time period after the time node is called a test period;

将所述训练期和所述测试期数据分别进行重组，构建输入-输出样本训练集和测试集；Reorganizing the training period data and the test period data to construct an input-output sample training set and a test set;

搭建可解释贝叶斯卷积网络（XBCN），基于所述输入-输出样本训练集，训练可解释贝叶斯卷积网络（XBCN）地下水位概率预报模型；Building an interpretable Bayesian convolutional network (XBCN), and training an interpretable Bayesian convolutional network (XBCN) groundwater level probability prediction model based on the input-output sample training set;

在可解释贝叶斯卷积网络地下水位概率预报模型中，采用蒙特卡洛丢弃法（MC-dropout），在预报阶段反复N次依概率随机丢弃可解释贝叶斯卷积网络地下水位概率预报模型的部分神经元，获得N个可解释贝叶斯卷积网络地下水位概率预报模型预报结果组成的预报集合；In the interpretable Bayesian convolutional network groundwater level probability prediction model, the Monte Carlo dropout method (MC-dropout) is used to randomly drop some neurons of the interpretable Bayesian convolutional network groundwater level probability prediction model N times according to probability during the prediction stage, and a forecast set consisting of N forecast results of the interpretable Bayesian convolutional network groundwater level probability prediction model is obtained;

基于所述输入-输出样本测试集，计算训练好的可解释贝叶斯卷积网络地下水位概率预报模型对每个监测井未来n个时间步地下水位的决定系数（R ²）、均方根误差（RMSE）和克林-古普塔效率系数（KGE）预报精度，并基于N个所述可解释贝叶斯卷积网络地下水位概率预报模型预报集合计算预报的置信区间；Based on the input-output sample test set, the determination coefficient ( R ² ), root mean square error (RMSE) and Kling-Gupta efficiency coefficient (KGE) prediction accuracy of the trained interpretable Bayesian convolutional network groundwater level probability prediction model for each monitoring well in the next n time steps are calculated, and the prediction confidence interval is calculated based on the N forecast sets of the interpretable Bayesian convolutional network groundwater level probability prediction model;

将所述决定系数、所述均方根误差和所述克林-古普塔效率系数作为判别标准，判断可解释贝叶斯卷积网络地下水位概率预报模型是否满足预报精度，若不满足预报精度则继续利用可解释贝叶斯卷积网络训练优化调整可解释贝叶斯卷积网络地下水位概率预报模型直至满足预报精度；The determination coefficient, the root mean square error and the Kling-Gupta efficiency coefficient are used as the judgment criteria to judge whether the interpretable Bayesian convolutional network groundwater level probability prediction model meets the prediction accuracy. If the prediction accuracy is not met, the interpretable Bayesian convolutional network training is continued to be used to optimize and adjust the interpretable Bayesian convolutional network groundwater level probability prediction model until the prediction accuracy is met;

基于所述满足预报精度的可解释贝叶斯卷积网络地下水位概率预报模型，使用SHAP（Shapely Additive exPlanations）解释算法获得地下水位、时间和气象因子对预报地下水位变化的贡献度；Based on the explainable Bayesian convolutional network groundwater level probability forecasting model that meets the forecast accuracy, the SHAP (Shapely Additive exPlanations) interpretation algorithm is used to obtain the contribution of groundwater level, time and meteorological factors to the forecast groundwater level change;

根据地下水位、时间和气象因子对预报地下水位变化的贡献度，确定主要影响因子，将主要影响因子对应的实时监测数据输入至满足预报精度的可解释贝叶斯卷积网络地下水位概率预报模型进行预报。将贡献度高于等于阈值的地下水位、时间和气象因子称为主要影响因子。According to the contribution of groundwater level, time and meteorological factors to the forecast of groundwater level change, the main influencing factors are determined, and the real-time monitoring data corresponding to the main influencing factors are input into the interpretable Bayesian convolutional network groundwater level probability forecasting model that meets the forecast accuracy for forecasting. The groundwater level, time and meteorological factors are called the main influencing factors.

根据上述技术方案，所述气象因子包括降水、蒸散发和温度等。According to the above technical solution, the meteorological factors include precipitation, evapotranspiration and temperature.

根据上述技术方案，所述训练期数据用于训练预报模型，所述测试期内的数据用于评估模型的预报精度。According to the above technical solution, the data in the training period is used to train the forecast model, and the data in the test period is used to evaluate the forecast accuracy of the model.

根据上述技术方案，将历史m个时间步的所述地下水位数据、所述气象因子数据和对应所述时间数据作为输入样本，将未来n个时间步的地下水位数据作为输出样本，构建所述输入-输出样本训练集和测试集。According to the above technical solution, the groundwater level data of m historical time steps, the meteorological factor data and the corresponding time data are used as input samples, and the groundwater level data of n future time steps are used as output samples to construct the input-output sample training set and test set.

根据上述技术方案，将所述气象因子、所述地下水位以及相应的时间数据输入到所述可解释贝叶斯卷积网络中，依次经过一个窗口归一化Normalization层、一个嵌入Embedding层和全连接层处理后，输入到多个堆叠的时间模块TimesBlock，搭建所述可解释贝叶斯卷积网络地下水位概率预报模型，其中可解释贝叶斯卷积网络地下水位概率预报模型可简称为XBCN模型；According to the above technical solution, the meteorological factors, the groundwater level and the corresponding time data are input into the interpretable Bayesian convolutional network, and after being processed by a window normalization layer, an embedding layer and a fully connected layer in sequence, they are input into multiple stacked time modules TimesBlock to build the interpretable Bayesian convolutional network groundwater level probability forecasting model, wherein the interpretable Bayesian convolutional network groundwater level probability forecasting model can be referred to as the XBCN model;

TimesBlock的基本原理是利用时间序列的多周期叠加特性，基于快速傅里叶变换将时间序列由一维向量转化到二维空间上，再通过二维卷积层充分提取时间序列的潜在特征，最后将所提取的特征转换回一维空间，最后经过一个全连接层和反归一化层得到地下水位预报值。The basic principle of TimesBlock is to utilize the multi-period superposition characteristics of time series, transform the time series from a one-dimensional vector to a two-dimensional space based on the fast Fourier transform, and then fully extract the potential features of the time series through a two-dimensional convolutional layer. Finally, the extracted features are converted back to one-dimensional space, and finally the groundwater level forecast value is obtained through a fully connected layer and an anti-normalization layer.

根据上述技术方案，基于所述训练集样本数据，对构建的可解释贝叶斯卷积网络地下水位概率预报模型进行训练，采用均方误差作为损失函数评价指标，自动训练优化可解释贝叶斯卷积网络地下水位概率预报模型得到最优参数组合。According to the above technical solution, based on the training set sample data, the constructed interpretable Bayesian convolutional network groundwater level probability prediction model is trained, and the mean square error is used as the loss function evaluation indicator to automatically train and optimize the interpretable Bayesian convolutional network groundwater level probability prediction model to obtain the optimal parameter combination.

在测试阶段采用所述决定系数、均方根误差和克林-古普塔效率系数为判别标准，其中决定系数（R ²）：；In the test phase, the determination coefficient, root mean square error and Kling-Gupta efficiency coefficient are used as the judgment criteria, where the determination coefficient ( R ² ) is: ;

均方根误差（RMSE）：；Root Mean Square Error (RMSE): ;

和克林-古普塔效率系数（KGE）：；and Kling-Gupta efficiency factor (KGE): ;

式中，为真实值，/>为真实值的平均值，/>为预报值，T为观测值的数量，克林-古普塔效率系数中的/>、/>和/>分别为真实序列和预报序列的相关系数、相对变异度和偏差率，其中/>表示预报序列和真实序列的协方差，/>和/>分别是预报序列和真实序列的标准差，/>和/>分别是预报序列和真实序列的平均值。In the formula, is the true value, /> is the average of the true values, /> is the forecast value, T is the number of observations, and the Kling-Gupta efficiency coefficient is 、/> and/> are the correlation coefficient, relative variability and deviation rate of the real sequence and the predicted sequence respectively, where/> represents the covariance of the forecast sequence and the true sequence,/> and/> are the standard deviations of the predicted and true series, respectively,/> and/> are the average values of the predicted series and the true series, respectively.

根据上述技术方案，在生成每个预报值时，按概率随机丢弃已训练好的所述可解释贝叶斯卷积网络地下水位概率预报模型的部分神经元，再基于余下神经元获得一个地下水位预报值，将该过程循环N次获得N个预报值组成的预报集合。According to the above technical solution, when generating each forecast value, some neurons of the trained interpretable Bayesian convolutional network groundwater level probability forecast model are randomly discarded according to probability, and then a groundwater level forecast value is obtained based on the remaining neurons. This process is repeated N times to obtain a forecast set consisting of N forecast values.

根据上述技术方案，基于所述预报集合，计算N个预报值的平均值标准差/>，得到所述预报置信区间的上下限为/>（z取1.64-2.58）。According to the above technical solution, based on the forecast set, the average value of N forecast values is calculated. Standard Deviation/> , the upper and lower limits of the forecast confidence interval are obtained as/> (z is 1.64-2.58).

根据上述技术方案，所述预报置信区间的范围为90%-99%。According to the above technical solution, the range of the forecast confidence interval is 90%-99%.

根据上述技术方案，所述SHAP解释算法将可解释贝叶斯卷积网络地下水位概率预报模型预报结果解释为气象因子、地下水位和时间的贡献度之和，计算公式：According to the above technical solution, the SHAP interpretation algorithm interprets the forecast results of the interpretable Bayesian convolutional network groundwater level probability forecast model as the sum of the contributions of meteorological factors, groundwater level and time, and the calculation formula is:

； ;

其中，是解释模型，/>是数据集中的样本预测值，/>是所有训练样本的平均预测值常数，M是样本/>中输入特征的数量，/>是样本预测值/>中的输入特征/>的归因值，所述输入特征包括气象因子特征、地下水位特征和时间特征。in, is the explanatory model, /> is the sample prediction value in the data set, /> is the average prediction value constant of all training samples, M is the sample /> The number of input features in / > is the sample prediction value/> Input features in /> The input characteristics include meteorological factor characteristics, groundwater level characteristics and time characteristics.

特征j的归因值可以通过以下公式计算：The attribution value of feature j can be calculated by the following formula:

； ;

其中，是所有输入特征的集合，/>是输入特征的数量，/>表示排除特征/>后的所有可能的输入特征集合， />是仅使用特征子集S进行预测的结果。所述输入特征包括气象因子特征、地下水位特征和时间特征，在本模型中p=6。in, is the set of all input features, /> is the number of input features, /> Indicates exclusion features/> All possible input feature sets after, /> It is the result of prediction using only the feature subset S. The input features include meteorological factor features, groundwater level features and time features. In this model, p=6.

与现有技术相比，本发明所达到的有益效果是：本发明提出了一种基于可解释贝叶斯卷积网络的地下水位概率预报方法，能实现监测井地下水位可靠概率预报。本发明利用一种基于可解释贝叶斯卷积网络的地下水位概率预报方法，提高了地下水位预报精度，有助于决策者更好的进行水资源规划和调度，提高水资源利用效率。本发明在卷积网络中融合了贝叶斯方法，以置信区间的形式量化了预报结果的不确定性，实现了地下水位的概率预报，从而提高水资源管理决策的可靠性，减少决策风险。在卷积网络中融合了SHAP解释算法，通过计算不同输入特征对结果的贡献度，增强了模型的可解释性，从而提高模型可靠性和可信度，高贡献度输入特征的实时监测数据可以为未来地下水位实时预报提供数据支撑。相比现有技术，本发明提高了地下水位预报精度，量化了预报结果的不确定性，并增强了模型的可解释性。Compared with the prior art, the beneficial effects achieved by the present invention are as follows: the present invention proposes a groundwater level probability prediction method based on an interpretable Bayesian convolutional network, which can realize reliable probability prediction of groundwater level in monitoring wells. The present invention uses a groundwater level probability prediction method based on an interpretable Bayesian convolutional network to improve the accuracy of groundwater level prediction, which helps decision makers to better plan and dispatch water resources and improve the efficiency of water resource utilization. The present invention integrates the Bayesian method into the convolutional network, quantifies the uncertainty of the prediction results in the form of confidence intervals, realizes the probability prediction of groundwater level, thereby improving the reliability of water resource management decisions and reducing decision risks. The SHAP interpretation algorithm is integrated into the convolutional network, and the interpretability of the model is enhanced by calculating the contribution of different input features to the results, thereby improving the reliability and credibility of the model. The real-time monitoring data of high-contribution input features can provide data support for future real-time prediction of groundwater levels. Compared with the prior art, the present invention improves the accuracy of groundwater level prediction, quantifies the uncertainty of the prediction results, and enhances the interpretability of the model.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

附图用来提供对本发明的进一步理解，并且构成说明书的一部分，与本发明的实施例一起用于解释本发明，并不构成对本发明的限制。在附图中：The accompanying drawings are used to provide a further understanding of the present invention and constitute a part of the specification. Together with the embodiments of the present invention, they are used to explain the present invention and do not constitute a limitation of the present invention. In the accompanying drawings:

图1是本实施例一种基于可解释贝叶斯卷积网络的地下水位概率预报方法的整体框架示意图；FIG1 is a schematic diagram of the overall framework of a groundwater level probability forecasting method based on an interpretable Bayesian convolutional network according to the present embodiment;

图2中（a）为TimesBlock模块结构示意图；Figure 2 (a) is a schematic diagram of the TimesBlock module structure;

图2中（b）为可解释贝叶斯卷积网络地下水位概率预报模型具体结构示意图；Figure 2 (b) is a schematic diagram of the specific structure of the interpretable Bayesian convolutional network groundwater level probability prediction model;

图3为可解释贝叶斯卷积网络地下水位概率预报模型在测试集的预报精度评估图；Figure 3 is a graph showing the prediction accuracy evaluation of the interpretable Bayesian convolutional network groundwater level probability prediction model in the test set;

图4为可解释贝叶斯卷积网络地下水位概率预报模型的测试集中一口井的预报结果曲线图；FIG4 is a graph showing the prediction results of a well in the test set of the interpretable Bayesian convolutional network groundwater level probability prediction model;

图5是本实施例各输入特征在预报不同时间步长时对地下水位变化的贡献度。FIG. 5 shows the contribution of various input features to groundwater level changes when forecasting different time steps in this embodiment.

具体实施方式Detailed ways

下面将结合本发明实施例中的附图，对本发明实施例中的技术方案进行清楚、完整地描述，显然，所描述的实施例仅仅是本发明一部分实施例，而不是全部的实施例。基于本发明中的实施例，本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例，都属于本发明保护的范围。The following will be combined with the drawings in the embodiments of the present invention to clearly and completely describe the technical solutions in the embodiments of the present invention. Obviously, the described embodiments are only part of the embodiments of the present invention, not all of the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by ordinary technicians in this field without creative work are within the scope of protection of the present invention.

选取中国西北某半干旱区域为例，预报了区域内监测井的地下水位波动趋势，并量化了预报结果的不确定性，分析了不同输入特征对预报结果的贡献度，来说明一种基于可解释贝叶斯卷积网络的地下水位概率预报方法，具体步骤包括：Taking a semi-arid area in northwest China as an example, the groundwater level fluctuation trend of the monitoring wells in the area was predicted, and the uncertainty of the forecast results was quantified. The contribution of different input features to the forecast results was analyzed to illustrate a groundwater level probabilistic forecasting method based on an interpretable Bayesian convolutional network. The specific steps include:

第1阶段：获取区域内45口地下水监测井2010-2017年的逐5日地下水位监测数据，以及地下水监测井地理位置。根据监测井地理位置和监测数据覆盖的时间段，提取2010-2017年相同位置的日降水、日蒸散发和日平均气温数据，并将逐日气象数据转换成和地下水位序列同尺度的逐5日数据。Phase 1: Obtain the 5-day groundwater level monitoring data and geographical location of 45 groundwater monitoring wells in the region from 2010 to 2017. According to the geographical location of the monitoring wells and the time period covered by the monitoring data, extract the daily precipitation, daily evapotranspiration and daily average temperature data at the same location from 2010 to 2017, and convert the daily meteorological data into 5-day data of the same scale as the groundwater level series.

第2阶段：将地下水位和气象因子数据以2015年7月为分界划分为训练期和测试期，即取每口监测井地下水位和气象因子的前65%数据集为训练集，后35%数据集为测试集。Phase 2: The groundwater level and meteorological factor data were divided into a training period and a testing period with July 2015 as the boundary. That is, the first 65% of the groundwater level and meteorological factor data sets of each monitoring well were taken as the training set, and the last 35% of the data sets were taken as the testing set.

第3阶段：基于第2阶段得到的训练集和测试集数据分别进行重组，构建输入-输出样本集。其重组过程具体为：将历史24个时间步的地下水位、降水、蒸散发、气温、月份和日期作为输入数据，未来6个时间步的地下水位作为输出数据。输入样本的滑动窗口设置为长度为24个时间步（1个时间步=5天），输出样本的滑动窗口设置为长度为6个时间步。在时间序列数据上，按照时间顺序，从开始到最后一个输出样本长度前的时刻，以1个时间步长的长度移动输入样本的滑动窗口，得到每口井的输入样本。在时间序列数据上，按照时间顺序，从第一个输入样本长度后的时刻到结束时刻，以1个时间步长的长度移动输出样本的滑动窗口，得到每口地下水监测井的输出样本。Phase 3: Based on the training set and test set data obtained in Phase 2, the input-output sample set is constructed by reorganizing them respectively. The specific reorganization process is as follows: the groundwater level, precipitation, evapotranspiration, temperature, month and date of the historical 24 time steps are used as input data, and the groundwater level of the next 6 time steps is used as output data. The sliding window of the input sample is set to a length of 24 time steps (1 time step = 5 days), and the sliding window of the output sample is set to a length of 6 time steps. On the time series data, the sliding window of the input sample is moved by a length of 1 time step from the beginning to the moment before the last output sample length in chronological order to obtain the input sample of each well. On the time series data, the sliding window of the output sample is moved by a length of 1 time step from the moment after the first input sample length to the end moment in chronological order to obtain the output sample of each groundwater monitoring well.

第4阶段：将训练集的输入样本放入可解释贝叶斯卷积网络地下水位概率预报模型（中训练，得到输出结果，采用均方误差作为损失函数评价指标，自动训练优化可解释贝叶斯卷积网络地下水位概率预报模型，得到最优参数组合。在可解释贝叶斯卷积网络地下水位概率预报模型中采用蒙特卡洛丢弃（MC-dropout）法，在生成每个预报值时，按概率随机丢弃已训练好XBCN模型的部分神经元，再基于余下神经元获得一个地下水位预报值，将该过程循环N次获得N个预报值组成的预报集合。模型训练使用丢弃率（dropout）为0.05，样本批大小（batch size）为16，学习率（learning rate）为0.001。图2 中（a）表示TimesBlock模块结构示意图，其中softmax表示归一化指数函数，Inception模块属于现有技术在此不做详细解释。Phase 4: The input samples of the training set are put into the interpretable Bayesian convolutional network groundwater level probability prediction model (trained in ), and the output results are obtained. The mean square error is used as the loss function evaluation index, and the interpretable Bayesian convolutional network groundwater level probability prediction model is automatically trained and optimized to obtain the optimal parameter combination. The Monte Carlo dropout (MC-dropout) method is used in the interpretable Bayesian convolutional network groundwater level probability prediction model. When generating each forecast value, some neurons of the trained XBCN model are randomly discarded according to probability, and then a groundwater level forecast value is obtained based on the remaining neurons. The process is repeated N times to obtain a forecast set consisting of N forecast values. The model training uses a dropout rate of 0.05, a sample batch size of 16, and a learning rate of 0.001. Figure 2 (a) shows a schematic diagram of the TimesBlock module structure, where softmax represents a normalized exponential function, and the Inception module belongs to the prior art and is not explained in detail here.

第5阶段：将测试集输入样本放入可解释贝叶斯卷积网络地下水位概率预报模型中，得到输出未来1-6个时间步长的地下水位预报结果，采用相关指标决定系数（R²）、均方根误差（RMSE）和克林-古普塔效率系数（KGE）评估预报精度，若预报精度不满足要求，则返回第4阶段优化调整可解释贝叶斯卷积网络地下水位概率预报模型参数直至满足预报精度。Stage 5: Put the test set input samples into the interpretable Bayesian convolutional network groundwater level probability forecasting model to obtain the output groundwater level forecast results for the next 1-6 time steps. The relevant indicators such as determination coefficient ( ^R2 ), root mean square error (RMSE) and Kling-Gupta efficiency coefficient (KGE) are used to evaluate the forecast accuracy. If the forecast accuracy does not meet the requirements, return to stage 4 to optimize and adjust the interpretable Bayesian convolutional network groundwater level probability forecasting model parameters until the forecast accuracy is met.

第6阶段：获得满足预报精度的可解释贝叶斯卷积网络地下水位概率预报模型，模型精度指标评估结果如图3所示，图3的（a）-（f）依次为预报未来1-6个时间步的结果，对1-6个时间步预报精度进行平均，得到区域45口井的平均均方根误差（RMSE）为0.21，平均决定系数（R2）为0.73，平均克林-古普塔效率系数（KGE）为0.85。基于第4阶段获得的N个预报结果集合，计算N个预报值的平均值和标准差/>，得到预报95%置信区间的上下限为/>（z取1.96）。区域内第/>口地下水监测井的预报曲线如图4所示，由图4可知可解释贝叶斯卷积网络地下水位概率预报模型的预报曲线能够较好的拟合观测值。Stage 6: Obtain an interpretable Bayesian convolutional network groundwater level probability forecasting model that meets the forecast accuracy. The model accuracy index evaluation results are shown in Figure 3. Figure 3 (a)-(f) are the results of forecasting the next 1-6 time steps respectively. The forecast accuracy of 1-6 time steps is averaged, and the average root mean square error (RMSE) of 45 wells in the region is 0.21, the average determination coefficient (R2) is 0.73, and the average Kling-Gupta efficiency coefficient (KGE) is 0.85. Based on the N forecast result sets obtained in Stage 4, the average value of the N forecast values is calculated. and standard deviation/> , the upper and lower limits of the 95% confidence interval of the forecast are/> (z is 1.96). The first/> The prediction curve of the groundwater monitoring well is shown in Figure 4. It can be seen from Figure 4 that the prediction curve of the interpretable Bayesian convolutional network groundwater level probability prediction model can fit the observed value well.

第7阶段：基于第5阶段得到的满足预报精度的可解释贝叶斯卷积网络地下水位概率预报模型，使用SHAP算法计算所有样本的每个输入特征对输出的贡献度（SHAP值），对所有样本的SHAP值取绝对值平均，并进行归一化，得到图5的SHAP结果图，图中的纵坐标为模型输入特征，从上到下依次为日期、月份、降水、蒸散发、气温和历史地下水位，横坐标是各特征SHAP值的绝对值平均后归一化结果，图5中的（a）-（f）依次表示预报未来1-6个时间步长的SHAP结果。从图5中可以看出，历史地下水位是最重要的输入特征，随着预报未来时间步长的增加，历史地下水位的影响被逐渐削弱，其他气象因子和月份输入特征的影响逐渐增大，这可能是因为地下水位对降水和蒸散发的响应具有一定的滞后效应。另外，月份也是较为重要的输入变量，在一定程度上可以减少气象输入数据异常值对结果的影响。Stage 7: Based on the interpretable Bayesian convolutional network groundwater level probability forecasting model that meets the forecast accuracy obtained in stage 5, the SHAP algorithm is used to calculate the contribution of each input feature of all samples to the output (SHAP value). The absolute value of the SHAP values of all samples is averaged and normalized to obtain the SHAP result diagram in Figure 5. The ordinate in the figure is the model input feature, which is date, month, precipitation, evapotranspiration, temperature and historical groundwater level from top to bottom. The abscissa is the normalized result after the average absolute value of the SHAP value of each feature. (a)-(f) in Figure 5 represent the SHAP results of the forecast of 1-6 time steps in the future. It can be seen from Figure 5 that the historical groundwater level is the most important input feature. As the forecast time step increases, the influence of the historical groundwater level is gradually weakened, and the influence of other meteorological factors and monthly input features gradually increases. This may be because the response of the groundwater level to precipitation and evapotranspiration has a certain lag effect. In addition, the month is also a relatively important input variable, which can reduce the impact of abnormal values of meteorological input data on the results to a certain extent.

第8阶段：基于第7阶段量化的各输入特征贡献度结果，重点收集地下水位实时监测数据，并输入第5阶段获得的满足预报精度的可解释贝叶斯卷积网络地下水位概率预报模型中，可实现未来地下水位的实时可靠预报，能够为未来水资源合理规划和生态水位实时预警提供技术支撑。Phase 8: Based on the contribution results of each input feature quantified in Phase 7, focus on collecting real-time monitoring data of groundwater levels and input them into the interpretable Bayesian convolutional network groundwater level probability forecasting model that meets the forecast accuracy obtained in Phase 5. This can achieve real-time and reliable forecasts of future groundwater levels, and provide technical support for future rational planning of water resources and real-time early warning of ecological water levels.

需要说明的是，在本文中，诸如第一和第二等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来，而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。而且，术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含，从而使得包括一系列要素的过程、方法、物品或者设备不仅包括那些要素，而且还包括没有明确列出的其他要素，或者是还包括为这种过程、方法、物品或者设备所固有的要素。It should be noted that, in this article, relational terms such as first and second, etc. are only used to distinguish one entity or operation from another entity or operation, and do not necessarily require or imply any such actual relationship or order between these entities or operations. Moreover, the terms "include", "comprise" or any other variants thereof are intended to cover non-exclusive inclusion, so that a process, method, article or device including a series of elements includes not only those elements, but also other elements not explicitly listed, or also includes elements inherent to such process, method, article or device.

最后应说明的是：以上所述仅为本发明的优选实施例而已，并不用于限制本发明，尽管参照前述实施例对本发明进行了详细的说明，对于本领域的技术人员来说，其依然可以对前述各实施例所记载的技术方案进行修改，或者对其中部分技术特征进行等同替换。凡在本发明的精神和原则之内，所作的任何修改、等同替换、改进等，均应包含在本发明的保护范围之内。Finally, it should be noted that the above is only a preferred embodiment of the present invention and is not intended to limit the present invention. Although the present invention has been described in detail with reference to the aforementioned embodiments, those skilled in the art can still modify the technical solutions described in the aforementioned embodiments or replace some of the technical features therein by equivalents. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention shall be included in the protection scope of the present invention.

Claims

1. A method for groundwater level probability forecasting based on an interpretable Bayesian convolutional network, characterized in that the steps include:

Determine the scope of the study area and obtain time series monitoring data of groundwater level monitoring wells in the study area;

Obtain meteorological factor data for groundwater level monitoring wells with the same geographical location within the time period covered by the corresponding time series monitoring data;

Divide the time period covered by the time series monitoring data into two time periods before and after a time node, the time period before the time node is called a training period, and the time period after the time node is called a test period;

Reorganizing the training period data and the test period data to construct an input-output sample training set and a test set;

Building an interpretable Bayesian convolutional network, and training an interpretable Bayesian convolutional network groundwater level probability prediction model based on the input-output sample training set;

In the interpretable Bayesian convolutional network groundwater level probability prediction model, the Monte Carlo discarding method is used to randomly discard some neurons of the interpretable Bayesian convolutional network groundwater level probability prediction model N times according to probability during the forecasting stage, and a forecast set consisting of N forecast results of the interpretable Bayesian convolutional network groundwater level probability prediction model is obtained;

Based on the input-output sample test set, the determination coefficient, root mean square error and Kling-Gupta efficiency coefficient prediction accuracy of the trained interpretable Bayesian convolutional network groundwater level probability prediction model for each monitoring well in the next n time steps are calculated, and the prediction confidence interval is calculated based on the N forecast sets of the interpretable Bayesian convolutional network groundwater level probability prediction model;

The determination coefficient, the root mean square error and the Kling-Gupta efficiency coefficient are used as the judgment criteria to judge whether the interpretable Bayesian convolutional network groundwater level probability prediction model meets the prediction accuracy. If the prediction accuracy is not met, the interpretable Bayesian convolutional network training is continued to be used to optimize and adjust the interpretable Bayesian convolutional network groundwater level probability prediction model until the prediction accuracy is met;

Based on the interpretable Bayesian convolutional network groundwater level probability forecasting model that meets the forecast accuracy, the SHAP interpretation algorithm is used to obtain the contribution of groundwater level, time and meteorological factors to the forecast of groundwater level changes;

According to the contribution of groundwater level, time and meteorological factors to the predicted groundwater level changes, the main influencing factors are determined, and the real-time monitoring data corresponding to the main influencing factors are input into the interpretable Bayesian convolutional network groundwater level probability forecasting model that meets the forecast accuracy for forecasting.

2. According to the method for groundwater level probability forecasting based on an interpretable Bayesian convolutional network according to claim 1, it is characterized in that the meteorological factors include precipitation, evapotranspiration and temperature.

3. According to the method for groundwater level probability forecasting based on an interpretable Bayesian convolutional network described in claim 1, it is characterized in that the training period data is used to train the forecasting model, and the data in the test period is used to evaluate the forecasting accuracy of the model.

4. According to the method for groundwater level probability forecasting based on an interpretable Bayesian convolutional network described in claim 1, it is characterized in that the groundwater level data, meteorological factor data and corresponding time data of historical m time steps are used as input samples, and the groundwater level data of future n time steps are used as output samples to construct the input-output sample training set and test set.

5. According to the method for groundwater level probability forecasting based on an interpretable Bayesian convolutional network described in claim 1, it is characterized in that meteorological factor data, groundwater level data and corresponding time data are input into the interpretable Bayesian convolutional network, and are processed in sequence by a window normalization layer, an embedding layer and a fully connected layer, and then input into multiple stacked time modules TimesBlock to build the interpretable Bayesian convolutional network groundwater level probability forecasting model.

6. According to the method for groundwater level probability prediction based on an interpretable Bayesian convolutional network according to claim 1, it is characterized in that the constructed interpretable Bayesian convolutional network groundwater level probability prediction model is trained based on the training set sample data, and the mean square error is used as the loss function evaluation index to automatically train and optimize the interpretable Bayesian convolutional network groundwater level probability prediction model to obtain the optimal parameter combination.

7. According to claim 1, a groundwater level probability forecasting method based on an interpretable Bayesian convolutional network is characterized in that when generating each forecast value, some neurons of the trained interpretable Bayesian convolutional network groundwater level probability forecasting model are randomly discarded according to probability, and then a groundwater level forecast value is obtained based on the remaining neurons, and a forecast set consisting of N groundwater level forecast values is obtained by looping N times.

8. A method for predicting groundwater level probability based on an interpretable Bayesian convolutional network according to claim 1, characterized in that, based on the forecast set, an average value of N forecast values is calculated Standard Deviation/> , the upper and lower limits of the forecast confidence interval are obtained as/> .

9. The method for groundwater level probability prediction based on an interpretable Bayesian convolutional network according to claim 1 is characterized in that the prediction confidence interval ranges from 90% to 99%.

10. According to claim 1, a groundwater level probability forecasting method based on an interpretable Bayesian convolutional network is characterized in that the SHAP interpretation algorithm interprets the forecast result of the interpretable Bayesian convolutional network groundwater level probability forecasting model as the sum of the contribution of meteorological factors, groundwater level and time, and the calculation formula is:

;

in, is the explanatory model,/> is the sample prediction value in the data set, /> is the average prediction value constant of all training samples, M is the sample /> The number of input features in / > is the sample prediction value/> Input features in/> The input characteristics include meteorological factor characteristics, groundwater level characteristics and time characteristics.