Abnormal data analysis method for dynamic water level monitoring
Technical Field
The invention belongs to the technical field of hydrological data processing, and particularly relates to an abnormal data analysis method for dynamic water level monitoring.
Background
In the process of hydrological observation test, the obtained water level data result is very sensitive to human activities and system changes in the hydrological process, and a certain number of abnormal values often exist. Most of the reasons for the abnormal values are that the measuring instrument is manually lifted from the water, so that the data such as the water pressure and the relative water depth are abnormal. The abnormal values can not reflect the hydrological change process faithfully, and reasonable technical means are needed to be adopted to remove the abnormal values.
In the prior art, the analysis and judgment of the monitoring abnormal data only stay at the stage of screening and rejecting the abnormal data which is very obvious, for the abnormal data which is not easy to identify locally, the abnormal data is usually identified and rejected step by step in the data using process, the workload is large, the time consumption is long, the efficiency is low, the uncertainty is strong, and for the centralized analysis and processing of the abnormal values of the large-batch monitoring data, a corresponding scientific method or a similar method for analyzing and rejecting does not exist.
Disclosure of Invention
In order to provide a method for analyzing and rejecting water level abnormal data of a certain water body, which not only ensures the scientificity and the judgment accuracy of the analysis method, but also is convenient to operate and improves the data checking efficiency, the invention provides an abnormal data analysis method for dynamically monitoring the water level, and the method is realized by the following technical scheme.
An abnormal data analysis method for water level dynamic monitoring comprises the following steps:
s1, continuously recording the precipitation process of a certain sampling point under the water of the drainage basin to be tested in a certain long-term period; continuously recording the water pressure and water temperature data of the sampling point at a time interval delta t of a specific duration within the long-term period, and respectively obtaining a continuous water pressure and water temperature data sequence of the sampling point within the long-term period;
s2, converting each water pressure data in the continuous water pressure data sequence into relative water depth data from the water surface to a sampling point to obtain a continuous relative water depth data sequence of the sampling point in the long-term time period;
s3, removing obvious abnormal data in the continuous relative water depth data sequence by combining the precipitation process of the basin sampling point to be measured in the step S1 in a long-term period to obtain a plurality of relative water depth data subsequences A; each relative water depth data subsequence A comprises a plurality of relative water depth data a with the same time interval delta t;
s4, calculating the increment X of the relative water depth data a at every 2 adjacent moments in each relative water depth data subsequence A to obtain a plurality of increment subsequences X, and combining all the increment subsequences X to obtain an increment sequence XX;
s5, estimating confidence intervals of the increments x in the increment sequence XX under a specific confidence level, and determining upper and lower limit thresholds of the confidence intervals, namely the upper and lower limit thresholds accepted by each increment;
s6, screening all abnormal increments exceeding the confidence interval in the increment sequence XX by taking the upper and lower limit thresholds of the confidence interval as a reference;
and S7, further analyzing the rationality of each abnormal increment and eliminating unreasonable abnormal increments by combining the precipitation process and the water temperature data sequence of the basin sampling point to be detected in the step S1 in a long-term period.
A certain sampling point of a water area to be measured measures a large amount of dynamic water pressure data in a long-term period (such as 1 month, half year, 1 year, 3 years and the like) by selecting a specific instrument, and the data are measured in the same short-time interval (such as 5min, 10min and the like). On the basis, the invention adopts a water level increment control method to complete the screening and elimination of abnormal data: the first step is to remove obviously abnormal data in advance by judging the rise and fall of the water level during rainfall, namely steps S1 to S3, which are the most easily adopted conventional means; the second step is that the relative water depth data subsequence obtained by removing abnormal data is used for calculating the increment (namely the water level difference) of the relative water depth data of adjacent moments, the increment sequence of the sampling point at a specific time interval in the long-term period is formed by combining, the increment is supposed to accord with a certain random distribution function, the confidence interval of the increment under a specific confidence level is estimated, the interval range of the increment which can be accepted is determined, each abnormal increment which is not in the range is screened out, and finally whether the abnormal increment can be accepted or not is judged according to the actual conditions of precipitation, water temperature and the like of the moment of each abnormal increment.
Preferably, in step S2, the calculation formula of each water pressure data in the continuous water pressure data sequence converted into the relative water depth data from the water surface to the sampling point is:
H=(P-P0)/9.8;
h is relative water depth data of the sampling point at the moment, and the unit is m; p is the atmospheric pressure above the water surface of the sampling point, and the unit is kPa; p0The unit is the water pressure at the sampling point in kPa.
Preferably, in step S3, the specific method for determining the apparently abnormal data in the continuous relative water depth data sequence is: when the relative water depth data at a certain moment is 0 or a negative value, the relative water depth data at the moment is obviously abnormal data;
removing the obvious abnormal data to obtain n relative water depth data subsequences A, wherein each relative water depth data subsequence A comprises m relative water depth data a with the same time interval delta t, namely
Ai={ai,1,ai,2,…,ai,m},1≤i≤n,
Wherein A isiIs the ith relative water depth data subsequence which contains m relative water depth data, respectively ai,1,ai,2,…,aim,。
Preferably, step S4 is specifically: for each relative water depth data subsequence Ai={ai,1,ai,2,…,ai,mA is calculatedi,mAnd ai,m-1Increment X in between, get increment subsequence XiEach increment subsequence contains m-1 increments x, i.e.
Xi={xi,1,xi,2,…,xi,m-1},xi,m-1=ai,m-ai,m-1,
Wherein x isi,m-1Is a relative water depth data subsequence AiMiddle adjacent 2 relative water depth data ai,mAnd ai,m-1An increment of (d);
then all the increment subsequences XiCombining to obtain increment sequence XX, XX ═ X1,X2,…,XnI.e. the sequence of increments XX contains the increments of all the relative water depth data at the sample point at a particular time interval at over the long-term period.
Preferably, step S5 is specifically: assuming that the increments x included in the increment sequence XX conform to a specific distribution function, the confidence interval of the increments x at a specific confidence level is estimated, and the upper and lower threshold values for which the increments x are accepted at the specific confidence level are determined.
More preferably, in step S5, the specific distribution function is a normal distribution, a Gumbel distribution, or a t distribution.
More preferably, in step S5, the particular confidence level is 90%, 95%, or 99%.
More preferably, in step S5, the particular confidence level is 95%.
Preferably, in step S7, the specific method for further analyzing the rationality of each abnormal increment obtained in step S6 is:
combining the precipitation process and the water temperature data of the basin sampling point to be measured obtained in the step S1 in a long-term period, when an abnormal increment is smaller than the lower limit threshold value of the increment x under the specific confidence level, and the water temperature changes obviously at the moment; or when a certain abnormal increment is larger than an upper limit threshold value accepted by the increment x under a specific level and the water temperature changes obviously and rainfall does not occur at the moment, judging that the abnormal increment is to be removed, and further judging that the correspondingly recorded water pressure and relative water depth data are also to be removed; when a certain abnormal increment is larger than the upper limit threshold value accepted by the increment x under a specific level, the water temperature is constant or slightly changed, and rainfall occurs, the abnormal increment is judged to be caused by the rainfall, and the correspondingly recorded water pressure and relative water depth data can be accepted.
Compared with the prior art, the invention has the beneficial effects that:
1. the invention discloses a method for identifying, screening and eliminating abnormal fluctuation when the water level of any water body is dynamically changed, the method can be suitable for analyzing and processing data for a long time and in large batch, and a calculation result can be directly obtained in monitoring equipment or a computer in the whole process, so that the time consumed by data checking is effectively reduced, and the uncertainty of manually processing abnormal data is reduced;
2. the water level increment is used as a control variable, a confidence interval of the water level increment under a specific confidence level (for example, 95%) is estimated by using a specific statistical model (for example, a normal distribution curve), and the judgment principle and basis are relatively scientific and reasonable;
3. specific problems are specifically analyzed aiming at water level abnormal changes possibly caused by human activities such as data reading, high-intensity precipitation, water temperature change and the like and system changes, and the reliability of monitoring data is improved.
Drawings
FIG. 1 is a flow chart of an abnormal data analysis method for dynamic water level monitoring according to embodiment 1;
FIG. 2 is a graph of precipitation process of a certain sampling point of a watershed monitored and recorded in 2017, month 1, day 7, month 31, with a recording time interval of 5 min;
FIG. 3 is a relative water depth data graph of monitoring records of a certain sampling point of a drainage basin in 2017, month 1, day 7, month 31, with a recording time interval of 5 min;
FIG. 4 is a graph of incremental data generated by calculation according to the relative water depth data of FIG. 3 in example 1, and recording time intervals of 5 min;
fig. 5 is a diagram of acceptable relative water depth data generated after removing abnormal relative water depth data corresponding to abnormal increments from the data of fig. 2 and 4 in embodiment 1.
Detailed Description
The technical solutions of the embodiments in this patent will be described clearly and completely with reference to the accompanying drawings, and it is obvious that the described embodiments are only a part of the embodiments of this patent, not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the patent without making creative efforts, shall fall within the protection scope of the patent.
Example 1
The embodiment is directed to the analysis of abnormal data of dynamic water level monitoring performed on a certain sampling point in a certain water area from 7/month 1 to 7/month 31 in 2017, and specifically comprises the following steps:
s1, continuously recording the precipitation process of the basin to be measured in 2017, month 1 and month 31 (month one and a whole month of 7) by using a HOBO pressure water level meter; continuously recording the water pressure and water temperature data of the sampling point at a time interval of 5min (delta t) in the time interval to respectively obtain continuous water pressure and water temperature data sequences of the sampling point in the time interval;
s2, passing each water pressure data in the continuous water pressure data sequence of the step S1 through a calculation formula
H=(P-P0)/9.8
Converting the relative water depth data from the water surface to the sampling point to obtain a continuous relative water depth data sequence of the sampling point in 7 months, wherein H is the relative water depth data of the sampling point at the moment, and the unit is m; p is the atmospheric pressure above the water surface of the sampling point, and the unit is kPa; p0The unit is the water pressure at the sampling point in kPa.
S3, combining the precipitation process of the basin sampling point to be measured obtained in the step S1 in the 7 month, and when the relative water depth data at a certain moment of the 7 month is 0 or a negative value, the relative water depth data at the moment is obviously abnormal data.
Removing the obvious abnormal data to obtain n relative water depth data subsequences A, wherein each relative water depth data subsequence A comprises m relative water depth data a, and the time interval recorded between every two relative water depth data a in each subsequence A is 5min, namely
Ai={ai,1,ai,2,…,ai,m},1≤i≤n,
Wherein A isiIs the ith relative water depth data subsequence which contains m relative water depth data, respectively ai,1,ai,2,…,ai,m;
S4, calculating the increment X of the relative water depth data a at every 2 adjacent moments in each relative water depth data subsequence A to obtain an increment subsequence X, and combining all the increment subsequences X to obtain an increment sequence XX, wherein the method comprises the following specific steps:
for each relative water depth data subsequence Ai={ai,1,ai,2,…,ai,mA is calculatedi,mAnd ai,m-1Increment X in between, get increment subsequence XiEach increment subsequence contains m-1 increments x, i.e.
Xi=xi,1,xi,2,…,xi,m-1},xi,m-1=ai,m-ai,m-1,
Wherein x isi,m-1Is a relative water depth data subsequence AiMiddle adjacent 2 relative waterDeep data ai,mAnd ai,m-1An increment of (d);
then all the increment subsequences XiCombining to obtain increment sequence XX, XX ═ X1,X2,…,XnI.e. the sequence of increments XX contains the increments of all the relative water depth data at the sample point at a particular time interval at over the long-term period.
S5, assuming that the increments x contained in the increment sequence XX are all in accordance with normal distribution, estimating the confidence interval of the increments x at the 95% confidence level, and determining the upper and lower limit thresholds of the increments x accepted at the 95% confidence level;
s6, screening all abnormal increments exceeding the confidence interval in the increment sequence of the step S4 by taking the upper limit threshold and the lower limit threshold of the confidence interval as a reference;
s7, combining the precipitation process and the water temperature data of the basin sampling point to be measured obtained in the step S1 in a long-term period, and when an abnormal increment is smaller than a lower limit threshold value accepted by the increment x under the 95% confidence level, the water temperature changes obviously at the moment; or when a certain abnormal increment is larger than an upper limit threshold value accepted by the increment x under the 95% confidence level, and the water temperature changes obviously and rainfall does not occur at the moment, judging that the abnormal increment is to be removed, and further judging that the correspondingly recorded water pressure and relative water depth data are also to be removed; when a certain abnormal increment is larger than an upper limit threshold value accepted by the increment x under the 95% confidence level, the water temperature is constant or slightly changed, and rainfall occurs, the abnormal increment is judged to be caused by the rainfall, and the correspondingly recorded water pressure and relative water depth data are considered to be accepted.
The flow chart of the analysis method of the present embodiment is shown in fig. 1; and then recording the formed precipitation process diagram, the relative water depth data diagram and the increment data diagram, and generating acceptable relative water depth data diagrams as shown in fig. 2-5 after eliminating abnormal relative water depth data corresponding to the moment of the abnormal increment.
As can be seen from fig. 2, due to the influence of rainfall, there are a plurality of changes of steep rise and fall within the rainfall time period, and the observed values at these moments may be abnormal values. However, through the combined comparison of fig. 2 to fig. 4, the water level changes of fig. 3 and fig. 4 have no abnormal relative water depth data which is not matched with the precipitation process of fig. 2.
According to the step S5, obtaining the average value and variance of the water level increment in every 5min interval by using a norm function in Matlab, wherein the average value miu of the increment in every 5min is-1.2 mm, and the standard deviation sigma is 158.6 in the embodiment; the 95% confidence interval is therefore (miu-1.96 sigma, miu +1.96 sigma), i.e., (-312.1, 309.7).
Finally, by comparing 95% confidence intervals, the rationality of all abnormal increments exceeding the 95% confidence interval in every 5min from 7 month 1 day to 7 month 31 day is analyzed and judged, abnormal relative water depth data are listed and summarized, and the following table 1 is prepared.
Table 1 summary of anomaly relative water depth data for example 1
The 6 abnormal periods contained in table 1 are all that, under the condition of no rainfall, the water level rapidly rises in a short time after being greatly dropped in a short time, the temperature also shows that the water level rapidly falls after being greatly increased in a short time, and the increment and the relative depth of water in the periods can be analyzed and judged to be abnormal values and need to be removed.