[go: up one dir, main page]

CN119046750A - Water quality detection method and system - Google Patents

Water quality detection method and system Download PDF

Info

Publication number
CN119046750A
CN119046750A CN202411050291.5A CN202411050291A CN119046750A CN 119046750 A CN119046750 A CN 119046750A CN 202411050291 A CN202411050291 A CN 202411050291A CN 119046750 A CN119046750 A CN 119046750A
Authority
CN
China
Prior art keywords
water quality
data
model
training
prediction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202411050291.5A
Other languages
Chinese (zh)
Inventor
袁兵
杨建波
贺艳红
何伟
冯莲慧
陈小明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sichuan Yunnuoxin Testing Technology Co ltd
Original Assignee
Sichuan Yunnuoxin Testing Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sichuan Yunnuoxin Testing Technology Co ltd filed Critical Sichuan Yunnuoxin Testing Technology Co ltd
Priority to CN202411050291.5A priority Critical patent/CN119046750A/en
Publication of CN119046750A publication Critical patent/CN119046750A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/18Water
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • G06N3/0442Recurrent networks, e.g. Hopfield networks characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2123/00Data types
    • G06F2123/02Data types in the time domain, e.g. time-series data
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A20/00Water conservation; Efficient water supply; Efficient water use
    • Y02A20/152Water filtration

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Chemical & Material Sciences (AREA)
  • Analytical Chemistry (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Pathology (AREA)
  • Immunology (AREA)
  • Biochemistry (AREA)
  • Medicinal Chemistry (AREA)
  • Food Science & Technology (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

本发明涉及水质检测技术领域,公开了一种水质检测方法及系统,旨在克服传统水质检测方法的局限性,实现对水质变化的实时预测与监控。该方法及系统首先通过数据采集层收集历史与实时水质数据,涵盖多种污染物浓度及环境参数。数据预处理层负责数据清洗、缺失值处理和异常值检测,确保数据质量。模型训练与验证层采用机器学习算法,训练并验证水质变化预测模型,通过参数调优优化模型性能。预测与应用层则将实时数据输入模型,预测未来水质变化,分析预测结果以识别潜在污染事件,并据此调整水处理策略。本发明提高了水质检测的时效性与准确性,有助于水资源管理和环境保护。

The present invention relates to the technical field of water quality detection, and discloses a water quality detection method and system, which are intended to overcome the limitations of traditional water quality detection methods and realize real-time prediction and monitoring of water quality changes. The method and system first collect historical and real-time water quality data through a data acquisition layer, covering a variety of pollutant concentrations and environmental parameters. The data preprocessing layer is responsible for data cleaning, missing value processing and outlier detection to ensure data quality. The model training and verification layer uses a machine learning algorithm to train and verify the water quality change prediction model, and optimizes the model performance through parameter tuning. The prediction and application layer inputs real-time data into the model, predicts future water quality changes, analyzes the prediction results to identify potential pollution events, and adjusts the water treatment strategy accordingly. The present invention improves the timeliness and accuracy of water quality detection, which is helpful for water resource management and environmental protection.

Description

Water quality detection method and system
Technical Field
The invention relates to the technical field of water quality detection, in particular to a water quality detection method and system.
Background
Water quality safety is one of the core problems of environmental protection and water resource management, and is important for maintaining ecological balance and human health. Currently, water quality detection mainly adopts a method of manually sampling on site and then sending to a laboratory for analysis, and a part of automatic monitoring stations and mobile monitoring equipment.
Although the method of manually sampling in the field and sending to a laboratory for analysis can provide more accurate water quality data, the method has the advantages of complex flow, long time consumption and dependence on the operation level of personnel and the preservation condition of the sample. The method can not reflect the change condition of water quality in real time, particularly in the sudden pollution event, the early warning information is difficult to provide in time, and the method is challenging to pollution control and emergency response.
Traditional water quality detection methods lack the ability to predict future water quality change trends. They are primarily concerned with the current water quality conditions and cannot provide information about how the future water quality may change. This is a significant drawback for timely early warning of potential water quality problems, taking treatment action in advance. Therefore, we propose a water quality detection method and system.
Disclosure of Invention
The invention aims to provide a water quality detection method and a water quality detection system, which are used for solving the problems in the background technology.
In order to achieve the purpose, the invention provides the following technical scheme that the water quality detection method comprises the following steps:
The method comprises the steps of collecting historical water quality data, specifically covering water source samples in different time periods, measuring and recording various pollutant concentrations in the water source samples by using a water quality detection instrument, including ammonia nitrogen, total phosphorus, heavy metals and total bacteria, and simultaneously recording the flow rate, temperature, pH value, chemical Oxygen Demand (COD) and Biological Oxygen Demand (BOD) of the water source to form a complete historical water quality data set;
Preprocessing the collected historical water quality data, including data cleaning, missing value processing and abnormal value detection;
dividing the preprocessed historical water quality data into a training set and a testing set for training and verifying a model;
training a water quality change prediction model by using a machine learning algorithm, wherein the water quality change prediction model is used for predicting a water quality parameter change trend in a future period of time according to current water quality detection data;
Parameter tuning and model training are carried out on the selected machine learning algorithm, and algorithm parameters and model structures are optimized through iteration;
verifying the trained model by using a test set, and evaluating the prediction performance and generalization capability of the model;
Deploying a real-time sensor at a water quality monitoring point, periodically collecting water quality data, and keeping the collected data consistent with historical data in parameter types and measurement units;
The method comprises the steps of collecting water quality data in real time, sending the water quality data collected in real time to a data center through a data transmission system, preprocessing the real-time data in the data center, including data cleaning and format conversion, so as to meet the input requirement of a machine learning model, and inputting the preprocessed real-time data into a trained water quality change prediction model;
the water quality change prediction model predicts the water quality parameter change in a period of time in the future according to the historical data and the mode learned in the training process;
And adjusting parameters and operation strategies of the water treatment process according to the prediction result and the real-time water quality data.
Preferably, in preprocessing the historical water quality data, the data cleaning method comprises the following steps:
Identifying and removing duplicate records in the dataset;
Checking data consistency, correcting or deleting format errors or unreasonable records;
the non-numeric data is transcoded to a numeric form suitable for analysis.
Preferably, in the preprocessing of the historical water quality data, the missing value processing method includes:
Filling the missing values of the numerical variable by adopting a mean filling method, a median filling method or a mode filling method;
for the classification variables, filling the missing values with the most frequently occurring class;
and (3) applying an interpolation method, including linear interpolation and polynomial interpolation, and estimating and filling missing values in the time sequence data.
Preferably, in preprocessing the historical water quality data, the abnormal value detection method includes:
Calculating a Z-score value of each data point by using a Z-score method, and if the absolute value of the Z-score is larger than a preset threshold value, determining the Z-score value as an abnormal value, wherein the algorithm formula is as follows:
where x is the raw data, μ is the mean of the data, σ is the standard deviation of the data, and Z is the Z-score value obtained by calculation.
Preferably, a long-short-term memory network LSTM model is selected to construct a water quality prediction model, and the specific method comprises the steps of designing a multi-layer LSTM neural network structure, comprising an input layer, one or more LSTM hidden layers and an output layer, configuring the input layer to receive and process training data of a plurality of time steps, setting a proper number of LSTM units in the LSTM hidden layer to learn and memorize time dependence in the data, capturing water quality change trend, and configuring the output layer to generate a prediction result of future water quality parameters.
Preferably, the step of initializing the parameters of the water quality prediction model comprises the steps of S1, initializing the weight parameters and the bias parameters of an LSTM neural network, S2, setting a learning rate for controlling the step length of updating the network weights so as to ensure the stability and the convergence speed of training, S3, determining the batch size, namely the number of samples used in each training iteration, for balancing the training speed and the memory use, S4, setting training rounds, namely the training pass number of the whole data set, S5, selecting an adaptive moment estimation Adam as an optimizer for adjusting the network weights in the training process, and S6, defining a mean square error MSE as a loss function for quantifying the difference between model prediction and actual change.
Preferably, the specific method for training the water quality prediction model is as follows:
a. performing iterative training on the LSTM neural network by using the divided training data set, wherein in each iteration, training data is used as input, and corresponding future change values are used as target output;
b. calculating an error between the predicted value and the actual value by using a defined loss function through the output of the forward propagation calculation network;
c. applying a back propagation algorithm and an optimizer to update the weight parameters and bias parameters of the network according to the calculated errors to minimize the prediction errors;
d. During the training process, the performance of the model is evaluated by periodically using the verification data set, whether the fitting occurs is judged by monitoring the loss function value and the verification error, and the model structure is adjusted accordingly or the training is terminated in advance.
Preferably, a water quality detection system comprises:
The data acquisition layer comprises a real-time sensor module and is used for periodically acquiring water quality data at water quality monitoring points, wherein the water quality data comprise various pollutant concentrations, flow rates, temperatures, pH values, chemical Oxygen Demand (COD) and Biological Oxygen Demand (BOD) in water source samples in different time periods;
The data preprocessing layer comprises a data cleaning module, a missing value processing module and an abnormal value detection module and is used for preprocessing the collected water quality data to form a complete historical water quality data set, and preprocessing the real-time data, including data cleaning and format conversion, so as to meet the input requirement of a machine learning model;
The model training and verifying layer comprises a machine learning algorithm module, a parameter tuning module and a model training module, and is used for training a water quality change prediction model by using the machine learning algorithm, performing parameter tuning and model training on the selected machine learning algorithm, verifying the trained model by using a test set through iterative optimization algorithm parameters and model structures, evaluating the prediction performance and generalization capability of the model, and selecting model parameters with optimal performance as final water quality change prediction model parameters;
The prediction and application layer comprises a real-time data input module, a water quality change prediction module and a result analysis module, and is used for inputting the preprocessed real-time data into a trained water quality change prediction model, predicting the water quality parameter change in a future period of time, receiving a prediction result output by a machine learning model, analyzing the prediction result, identifying a potential high pollution load or a harmful event, and adjusting the parameters and the operation strategy of the water treatment process according to the prediction result and the real-time water quality data.
Compared with the prior art, the invention has the beneficial effects that:
According to the invention, the real-time sensor is deployed, so that water quality data can be periodically collected and sent to the data center for analysis, and the real-time monitoring of water quality change is realized. By combining with a machine learning prediction model, the water quality parameter change trend in a period of time in the future can be predicted, and timely early warning is provided for potential high pollution load or harmful events, so that sudden pollution events are effectively treated.
Traditional methods rely on manual sampling and laboratory analysis, and are complex in flow and time-consuming. According to the invention, the machine learning algorithm is utilized to carry out deep analysis on the water quality data, so that the accuracy of water quality detection is improved. In the sudden pollution event, the invention can provide early warning information rapidly, is helpful for starting an emergency response mechanism in time, controls pollution diffusion and protects ecological environment and human health.
Drawings
FIG. 1 is a step diagram of a water quality detection method according to the present invention;
FIG. 2 is a training flow chart of a water quality prediction model.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Referring to FIGS. 1-2, the present invention provides a technical solution, wherein the present invention provides a water quality detection method, the method includes:
Step 1, collecting historical water quality data, namely covering water source samples in different time periods, and ensuring the diversity and representativeness of the data. Using a water quality testing instrument, various contaminant concentrations in the water source sample are measured and recorded, including but not limited to ammonia nitrogen, total phosphorus, heavy metals, and total bacteria count. The water source flow rate, temperature, pH, chemical Oxygen Demand (COD), and Biological Oxygen Demand (BOD) are simultaneously recorded to form a complete historical water quality dataset.
And 2, preprocessing the collected historical water quality data, including data cleaning, missing value processing and abnormal value detection, so as to ensure the quality and consistency of the data.
And step 3, dividing the preprocessed historical water quality data into a training set and a testing set for training and verifying the model.
And 4, training a model, namely training a water quality change prediction model by using a machine learning algorithm. The model can predict the water quality parameter change trend in a period of time in the future according to the current water quality detection data.
And 5, parameter tuning and model training, namely performing parameter tuning and model training on the selected machine learning algorithm, and improving the prediction performance and generalization capability of the model through iterative optimization algorithm parameters and model structures.
And 6, verifying the trained model by using a test set, and evaluating the prediction performance and generalization capability of the model. And selecting the model parameter with optimal performance as the final water quality change prediction model parameter according to the verification result.
And 7, deploying a real-time sensor, namely deploying the real-time sensor at a water quality monitoring point and periodically collecting water quality data. Ensuring that the collected data and the historical data are consistent in terms of parameter type and unit of measurement.
And 8, data transmission and preprocessing, namely sending the water quality data acquired in real time to a data center through a data transmission system. In a data center, real-time data is preprocessed, including data cleansing and format conversion, to meet the input requirements of a machine learning model.
And 9, predicting the real-time data, namely inputting the preprocessed real-time data into a trained water quality change prediction model. The water quality change prediction model predicts the water quality parameter change in a future period according to the historical data and the mode learned in the training process.
And 10, analyzing and applying the result, namely receiving the prediction result output by the machine learning model, analyzing the prediction result, and identifying the potential high pollution load or the harmful event. And adjusting parameters and operation strategies of the water treatment process according to the prediction result and the real-time water quality data so as to cope with potential water quality problems.
The invention is further illustrated in the following in connection with examples 1 to 3:
example 1:
In the process of preprocessing the historical water quality data, the data cleaning implementation mode specifically comprises the following links:
And (3) identifying and removing repeated records in the data set, namely adopting a specific data processing software or a deduplication function in a programming language (such as Python, R and the like) to compare all records in the historical water quality data set one by one. Duplicate records that are identical are identified by comparing the fields of the records (e.g., sampling time, sampling location, contaminant concentration, etc.). The identified duplicate records are deleted from the dataset, ensuring that each record is unique, avoiding bias in subsequent analysis.
Checking the consistency of the data, correcting or deleting the records with wrong or unreasonable format, namely checking each record in the data set one by one, and ensuring the correct format and reasonable data. And carrying out format correction on records with wrong formats, such as incorrect date formats, disordered numerical formats and the like, so as to ensure the consistency and the readability of the data. Further verification is performed for records with unreasonable data, such as contaminant concentrations outside of normal ranges, negative flow rates, etc. If the data error is confirmed after verification, it is deleted from the dataset.
And (3) performing code conversion on the non-numerical data, and converting the non-numerical data into a numerical form suitable for analysis, wherein the non-numerical data in the data set is identified, such as descriptive information of water quality category, pollution degree and the like.
And according to the characteristics and analysis requirements of the non-numerical data, a reasonable coding scheme is designed. For example, the water quality class may be coded according to the quality level, for example, a "good" code is 1, a "good" code is 2, and a "bad" code is 3.
The non-numeric data is converted using a coding scheme to a numeric form for use in subsequent analysis and modeling.
In the process of preprocessing the historical water quality data, the implementation mode of the missing value processing specifically comprises the following steps:
filling up missing values of numerical variables:
First, numerical variables in the dataset are identified and checked for missing values in these variables.
For the missing values of the numerical variables, the filling can be performed by adopting a mean filling method, a median filling method or a mode filling method. Which method is specifically selected is determined according to the distribution characteristics of the variables and the analysis requirements. For example, if the variables are normally distributed, mean filling may be more appropriate, and if there is a significant bias in the variables, median or mode filling may be more appropriate. And filling the missing values of the digital variables by using the selected filling method, and ensuring the integrity of the data set.
Filling up missing values of the classification variables:
Classification variables in the dataset are identified and checked for missing values in these variables. For missing values of the classification variables, the most frequently occurring class is used for filling. This approach is based on the statistical mode principle, i.e. the true class that is considered to be most frequently occurring is most likely to represent the missing value. The most frequently occurring category is determined by counting the occurrence frequency of each category and is used to fill in the missing value of the classification variable.
Missing value estimation padding of time series data:
Time-series data in the data set are identified and checked for missing values in the data. For missing values in the time series data, an interpolation method can be applied to estimate and fill. The interpolation method comprises linear interpolation, polynomial interpolation and the like, and the specific selection of which method is determined according to the characteristics and analysis requirements of the time series data. The linear interpolation is a simple interpolation method which assumes that time-series data changes linearly between two observation points before and after a missing value. By calculating the slopes of the two observation points, the magnitude of the missing value is estimated accordingly. Polynomial interpolation is a more complex interpolation method that uses a polynomial function to fit time series data and estimates the magnitude of the missing values based on the fit. Polynomial interpolation can accommodate more complex time series data change patterns.
In the process of preprocessing the historical water quality data, the implementation mode of abnormal value detection specifically comprises the following links:
application of the Z-score method:
First, for each numerical variable in the dataset, the mean (μ) and standard deviation (σ) thereof are calculated. These two statistics will be used for subsequent Z-score calculations.
Next, for each data point x in the dataset, a calculation formula of z= (x- μ)/σ is applied, where Z represents the Z-score value obtained by the calculation.
This formula reflects the relative position between the data point x and the data set mean μ, measured in standard deviation σ.
Identification and processing of outliers:
After obtaining the Z-score value for each data point, a preset threshold value needs to be set. This threshold is typically determined based on the nature of the data and the analysis requirements, e.g., 2.5, 3 or 3.5, etc. may be chosen as the threshold.
Then, it is checked whether the absolute value of the Z-score value of each data point is greater than a preset threshold. If so, the data point is considered an outlier.
For the identified outliers, various processing approaches may be taken. For example, these outliers may be deleted directly to maintain data consistency, or if the outliers have a practical meaning (e.g., representing a particular event), they may be selected for retention and special handling.
In practical applications, the Z-score method may need to be optimized and tuned according to the specific circumstances and analysis requirements of the data. For example, the magnitude of the preset threshold may be adjusted to more accurately identify outliers. In addition, other abnormal value detection methods (such as a distribution-based method, a distance-based method and the like) can be combined to further improve the accuracy and reliability of abnormal value detection.
Example 2:
when constructing a water quality prediction model, selecting a long-short-term memory network (LSTM) model as a core algorithm, wherein the specific implementation mode comprises the following steps:
A multi-layered LSTM neural network is designed that includes an input layer, one or more LSTM hidden layers, and an output layer. The structural design aims to fully utilize the advantages of LSTM in the aspect of processing time series data and capture the change trend of water quality parameters along with time.
An input layer is configured to be capable of receiving and processing training data for a plurality of time steps. This means that the input layer is able to receive a series of time-ordered water quality parameter data, such as contaminant concentration, water temperature, etc. at different points in time.
By taking data of a plurality of time steps as input, the LSTM model can better understand the change rule of the water quality parameters along with time and provide rich information for subsequent prediction tasks.
Setting LSTM hidden layer, namely setting proper number of LSTM units in the LSTM hidden layer. These LSTM cells are the core part of the model, responsible for learning and memorizing the time dependencies in the data.
The LSTM unit controls the flow of information through its internal gating mechanisms (forget gate, input gate and output gate) to effectively capture time series characteristics of water quality parameters.
By configuring a plurality of LSTM hidden layers, the learning ability of the model and the capturing ability of complex water quality change trend can be further enhanced.
And configuring an output layer, wherein the output layer is configured to generate a prediction result of the future water quality parameter. Depending on the particular prediction task, the output layer may be one or more neurons, each corresponding to a different water quality parameter.
And the neurons of the output layer receive the output of the LSTM hidden layer, further calculate and process the output, and finally generate the predicted value of the future water quality parameter.
Model training and prediction, namely training a designed LSTM model by using historical water quality data. During the training process, the model aims at minimizing the prediction error by constantly learning and adjusting its internal parameters. After training is completed, the trained LSTM model can be used for predicting future water quality parameters. By inputting the data of the new time step into the model, the model will output the corresponding prediction result.
The parameters of the water quality prediction model are initialized to include:
Initializing the weight and bias parameters of the LSTM neural network are the core of the network learning, which are continually adjusted during training to minimize the prediction error. When initializing these parameters, it is common to use a randomly generated manner, such as generating initial values using a normal distribution or a uniform distribution. More complex initialization methods, such as He or Glorot based initialization strategies, may also be employed to ensure the rationality and validity of parameter initialization.
The learning rate is set, namely the learning rate is an important super parameter, and controls the step length of updating the network weight. Setting a proper learning rate is important to ensure the stability and convergence speed of training. Typically, the choice of learning rate needs to be determined experimentally, and a learning rate decay strategy can be used to dynamically adjust the learning rate during the training process.
Batch size is determined by the number of samples used in each training iteration. The choice of batch size requires balancing training speed and memory usage. A smaller lot size may increase the training speed but may increase memory usage, and a larger lot size may decrease memory usage but may decrease the training speed. The choice of batch size needs to be determined based on the specific data set and hardware conditions.
Training rounds are set, wherein the training rounds refer to the training times of the whole data set. Setting the proper training rounds is critical to ensure that the model is adequately learned and converged. The choice of training rounds needs to be determined experimentally, and an early stop strategy can be used to prevent overfitting.
An optimizer is selected which adjusts the network weights during the training process to minimize the loss function. An adaptive moment estimate (Adam) is chosen as the optimizer because it combines the advantages of the momentum method and RMSprop to be able to adaptively adjust the learning rate during training. Adam optimizers are particularly effective for handling large-scale data and complex network structures.
A loss function is defined for the gap between the quantized model predictions and the actual changes. A Mean Square Error (MSE) is defined as a loss function because it can measure the square error between the model's predicted value and the actual value. MSE is a commonly used regression loss function, suitable for continuous value prediction tasks.
Example 3:
A water quality detection system comprises a data acquisition layer, a data preprocessing layer, a model training and verifying layer and a prediction and application layer.
For the data acquisition layer:
A real-time sensor module is designed and implemented that has the interface capability with various water quality monitoring sensors (e.g., contaminant concentration sensor, flow rate sensor, temperature sensor, pH sensor, etc.), ensuring that water quality data can be periodically collected from water quality monitoring points.
The real-time sensor module is configured to cover water source samples in different time periods, so that collected data is ensured to contain key water quality parameters such as various pollutant concentrations, flow rates, temperatures, pH values, chemical Oxygen Demand (COD), biological Oxygen Demand (BOD) and the like.
The timing task of data acquisition is realized, the real-time sensor module is ensured to automatically acquire water quality data according to a preset time interval, and the data is stored in a local or remote database.
For the data preprocessing layer:
the data cleaning module is developed, and the module can preprocess the collected water quality data, including removing repeated data, correcting error data, processing inconsistent data and the like, so as to ensure the accuracy and consistency of the data.
A missing value handling module is implemented that can identify missing values in the dataset and fill in the missing values using appropriate fill policies (e.g., mean fill, median fill, or mode fill) to ensure the integrity of the dataset.
An outlier detection module is designed and implemented, and the outlier detection module can detect outliers in the dataset by using a Z-score method or other statistical methods and perform appropriate processing (such as deleting or replacing) on the outliers so as to ensure the normal distribution and accuracy of the data.
The data format conversion function is developed, and the water quality data after data preprocessing is converted into an input format required by the machine learning model, wherein the data format conversion function comprises steps of feature extraction, data normalization or standardization and the like so as to ensure that the data can meet the input requirement of the machine learning model.
The construction function of the historical water quality data set is realized, the pretreated water quality data is organized according to a time sequence to form a complete historical water quality data set, and data support is provided for subsequent model training and verification.
The model training and verification layer comprises:
and the machine learning algorithm module integrates a plurality of machine learning algorithm libraries, such as TensorFlow, pyTorch and the like, and provides rich algorithm selection for water quality change prediction. And selecting a proper machine learning algorithm, such as LSTM, according to the characteristics of the water quality data and the prediction requirements, and performing model training.
The parameter tuning module designs a parameter tuning strategy, including defining a parameter search space, setting a tuning target (such as minimizing a prediction error), and the like. And (3) performing parameter optimization on the selected machine learning algorithm by applying parameter optimization methods such as grid search, random search and the like. And (3) evaluating the model performance under different parameter combinations through iterative training, and selecting an optimal parameter combination as a basis of model training.
And the model training module is used for carrying out model training on the selected machine learning algorithm by using the preprocessed historical water quality data set as training data. During the training process, cross-validation strategies, such as K-fold cross-validation, are applied to evaluate the generalization ability of the model. Key indexes in the training process, such as loss function values, accuracy and the like, are recorded and used for subsequent analysis and optimization.
The prediction and application layer comprises:
And the real-time data input module is used for designing and realizing a real-time data interface and ensuring that the pretreated real-time water quality data from the data acquisition layer can be received. And a data caching mechanism is realized, and the access and processing efficiency of real-time data in the model prediction process is improved.
And the water quality change prediction module is used for inputting the real-time data into a trained water quality change prediction model, executing a model reasoning process and predicting the water quality parameter change in a future period of time. The real-time output function of the prediction result is realized, and the prediction result can be timely and accurately transmitted to a subsequent processing module or a user interface.
And the result analysis module is used for analyzing the prediction result output by the machine learning model and identifying the potential high pollution load or the harmful event. And designing and realizing an early warning mechanism, and automatically triggering an early warning signal to remind relevant personnel to take corresponding countermeasures when the predicted result exceeds a preset threshold value.
The visual tool or report generating function is provided, the predicted result is displayed to the user in the forms of charts, reports and the like, and the user is helped to intuitively understand the water quality change trend and the predicted result.
And the decision support and application module is used for providing decision support for the water treatment process according to the prediction result and the real-time water quality data and adjusting the process parameters and the operation strategy so as to optimize the water quality treatment effect. The system is integrated into a wider water quality management system, and cooperates with other modules (such as a data acquisition layer, a data preprocessing layer and the like) to realize closed-loop management of water quality monitoring, prediction, early warning and processing.
And an API (application program interface) or an SDK (software development kit) is provided, so that a third party system or an application can integrate and use the prediction function of the water quality detection system.
It is noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.
Although embodiments of the present invention have been shown and described, it will be understood by those skilled in the art that various changes, modifications, substitutions and alterations can be made therein without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.

Claims (8)

1.一种水质检测方法,其特征在于,所述方法包括:1. A water quality detection method, characterized in that the method comprises: 收集历史水质数据,具体涵盖不同时间段内的水源样本;使用水质检测仪器,测量并记录水源样本中的各种污染物浓度,包括氨氮、总磷、重金属和细菌总数;同时记录水源的流速、温度、pH值、化学需氧量COD和生物需氧量BOD,以形成完整的历史水质数据集;Collect historical water quality data, specifically covering water source samples over different time periods; use water quality testing instruments to measure and record the concentrations of various pollutants in water source samples, including ammonia nitrogen, total phosphorus, heavy metals, and total bacteria; and simultaneously record the flow rate, temperature, pH value, chemical oxygen demand (COD), and biological oxygen demand (BOD) of the water source to form a complete historical water quality data set; 对收集到的历史水质数据进行预处理,包括数据清洗、缺失值处理和异常值检测;Preprocess the collected historical water quality data, including data cleaning, missing value processing and outlier detection; 将预处理后的历史水质数据划分为训练集和测试集,用于模型的训练和验证;The preprocessed historical water quality data is divided into a training set and a test set for model training and verification; 利用机器学习算法,训练一个水质变化预测模型,所述水质变化预测模型用于根据当前水质检测数据预测未来一段时间内的水质参数变化趋势;Using a machine learning algorithm, a water quality change prediction model is trained, wherein the water quality change prediction model is used to predict the change trend of water quality parameters in the future based on current water quality detection data; 对选定的机器学习算法进行参数调优和模型训练,通过迭代优化算法参数和模型结构;Perform parameter tuning and model training on the selected machine learning algorithm, and optimize the algorithm parameters and model structure through iteration; 使用测试集对训练好的模型进行验证,评估模型的预测性能和泛化能力;根据验证结果,选择性能最优的模型参数作为最终的水质变化预测模型参数;Use the test set to verify the trained model and evaluate the model's prediction performance and generalization ability; based on the verification results, select the model parameters with the best performance as the final water quality change prediction model parameters; 在水质监测点部署实时传感器,定期采集水质数据,采集的数据与历史数据在参数类型和测量单位上保持一致;Deploy real-time sensors at water quality monitoring points to collect water quality data regularly, and ensure that the collected data is consistent with historical data in terms of parameter type and measurement unit; 将实时采集的水质数据通过数据传输系统发送至数据中心;在数据中心中,对实时数据进行预处理,包括数据清洗和格式转换,以符合机器学习模型的输入要求;将预处理后的实时数据输入至已训练的水质变化预测模型中;The real-time collected water quality data is sent to the data center through the data transmission system; in the data center, the real-time data is preprocessed, including data cleaning and format conversion, to meet the input requirements of the machine learning model; the preprocessed real-time data is input into the trained water quality change prediction model; 水质变化预测模型根据历史数据和训练过程中学习到的模式,预测未来一段时间内的水质参数变化;The water quality change prediction model predicts the changes in water quality parameters over a period of time in the future based on historical data and the patterns learned during training; 接收机器学习模型输出的预测结果,分析预测结果,识别潜在的高污染负荷或有害事件;根据预测结果和实时水质数据,调整水处理工艺的参数和操作策略。Receive prediction results from machine learning models, analyze the prediction results, identify potential high pollution loads or harmful events; adjust the parameters and operating strategies of the water treatment process based on the prediction results and real-time water quality data. 2.根据权利要求1所述的一种水质检测方法,其特征在于,在对历史水质数据进行预处理中,数据清洗的方式包括:2. A water quality detection method according to claim 1, characterized in that, in preprocessing the historical water quality data, the data cleaning method includes: 识别并去除数据集中的重复记录;Identify and remove duplicate records from a data set; 检查数据一致性,纠正或删除格式错误或不合理的记录;Check data consistency and correct or delete records that are incorrectly formatted or unreasonable; 对非数值型数据进行编码转换,将其转换为适合分析的数值形式。Perform encoding conversion on non-numeric data and convert it into a numeric form suitable for analysis. 3.根据权利要求1所述的一种水质检测方法,其特征在于,在对历史水质数据进行预处理中,缺失值处理的方式包括:3. A water quality detection method according to claim 1, characterized in that, in preprocessing the historical water quality data, the missing value processing method includes: 采用均值填充、中位数填充或众数填充方法,对数值型变量的缺失值进行填补;Use mean filling, median filling or mode filling methods to fill missing values of numerical variables; 对于分类变量,使用最频繁出现的类别填充缺失值;For categorical variables, missing values were filled using the most frequently occurring category; 应用插值方法,包括线性插值和多项式插值,对时间序列数据中的缺失值进行估算填补。Apply interpolation methods, including linear interpolation and polynomial interpolation, to estimate and fill missing values in time series data. 4.根据权利要求1所述的一种水质检测方法,其特征在于,在对历史水质数据进行预处理中,异常值检测的方式包括:4. A water quality detection method according to claim 1, characterized in that, in preprocessing the historical water quality data, the abnormal value detection method includes: 使用Z-score方法,计算每个数据点的Z-score值,若Z-score的绝对值大于预设阈值,则视为异常值,算法公式为:Use the Z-score method to calculate the Z-score value of each data point. If the absolute value of the Z-score is greater than the preset threshold, it is considered an outlier. The algorithm formula is: 其中,x是原始数据,μ是数据的均值,σ是数据的标准差,z是通过计算获得的Z-score值。Among them, x is the original data, μ is the mean of the data, σ is the standard deviation of the data, and z is the Z-score value obtained by calculation. 5.根据权利要求4所述的一种水质检测方法,其特征在于,选择长短期记忆网络LSTM模型构建水质预测模型,具体方法为:设计一个多层的LSTM神经网络结构,包含一个输入层、一个或多个LSTM隐藏层,以及一个输出层;配置输入层以接收并处理多个时间步长的训练数据;在LSTM隐藏层中,设置适当数量的LSTM单元,用于学习并记忆数据中的时间依赖性,捕捉水质变化趋势;配置输出层以产生未来水质参数的预测结果。5. A water quality detection method according to claim 4, characterized in that a long short-term memory network LSTM model is selected to construct a water quality prediction model, and the specific method is: designing a multi-layer LSTM neural network structure, comprising an input layer, one or more LSTM hidden layers, and an output layer; configuring the input layer to receive and process training data of multiple time steps; in the LSTM hidden layer, setting an appropriate number of LSTM units to learn and memorize the time dependency in the data and capture the trend of water quality changes; configuring the output layer to generate prediction results of future water quality parameters. 6.根据权利要求5所述的一种水质检测方法,其特征在于,初始化水质预测模型参数的步骤包括:S1、初始化LSTM神经网络的权重参数和偏置参数;S2、设定学习率,用于控制网络权重更新的步长,以保证训练的稳定性和收敛速度;S3、确定批次大小,即每次训练迭代中使用的样本数量,以平衡训练速度和内存使用;S4、设定训练轮次,即整个数据集的训练遍数;S5、选择自适应性矩估计Adam作为优化器,用于在训练过程中调整网络权重;S6、定义均方误差MSE作为损失函数,用于量化模型预测与实际变化之间的差距。6. A water quality detection method according to claim 5, characterized in that the step of initializing the parameters of the water quality prediction model includes: S1, initializing the weight parameters and bias parameters of the LSTM neural network; S2, setting the learning rate to control the step size of the network weight update to ensure the stability and convergence speed of the training; S3, determining the batch size, that is, the number of samples used in each training iteration, to balance the training speed and memory usage; S4, setting the training rounds, that is, the number of training rounds for the entire data set; S5, selecting the adaptive moment estimator Adam as the optimizer to adjust the network weights during the training process; S6, defining the mean square error MSE as the loss function to quantify the gap between the model prediction and the actual change. 7.根据权利要求6所述的一种水质检测方法,其特征在于,训练水质预测模型的具体方法为:7. A water quality detection method according to claim 6, characterized in that the specific method of training the water quality prediction model is: a、使用划分好的训练数据集对LSTM神经网络进行迭代训练,每次迭代中,将训练数据作为输入,对应的未来变化值作为目标输出;a. Use the divided training data set to iteratively train the LSTM neural network. In each iteration, the training data is used as input and the corresponding future change value is used as the target output; b、通过前向传播计算网络的输出,并利用定义的损失函数计算预测值与实际值之间的误差;b. Calculate the output of the network through forward propagation and use the defined loss function to calculate the error between the predicted value and the actual value; c、应用反向传播算法和优化器,根据计算出的误差更新网络的权重参数和偏置参数,以最小化预测误差;c. Apply the back-propagation algorithm and optimizer to update the network’s weight and bias parameters based on the calculated error to minimize the prediction error; d、在训练过程中,定期使用验证数据集评估模型的性能,通过监控损失函数值和验证误差来判断是否出现过拟合,并据此调整模型结构或提前终止训练。d. During the training process, the validation data set is used regularly to evaluate the performance of the model. By monitoring the loss function value and validation error, it is determined whether overfitting occurs, and the model structure is adjusted accordingly or the training is terminated early. 8.一种水质检测系统,其特征在于,包括:8. A water quality detection system, comprising: 数据采集层,包括实时传感器模块,用于在水质监测点定期采集水质数据,涵盖不同时间段内的水源样本中的各种污染物浓度、流速、温度、pH值、化学需氧量COD和生物需氧量BOD;The data collection layer includes real-time sensor modules, which are used to regularly collect water quality data at water quality monitoring points, covering the concentration of various pollutants, flow rate, temperature, pH value, chemical oxygen demand (COD) and biological oxygen demand (BOD) in water source samples over different time periods; 数据预处理层,包括数据清洗模块、缺失值处理模块和异常值检测模块,用于对采集的水质数据进行预处理,形成完整的历史水质数据集,并对实时数据进行预处理,包括数据清洗和格式转换,以符合机器学习模型的输入要求;The data preprocessing layer includes a data cleaning module, a missing value processing module, and an outlier detection module, which are used to preprocess the collected water quality data to form a complete historical water quality data set, and preprocess the real-time data, including data cleaning and format conversion, to meet the input requirements of the machine learning model; 模型训练与验证层,包括机器学习算法模块、参数调优模块和模型训练模块,用于利用机器学习算法训练一个水质变化预测模型,对选定的机器学习算法进行参数调优和模型训练,通过迭代优化算法参数和模型结构,并使用测试集对训练好的模型进行验证,评估模型的预测性能和泛化能力,选择性能最优的模型参数作为最终的水质变化预测模型参数;The model training and verification layer includes a machine learning algorithm module, a parameter tuning module, and a model training module, which are used to train a water quality change prediction model using a machine learning algorithm, perform parameter tuning and model training on the selected machine learning algorithm, optimize the algorithm parameters and model structure through iteration, and use the test set to verify the trained model, evaluate the prediction performance and generalization ability of the model, and select the model parameters with the best performance as the final water quality change prediction model parameters; 预测与应用层,包括实时数据输入模块、水质变化预测模块和结果分析模块,用于将预处理后的实时数据输入至已训练的水质变化预测模型中,预测未来一段时间内的水质参数变化,接收机器学习模型输出的预测结果,分析预测结果,识别潜在的高污染负荷或有害事件,并根据预测结果和实时水质数据,调整水处理工艺的参数和操作策略。The prediction and application layer includes a real-time data input module, a water quality change prediction module and a result analysis module, which are used to input the pre-processed real-time data into the trained water quality change prediction model, predict the changes in water quality parameters in the future, receive the prediction results output by the machine learning model, analyze the prediction results, identify potential high pollution loads or harmful events, and adjust the parameters and operation strategies of the water treatment process according to the prediction results and real-time water quality data.
CN202411050291.5A 2024-08-01 2024-08-01 Water quality detection method and system Pending CN119046750A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202411050291.5A CN119046750A (en) 2024-08-01 2024-08-01 Water quality detection method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202411050291.5A CN119046750A (en) 2024-08-01 2024-08-01 Water quality detection method and system

Publications (1)

Publication Number Publication Date
CN119046750A true CN119046750A (en) 2024-11-29

Family

ID=93586616

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202411050291.5A Pending CN119046750A (en) 2024-08-01 2024-08-01 Water quality detection method and system

Country Status (1)

Country Link
CN (1) CN119046750A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118913738A (en) * 2024-07-19 2024-11-08 上检(浙江)机动车检测技术有限公司 Comprehensive experimental device for refrigerating and heat exchanging of air conditioner
CN119416129A (en) * 2025-01-07 2025-02-11 湖南亿康环保科技有限公司 A sewage monitoring method and system in rainstorm scenarios based on deep learning
CN119849709A (en) * 2025-03-19 2025-04-18 湖南师范大学 Water quality prediction method, device, equipment and medium
CN119905158A (en) * 2025-03-31 2025-04-29 武汉龙净环保工程有限公司 A method for optimizing wet desulfurization parameters based on data processing
CN120069228A (en) * 2025-04-25 2025-05-30 中国科学院南海海洋研究所 Multi-element synchronous rapid prediction method, equipment and storage medium for water environment pollution

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118913738A (en) * 2024-07-19 2024-11-08 上检(浙江)机动车检测技术有限公司 Comprehensive experimental device for refrigerating and heat exchanging of air conditioner
CN119416129A (en) * 2025-01-07 2025-02-11 湖南亿康环保科技有限公司 A sewage monitoring method and system in rainstorm scenarios based on deep learning
CN119849709A (en) * 2025-03-19 2025-04-18 湖南师范大学 Water quality prediction method, device, equipment and medium
CN119905158A (en) * 2025-03-31 2025-04-29 武汉龙净环保工程有限公司 A method for optimizing wet desulfurization parameters based on data processing
CN120069228A (en) * 2025-04-25 2025-05-30 中国科学院南海海洋研究所 Multi-element synchronous rapid prediction method, equipment and storage medium for water environment pollution

Similar Documents

Publication Publication Date Title
CN119046750A (en) Water quality detection method and system
CN118350678B (en) Water environment monitoring data processing method and system based on Internet of things and big data
CN118553338B (en) Multi-parameter prediction method for water quality of marine pasture
CN118566770A (en) Estimation method and system for state of health value of battery system
CN118861957B (en) Air quality detection method based on multi-sensor monitoring
CN118518842B (en) Water quality detection method and system based on multi-mode data fusion and intelligent optimization
CN113868957B (en) Remaining life prediction and uncertainty quantification calibration method based on Bayesian deep learning
CN106295121A (en) Landscape impoundments Bayes's water quality grade Forecasting Methodology
CN119558171B (en) Methods and models for inversion of key parameters in numerical model of biodegradation of groundwater pollutants based on neural network algorithm
CN111898673A (en) A Dissolved Oxygen Content Prediction Method Based on EMD and LSTM
CN115185937A (en) SA-GAN architecture-based time sequence anomaly detection method
van Oosterom et al. Optimal maintenance policies for a safety‐critical system and its deteriorating sensor
CN118606650A (en) A method, system, device and storage medium for measuring the importance of water quality influencing factors
CN117578441A (en) Method for improving power grid load prediction precision based on neural network
CN117093919A (en) Geotechnical engineering geological disaster prediction method and system based on deep learning
CN119560050A (en) A method for detecting and evaluating the toxicity of environmental pollutants
CN119027107B (en) Intelligent diagnosis and maintenance method, device and equipment for incinerator and storage medium
CN119358848A (en) Carbon emission evaluation system and method for railway transportation engineering based on the whole life cycle
CN116013426A (en) Site ozone concentration prediction method with high space-time resolution
CN118761514A (en) Method, equipment, device and medium for predicting trends using artificial intelligence technology
CN118228029A (en) An integrated management method and system for multidimensional data
CN118297414A (en) Mine water hazard forecasting electronic equipment and computer program product
CN117649209A (en) Enterprise revenue auditing method, system, equipment and storage medium
CN118759603B (en) A retractable weather station with protection function
CN119129857B (en) Water quality prediction method and software system for water supply network based on spatiotemporal graph neural network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination