[go: up one dir, main page]

CN120338210B - Reservoir operation method based on deep learning adaptive dynamic network and reinforcement learning - Google Patents

Reservoir operation method based on deep learning adaptive dynamic network and reinforcement learning

Info

Publication number
CN120338210B
CN120338210B CN202510819807.6A CN202510819807A CN120338210B CN 120338210 B CN120338210 B CN 120338210B CN 202510819807 A CN202510819807 A CN 202510819807A CN 120338210 B CN120338210 B CN 120338210B
Authority
CN
China
Prior art keywords
reservoir
reinforcement learning
flow
data
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202510819807.6A
Other languages
Chinese (zh)
Other versions
CN120338210A (en
Inventor
郑冬燕
汤国和
李善综
邱文丰
林木隆
赖永泉
郭淑慧
巫美强
李伟
蒋永强
陈毅锋
王晗
李敏
刘和昌
李嘉第
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhujiang Water Resources Comprehensive Technology Center Of Zhujiang Water Resources Commission Of Ministry Of Water Resources
Original Assignee
Zhujiang Water Resources Comprehensive Technology Center Of Zhujiang Water Resources Commission Of Ministry Of Water Resources
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhujiang Water Resources Comprehensive Technology Center Of Zhujiang Water Resources Commission Of Ministry Of Water Resources filed Critical Zhujiang Water Resources Comprehensive Technology Center Of Zhujiang Water Resources Commission Of Ministry Of Water Resources
Priority to CN202510819807.6A priority Critical patent/CN120338210B/en
Publication of CN120338210A publication Critical patent/CN120338210A/en
Application granted granted Critical
Publication of CN120338210B publication Critical patent/CN120338210B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06N3/0455Auto-encoder networks; Encoder-decoder networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/092Reinforcement learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0631Resource planning, allocation, distributing or scheduling for enterprises or organisations
    • G06Q10/06312Adjustment or analysis of established resource schedule, e.g. resource or task levelling, or dynamic rescheduling
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0637Strategic management or analysis, e.g. setting a goal or target of an organisation; Planning actions based on goals; Analysis or evaluation of effectiveness of goals
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A10/00TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE at coastal zones; at river basins
    • Y02A10/40Controlling or monitoring, e.g. of flood or hurricane; Forecasting, e.g. risk assessment or mapping

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Human Resources & Organizations (AREA)
  • Economics (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Strategic Management (AREA)
  • General Physics & Mathematics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • General Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Tourism & Hospitality (AREA)
  • Health & Medical Sciences (AREA)
  • Operations Research (AREA)
  • General Health & Medical Sciences (AREA)
  • Educational Administration (AREA)
  • Development Economics (AREA)
  • Game Theory and Decision Science (AREA)
  • Quality & Reliability (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Evolutionary Computation (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Public Health (AREA)
  • Primary Health Care (AREA)
  • Water Supply & Treatment (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a reservoir dispatching method based on a deep learning self-adaptive dynamic network and reinforcement learning, which relates to the technical field of reservoir dispatching and comprises the steps of acquiring real-time water level data, meteorological data, regional cloud layer and radar image data of surface features of a reservoir, carrying out data fusion on the three data, inputting the fused data into a preset self-adaptive dynamic transducer network model to obtain predicted output, wherein the predicted output is a reservoir water level sequence, a warehouse-in flow sequence and a lower discharge flow sequence, constructing a reinforcement learning state space, inputting the reinforcement learning state space into a preset reinforcement learning network, wherein the action space of the reinforcement learning network comprises flood discharge quantity, power generation flow and ecological flow, and the reinforcement learning network is optimized to maximize accumulated discount rewards, and the rewards function is a weighting function of flood control rewards, power generation rewards and ecological rewards. The invention improves the collaborative management efficiency of a plurality of targets such as reservoir flood control safety, power generation benefit, ecological protection and the like.

Description

Reservoir dispatching method based on deep learning self-adaptive dynamic network and reinforcement learning
Technical Field
The invention relates to the technical field of reservoir dispatching, in particular to a reservoir dispatching method based on deep learning self-adaptive dynamic network and reinforcement learning.
Background
With the aggravation of global climate change, extreme weather events frequently occur, and the traditional reservoir scheduling method is difficult to adapt to complex and changeable hydrological meteorological environments. The conventional scheduling mode is often based on fixed rules or offline history experience, lacks effective utilization of real-time weather and water level data, faces sudden heavy rain or drought and waterlogging abrupt transition, and is easy to generate decision delay and water level control errors, so that flood control risks are increased, power generation efficiency is reduced, and stability of a downstream ecological system is damaged. The reservoir scheduling method of the deep learning network structure which is currently applied is usually of a static fixed structure, lacks real-time self-adaptive capacity, cannot adjust the network structure in time along with the dynamic change of hydrologic prediction requirements, and causes unstable prediction precision and insufficient model generalization capability.
Reservoir dispatching methods based on deep learning self-adaptive dynamic network and reinforcement learning are developed to solve the problems.
Disclosure of Invention
The invention provides a reservoir dispatching method based on deep learning self-adaptive dynamic network and reinforcement learning, which aims to solve the problems of unstable prediction precision and insufficient generalization capability of the existing reservoir dispatching method.
The invention realizes the above purpose through the following technical scheme:
the reservoir dispatching method based on deep learning self-adaptive dynamic network and reinforcement learning comprises the following steps:
acquiring real-time water level data, meteorological data, regional cloud layer and radar image data of ground surface characteristics of a reservoir, and carrying out data fusion on the three data;
Inputting the fused data into a preset self-adaptive dynamic transducer network model to obtain predicted output, wherein the predicted output comprises a reservoir water level sequence, a warehouse-in flow sequence and a lower drainage flow sequence;
Constructing a reinforcement learning state space, wherein the reinforcement learning state space comprises a predicted reservoir water level sequence, a warehouse-in flow sequence, a downward leakage flow sequence and a latest real-time meteorological data sequence of a preset time step in the future of a transducer network model;
Inputting the reinforcement learning state space into a preset reinforcement learning network, wherein the action space of the reinforcement learning network comprises flood discharge amount, power generation flow and ecological flow, the optimization target of the reinforcement learning network is to maximize accumulated discount rewards, the rewarding function is a weighting function of flood control rewarding items, power generation rewarding items and ecological rewarding items, and the reinforcement learning network finally converges and outputs an optimal combination scheme of flood discharge amount, power generation flow and ecological flow through continuous iterative optimization.
Further, the preprocessing of the real-time water level data and the meteorological data is performed before the data fusion, and the preprocessing step comprises the following steps:
further, performing spatial interpolation processing on the real-time water level data, the meteorological data, the regional cloud layer and the radar image data of the ground surface characteristics by adopting a Kriging interpolation algorithm based on the spatial distance weight to obtain continuity data covering the whole reservoir region;
detecting abnormal values of the continuous data, and eliminating the abnormal values in real time;
and carrying out smooth denoising treatment on the data with the outliers removed.
Further, when the result error of each prediction output and actual observation of the adaptive dynamic transducer network model exceeds a preset upper limit threshold, a preset number of coding layers are automatically added to the adaptive dynamic transducer network model, when the result error is lower than a preset lower limit threshold, a preset number of coding layers are automatically reduced to the adaptive dynamic transducer network model, and when the result error of each prediction output and actual observation of the adaptive dynamic transducer network model is between the lower limit threshold and the upper limit threshold, the adaptive dynamic transducer network model is unchanged.
Further, the result error is a weighted sum of a reservoir water level mean square error of a predicted value and a true value of the reservoir water level, a reservoir flow mean square error of a predicted value and a true value of the reservoir flow, a predicted value of the drain flow and a drain flow mean square error of the true value.
Further, the adaptive dynamic transducer network model outputs weights of flood control rewarding items, power generation rewarding items and ecological rewarding items in the rewarding function each time based on a dynamic attention mechanism
Further, according to the weights of flood control rewarding items, electricity generation rewarding items and ecological rewarding itemsUpdating target weightsThe updating mode is as follows:
;
;
;
After the dynamic attention weight is adjusted in real time, the dynamic attention weight is used as a target weighting coefficient in the next reinforcement learning decision, so that the system can be ensured to accurately adapt to the real-time requirements of different scheduling targets according to the real-time state;
The dynamic attention mechanism adopts a target weight vector of a trainable parameter, and calculates the attention weight of each target in real time by a Softmax function.
Further, R t is a cumulative discount prize, alpha Flood control is a target weight coefficient for a flood control prize, alpha Generating electricity is a target weight coefficient for a power generation prize, alpha Ecological system is a target weight coefficient for a ecological prize,In order to predict the water level,In order to prevent flood and limit the water level,The method is characterized in that the method comprises the steps of generating economic benefit coefficient of unit flow, wherein Q Generating electricity ,t is generating flow, Q Ecological system ,t is current ecological flow, Q Ecological system , Ideal for is ideal ecological flow, R Flood control is flood control rewarding item, R Generating electricity is generating rewarding item, R Ecological system is ecological rewarding item and t is time step.
Further, determining the opening degree of each flood discharge gate of the reservoir and the power output scheme of the generator set according to the output optimal combination scheme of the flood discharge amount, the power generation flow and the ecological flow, and forming a specific execution instruction, wherein the specific execution instruction comprises the following steps:
calculating in real time according to the output flood discharge amount and a gate flow-opening relation formula to obtain the opening of the flood discharge gate;
Determining the start-stop number of the units and the load distribution of each unit in real time according to the output power generation flow;
constructing an execution instruction according to the opening degree of the flood discharge gate, the start-stop number of the units and the load distribution of each unit;
Judging whether the total discharging amount of the flood discharging and power generating is not lower than the ecological flow, if yes, considering that the ecological water is contained without additional scheduling, otherwise, supplementing water through an ecological special gate hole or a low-load unit according to the difference, wherein the difference is the difference between the ecological flow and the total discharging amount of the flood discharging and power generating.
Further, a long-term running root mean square error historical database is constructed, statistical analysis is carried out on historical trends of result errors and decision errors at regular intervals, and a preset error threshold value is dynamically adjusted according to the historical error trends.
Further, acquiring real-time water level data, meteorological data, regional cloud cover and radar image data of surface features of the reservoir, including:
Acquiring real-time water level data monitored by a wave-type water level sensor, wherein the wave-type water level sensor is deployed in a reservoir area and a key section of a main warehouse-in river channel;
Acquiring meteorological data of a meteorological data acquisition system, wherein the meteorological data acquisition system is deployed in a water reservoir area and comprises a precipitation sensor, a wind speed sensor and a temperature and humidity sensor;
and acquiring radar image data of regional cloud and earth surface features, and performing satellite remote sensing observation on the upstream regions of reservoirs and watercourses by using a high-resolution radar remote sensing satellite with the spatial resolution of 30m to acquire the radar image data of the regional cloud and earth surface features.
The invention has the beneficial effects that:
Compared with the prior art, the reservoir dispatching method based on deep learning self-adaptive dynamic network and reinforcement learning provided by the invention has the advantages that the cooperative management efficiency of a plurality of targets such as reservoir flood control safety, power generation benefit and ecological protection is comprehensively improved, the stability of prediction precision is improved, and the model flooding capability is improved.
Drawings
FIG. 1 is a flowchart of a reservoir dispatching method based on deep learning adaptive dynamic network and reinforcement learning in an embodiment.
Fig. 2 is a schematic diagram of an adaptive dynamic transducer network structure in step S2 in the embodiment.
FIG. 3 is a flowchart of a reinforcement learning multi-objective collaborative optimization module in step S3 in an embodiment.
Fig. 4 is a functional framework diagram of the intelligent co-scheduling platform in step S6 in the embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention more clear, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. It will be apparent that the described embodiments are some, but not all, embodiments of the invention. The components of the embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations.
Thus, the following detailed description of the embodiments of the invention, as presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
It should be noted that like reference numerals and letters refer to like items in the following figures, and thus once an item is defined in one figure, no further definition or explanation thereof is necessary in the following figures.
Furthermore, the terms "first," "second," and the like, are used merely to distinguish between descriptions and should not be construed as indicating or implying relative importance.
The following describes specific embodiments of the present invention in detail with reference to the drawings.
As shown in fig. 1, the specific implementation manner of the reservoir dispatching method flow chart based on the deep learning self-adaptive dynamic network and the reinforcement learning of the invention is as follows:
and S1, real-time data fusion and preprocessing.
In practical implementation of the invention, 8 ultrasonic water level sensors (model: MB 7389) are deployed on the reservoir area and key sections of the main warehouse river channel, the water level measurement precision is +/-2 cm, real-time water level data are acquired every 1 minute, and the data are uploaded to a data fusion center in real time through a LoRa wireless transmission protocol. Meanwhile, a meteorological data acquisition system consisting of meteorological stations is installed in a warehouse area and specifically comprises a precipitation sensor (model: TB4, precision + -0.5 mm), a wind speed sensor (model: windSonic, precision + -0.2 m/s) and a temperature and humidity sensor (model: HMP155, temperature precision + -0.5 ℃ and humidity precision + -3%), wherein meteorological data acquisition is carried out every 5 minutes. In addition, a high-resolution radar remote sensing satellite with the spatial resolution of 30m is used, satellite remote sensing observation is carried out on the reservoir and the upstream area of the river basin once per hour, and radar image data of regional cloud layers and surface features are obtained, wherein the data format is GeoTIFF. The data fusion module firstly carries out spatial interpolation processing on a plurality of water level measuring points and meteorological sensor data by adopting a Kriging interpolation algorithm based on spatial distance weight, and obtains continuity data covering the whole reservoir area. The abnormal value detection after data fusion is realized based on a Z-Score method, regional cloud layers are combined with meteorological data, sudden heavy rain and other conditions are predicted in advance, and the earth surface features are used for distinguishing different runoff producing effects under the same rainfall so as to schedule more accurately.
The specific formula is as follows:
Wherein, the Representing the real-time measurement value,Representing the mean of the past 24 hours of historical data,Represents standard deviation, Z represents standard fraction. When (when)And when the system automatically marks the data as an abnormal value and eliminates the abnormal value in real time. Finally, after the smoothing denoising treatment is carried out by a Savitzky-Golay smoothing filter, the data is standardized to be processed into a sequence form, and the sequence form is periodically updated to the input of a subsequent network model by taking 10 minutes as a unit.
And S2, adaptive dynamic network accurate prediction based on a transducer structure.
The method adopts a Transformer network structure, takes the standardized data sequence processed in the step 1 as the input of the network, realizes short-time (30 minutes in future) accurate prediction of reservoir water level, warehouse-in flow and lower discharge flow, and comprises the following specific embodiments:
Step 2.1, as shown in fig. 2, the initial structure of the transducer network is built up by 4 standard transducer encoder layers, each comprising a multi-headed self-attention sub-Layer and a feed-forward neural network sub-Layer, and Layer Norm Layer and residual connection. Each self-attention sub-layer contains 8 attention heads, each head having dimensions of 64, so the output dimension of each self-attention sub-layer is 512 dimensions, calculated as follows:
the calculation mode of the single attention head is as follows:
Wherein, the Respectively representing a query matrix, a key matrix and a value matrix, and the dimensions are all. The output of the multi-head attention is:
Wherein, the ,Representing a linear transformation parameter matrix.
The feedforward neural network is composed of two linear transformation layers, the structure is 512-2048-512-D, and the formula is as follows:
The initial input data dimension of the network is standardized data of 120 minutes in the past (12 time steps are included in each time step, the characteristic data of water level, precipitation amount, wind speed, temperature and humidity and the like are included in each time step, the total input dimension is 128 dimensions), the water level, warehouse-in flow and downward discharge flow of 3 time steps (30 minutes) in the future are output to be predicted (the output dimension is 3 dimensions), and meanwhile, the weights of flood control rewarding items, power generation rewarding items and ecological rewarding items in the rewarding function are output (the output dimension is 3 dimensions). The training process adopts an Adam optimizer, the initial learning rate is 0.001, and the loss function adopts a Mean Square Error (MSE):
Wherein, the The actual observed value is represented by a set of values,Representing the predicted value of the network output,Is the number of samples in a batch. In the training process, branches outputting weights of flood control rewarding items, power generation rewarding items and ecological rewarding items are not involved in training, namely gradients of the branches are cut off.
Step 2.2, self-adaptive dynamic adjustment of network structure, in which the network dynamic adjustment module calculates the result error according to real timeLine automatic adjustment of the number of layers of a transducer network, wherein the resulting errorThe calculation mode of (2) is as follows:
Wherein, the Mean square error and weight of predicted value and true value output by networkBy means of an off-line cross-validation determination,The water level is indicated and the water level,The flow rate of the warehouse entry is represented,The leakage flow is represented, t represents time, and t represents time;
taking the latest W prediction loops as a sliding window, firstly calculating a comprehensive error sequence Is the sliding average value of (2)Standard deviationThe standard deviation takes the form of a weighted covariance:
Wherein the method comprises the steps of The index weight vector determined for offline cross-validation,For covariance matrices of three classes of RMSE within a window, then dynamically constructing thresholds:
;
;
Wherein the coefficient is relaxed Offline determination on a historical dataset by bayesian optimization, as the overall result error gradually decreases,With a consequent reduction in the size of the film,Correspondingly down-regulating, otherwise, when the result error increases sharply,Is dynamically lifted to avoid false positives, and once the result is in real time, the errorExceeding the new upper threshold, the layer adding operation is still triggered, when the result error of each predicted result and the actual observed result exceeds the upper thresholdWhen the number of network layers is increased by 1 and the maximum increase is not more than 3, when the result error is lower than the lower thresholdWhen the network layer number is reduced by 1, the specific regulation rule is as follows:
;
the dynamic adjustment mechanism of the network layer number can effectively balance the calculation load and the prediction precision, and the real-time property and the stability of the prediction are maintained.
And 2.3, implementing a dynamic attention mechanism, wherein the dynamic attention mechanism is designed for adaptively adjusting the predicted weights of three different targets of flood control, power generation and ecology. According to the weights of flood control rewarding items, electricity generation rewarding items and ecological rewarding itemsUpdating target weightsThe updating mode is as follows:
;
;
;
After the dynamic attention weight is adjusted in real time, the dynamic attention weight is used as a target weighting coefficient in the next reinforcement learning decision, so that the system can be ensured to accurately adapt to the real-time requirements of different scheduling targets according to the real-time state;
The dynamic attention mechanism adopts a target weight vector of a trainable parameter, and calculates the attention weight of each target in real time by a Softmax function.
After the dynamic attention weight is adjusted in real time, the dynamic attention weight is used as a target weighting coefficient in the next reinforcement learning decision, so that the system can be ensured to accurately adapt to the real-time requirements of different scheduling targets according to the real-time state.
The dynamic attention mechanism adopts a target weight vector of a trainable parameter, and calculates the attention weight of each target in real time by a Softmax function.
And 3, implementing the reinforcement learning multi-objective collaborative optimization module.
And 3.1, defining a multi-objective optimized reinforcement learning state space, wherein the state space comprises meteorological data, a predicted water level sequence output by a self-adaptive network, a warehouse-in flow sequence and a downward leakage flow sequence. As shown in fig. 3, a state space for reinforcement learning is constructed. And (3) defining a state vector of the reinforcement learning module as S t based on the prediction result output by the dynamic transducer network in the step (2). Specifically, the state vector contains the future 3 time-step predicted water level sequences output by the transducer networkWarehouse-in flow prediction sequenceAnd a down-flow prediction sequence Q out,t=[Qout,t+1,Qout,t+2,Qout,t+3 and a latest real-time meteorological data sequence M t, wherein the meteorological data comprise precipitation, wind speed and temperature and humidity. After the state data are spliced, the dimension of the state space is controlled to be between 100 and 500, and the state space is input into the reinforcement learning network after standardized processing. Warehouse entry flow sequence, downward leakage flow sequence
And 3.2, defining a reinforcement learning action space. The motion vector a t contains three variables of flood discharge, power generation flow and ecological flow, and is specifically set as follows:
Flood discharge flow regulation range is 50 to 5000 m3/s, and step length is 50m 3/s;
The power generation flow adjustment range is 100 to 2000 m3/s, and the step length is 20m 3/s;
the ecological flow regulating range is 10 to 500 m3/s, and the step length is 10 m3/s;
The discrete combination of the action space is generated by adopting a grid search mode, and the state vector is mapped to the action space through a three-layer fully connected network so as to strengthen the learning model to perform efficient action selection.
And 3.3, strengthening the concrete implementation of the learning network. The reinforcement learning network adopts a deep Q network, and the network structure is specifically as follows:
An input layer for inputting a state vector S t;
the hidden layer is composed of a 4-layer convolution network and a 2-layer full-connection network, wherein the convolution network core size is [8×8, 4×4,3×3] in sequence, the convolution step length is [4, 2,1, 1], and the output feature dimensions are 128, 64 and 64 respectively;
and an output layer, namely the Q value of each action combination.
Optimization of reinforcement learning networks is aimed at maximizing cumulative discount rewardsThe reward function is defined as the weighted result of three targets (flood control, power generation, ecology), and the specific formula is:
;
wherein, the rewarding weight alpha Flood control 、α Generating electricity 、α Ecological system comes from the calculation result of the dynamic attention mechanism in the step 2.3, and is dynamically updated in real time. Specifically, each target prize is defined as follows:
The flood control rewarding item R Flood control gives negative feedback according to the predicted water level exceeding the flood control limit water level, and is defined as: . Wherein, the In order to predict the water level,Is the flood control limit water level.
The electricity generation rewarding item R Generating electricity is calculated according to the economic benefit generated by the current electricity generation flow, and is defined as: . Wherein k Generating electricity is a unit flow power generation economic benefit coefficient, and Q Generating electricity ,t is power generation flow;
The ecological rewarding item R Ecological system is calculated according to the degree that the downstream ecological flow deviates from the ideal ecological flow: . Wherein Q Ecological system ,t is the current ecological flow rate, and Q Ecological system , Ideal for is the ideal ecological flow rate.
In the training process, parameters of the dynamic transducer network are frozen, only a layer outputting the weight is activated, and training is carried out in combination with the reinforcement learning network.
And finally converging the reinforcement learning network to the optimal strategy of multi-objective collaborative optimization through continuous iterative optimization, and outputting an optimal combination scheme of each flow.
The network updating learning rate of the reinforcement learning network is between 0.001 and 0.005, the dynamic range of the weight coefficient of each target in the reward function is 0.4-0.8 for flood control, 0.1-0.5 for power generation and 0.1-0.3 for ecology, and the target weight is updated once every 30 minutes.
And 4, implementing a real-time reservoir dispatching decision module.
In the embodiment, the real-time reservoir dispatching decision module determines the opening degree of each flood discharge gate of the reservoir and the power scheme of the generator set in real time through the optimal strategy result output by the reinforcement learning multi-objective collaborative optimization module, and forms a specific execution instruction. Specifically, the floodgate opening control commandFlood discharge flow output by reinforcement learning moduleThe valve flow-opening degree relation formula is obtained through real-time calculation:
Wherein, the Is an empirical relationship function between the flood discharge amount and the opening degree obtained by actual measurement according to the hydraulic characteristics of the gate.
The output force adjusting instruction of the generator set is based on the power generation flow output by reinforcement learningThe method comprises the steps of determining the start-stop number of units and the load distribution of each unit in real time:
Wherein, the Is the power generated by the ith unit,Is the water density of the water, the water is in a water-tight state,The gravity acceleration, the H water head,For a real-time head of water,For the efficiency of the machine set,For starting the number of units.
On the basis, checking whether the total drainage quantity of Q Flood discharge ,t+Q Generating electricity ,t is not lower than Q Ecological system ,t, if so, considering that ecological water is contained without additional scheduling, and if not, supplementing water according to the difference Q supplementing ecology ,t=Q Ecological system ,t-(Q Flood discharge ,t+Q Generating electricity ,t) through an ecological special gate hole or a low-load unit to ensure that the downstream ecological flow reaches the standard so as to supplement the ecological flow.
Under extreme rainfall conditions (such as the rainfall intensity of a reservoir area exceeds 30 mm/h), the system automatically promotes the flood control target weight alpha Flood control to 0.85, the reinforcement learning network outputs a high-intensity flood discharge strategy in real time, and a storm event is responded quickly, so that the water level of the reservoir area is always strictly controlled below the flood control limit water level, and the flood risk is avoided.
And under the extreme rainfall condition, the system automatically increases the flood control target weight to between 0.7 and 0.9, adjusts the flood discharge amount at the fastest speed and ensures that the water level of the reservoir area is always controlled below the flood control limit water level.
And 5, implementing a rolling optimization and feedback updating mechanism.
Step 5.1, real-time evaluation of result errorAnd dynamic network updates. The system automatically calculates the result error once per hourCalculating an upper limit threshold and a lower limit threshold according to the step S2.2, and triggering the updating process of the transducer self-adaptive dynamic network structure and parameters according to the adjustment rule so as to reduce the error of the subsequent result. The network update is realized through a gradient descent algorithm, the learning rate is 0.001, and the real-time prediction work is immediately put into again after the parameter adjustment is completed.
And 5.2, the reinforcement learning strategy is finely tuned regularly. And automatically counting multi-objective optimization results of reinforcement learning decisions by the system at the end of each day, wherein the multi-objective optimization results comprise flood control risk reduction conditions, power generation income conditions and ecological flow guarantee conditions, and comparing and analyzing with the previous cycle targets. And (3) performing parameter fine adjustment of the reinforcement learning network according to summarized data every week, wherein an experience playback strategy is adopted in the fine adjustment process, the state-action-rewarding data of the last week are stored, and parameters of the reinforcement learning network are updated by a method of randomly extracting samples in batches, so that continuous optimization and generalization performance of the network are ensured.
And 5.3, establishing a long-term error database and periodically optimizing. The system establishes a long-term operation error historical database, and results errors are regularly obtained every monthAnd carrying out statistical analysis on the historical trend of the decision error. Dynamically adjusting a prediction error threshold (initial 0.05m, adjusting a range of + -0.01 m according to trend) and strengthening learning target weight according to historical error trendThe dynamic adjustment range of the device is controlled within +/-10% to ensure stability and self-adaptation capability in long-term operation.
And 6, deploying and implementing the intelligent collaborative scheduling platform.
As shown in fig. 4, the implementation of the intelligent collaborative scheduling platform is realized through an integrated intelligent management platform deployed on a high-performance server, and the platform has functional modules of real-time data monitoring, automatic execution of prediction decision, real-time alarm, historical data analysis and the like. The platform adopts a distributed architecture, is deployed on an Intel Xeon high-performance server (CPU 32 core, memory 128 GB) and has disaster recovery backup and remote Web access interfaces. The real-time data monitoring module receives and displays sensing data such as water level, weather and the like in real time, the self-adaptive network prediction module updates a prediction result every 10 minutes, the reinforcement learning decision module outputs an optimal scheduling instruction in real time and automatically transmits the optimal scheduling instruction to the on-site execution equipment, and the automatic scheduling execution module completes real-time execution of gate opening and power generation load. The response time of the system alarm mechanism is controlled within 30 seconds, so that the system alarm mechanism can automatically detect events such as water level overrun, extreme weather abnormality and the like, trigger real-time alarm and send the alarm to management personnel through short messages and mails. The platform visual interface comprises a real-time reservoir state (such as a water level real-time curve and a gate opening indication diagram), a meteorological trend diagram and a multi-objective optimization decision chart, supports quick query and analysis of historical data, and provides comprehensive technical support for reservoir comprehensive management in a chart and report mode.
And 7, monitoring the long-term running state of the system and performing performance evaluation.
In order to ensure the long-term stability and the self-adaptability of the system, a perfect long-term running state monitoring and performance evaluation mechanism is established, and the comprehensive performance of the system is evaluated regularly. Generating a comprehensive system operation evaluation report once in a quarter, wherein the evaluation indexes comprise:
Flood control risk reduction rate:
;
the power generation benefit improvement rate:
;
Ecological flow guarantee rate:
;
The evaluation error of each index is controlled within +/-5 percent. According to the quarter evaluation result, if any index has a significant decreasing trend (the decreasing amplitude exceeds 10%), the updating and structure adjusting processes of the self-adaptive dynamic network structure and the reinforcement learning network parameters are automatically triggered to restore the system performance. In addition, a system operation log and abnormal event recording mechanism is established, and changes of the scheduling strategy, prediction of abnormal conditions, parameter adjustment records and the like are stored and managed for a long time to form a complete operation database for annual technology upgrading and system maintenance decision.
The invention comprehensively improves the collaborative management efficiency of a plurality of targets such as reservoir flood control safety, power generation benefit, ecological protection and the like through a real-time data fusion preprocessing technology, a self-adaptive dynamic network structure design and a multi-target reinforcement learning decision mechanism. Especially in extreme weather conditions, the invention can quickly respond and dynamically adjust the scheduling target weight, and effectively reduce decision delay and risk in the traditional reservoir scheduling mode. Through a dynamic network structure and a rolling updating and feedback optimization mechanism for reinforcement learning decision, the invention can keep high self-adaptability and stability for a long time, and meets the strict requirements of real-time and accuracy of actual reservoir management. The method can be widely applied to intelligent and refined management of large and medium-sized reservoirs, and the reservoir scheduling decision level and the safe operation guarantee capability are obviously improved.
The invention provides a reservoir dispatching method based on deep learning self-adaptive dynamic network and reinforcement learning, which effectively improves the accuracy and the integrity of reservoir real-time state data by carrying out real-time fusion and accurate pretreatment on multi-source data through a real-time water level sensor, a meteorological data acquisition system and high-resolution satellite remote sensing equipment. By designing the adaptive dynamic network and the dynamic attention mechanism based on the transducer structure, the invention realizes short-time accurate prediction of reservoir water level, warehouse-in flow and discharging flow, and can automatically and dynamically adjust the network structure according to the result error, thereby ensuring good balance between prediction precision and calculation efficiency. Meanwhile, the invention combines a deep reinforcement learning method, constructs a multi-target collaborative optimization decision module, outputs an optimal combination strategy of flood discharge, power generation and ecological flow in real time based on a dynamic prediction result, and realizes the real-time dynamic intelligent collaborative regulation and control of multiple targets of the reservoir. In a real-time scheduling execution link, the method can quickly convert the optimization strategy into accurate execution instructions of gate opening and generator set output, quickly respond under extreme weather conditions, and ensure flood control safety.
The invention provides a complete and innovative rolling optimization feedback updating mechanism, which automatically triggers network parameter fine adjustment and strategy updating through real-time evaluation of result errors and reinforcement of learning decision effects, and effectively improves the stability and adaptability of long-term operation. The deployment of the intelligent collaborative scheduling platform realizes the real-time monitoring of data, the automatic execution of predictive decision-making and the rapid alarm of abnormal events, and effectively improves the automation, refinement and intellectualization level of reservoir management. In addition, the invention further establishes a monitoring and evaluating mechanism for the long-term running state and performance of the system, can evaluate the realization conditions of flood control safety, power generation benefit and ecological protection targets regularly, and adaptively optimizes the network structure and parameters according to the evaluation result, so that the long-term running performance of the system is kept in an optimal state. Experimental verification in an actual reservoir environment shows that the implementation of the method can obviously reduce the flood control risk of the reservoir, improve the economic benefit of power generation and effectively ensure ecological flow, has the outstanding advantages of rapid real-time response, strong multi-objective collaborative optimization capability and high long-term operation stability, and can effectively meet the management requirements of a modern reservoir in complex weather and hydrologic environments.
The reservoir dispatching method based on the deep learning self-adaptive dynamic network solves the problem.
The foregoing is merely a preferred embodiment of the present invention, and it should be noted that it will be apparent to those skilled in the art that several modifications and variations can be made without departing from the technical principle of the present invention, and these modifications and variations should also be regarded as the scope of the invention.

Claims (7)

1.基于深度学习自适应动态网络和强化学习的水库调度方法,其特征在于,包括:1. A reservoir operation method based on deep learning adaptive dynamic network and reinforcement learning, characterized by including: 获取水库的实时水位数据、气象数据、区域云层和地表特征的雷达影像数据并对三者进行数据融合;Obtain real-time reservoir water level data, meteorological data, and radar image data of regional cloud cover and surface features and fuse the three; 将融合后的数据输入预设的自适应动态Transformer网络模型,得到预测输出,预测输出包括水库水位序列、入库流量序列和下泄流量序列;The fused data is input into the preset adaptive dynamic Transformer network model to obtain the prediction output, which includes the reservoir water level series, the inflow flow series and the outflow flow series. 构建强化学习状态空间,强化学习状态空间包括Transformer网络模型未来预设时间步长的预测水库水位序列、入库流量序列、下泄流量序列和最新实时气象数据序列;Construct a reinforcement learning state space, which includes the predicted reservoir water level sequence, inflow sequence, outflow sequence, and the latest real-time meteorological data sequence at the future preset time step of the Transformer network model; 将强化学习状态空间输入预设的强化学习网络中,强化学习网络的动作空间包括泄洪量、发电流量与生态流量,强化学习网络的优化目标为最大化累积折扣奖励,奖励函数为防洪奖励项、发电奖励项与生态奖励项的加权函数,强化学习网络通过不断迭代优化,最终收敛输出泄洪量、发电流量与生态流量的最佳组合方案;The reinforcement learning state space is input into a pre-set reinforcement learning network. The action space of the reinforcement learning network includes flood discharge, power generation flow, and ecological flow. The optimization goal of the reinforcement learning network is to maximize the cumulative discounted reward. The reward function is a weighted function of the flood control reward item, the power generation reward item, and the ecological reward item. Through continuous iterative optimization, the reinforcement learning network eventually converges and outputs the optimal combination of flood discharge, power generation flow, and ecological flow. 当自适应动态Transformer网络模型每次的预测输出与实际观测的结果误差超过预设的上限阈值时,对自适应动态Transformer网络模型自动增加预设数量个的编码层,当结果误差低于预设的下限阈值时,对自适应动态Transformer网络模型自动减小预设数量的编码层,当自适应动态Transformer网络模型每次的预测输出与实际观测的结果误差在下限阈值和上限阈值之间时,则自适应动态Transformer网络模型保持不变;When the error between the predicted output of the adaptive dynamic Transformer network model and the actual observation result exceeds a preset upper threshold, the adaptive dynamic Transformer network model automatically adds a preset number of coding layers. When the result error is lower than the preset lower threshold, the adaptive dynamic Transformer network model automatically reduces the preset number of coding layers. When the error between the predicted output of the adaptive dynamic Transformer network model and the actual observation result is between the lower threshold and the upper threshold, the adaptive dynamic Transformer network model remains unchanged. 结果误差为水库水位的预测值和真实值的水库水位均方差、入库流量的预测值和真实值的入库流量均方差、下泄流量的预测值和真实值的下泄流量均方差的加权和;The result error is the weighted sum of the mean square error of the reservoir water level between the predicted value and the actual value, the mean square error of the inflow between the predicted value and the actual value, and the mean square error of the outflow between the predicted value and the actual value. 自适应动态Transformer网络模型的自适应动态调整根据实时计算的结果误差进行自动调整Transformer网络层数,其中,结果误差的计算方式如下:Adaptive dynamic Transformer network model adaptive dynamic adjustment according to the result error of real-time calculation Automatically adjust the number of Transformer network layers, where the result error is calculated as follows: , 其中,指网络输出的预测值和真实值的均方差,权重 通过离线交叉验证确定,H表示水位,表示入库流量,表示下泄流量,t表示时间;in, Refers to the mean square error between the predicted value and the true value output by the network, weight Determined by offline cross-validation, H represents the water level, Indicates the inflow flow, represents the discharge flow, and t represents the time; 以最近W个预测循环为滑动窗口,首先计算综合误差序列的滑动均值以及标准差,标准差采用加权协方差形式:Taking the latest W prediction cycles as the sliding window, first calculate the comprehensive error sequence The sliding mean of and standard deviation , the standard deviation takes the form of a weighted covariance: , 其中 为离线交叉验证确定的指标权重向量,为窗口内三类RMSE 的协方差矩阵,随后动态构造阈值:in The indicator weight vector determined for offline cross-validation, is the covariance matrix of the three types of RMSE within the window, and then the threshold is dynamically constructed: ; ; 其中放宽系数通过贝叶斯优化在历史数据集上离线确定,当整体结果误差逐步降低时,随之缩小,相应下调;反之,当结果误差急剧增大时,被动态抬升以避免误判,而一旦实时的结果误差超越新的上限阈值,仍将触发增层操作,当每次预测结果与实际观测结果的结果误差超过上限阈值时,网络层数增加1个,最大增加不超过3个;当结果误差低于下限阈值时,网络层数减少1个,具体调整规则为:The relaxation coefficient Through Bayesian optimization, it is determined offline on the historical data set. When the overall result error gradually decreases, Then it shrinks, On the contrary, when the result error increases sharply, It is dynamically lifted to avoid misjudgment, and once the real-time result error Beyond the new upper threshold, the layer-increasing operation will still be triggered. When the error between the predicted result and the actual observation result exceeds the upper threshold, When the error of the result is lower than the lower threshold, the number of network layers increases by 1, and the maximum increase does not exceed 3; When , the number of network layers is reduced by 1, and the specific adjustment rules are as follows: ; 根据输出的泄洪量、发电流量与生态流量的最佳组合方案确定水库各个泄洪闸门的开度以及发电机组的出力方案,并形成具体的执行指令,具体包括:Based on the optimal combination of flood discharge, power generation flow and ecological flow, the opening of each flood discharge gate of the reservoir and the output plan of the generator set are determined, and specific execution instructions are formed, including: 根据输出的泄洪量和闸门流量-开度关系公式实时计算获得泄洪闸门开度;The flood discharge gate opening is calculated in real time based on the output flood discharge volume and the gate flow-opening relationship formula; 根据输出的发电流量实时确定机组启停台数和每台机组的负荷分配;Determine the number of units started and stopped and the load distribution of each unit in real time based on the output power flow; 根据泄洪闸门开度、机组启停台数和每台机组的负荷分配构建执行指令;Construct execution instructions based on the flood gate opening, the number of units started and stopped, and the load distribution of each unit; 判断“泄洪+发电”总下泄量是否不低于生态流量,若是,则认为生态需水已被包含,无需额外调度,若否,则按差额通过生态专用闸孔或低负荷机组补水,差额为生态流量与“泄洪+发电”总下泄量之差。Determine whether the total discharge of "flood discharge + power generation" is not lower than the ecological flow. If so, it is considered that the ecological water demand has been included and no additional scheduling is required. If not, water is replenished through the ecological special sluice or low-load unit according to the difference. The difference is the difference between the ecological flow and the total discharge of "flood discharge + power generation". 2.根据权利要求1所述的基于深度学习自适应动态网络和强化学习的水库调度方法,其特征在于,在进行数据融合之前对实时水位数据和气象数据进行预处理,预处理步骤包括:2. The reservoir operation method based on deep learning adaptive dynamic network and reinforcement learning according to claim 1 is characterized in that the real-time water level data and meteorological data are preprocessed before data fusion, and the preprocessing step includes: 采用基于空间距离权重的克里金插值算法,对实时水位数据、气象数据、区域云层和地表特征的雷达影像数据进行空间插值处理,得到覆盖整个水库区域的连续性数据;Using the Kriging interpolation algorithm based on spatial distance weights, real-time water level data, meteorological data, and radar image data of regional cloud cover and surface features are spatially interpolated to obtain continuous data covering the entire reservoir area. 对连续性数据进行异常值检测,并将异常值并进行实时剔除;Perform outlier detection on continuous data and remove outliers in real time; 对剔除异常值后的数据进行平滑去噪处理。The data after removing outliers is smoothed and denoised. 3.根据权利要求1所述的基于深度学习自适应动态网络和强化学习的水库调度方法,其特征在于,自适应动态Transformer网络模型每次基于动态注意力机制输出奖励函数中防洪奖励项、发电奖励项与生态奖励项的权重3. The reservoir scheduling method based on deep learning adaptive dynamic network and reinforcement learning according to claim 1 is characterized in that the adaptive dynamic Transformer network model outputs the weights of flood control reward items, power generation reward items and ecological reward items in the reward function based on the dynamic attention mechanism each time. . 4.根据权利要求3所述的基于深度学习自适应动态网络和强化学习的水库调度方法,其特征在于,根据防洪奖励项、发电奖励项与生态奖励项的权重更新目标权重,更新方式为:4. The reservoir scheduling method based on deep learning adaptive dynamic network and reinforcement learning according to claim 3 is characterized in that the weights of flood control reward items, power generation reward items and ecological reward items are used to determine the optimal allocation of resources. Update target weight , the update method is: ; ; ; 动态注意力机制采用可训练参数的目标权重向量,以Softmax函数实时计算每个目标的注意力权重。The dynamic attention mechanism uses a target weight vector with trainable parameters and calculates the attention weight of each target in real time using the Softmax function. 5.根据权利要求4所述的基于深度学习自适应动态网络和强化学习的水库调度方法,其特征在于,奖励函数的公式如下:5. The reservoir scheduling method based on deep learning adaptive dynamic network and reinforcement learning according to claim 4 is characterized in that the formula of the reward function is as follows: ; ; ; ; Rt为累积折扣奖励,α防洪为防洪奖励项的目标权重系数,α发电为发电奖励项的目标权重系数,α生态为生态奖励项的目标权重系数,为预测水位,为防洪限水位,为单位流量发电经济效益系数,Q发电,t为发电流量,Q生态,t为当前生态流量,Q生态,理想为理想生态流量,R防洪为防洪奖励项,R发电为发电奖励项,R生态为生态奖励项。 Rt is the cumulative discount reward, αflood control is the target weight coefficient of the flood control reward item, αpower generation is the target weight coefficient of the power generation reward item, and αecology is the target weight coefficient of the ecological reward item. To predict water levels, To prevent floods, water levels are limited. is the economic benefit coefficient of power generation per unit flow, Qpower ,t is the power generation flow, Qecology,t is the current ecological flow, Qecology,ideal is the ideal ecological flow, Rflood control is the flood control reward item, Rpower generation is the power generation reward item, and Recology is the ecological reward item. 6.根据权利要求1所述的基于深度学习自适应动态网络和强化学习的水库调度方法,其特征在于,构建长期运行均方根误差历史数据库,定期对结果误差、决策误差的历史趋势进行统计分析,依据历史误差趋势,动态调整预设的预设误差阈值。6. The reservoir scheduling method based on deep learning adaptive dynamic network and reinforcement learning according to claim 1 is characterized in that a long-term running root mean square error historical database is constructed, and statistical analysis of the historical trends of result error and decision error is performed regularly, and the preset error threshold is dynamically adjusted according to the historical error trend. 7.根据权利要求1所述的基于深度学习自适应动态网络和强化学习的水库调度方法,其特征在于,获取水库的实时水位数据、气象数据、区域云层和地表特征的雷达影像数据,包括:7. The reservoir operation method based on deep learning adaptive dynamic network and reinforcement learning according to claim 1 is characterized in that obtaining real-time reservoir water level data, meteorological data, and radar image data of regional cloud cover and surface features includes: 获取波式水位传感器监测到的实时水位数据,所述波式水位传感器部署在水库库区及主要入库河道关键断面;Acquiring real-time water level data monitored by wave-type water level sensors deployed in key sections of the reservoir area and main inflow rivers; 获取气象数据采集系统的气象数据,气象数据采集系统部署在水库区内,气象数据采集系统包括降水量传感器、风速传感器、温湿度传感器;Obtain meteorological data from the meteorological data acquisition system, which is deployed in the reservoir area and includes precipitation sensors, wind speed sensors, and temperature and humidity sensors; 获取区域云层和地表特征的雷达影像数据,使用空间分辨率为30m的高分辨率雷达遥感卫星,对水库及流域上游区域进行卫星遥感观测,获取区域云层和地表特征的雷达影像数据。Obtain radar image data of regional cloud and surface features. Use a high-resolution radar remote sensing satellite with a spatial resolution of 30m to conduct satellite remote sensing observations of the reservoir and the upstream area of the basin to obtain radar image data of regional cloud and surface features.
CN202510819807.6A 2025-06-19 2025-06-19 Reservoir operation method based on deep learning adaptive dynamic network and reinforcement learning Active CN120338210B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202510819807.6A CN120338210B (en) 2025-06-19 2025-06-19 Reservoir operation method based on deep learning adaptive dynamic network and reinforcement learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202510819807.6A CN120338210B (en) 2025-06-19 2025-06-19 Reservoir operation method based on deep learning adaptive dynamic network and reinforcement learning

Publications (2)

Publication Number Publication Date
CN120338210A CN120338210A (en) 2025-07-18
CN120338210B true CN120338210B (en) 2025-10-21

Family

ID=96370187

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202510819807.6A Active CN120338210B (en) 2025-06-19 2025-06-19 Reservoir operation method based on deep learning adaptive dynamic network and reinforcement learning

Country Status (1)

Country Link
CN (1) CN120338210B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN120745946B (en) * 2025-08-27 2025-11-18 水利部珠江水利委员会珠江水利综合技术中心 Dynamic multi-objective optimization control system and method for reservoir flood period water level

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115952958A (en) * 2023-03-14 2023-04-11 珠江水利委员会珠江水利科学研究院 Reservoir Group Joint Optimal Dispatch Method Based on MADDPG Reinforcement Learning
CN117575873A (en) * 2024-01-15 2024-02-20 安徽大学 Flood warning method and system integrating meteorological and hydrological sensitivity
CN119721368A (en) * 2024-12-13 2025-03-28 河海大学 A method and system for predicting power generation capacity of a hydropower station

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111369181B (en) * 2020-06-01 2020-09-29 北京全路通信信号研究设计院集团有限公司 Train autonomous scheduling deep reinforcement learning method and device
US20250075602A1 (en) * 2023-08-30 2025-03-06 Saudi Arabian Oil Company Predicting gas lift equipment failure with deep learning techniques

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115952958A (en) * 2023-03-14 2023-04-11 珠江水利委员会珠江水利科学研究院 Reservoir Group Joint Optimal Dispatch Method Based on MADDPG Reinforcement Learning
CN117575873A (en) * 2024-01-15 2024-02-20 安徽大学 Flood warning method and system integrating meteorological and hydrological sensitivity
CN119721368A (en) * 2024-12-13 2025-03-28 河海大学 A method and system for predicting power generation capacity of a hydropower station

Also Published As

Publication number Publication date
CN120338210A (en) 2025-07-18

Similar Documents

Publication Publication Date Title
CN113222296B (en) Flood control scheduling method based on digital twinning
CN119401452B (en) Power load prediction method, system equipment and medium based on multi-source heterogeneous data feature fusion
JP4807565B2 (en) Flow prediction device
CN114611778B (en) Reservoir water level early warning method and system based on warehousing flow
CN115343784A (en) Local air temperature prediction method based on seq2seq-attention model
CN120338210B (en) Reservoir operation method based on deep learning adaptive dynamic network and reinforcement learning
CN115271186B (en) A reservoir water level prediction and early warning method based on delay factor and PSO RNN Attention model
CN120106490A (en) Cascade power station dispatching parameter access and influencing factor prediction method
CN120067606A (en) Intelligent water conservancy dynamic monitoring and early warning method based on deep learning
CN119168115A (en) A dam seepage prediction method based on multi-scale convolutional neural network and bidirectional long short-term memory neural network
Liu et al. Lstm-based hazard source detection and risk assessment model for the shandong yellow river basin
CN119829912A (en) Marine environment forecasting method integrating daily climate state and machine learning model
CN119067269B (en) A method and system for correcting wind speed prediction in an integrated wind farm
CN115186879A (en) Water level prediction method based on deep learning meshed watershed inundation model
CN120726779A (en) A method and system for predicting and alarming water level in an expansion tank of a hydropower station
CN119811044A (en) An intelligent monitoring and early warning method and system for small hydropower stations in mountainous areas
CN120726765B (en) A method and device for coordinated flood warning issuance between upstream and downstream areas
CN118759603B (en) A retractable weather station with protection function
CN120652826B (en) Hydraulic engineering water delivery quantity adjusting method and system based on artificial intelligence
CN119168322B (en) A method for early warning analysis of full-pipe operation of sewage pipe network
CN121209593A (en) A Reservoir Flow Control Method and System Based on Artificial Intelligence
CN121146392A (en) Cascade reservoir flood limit water level joint application scheduling method and system
CN119721681A (en) A dam safety state memory maintenance method and system for digital twin water conservancy large model
CN120997018A (en) Flood disaster forecasting method based on spatiotemporal feature fusion
CN121456786A (en) Wind power ultra-short-term prediction method based on transform-LSTM fusion model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant