[go: up one dir, main page]

CN118519340B - Intelligent heating control method based on deep reinforcement learning and adaptive control - Google Patents

Intelligent heating control method based on deep reinforcement learning and adaptive control Download PDF

Info

Publication number
CN118519340B
CN118519340B CN202410578917.3A CN202410578917A CN118519340B CN 118519340 B CN118519340 B CN 118519340B CN 202410578917 A CN202410578917 A CN 202410578917A CN 118519340 B CN118519340 B CN 118519340B
Authority
CN
China
Prior art keywords
thermal power
time period
heat load
heating
heat
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202410578917.3A
Other languages
Chinese (zh)
Other versions
CN118519340A (en
Inventor
路亮亮
杨雪平
刘文韬
于文浩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shijiazhuang Huadian Heat Supply Group Co ltd
Original Assignee
Shijiazhuang Huadian Heat Supply Group Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shijiazhuang Huadian Heat Supply Group Co ltd filed Critical Shijiazhuang Huadian Heat Supply Group Co ltd
Priority to CN202410578917.3A priority Critical patent/CN118519340B/en
Publication of CN118519340A publication Critical patent/CN118519340A/en
Application granted granted Critical
Publication of CN118519340B publication Critical patent/CN118519340B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05BCONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
    • G05B13/00Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
    • G05B13/02Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
    • G05B13/04Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators
    • G05B13/042Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators in which a parameter or coefficient is automatically adjusted to optimise the performance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/092Reinforcement learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Human Resources & Organizations (AREA)
  • Economics (AREA)
  • Theoretical Computer Science (AREA)
  • Strategic Management (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Tourism & Hospitality (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Primary Health Care (AREA)
  • Development Economics (AREA)
  • Biophysics (AREA)
  • Public Health (AREA)
  • Water Supply & Treatment (AREA)
  • Biomedical Technology (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Educational Administration (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Game Theory and Decision Science (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Automation & Control Theory (AREA)
  • Air Conditioning Control Device (AREA)
  • Feedback Control In General (AREA)
  • Supply And Distribution Of Alternating Current (AREA)

Abstract

本发明涉及供热管网调控的技术领域,公开了一种基于深度强化学习和自适应控制的智能化供热调控方法,包括以下步骤:S1:收集室外环境数据与每个热力站的热负荷数据,并进行数据预处理;S2:基于所述室外环境数据与热负荷数据,训练深度强化学习模型;S3:基于所述深度强化学习模型制定供热策略;S4:通过自适应控制算法执行所述供热策略。本发明提高了能源利用效率,降低了不必要的能源损耗;增强了整个供热系统的稳定性和可靠性,保证供热质量的同时,优化了能源分配。

The present invention relates to the technical field of heating network control, and discloses an intelligent heating control method based on deep reinforcement learning and adaptive control, comprising the following steps: S1: collecting outdoor environment data and heat load data of each heating station, and performing data preprocessing; S2: training a deep reinforcement learning model based on the outdoor environment data and heat load data; S3: formulating a heating strategy based on the deep reinforcement learning model; S4: executing the heating strategy through an adaptive control algorithm. The present invention improves energy utilization efficiency, reduces unnecessary energy loss, enhances the stability and reliability of the entire heating system, ensures heating quality, and optimizes energy distribution.

Description

Intelligent heat supply regulation and control method based on deep reinforcement learning and self-adaptive control
Technical Field
The invention relates to the technical field of heat supply pipe network regulation and control, in particular to an intelligent heat supply regulation and control method based on deep reinforcement learning and self-adaptive control.
Background
Along with the acceleration of the urban process and the deep penetration of the environment-friendly and energy-saving concepts, the intelligent and refined management of the heating system becomes the necessary trend of industry development. Especially in northern areas, central heating systems are in charge of guaranteeing the heating demands of residents in winter, and the running efficiency and the energy utilization rate of the central heating systems directly influence the energy consumption structure and the environmental protection level of cities. Modern heating systems have evolved from initially simple temperature control to employing automatic control systems, implementing the basic function of on-demand heating. The use of automation devices such as sensors and controllers enables the heating system to monitor the indoor and outdoor environmental temperature changes in real time and adjust the operating conditions of the heating device accordingly. Meanwhile, an advanced data analysis technology is also applied to the heat supply field, and future heat load demands can be predicted through mining and analyzing historical data, so that certain predictive regulation and control are realized.
However, the conventional heat supply regulation and control modes are mostly based on fixed rules or simplified mathematical models, and it is difficult to sufficiently cope with the complicated and changeable external environment conditions and the change of the heat demands of users. For example, a sudden weather change may cause an increase in the error of the existing prediction model, and thus the heating strategy cannot be timely and accurately adjusted, thereby causing a problem of excessive or insufficient heating. In addition, due to the lack of a dynamic optimization mechanism, the system is often difficult to realize globally optimal energy configuration, and the heat supply efficiency is greatly limited. The current heating strategies mostly depend on statically set thresholds and empirical rules, and do not have the capability of online learning and continuous optimization. In the face of rapidly changing environmental conditions and user behavior habits, such "one-cut" regulation and control methods are prone to waste of heat supply resources or unbalance of supply and demand. Because the regulation and control means are relatively single and not accurate enough, large deviation often exists between the energy consumption and the actual heat demand, and therefore, the operation cost is increased, and the environmental pollution is also aggravated. In summary, although the automation and informatization of the heating industry has advanced, many challenges remain in practical operation.
For example, china patent with the publication number CN110736129B discloses an intelligent balance control system and method for an urban heating network. The system comprises a first adjusting component, a second adjusting component, a plate heat exchanger unit and a control cabinet. Through set up first governing valve on the water supply pipe way once, set up first temperature sensor on the water return pipe way once, set up second governing valve and second temperature sensor on the water supply pipe way of secondary, set up the third governing valve on the bypass pipeline, adjust each valve according to the temperature value, separate the regulation of primary pipe network and the regulation of secondary pipe network, the regulation of primary pipe network relies on the return water temperature of primary pipe network to realize, the regulation of secondary pipe network relies on secondary water supply temperature to realize, when secondary network water supply temperature demand changes, the hydraulic operating mode of primary network is not influenced, the hydraulic operating mode of primary network is stable is guaranteed, the decoupling operation of primary pipe network and secondary pipe network has been realized, the difficult problem of primary pipe network and secondary pipe network hydraulic imbalance has been solved.
The patent with the application publication number of CN108592173A discloses a heat supply pipe network regulation and control method which comprises the following steps of intelligent monitoring information acquisition, corresponding module information, temperature information corresponding to a heat supply pipe network, pipeline information and heat source information in the heat supply pipe network, obtaining the heat supply load of the heat supply pipe network according to the current outdoor temperature information and the building information, and obtaining an actual value corresponding to the heat supply pipe network at the minimum total time consumption according to the heat supply load, the heat dissipation condition, the pipeline information and the heat source information. The invention can solve the disadvantage of the traditional heat supply network and reduce the total energy consumption of the heat supply network.
The problems of the prior art are that the complex and changeable external environment conditions and the change of the heat demand of the user are difficult to fully cope with, the regulation and control means are relatively single and inaccurate, and large deviation exists between the energy consumption and the actual heat demand.
The information disclosed in this background section is only for enhancement of understanding of the general background of the invention and should not be taken as an acknowledgement or any form of suggestion that this information forms the prior art already known to a person of ordinary skill in the art.
Disclosure of Invention
The invention aims to overcome the defects of the prior art, provides an intelligent heat supply regulation and control method based on deep reinforcement learning and self-adaptive control, improves energy utilization efficiency, reduces unnecessary energy loss, enhances the stability and reliability of the whole heat supply system, ensures heat supply quality and optimizes energy distribution.
In order to solve the technical problems, the invention provides the following technical scheme:
An intelligent heat supply regulation and control method based on deep reinforcement learning and self-adaptive control comprises the following steps: S1, collecting outdoor environment data and heat load data of each heating power station, and preprocessing the data;
s2, training a deep reinforcement learning model based on the outdoor environment data and the heat load data;
S3, formulating a heating strategy based on the deep reinforcement learning model;
and S4, executing the heating strategy through an adaptive control algorithm.
As a preferable scheme of the intelligent heat supply regulation and control method based on deep reinforcement learning and self-adaptive control, the outdoor environment data comprises temperature, humidity, wind speed and solar radiation intensity;
The thermal load data represents the amount of heat that the thermal station needs to provide to maintain the indoor temperature of all users to the heating standard temperature.
The preprocessing comprises missing value filling, abnormal value detection and replacement, data normalization and data synchronous integration, wherein the method for data synchronous integration is to correspond heat load data and outdoor environment data according to time, so as to form a time sequence corresponding to the heat load data and the outdoor environment data.
As an optimal scheme of the intelligent heat supply regulation and control method based on deep reinforcement learning and self-adaptive control, the method for training the deep reinforcement learning model comprises the following steps:
S100, setting a time period for heat supply regulation, constructing a state space, an action set and a reward function, constructing a strategy network and initializing parameters;
S200, selecting an action from the action collection set and executing the action;
S300, calculating a cumulative rewards value and updating parameters of a strategy network;
S400, entering the next time period, generating a new action set and updating the feature vector of the state space;
S500, repeating the steps S200-S400 until the accumulated rewards value converges to complete the training of the strategy network, storing the strategy network and deploying and applying the deep reinforcement learning model.
As a preferable scheme of the intelligent heat supply regulation and control method based on deep reinforcement learning and self-adaptive control, the time period is the minimum time unit for heat supply regulation and control;
the state space is composed of the outdoor environment data and heat load data;
the action set is dynamically generated based on the state space and the time period, and the method comprises the following steps:
For any time period A, extracting heat load data of each heating power station in a time period corresponding to the time period A in each of the previous n years, wherein n is a positive integer, averaging n heat load data of the ith heating power station, namely E Ai, wherein the value range of i is 1,2, m and m represent the number of the heating power stations, setting an arithmetic sequence containing k elements, namely Q, Q= { Q 1,q2,......,qk }, wherein Q 1 is the smallest element in the sequence Q, the value range is (0, 1), Q k is the largest element in the sequence Q, and the value range is (1, 2), and for the time period A, the form of an action set A a is as follows:
Aa={a1,a2,......,ak};
the specific form of each action is as follows:
aj={qj·EA1,qj·EA2,......,qj·EAm};
Where a j represents a j-th action in the action set, and j has a value range of 1, 2.
As an optimal scheme of the intelligent heat supply regulation method based on deep reinforcement learning and self-adaptive control, the calculation formula of the reward function is as follows:
Wherein s j represents a feature vector of a state space at a jth time period, a j represents an action performed at the jth time period, R (s j,aj) represents a bonus function value of the action a j performed at the time when the feature vector of the state space is s j, α is a weight parameter, β is a proportionality coefficient, and E ij represents a heat load supply amount allocated to the ith heat station at the jth time period; e j represents the total heat generation amount of the thermal power plant in the jth time period, and the calculation formula is as follows:
Where η i denotes the heat loss coefficient when the thermal power plant delivers a heat load to the i-th thermal power station.
The intelligent heat supply regulation and control method based on deep reinforcement learning and self-adaptive control is characterized in that a characteristic vector of a state space consists of historical outdoor environment data, real-time outdoor environment data and historical heat load data of each heating power station, the updating method is that the current entering time period is set as B, the heat load data and the outdoor environment data of each heating power station in a time period corresponding to the time period B in each year of the previous n years are extracted, the real-time outdoor environment data are obtained, and the characteristic vector of the state space corresponding to the time period B is formed.
The intelligent heat supply regulation and control method based on the deep reinforcement learning and the self-adaptive control is characterized in that when the time period of each heat supply regulation and control starts, real-time outdoor environment parameters are collected and input into a deep reinforcement learning model, an action set is automatically generated by the model and the action with the highest selection probability is output, the heat load supply quantity of a heat power plant to each heat station in the current time period of the heat supply regulation and control is determined based on the action with the highest selection probability, and the total heat production quantity of the heat power plant in the current time period of the heat supply regulation and control is calculated.
The intelligent heat supply regulation and control method based on deep reinforcement learning and self-adaptive control is characterized in that the heat supply strategy is implemented by specifically comprising the steps of self-adaptively controlling the total heat generation amount of a thermal power plant and the heat load supply amount of the self-adaptively controlling the thermal power plant to each thermal power plant, wherein the method for self-adaptively controlling the total heat generation amount of the thermal power plant comprises the following steps:
Collecting fuel consumption in real time;
A PID controller is designed for the thermal power plant, and an error term e 0 (t) is calculated, and the formula is as follows:
e0(t)=Ej-E(t);
wherein E (t) represents the actual heat generation amount at the current moment, and the calculation formula is as follows:
E(t)=η0·m(t)·Hv;
Wherein η 0 denotes a boiler heat efficiency, m (t) denotes a fuel mass which has been currently consumed, H v denotes a heating value of the fuel;
Calculating a control output u 0 (t) at the current moment through a PID algorithm, wherein u 0 (t) represents the regulated supply quantity of the fuel;
Based on u 0 (t), adjusting the fuel supply by the hardware device;
the actual heat production is continuously monitored, and the fuel supply is adjusted until the actual heat production reaches the total heat production.
As an optimal scheme of the intelligent heat supply regulation method based on deep reinforcement learning and self-adaptive control, the method for self-adaptively controlling the heat load supply quantity of the thermal power plant to each heat power station comprises the following steps:
Collecting the primary water supply temperature, the primary backwater temperature and the primary water supply instantaneous flow of each heating power station in real time;
The PID controller is designed for each heating station, and the error term e i (t) of the PID controller of the ith heating station is calculated as follows:
ei(t)=Eij-Ei(t);
Wherein E i (t) represents the heat load value actually obtained by the ith heating station, and the calculation formula is as follows:
Ei(t)=c·(Tri-Tsi)·Qi(t);
Wherein c is the specific heat capacity of water, T ri is the primary backwater temperature of the ith heating station, T si is the primary water supply temperature of the ith heating station, Q i (T) is the total primary water supply amount of the ith heating station from the current time period to the current time, and the total primary water supply amount is obtained by integrating the instantaneous primary water supply flow with time;
Calculating a control output u i (t) of each heating power station at the current moment through a PID algorithm, wherein u i (t) represents the opening degree of a branch valve of the ith heating power station and the frequency of a circulating pump;
And continuously monitoring the heat load value actually acquired by each heating power station, and adjusting the opening degree of the branch valve and the frequency of the circulating pump until the heat load value actually acquired by each heating power station reaches the distributed heat load supply quantity.
Compared with the prior art, the invention has the following beneficial effects:
The deep reinforcement learning is applied to a heat supply regulation scene, so that the transition from passive on-demand heat supply to active prediction and optimization of a heat supply strategy is realized, and the intelligent level of a heat supply system is remarkably improved. The method comprises the steps of finishing macroscopic heat supply strategy planning through deep reinforcement learning, considering complex factors such as seasonality, weather change, predicted user demands and the like, rapidly responding according to the current actual conditions through self-adaptive control, overcoming the limitation of a single technology in heat supply regulation and control, reducing the complexity of a deep reinforcement learning model, relieving the burden of the deep reinforcement learning model and accelerating the convergence speed of a network, and organically combining the two, thereby not only meeting the long-term stable and efficient operation of a heat supply system, but also taking into account the short-term flexible regulation and control demands.
The heat supply strategy based on deep learning can be more accurately matched with the actual heat demand, and the situation of excessive heat supply or insufficient heat supply is avoided, so that the energy utilization efficiency is improved, the unnecessary energy loss is reduced, and the energy conservation and emission reduction are facilitated. The self-adaptive control algorithm is used for executing the heat supply strategy, the total heat generation amount of the thermal power plant and the heat load supply amount received by each heat station can be dynamically adjusted according to actual conditions, the stability and reliability of the whole heat supply system are enhanced, the heat supply quality is ensured, and meanwhile, the energy distribution is optimized.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the description of the embodiments will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art. Wherein:
FIG. 1 is a flow chart of an intelligent heat supply regulation and control method based on deep reinforcement learning and self-adaptive control provided by the invention;
FIG. 2 is a flow chart of a method for training a deep reinforcement learning model for formulating a heating strategy in accordance with the present invention.
Detailed Description
The following detailed description of the present invention is made with reference to the accompanying drawings and specific embodiments, and it is to be understood that the specific features of the embodiments and the embodiments of the present invention are detailed description of the technical solutions of the present invention, and not limited to the technical solutions of the present invention, and that the embodiments and the technical features of the embodiments of the present invention may be combined with each other without conflict.
Example 1
This embodiment describes an intelligent heat supply regulation and control method based on deep reinforcement learning and adaptive control, and referring to fig. 1, the method includes the following steps:
An intelligent heat supply regulation and control method based on deep reinforcement learning and self-adaptive control comprises the following steps:
s1, collecting outdoor environment data and heat load data of each heating power station, and preprocessing the data;
The outdoor environment data comprises temperature, humidity, wind speed and solar radiation intensity, and the outdoor environment data can be used for predicting the heat load requirement of each heating station. Temperature is one of the most significant factors in the heat load demand. Lower outdoor temperatures may result in increased heating heat load. In heating season, the high humidity can make the human body feel cooler, and the demand of people for heating is increased. The wind power influences the convection heat exchange of the outer surface of the building, and the heat dissipation process of the building shell can be accelerated due to the fact that the wind speed is increased, so that the heat loss is increased. Solar radiation has direct influence on the heat gain of a building, especially for a building with a large-area glass curtain wall or a south window, the indoor temperature can be obviously improved by sunlight irradiation, and the heating requirement is reduced.
The thermal load data represents the amount of heat that the thermal station needs to provide to maintain the indoor temperature of all users to the heating standard temperature.
The preprocessing comprises missing value filling, abnormal value detection and replacement, data normalization and data synchronous integration;
the method for synchronously integrating the data is to correspond the heat load data and the outdoor environment data according to time to form a time sequence corresponding to the heat load data and the outdoor environment data.
The method for filling the missing values is one of linear interpolation, polynomial interpolation and prediction based on a time sequence model;
The method of data normalization is one of minimum-maximum normalization and Z-score normalization, and because outdoor environment parameters and heat load data tend to have different dimensions and numerical ranges, the parameters and the heat load data need to be converted to the same dimension so that a deep reinforcement learning model can process all input features. After the preprocessing step is completed, the collected data can be used as effective input of a deep reinforcement learning model, so that an intelligent model capable of adjusting a heating strategy according to real-time environmental conditions is constructed.
S2, training a deep reinforcement learning model based on the outdoor environment data and the heat load data, wherein referring to FIG. 2, the method comprises the following steps:
S100, setting a time period for heat supply regulation, constructing a state space, an action set and a reward function, constructing a strategy network and initializing parameters;
The time period is the minimum time unit for heat supply regulation and control, for example, one day is set as one time period;
the state space is composed of the outdoor environment data and heat load data;
the action set is dynamically generated based on the state space and the time period, and the method comprises the following steps:
For any time period A, extracting heat load data of each heating power station in a time period corresponding to the time period A in each of the previous n years, wherein n is a positive integer, averaging n heat load data of the ith heating power station, namely E Ai, wherein the value range of i is 1,2, m and m represent the number of the heating power stations, setting an arithmetic sequence containing k elements, namely Q, Q= { Q 1,q2,......,qk }, wherein Q 1 is the smallest element in the sequence Q, the value range is (0, 1), Q k is the largest element in the sequence Q, and the value range is (1, 2), and for the time period A, the form of an action set A a is as follows:
Aa={a1,a2,......,ak};
the specific form of each action is as follows:
aj={qj·EA1,qj·EA2,......,qj·EAm};
Wherein a j represents the j-th action in the action set, and the value range of j is 1, 2. Q j represents a j-th element in the arithmetic array Q;
The scheme is used for setting the action based on the historical heat load requirement of each heating station as a reference, for example, the optional action of the ith heating station is set to be 0.8 times, 0.9 times, 1.0 times, 1.1 times and 1.2 times of the historical average heat load requirement, and the strategy network optimizes the action based on the historical data setting through the comparison of the real-time outdoor environment parameters and the historical outdoor environment parameters, so that the training speed of the deep reinforcement learning model is improved.
The calculation formula of the reward function is as follows:
Wherein s j represents a feature vector of a state space in a jth time period, a j represents an action performed in the jth time period, R (s j,aj) represents a bonus function value of the action a j performed when the feature vector of the state space is s j, alpha is a weight parameter, beta is a proportionality coefficient, which is set by a person skilled in the art according to actual requirements, E ij represents a heat load supply amount allocated to an ith heat station in the jth time period, and the heat load supply amount is determined by the action selected to be performed in the jth time period; e j represents the total heat generation amount of the thermal power plant in the jth time period, and the calculation formula is as follows based on the heat load supply amount of each thermal power station:
Wherein η i represents the heat loss coefficient of the thermal power plant when delivering heat load to the ith thermal power station, which is experimentally determined by one skilled in the art;
the rewarding function gives consideration to the balance between the total heat production of the thermal power plant and the supply and demand of each thermal power station, saves the total energy consumption, and maximally balances the uniformity degree of the heat load supply quantity of each thermal power station.
The strategy network comprises an input layer, a hiding layer and an output layer, wherein the input layer is used for inputting characteristic vectors of a state space, the hiding layer is used for further extracting characteristics of the state space, and the output layer is used for generating selection probability of each action in the action aggregation set under the current state space. Converting the output into a probability distribution using a softmax function to ensure that the sum of the selection probabilities of all actions is 1;
S200, selecting an action from the action collection set and executing the action;
The method for selecting an action is as follows:
inputting the feature vector of the current state space into the strategy network to obtain the selection probability of each action in the action aggregation set under the current state space;
setting a threshold parameter epsilon, wherein the value range is (0,0.15);
Generating a random number r, wherein the value range is [0,1], if r is more than or equal to epsilon, executing the action with highest selection probability, and if r is less than epsilon, randomly selecting one action from the action collection set and executing the action;
S300, calculating a cumulative rewards value and updating parameters of a strategy network;
the cumulative prize value is calculated as follows:
Wherein R N represents the current cumulative prize value, N represents the number of actions that have been performed, β represents the discount factor, β j represents the power of the discount factor β to the power of j, R (s j,aj) represents the prize value for performing action a j in state space s j, and j has a range of values of 1, 2.
The calculation formula for updating the parameters of the policy network is as follows:
wherein δ represents any one parameter in the policy network; The gradient of the function pair delta in brackets is represented, eta is the learning rate, L N is the objective function, and the calculation formula is as follows:
LN=ln(p(sN,aN)·RN);
Where p (s N,aN) represents the probability of selection of action a N in the environmental state s N.
S400, entering the next time period, generating a new action set and updating the feature vector of the state space;
The characteristic vector of the state space consists of historical outdoor environment data, real-time outdoor environment data and historical heat load data of each heating power station, and the characteristic vector updating method of the state space comprises the steps of enabling the current entering time period to be B, extracting the heat load data and the outdoor environment data of each heating power station in the time period corresponding to the time period B in each year of the previous n years, acquiring the real-time outdoor environment data and forming the characteristic vector of the state space corresponding to the time period B.
In the training stage of the deep reinforcement learning model, the real-time outdoor environment data is obtained by actually reading outdoor environment data in a corresponding time period in a state space, and after model training is completed, the real-time outdoor environment data is collected in real time through a sensor and other detection equipment in the actual deployment and application process.
S500, repeating the steps S200-S400 until the accumulated rewards value converges to complete the training of the strategy network, storing the strategy network and deploying and applying the deep reinforcement learning model.
After multiple iterations, the jackpot value tends to stabilize and no significant fluctuations occur, i.e., the jackpot value is considered to converge, and the policy network has been able to make decisions to maximize the jackpot value.
S3, formulating a heating strategy based on the deep reinforcement learning model, wherein the method comprises the following steps:
And determining the heat load supply quantity of the heat power plant to each heat station in the current heat supply regulation time period based on the action with the highest selection probability, and calculating the total heat production quantity of the heat power plant in the current heat supply regulation time period.
And S4, executing the heating strategy through an adaptive control algorithm. The method comprises the steps of adaptively controlling the total heat generation amount of the thermal power plant and adaptively controlling the heat load supply amount of the thermal power plant to each thermal power plant.
The method for adaptively controlling the total heat generation amount of the thermal power plant comprises the following steps:
Collecting fuel consumption in real time;
A PID controller is designed for the thermal power plant, and an error term e 0 (t) is calculated, and the formula is as follows:
e0(t)=Ej-E(t);
wherein E (t) represents the actual heat generation amount at the current moment, and the calculation formula is as follows:
E(t)=η0·m(t)·Hv;
Wherein η 0 denotes the boiler thermal efficiency, the efficiency of converting the heat energy released by the combustion of the fuel into the heat of the feed water for the boiler, m (t) denotes the mass of fuel currently consumed, H v denotes the heating value of the fuel, the heat released when the fuel is completely combusted for a unit mass is generally given in the product manual of the fuel;
Calculating a control output u 0 (t) at the current moment through a PID algorithm, wherein u 0 (t) represents the regulated supply quantity of the fuel;
Mapping the control output signal u 0 (t) into the adjustment range of the actual fuel supply quantity, and driving a fuel valve by adjusting a combustion controller or a servo motor to realize accurate control of the fuel supply quantity;
Continuously monitoring the actual heat generation amount, and adjusting the fuel supply amount until the actual heat generation amount reaches the total heat generation amount;
The parameters of the PID controller are preliminarily set through on-site debugging and an empirical formula according to the requirements of the response characteristics of the boiler and the stability of the system, and are adjusted regularly.
The method for adaptively controlling the heat load supply amount of the thermal power plant to be transmitted to each thermal power station comprises the following steps:
the method comprises the steps of collecting the primary water supply temperature, the primary backwater temperature and the primary water supply instantaneous flow of each heating station in real time, and acquiring heat supply network data in real time by installing a flowmeter and a thermometer on a main pipe network.
The PID controller is designed for each heating station, and the error term e i (t) of the PID controller of the ith heating station is calculated as follows:
ei(t)=Eij-Ei(t);
Wherein E i (t) represents the heat load value actually obtained by the ith heating station, and the calculation formula is as follows:
Ei(t)=c·(Tri-Tsi)·Qi(t);
Wherein c is the specific heat capacity of water, T ri is the primary backwater temperature of the ith heating station, T si is the primary water supply temperature of the ith heating station, Q i (T) is the total primary water supply amount of the ith heating station from the current time period to the current time, and the total primary water supply amount is obtained by integrating the instantaneous primary water supply flow with time;
Calculating a control output u i (t) of each heating power station at the current moment through a PID algorithm, wherein u i (t) represents the opening degree of a branch valve of the ith heating power station and the frequency of a circulating pump;
The distribution proportion of the primary water supply quantity transmitted by the thermal power station to each thermal power station can be adjusted by adjusting the opening of the branch valve and the frequency of the circulating pump, and then the distribution proportion of the heat load supply quantity of each thermal power station is adjusted.
And continuously monitoring the heat load value actually acquired by each heating power station, and adjusting the opening degree of the branch valve and the frequency of the circulating pump until the heat load value actually acquired by each heating power station reaches the distributed heat load supply quantity.
In deep reinforcement learning, the advantages of adaptive control can help improve the performance and convergence speed of the algorithm. The self-adaptive control can solve part of bottom layer control problems by monitoring and adjusting system parameters in real time, and especially can perform local optimization faster when facing the change of the dynamic characteristics of the system, so as to keep the stability of the system.
It will be appreciated by those skilled in the art that embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The embodiments of the present invention have been described above with reference to the accompanying drawings, but the present invention is not limited to the above-described embodiments, which are merely illustrative and not restrictive, and many forms may be made by those having ordinary skill in the art without departing from the spirit of the present invention and the scope of the claims, which are all within the protection of the present invention.

Claims (6)

1.一种基于深度强化学习和自适应控制的智能化供热调控方法,其特征在于:包括以下步骤:1. An intelligent heating control method based on deep reinforcement learning and adaptive control, characterized in that it includes the following steps: S1:收集室外环境数据与每个热力站的热负荷数据,并进行数据预处理;S1: Collect outdoor environment data and heat load data of each heating station, and perform data preprocessing; S2:基于所述室外环境数据与热负荷数据,训练深度强化学习模型;S2: training a deep reinforcement learning model based on the outdoor environment data and the heat load data; 所述训练深度强化学习模型的方法如下:The method for training the deep reinforcement learning model is as follows: S100:设置进行供热调控的时间周期;构建状态空间、动作合集、奖励函数;构建策略网络并进行参数初始化;S100: Setting the time period for heating control; constructing the state space, action set, and reward function; constructing the strategy network and initializing the parameters; 所述时间周期为进行供热调控的最小时间单位;The time period is the minimum time unit for heat supply regulation; 所述状态空间由所述室外环境数据与热负荷数据构成;The state space is composed of the outdoor environment data and heat load data; 所述动作合集基于所述状态空间与时间周期动态生成,方法如下:对于任一时间周期A,提取前n年中每一年与时间周期A对应的时间段中每个热力站的热负荷数据,n为正整数;将第i个热力站的n个热负荷数据求均值,记作EAi,i的取值范围为1,2,……,m,m表示热力站的个数;设置一个包含k个元素的等差数列,记作Q,Q={q1,q2,......,qk};其中,q1为数列Q中的最小元素,取值范围为(0,1);qk为数列Q中的最大元素,取值范围为(1,2);则对于时间周期A,动作合集Aa的形式如下:The action set is dynamically generated based on the state space and time period, and the method is as follows: for any time period A, extract the heat load data of each thermal power station in the time period corresponding to the time period A in each year of the previous n years, where n is a positive integer; calculate the average of the n heat load data of the i-th thermal power station, denoted as E Ai , the value range of i is 1, 2, ..., m, and m represents the number of thermal power stations; set an arithmetic progression containing k elements, denoted as Q, Q = {q 1 ,q 2 , ..., q k }; wherein q 1 is the minimum element in the sequence Q, and the value range is (0, 1); q k is the maximum element in the sequence Q, and the value range is (1, 2); then for the time period A, the form of the action set A a is as follows: Aa={a1,a2,......,ak};A a = {a 1 , a 2 ,..., a k }; 其中,每个动作的具体形式如下:The specific form of each action is as follows: aj={qj·EA1,qj·EA2,......,qj·EAm};a j ={q j ·E A1 ,q j ·E A2 ,...,q j ·E Am }; 其中,aj表示动作合集中的第j个动作,j的取值范围为1,2,……,k;qj表示等差数列Q中的第j个元素;Where a j represents the jth action in the action set, and the value range of j is 1, 2, ..., k; q j represents the jth element in the arithmetic sequence Q; 所述奖励函数的计算公式如下:The calculation formula of the reward function is as follows: 其中,sj表示第j个时间周期时状态空间的特征向量;aj表示第j个时间周期时执行的动作;R(sj,aj)表示在状态空间的特征向量为sj时,执行动作aj的奖励函数值;α为权重参数,β为比例系数;Eij表示第j个时间周期时为第i个热力站分配的热负荷供给量;表示第j个时间周期时第i个热力站的热负荷需求;Ej表示第j个时间周期时热电厂的总产热量,计算公式如下:Where sj represents the characteristic vector of the state space at the jth time period; aj represents the action performed at the jth time period; R( sj , aj ) represents the reward function value of executing action aj when the characteristic vector of the state space is sj ; α is the weight parameter, β is the proportional coefficient; Eij represents the heat load supply allocated to the i-th thermal power station at the jth time period; represents the heat load demand of the i-th thermal power station in the j-th time period; Ej represents the total heat output of the thermal power plant in the j-th time period, and the calculation formula is as follows: 其中,ηi表示热电厂向第i个热力站输送热负荷时的热损耗系数;Where η i represents the heat loss coefficient when the thermal power plant delivers heat load to the i-th thermal power station; S200:从动作合集中选择一个动作并执行;S200: Select an action from the action collection and execute it; S300:计算累计奖励值,并进行策略网络的参数更新;S300: Calculate the cumulative reward value and update the parameters of the strategy network; S400:进入下一个时间周期,生成新的动作合集并更新状态空间的特征向量;S400: Entering the next time period, generating a new action set and updating the feature vector of the state space; S500:重复步骤S200-S400,直至累计奖励值收敛,完成策略网络的训练;保存策略网络并将深度强化学习模型部署运用;S500: Repeat steps S200-S400 until the cumulative reward value converges and the training of the policy network is completed; save the policy network and deploy the deep reinforcement learning model; S3:基于所述深度强化学习模型制定供热策略;S3: Formulate a heating strategy based on the deep reinforcement learning model; S4:通过自适应控制算法执行所述供热策略。S4: Execute the heating strategy through an adaptive control algorithm. 2.如权利要求1所述的基于深度强化学习和自适应控制的智能化供热调控方法,其特征在于:所述室外环境数据包括温度、湿度、风速、太阳辐射强度;2. The intelligent heating control method based on deep reinforcement learning and adaptive control according to claim 1, characterized in that: the outdoor environment data includes temperature, humidity, wind speed, and solar radiation intensity; 所述热负荷数据表示热力站为保持所有用户的室内温度达到供热标准温度,所需提供的热量;The heat load data represents the amount of heat that the heating station needs to provide to keep the indoor temperature of all users at the heating standard temperature; 所述预处理包括缺失值填充、异常值检测与替换、数据归一化、数据同步整合;其中,数据同步整合的方法为将热负荷数据与室外环境数据按照时间对应起来,形成热负荷数据与室外环境数据对应的时间序列。The preprocessing includes missing value filling, outlier detection and replacement, data normalization, and data synchronization integration; wherein the method of data synchronization integration is to match the heat load data with the outdoor environment data according to time to form a time series corresponding to the heat load data and the outdoor environment data. 3.如权利要求2所述的基于深度强化学习和自适应控制的智能化供热调控方法,其特征在于:所述状态空间的特征向量由历史室外环境数据、实时室外环境数据、每个热力站的历史热负荷数据组成;更新方法如下:令当前进入的时间周期为B,提取前n年中每一年与时间周期B对应的时间段中每个热力站的热负荷数据与室外环境数据,并获取实时室外环境数据,组成时间周期B对应的状态空间的特征向量。3. The intelligent heating control method based on deep reinforcement learning and adaptive control as described in claim 2 is characterized in that: the characteristic vector of the state space is composed of historical outdoor environment data, real-time outdoor environment data, and historical heat load data of each thermal power station; the updating method is as follows: let the current time period be B, extract the heat load data and outdoor environment data of each thermal power station in the time period corresponding to time period B in each year of the previous n years, and obtain real-time outdoor environment data to form the characteristic vector of the state space corresponding to time period B. 4.如权利要求3所述的基于深度强化学习和自适应控制的智能化供热调控方法,其特征在于:所述制定供热策略的方法如下:每个供热调控的时间周期开始时,采集实时的室外环境参数并输入所述深度强化学习模型中,由模型自动生成动作合集并将选择概率最高的动作输出;基于所述选择概率最高的动作确定当前供热调控的时间周期中热电厂对每个热力站输送的热负荷供给量,并计算热电厂在当前供热调控的时间周期中的总产热量。4. The intelligent heating regulation method based on deep reinforcement learning and adaptive control as described in claim 3 is characterized in that: the method for formulating the heating strategy is as follows: at the beginning of each heating regulation time period, real-time outdoor environmental parameters are collected and input into the deep reinforcement learning model, and the model automatically generates an action collection and outputs the action with the highest selection probability; based on the action with the highest selection probability, the heat load supply delivered by the thermal power plant to each thermal power station in the current heating regulation time period is determined, and the total heat production of the thermal power plant in the current heating regulation time period is calculated. 5.如权利要求4所述的基于深度强化学习和自适应控制的智能化供热调控方法,其特征在于:执行所述供热策略具体包括自适应控制热电厂的总产热量与自适应控制热电厂对每个热力站输送的热负荷供给量;其中,自适应控制热电厂的总产热量的方法如下:5. The intelligent heating control method based on deep reinforcement learning and adaptive control according to claim 4 is characterized in that: executing the heating strategy specifically includes adaptively controlling the total heat output of the thermal power plant and adaptively controlling the heat load supply of the thermal power plant to each thermal power station; wherein the method for adaptively controlling the total heat output of the thermal power plant is as follows: 实时采集燃料消耗量;Real-time collection of fuel consumption; 为热电厂设计PID控制器,计算误差项e0(t),公式如下:To design a PID controller for a thermal power plant, calculate the error term e 0 (t) using the following formula: e0(t)=Ej-E(t);e 0 (t) = E j - E (t); 其中,E(t)表示当前时刻的实际产热量,计算公式如下:Among them, E(t) represents the actual heat production at the current moment, and the calculation formula is as follows: E(t)=η0·m(t)·HvE(t) = η 0 ·m(t)·H v ; 其中,η0表示锅炉热效率;m(t)表示当前已消耗的燃料质量;Hv表示燃料的热值;Where η 0 represents the thermal efficiency of the boiler; m(t) represents the mass of fuel currently consumed; H v represents the calorific value of the fuel; 通过PID算法计算得出当前时刻的控制输出u0(t),其中u0(t)表示燃料的调整供应量;The control output u 0 (t) at the current moment is calculated by the PID algorithm, where u 0 (t) represents the adjusted supply amount of fuel; 基于u0(t),通过硬件设备调整燃料供应量;Based on u 0 (t), the fuel supply is adjusted through the hardware device; 持续监测实际产热量,并调整燃料供应量,直至实际产热量达到总产热量。The actual heat production is continuously monitored and the fuel supply is adjusted until the actual heat production reaches the total heat production. 6.如权利要求5所述的基于深度强化学习和自适应控制的智能化供热调控方法,其特征在于:自适应控制热电厂对每个热力站输送的热负荷供给量的方法如下:6. The intelligent heating control method based on deep reinforcement learning and adaptive control as claimed in claim 5 is characterized in that: the method for adaptively controlling the heat load supply amount delivered by the thermal power plant to each thermal power station is as follows: 实时采集每个热力站的一次供水温度、一次回水温度、一次供水瞬时流量;Real-time collection of primary water supply temperature, primary return water temperature, and primary water supply instantaneous flow rate of each thermal power station; 为每个热力站设计PID控制器,计算第i个热力站的PID控制器的误差项ei(t),公式如下:Design a PID controller for each thermal power station and calculate the error term e i (t) of the PID controller of the i-th thermal power station. The formula is as follows: ei(t)=Eij-Ei(t);e i (t) = E ij - E i (t); 其中,Ei(t)表示第i个热力站实际获取的热负荷值;计算公式如下:Where E i (t) represents the actual heat load value obtained by the i-th thermal power station; the calculation formula is as follows: Ei(t)=c·(Tri-Tsi)·Qi(t);E i (t)=c·(T ri -T si )·Q i (t); 其中,c为水的比热容;Tri表示第i个热力站的一次回水温度;Tsi表示第i个热力站的一次供水温度;Qi(t)表示第i个热力站从当前时间周期开始至当前时刻的一次供水总量,由一次供水瞬时流量对时间积分得到;Wherein, c is the specific heat capacity of water; Tri represents the primary return water temperature of the i-th thermal power station; Tsi represents the primary supply water temperature of the i-th thermal power station; Qi (t) represents the total primary water supply of the i-th thermal power station from the current time period to the current moment, which is obtained by integrating the instantaneous flow of primary water supply over time; 通过PID算法计算得出当前时刻每个热力站的控制输出ui(t),其中ui(t)表示第i个热力站的支路阀门开度与循环泵频率;The control output u i (t) of each thermal power station at the current moment is calculated by the PID algorithm, where u i (t) represents the branch valve opening and circulating pump frequency of the i-th thermal power station; 持续监测每个热力站实际获取的热负荷值,并调整支路阀门开度与循环泵频率,直至每个热力站实际获取的热负荷值达到分配的热负荷供给量。Continuously monitor the actual heat load value obtained by each thermal power station, and adjust the branch valve opening and circulation pump frequency until the actual heat load value obtained by each thermal power station reaches the allocated heat load supply.
CN202410578917.3A 2024-05-10 2024-05-10 Intelligent heating control method based on deep reinforcement learning and adaptive control Active CN118519340B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410578917.3A CN118519340B (en) 2024-05-10 2024-05-10 Intelligent heating control method based on deep reinforcement learning and adaptive control

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410578917.3A CN118519340B (en) 2024-05-10 2024-05-10 Intelligent heating control method based on deep reinforcement learning and adaptive control

Publications (2)

Publication Number Publication Date
CN118519340A CN118519340A (en) 2024-08-20
CN118519340B true CN118519340B (en) 2025-02-11

Family

ID=92276615

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410578917.3A Active CN118519340B (en) 2024-05-10 2024-05-10 Intelligent heating control method based on deep reinforcement learning and adaptive control

Country Status (1)

Country Link
CN (1) CN118519340B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118780941A (en) * 2024-09-12 2024-10-15 广东联航智能科技有限公司 Energy-saving management early warning method and system based on big data
CN119554904B (en) * 2025-01-24 2025-05-06 深圳大学 Temperature storage device based on chemical reversible thermal effect and control method of temperature storage device
CN120043149A (en) * 2025-04-24 2025-05-27 北京智合瑞行能源科技有限公司 Heat network balancing system and method based on heat station load characteristic difference weighted algorithm

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109631150A (en) * 2018-12-19 2019-04-16 天津宏达瑞信科技有限公司 A kind of accurate heat supply method of temperature based on secondary network, pressure and user's room temperature

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR102715372B1 (en) * 2022-02-25 2024-10-14 경희대학교 산학협력단 An auto-adaptive controller tuning system for wastewater treatment process using data-driven smart decisions of offline reinforcement learning
CN114909706B (en) * 2022-04-24 2024-05-07 常州英集动力科技有限公司 Two-level network balance regulation and control method based on reinforcement learning algorithm and differential pressure control
CN116697445A (en) * 2023-06-06 2023-09-05 巴云海 Heat supply control method for deep learning

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109631150A (en) * 2018-12-19 2019-04-16 天津宏达瑞信科技有限公司 A kind of accurate heat supply method of temperature based on secondary network, pressure and user's room temperature

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于策略梯度的多热力站热量分配优化研究;谭梦媛;中国优秀硕士学位论文全文数据库工程科技II辑;20230331;第2023年(第03期);C038-894 *

Also Published As

Publication number Publication date
CN118519340A (en) 2024-08-20

Similar Documents

Publication Publication Date Title
CN118519340B (en) Intelligent heating control method based on deep reinforcement learning and adaptive control
CN110458443B (en) Smart home energy management method and system based on deep reinforcement learning
Kang et al. A novel approach of day-ahead cooling load prediction and optimal control for ice-based thermal energy storage (TES) system in commercial buildings
CN111561732B (en) Heat exchange station heat supply adjusting method and system based on artificial intelligence
CN111580382B (en) Unit-level heat supply adjusting method and system based on artificial intelligence
Lu et al. Data augmentation strategy for short-term heating load prediction model of residential building
Yuan et al. Analysis and evaluation of the operation data for achieving an on-demand heating consumption prediction model of district heating substation
CN118316092B (en) Intelligent control method, device, system, electronic device and storage medium for coordinated power consumption of source, grid, load and storage
CN115333168B (en) A field-level control strategy for offshore wind farms based on distributed rolling optimization
CN116734424A (en) Control method of indoor thermal environment based on RC model and deep reinforcement learning
CN119713515B (en) Optimal control method of central air-conditioning system for building load forecasting
CN115455835A (en) Optimized operation method of renewable energy-containing multi-heat-source networked heating system
CN117913812B (en) Deep learning-based interactive regulation and control method for wind-solar power supply and flexible load
CN115471006A (en) Power supply planning method and system considering wind power output uncertainty
CN112560160A (en) Model and data-driven heating ventilation air conditioner optimal set temperature obtaining method and equipment
CN116995659A (en) Flexible operation method of heat pump system considering renewable energy source consumption
CN119382251B (en) Cooperative optimization method of multiple types of power sources based on complementary characteristics
CN119647672B (en) A near-zero energy building energy supply method and system based on multi-objective optimization
Li et al. Enhancing demand response and heating performance of air source heat pump through optimal water temperature scheduling: Method and application
CN119227998A (en) Energy management strategy for electric-thermal integrated energy system based on DQN-CE algorithm with noise network and self-attention mechanism
CN116485582A (en) Heat supply optimization regulation and control method and device based on deep learning
CN116454996A (en) Hydropower station real-time load distribution method based on deep reinforcement learning
CN114372645A (en) Energy supply system optimization method and system based on multi-agent reinforcement learning
EP4109200A1 (en) Acquiring a user consumption pattern of domestic hot water and controlling domestic hot water production based thereon
CN116107206A (en) Greenhouse equipment control method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant