CN113825356B

CN113825356B - Energy-saving control method and device for cold source system, electronic equipment and storage medium

Info

Publication number: CN113825356B
Application number: CN202110856943.4A
Authority: CN
Inventors: 林依挺; 吴俊杰; 夏恒; 贾庆山; 王宇恒; 唐静娴; 陆翔
Original assignee: Tsinghua University; Tencent Technology Shenzhen Co Ltd
Current assignee: Tsinghua University; Tencent Technology Shenzhen Co Ltd
Priority date: 2021-07-28
Filing date: 2021-07-28
Publication date: 2023-11-28
Anticipated expiration: 2041-07-28
Also published as: CN113825356A

Abstract

The embodiment of the application discloses an energy-saving control method and device of a cold source system, electronic equipment and a storage medium; according to the embodiment of the application, the current state quantity of the cold source system and the target control strategy of the preset control model can be obtained, the predicted state quantity of the cold source system in a plurality of dimensions in a target period is predicted according to the current state quantity and the target control strategy, the predicted state quantity of the plurality of dimensions is fused, the profit calculation is carried out on the fused predicted state quantity according to the preset rewarding function and the preset constraint condition, the preset control model is adopted, the total profit value of the cold source system in a preset period is determined based on the profit value, when the total profit value does not meet the preset condition, the target control strategy is adjusted according to the total profit, the adjusted control strategy is used as the target control strategy to be predicted continuously, and when the total profit value meets the preset condition, the trained control model is output for controlling the cold source system. The scheme can effectively realize the energy-saving control of the cold source system.

Description

Energy-saving control method and device for cold source system, electronic equipment and storage medium

Technical Field

The present application relates to the field of computer technologies, and in particular, to an energy-saving control method and apparatus for a cold source system, an electronic device, and a storage medium.

Background

The data center refers to a machine room used for placing servers in the technical fields of communication and information. The data center is used to transfer, accelerate, display, calculate, and store data information over the internet infrastructure. With the continuous innovation and development of information technology and the increasing material culture demands of people, more and more enterprises are gradually aware that data processing, storage and exchange have great influence on the value of the enterprises, data are gradually becoming the most important assets of the enterprises, and data centers are also in the period of rapid development. The data center generally consumes a large amount of electric energy and generates a large amount of heat, however, in the prior art, the heat dissipation and the temperature reduction are only carried out on the data center through an air conditioning system, so that the real-time and effective energy-saving control requirements on the data center cannot be met, and the energy conservation and the environmental protection are not enough.

Disclosure of Invention

The embodiment of the application provides an energy-saving control method, an energy-saving control device, electronic equipment and a storage medium for a cold source system, which can effectively realize energy-saving control of the cold source system and greatly reduce energy consumption of the cold source system.

The embodiment of the application provides an energy-saving control method of a cold source system, which comprises the following steps:

acquiring a current state quantity of a cold source system and a target control strategy of a preset control model, wherein the current state quantity comprises a current period and a state quantity of a preset period before the current period;

predicting the predicted state quantity of a plurality of dimensions of the cold source system in a target period according to the current state quantity and a target control strategy, and fusing the predicted state quantity of the plurality of dimensions to obtain a fused predicted state quantity;

carrying out profit calculation on the fused predicted state quantity according to a preset reward function and a preset constraint condition to obtain a profit value of the cold source system in a target period;

determining a total profit value of the cold source system in a preset time period based on the profit value by adopting a preset control model, wherein the preset time period comprises at least one target time period;

when the total benefit value does not meet a preset condition, adjusting the target control strategy according to the total benefit to obtain an adjusted control strategy, and continuously predicting the predicted state quantity of the cold source system in multiple dimensions in a target period by taking the adjusted control strategy as the target control strategy;

And when the total income value meets the preset condition, outputting a trained control model for controlling the cold source system.

Correspondingly, the embodiment of the application also provides an energy-saving control device of the cold source system, which comprises:

the system comprises an acquisition unit, a control unit and a control unit, wherein the acquisition unit is used for acquiring the current state quantity of the cold source system and a target control strategy of a preset control model, and the current state quantity comprises a current period and a state quantity of a preset period before the current period;

the prediction unit is used for predicting the predicted state quantity of the multiple dimensions of the cold source system in the target period according to the current state quantity and the target control strategy, and fusing the predicted state quantity of the multiple dimensions to obtain a fused predicted state quantity;

the calculation unit is used for carrying out profit calculation on the fused prediction state quantity according to a preset reward function and a preset constraint condition to obtain a profit value of the cold source system in a target period;

the determining unit is used for determining the total benefit value of the cold source system in a preset time period based on the benefit value by adopting a preset control model, wherein the preset time period comprises at least one target time period;

the adjusting unit is used for adjusting the target control strategy according to the total benefit when the total benefit value does not meet the preset condition, obtaining an adjusted control strategy, and taking the adjusted control strategy as the target control strategy to continuously predict the predicted state quantity of the cold source system in multiple dimensions in the target period;

And the control unit is used for outputting a trained control model for controlling the cold source system when the total profit value meets the preset condition.

Optionally, in some embodiments, the prediction unit includes a determination subunit, a first prediction subunit, a second prediction subunit, and a fusion subunit, as follows:

the determining subunit is used for determining the currently adopted target control quantity according to the current state quantity and the target control strategy;

the first prediction subunit is configured to predict a first predicted state quantity of the cold source system in a target period by using a data driving sub-model, where the data driving sub-model is a data model obtained based on historical data training of the cold source system, and the historical data includes a historical state quantity and a historical control quantity of the cold source system;

the second prediction subunit is configured to predict a second predicted state quantity of the cold source system in a target period by using a mechanism energy consumption sub-model, where the mechanism energy consumption sub-model is a physical model that is built based on reference energy consumption of internal equipment of the cold source system;

and the fusion subunit is used for fusing the first predicted state quantity and the second predicted state quantity to obtain a fused predicted state quantity.

Optionally, in some embodiments, the mechanism energy consumption sub-model includes a chiller energy consumption module, a chilled water pump energy consumption module, and a chilled water pump energy consumption module, the second predicted state quantity includes a second predicted chiller power, a second predicted chilled water pump power, and the second predicted sub-unit includes a first module, a second module, and a third module, as follows:

the first module is used for predicting the second predicted water chilling unit power of the cold source system in the target period by utilizing the water chilling unit energy consumption module;

the second module is used for predicting second predicted chilled water pump power of the cold source system in a target period by utilizing the chilled water pump energy consumption module;

and the third module is used for predicting the second predicted cooling water pump power of the cold source system in the target period by using the cooling water pump energy consumption module.

Optionally, in some embodiments, the current state quantity includes a target cooling water outlet temperature and a target chilled water return temperature, the target control quantity includes a target chilled water outlet temperature and a target chilled water flow, and the first module may be specifically configured to obtain a cold water model parameter of the chiller energy consumption module; and calculating the second predicted water chilling unit power of the cold source system in the target period based on the target cooling water outlet temperature, the target chilled water return temperature, the target chilled water outlet temperature, the target chilled water flow and the cold water model parameters.

Optionally, in some embodiments, the target control amount includes a target chilled water pump flow, and the second module may be specifically configured to obtain a chilled model parameter of the chilled water pump energy consumption module; and calculating a second predicted chilled water pump power of the cold source system in a target period based on the target chilled water pump flow and the freezing model parameters.

Optionally, in some embodiments, the target control amount includes a target cooling water pump flow, and the third module may be specifically configured to obtain a cooling model parameter of the cooling water pump energy consumption module; and calculating the second predicted cooling water pump power of the cold source system in the target period based on the target cooling water pump flow and the cooling model parameters.

Optionally, in some embodiments, the fusion subunit may specifically be configured to determine a first weight of the first predicted state quantity based on the prediction errors of the data-driven sub-model and the mechanism energy consumption sub-model, and determine a second weight of the second predicted state quantity based on the prediction errors of the data-driven sub-model and the mechanism energy consumption sub-model; and fusing the first predicted state quantity and the second predicted state quantity based on the first weight and the second weight to obtain a fused predicted state quantity.

Optionally, in some embodiments, the computing unit includes a first computing subunit, a second computing subunit, and a third computing subunit, as follows:

the first calculation subunit is used for carrying out rewarding calculation on the fused predicted state quantity according to a preset rewarding function to obtain a rewarding value of the cold source system in a target period;

the second calculating subunit is configured to perform punishment calculation on the fused predicted state quantity based on a preset constraint condition, so as to obtain a punishment value of the cold source system in a target period;

and the third calculation subunit is used for calculating the benefit value of the cold source system in the target period based on the reward value and the penalty value.

Optionally, in some embodiments, the post-fusion predicted state quantity includes post-fusion predicted chiller power, post-fusion predicted chilled water pump power, and post-fusion predicted cooling water pump power, and the first computing subunit may be specifically configured to determine a reward weight of the post-fusion predicted state quantity; and calculating the rewarding value of the cold source system in the target period based on the fused predicted water chilling unit power, the fused predicted chilled water pump power, the fused predicted cooling water pump power and the rewarding weight.

Optionally, in some embodiments, the second computing subunit may be specifically configured to obtain a load of the data center device, a refrigeration capacity coefficient of the cold source system, and a refrigeration capacity of the cold source system in the target period; and calculating a punishment value of the cold source system in a target period based on the load of the data center equipment, the refrigerating capacity coefficient and the refrigerating capacity.

Optionally, in some embodiments, the target period is a period next to the current period, and the determining unit may be specifically configured to update the target control policy of the preset control model based on the benefit value, to obtain an updated control policy; taking the target time period as the current time period, and taking the updated control strategy as the target control strategy of the preset control model to continuously predict the next time period of the cold source system until obtaining the benefit value of each time period in the preset time period; and determining the total profit value of the cold source system in the preset time period based on the profit value of each time period in the preset time period.

Optionally, in some embodiments, the control unit may be specifically configured to obtain a current state quantity of the cold source system, where the current state quantity includes a current period and a state quantity of a preset period before the current period; determining a system control strategy of the cold source system by utilizing the trained control model based on the current state quantity; and controlling the cold source system according to the system control strategy.

In addition, the embodiment of the application also provides a computer readable storage medium, wherein the computer readable storage medium stores a plurality of instructions, and the instructions are suitable for being loaded by a processor to execute the steps in the energy-saving control method of any cold source system provided by the embodiment of the application.

In addition, the embodiment of the application also provides electronic equipment, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor realizes the steps in the energy-saving control method of any cold source system provided by the embodiment of the application when executing the program.

According to one aspect of the present application, there is provided a computer program product or computer program comprising computer instructions stored in a computer readable storage medium, the computer instructions being read from the computer readable storage medium by a processor of a computer device, the computer instructions being executed by the processor such that the computer device performs the method provided in various alternative implementations of the energy saving control aspect of the cold source system described above.

The embodiment can acquire the current state quantity of the cold source system and a target control strategy of a preset control model, wherein the current state quantity comprises a current period and a state quantity of a preset period before the current period; then, predicting the predicted state quantity of a plurality of dimensions of the cold source system in a target period according to the current state quantity and a target control strategy, and fusing the predicted state quantity of the plurality of dimensions to obtain a fused predicted state quantity; then, carrying out profit calculation on the fused predicted state quantity according to a preset reward function and a preset constraint condition to obtain a profit value of the cold source system in a target period; determining the total profit value of the cold source system in a preset time period based on the profit value by adopting a preset control model, wherein the preset time period comprises at least one target time period; when the total benefit value does not meet a preset condition, adjusting the target control strategy according to the total benefit to obtain an adjusted control strategy, and continuously predicting the predicted state quantity of the cold source system in multiple dimensions in a target period by taking the adjusted control strategy as the target control strategy; and when the total income value meets the preset condition, outputting a trained control model for controlling the cold source system. The scheme can effectively realize the energy-saving control of the cold source system, and greatly reduces the energy consumption of the cold source system.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the description of the embodiments will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

Fig. 1a is a schematic view of a scenario of an energy-saving control method of a cold source system according to an embodiment of the present application;

FIG. 1b is a first flowchart of an energy-saving control method of a cold source system according to an embodiment of the present application;

FIG. 1c is a schematic diagram of a cold source system according to an embodiment of the present application;

FIG. 1d is a second flowchart of an energy-saving control method of a cold source system according to an embodiment of the present application;

FIG. 1e is a schematic diagram of a simulation model of a cold source system according to an embodiment of the present application;

FIG. 2a is a third flowchart of an energy-saving control method of a cold source system according to an embodiment of the present application;

FIG. 2b is a fourth flowchart of an energy-saving control method of a cold source system according to an embodiment of the present application;

FIG. 3 is a schematic structural diagram of an energy-saving control device of a cold source system according to an embodiment of the present application;

Fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present application.

Detailed Description

The following description of the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.

The principles of the present application are illustrated as implemented in a suitable computing environment. In the description that follows, specific embodiments of the application will be described with reference to steps and symbols performed by one or more computers, unless otherwise indicated. Thus, these steps and operations will be referred to in several instances as being performed by a computer, which as referred to herein performs operations that include processing units by the computer that represent electronic signals that represent data in a structured form. This operation transforms the data or maintains it in place in the computer's memory system, which may reconfigure or otherwise alter the computer's operation in a manner well known to those skilled in the art. The data structure maintained by the data is the physical location of the memory, which has specific characteristics defined by the data format. However, the principles of the present application are described in the foregoing text and are not meant to be limiting, and one skilled in the art will recognize that various steps and operations described below may also be implemented in hardware.

The term "unit" as used herein may be regarded as a software object executing on the computing system. The various components, units, engines, and services described herein may be viewed as implementing objects on the computing system. The apparatus and method may be implemented in software, but may also be implemented in hardware, which is within the scope of the present application.

The terms "first," "second," and "third," etc. in this disclosure are used for distinguishing between different objects and not for describing a particular sequential order. Furthermore, the terms "comprise" and "have," as well as any variations thereof, are intended to cover a non-exclusive inclusion. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to the list of steps or elements but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment may be included in at least one embodiment of the application. The appearances of such phrases in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those of skill in the art will explicitly and implicitly appreciate that the embodiments described herein may be combined with other embodiments.

The embodiment of the application provides an energy-saving control method and device for a cold source system, electronic equipment and a storage medium. The energy-saving control device of the cold source system can be integrated in electronic equipment, and the electronic equipment can be a server or a terminal and other equipment.

The energy-saving control method of the cold source system provided by the embodiment of the application relates to a machine learning technology in the field of artificial intelligence, and can train a control model and a strategy by utilizing the machine learning of the artificial intelligence, thereby outputting corresponding control quantity according to the working condition of the cold source system, realizing energy-saving control of the cold source system of a data center and realizing efficient control strategy optimization.

Among these, artificial intelligence (Artificial Intelligence, AI) is the theory, method, technique and application system that uses a digital computer or a digital computer-controlled machine to simulate, extend and extend human intelligence, sense the environment, acquire knowledge and use knowledge to obtain optimal results. In other words, artificial intelligence is an integrated technology of computer science that attempts to understand the essence of intelligence and to produce a new intelligent machine that can react in a similar way to human intelligence. Artificial intelligence, i.e. research on design principles and implementation methods of various intelligent machines, enables the machines to have functions of sensing, reasoning and decision. The artificial intelligence technology is a comprehensive subject, and relates to the technology with wide fields, namely the technology with a hardware level and the technology with a software level. The artificial intelligence software technology mainly comprises the directions of computer vision technology, machine learning/deep learning and the like.

Among them, machine Learning (ML) is a multi-domain interdisciplinary, and involves multiple disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory, etc. It is specially studied how a computer simulates or implements learning behavior of a human to acquire new knowledge or skills, and reorganizes existing knowledge structures to continuously improve own performance. Machine learning is the core of artificial intelligence, a fundamental approach to letting computers have intelligence, which is applied throughout various areas of artificial intelligence. Machine learning and deep learning typically include techniques such as artificial neural networks, belief networks, reinforcement learning, transfer learning, induction learning, teaching learning, and the like.

For example, as shown in fig. 1a, first, the electronic device integrated with the energy-saving control device of the cold source system may acquire a current state quantity of the cold source system and a target control policy of a preset control model, where the current state quantity includes a current period and a state quantity of a preset period before the current period; then, predicting the predicted state quantity of a plurality of dimensions of the cold source system in a target period according to the current state quantity and a target control strategy, and fusing the predicted state quantity of the plurality of dimensions to obtain a fused predicted state quantity; then, carrying out profit calculation on the fused predicted state quantity according to a preset reward function and a preset constraint condition to obtain a profit value of the cold source system in a target period; determining the total profit value of the cold source system in a preset time period based on the profit value by adopting a preset control model, wherein the preset time period comprises at least one target time period; when the total benefit value does not meet a preset condition, adjusting the target control strategy according to the total benefit to obtain an adjusted control strategy, and continuously predicting the predicted state quantity of the cold source system in multiple dimensions in a target period by taking the adjusted control strategy as the target control strategy; and when the total income value meets the preset condition, outputting a trained control model for controlling the cold source system. According to the scheme, the sensor detection data accumulated by the data center can be utilized to learn and construct a simulation model of the cold source system, then based on the simulation model, operation constraint, energy consumption optimization targets and the like of the system are considered, bonus function design is conducted, a control strategy is optimized by means of reinforcement learning algorithm, and finally energy conservation of the cold source system is achieved. The scheme can model and optimize a data center cold source system, and can obtain more energy-saving control parameters of the cold source system according to fluctuation of equipment load in a data center machine room and under proper refrigeration capacity constraint, so that energy consumption is reduced. And the precision and the interpretability are improved by combining a data driving model and a mechanism driving model. Based on the trained control model and strategy, the corresponding control quantity can be directly output according to the working condition of the cold source system, and efficient control strategy optimization is realized. With the running of the system, the working condition characteristics of the system can be changed, and the model and the strategy can be continuously and iteratively updated to realize real-time dynamic adjustment, so that the system has good adaptability.

The following will describe in detail. The following description of the embodiments is not intended to limit the preferred embodiments.

The embodiment will be described from the perspective of an energy-saving control device of a cold source system, where the energy-saving control device of the cold source system may be integrated in an electronic device, and the electronic device may be a server or a terminal; the terminal may include a mobile phone, a tablet computer, a notebook computer, a personal computer (Personal Computer, PC), and the like.

An energy-saving control method of a cold source system comprises the following steps: acquiring a current state quantity of a cold source system and a target control strategy of a preset control model, wherein the current state quantity comprises a current period and a state quantity of a preset period before the current period; then, predicting the predicted state quantity of a plurality of dimensions of the cold source system in a target period according to the current state quantity and a target control strategy, and fusing the predicted state quantity of the plurality of dimensions to obtain a fused predicted state quantity; then, carrying out profit calculation on the fused predicted state quantity according to a preset reward function and a preset constraint condition to obtain a profit value of the cold source system in a target period; determining the total profit value of the cold source system in a preset time period based on the profit value by adopting a preset control model, wherein the preset time period comprises at least one target time period; when the total benefit value does not meet a preset condition, adjusting the target control strategy according to the total benefit to obtain an adjusted control strategy, and continuously predicting the predicted state quantity of the cold source system in multiple dimensions in a target period by taking the adjusted control strategy as the target control strategy; and when the total income value meets the preset condition, outputting a trained control model for controlling the cold source system.

As shown in fig. 1b, the specific flow of the energy-saving control method of the cold source system may be as follows:

101. the method comprises the steps of obtaining a current state quantity of a cold source system and a target control strategy of a preset control model, wherein the current state quantity comprises a current period and a state quantity of a preset period before the current period.

The cold source system can be a system capable of radiating and cooling equipment such as servers and other equipment in a data center machine room. For example, for a data center, a cold source system is used as a cold source of a machine room at the tail end of the data center, and the energy efficiency of the whole data center is greatly affected, so that a model of the cold source system is necessary to be explored, and the control of the cold source system is optimized by combining a related optimization method, so that the energy conservation of the data center is realized.

For example, the data center cold source system can be composed of a water chilling unit, a chilled water pump, a cooling tower and other devices. As shown in fig. 1c, the basic structure of the cold source system is shown in fig. 1c, including a chiller, a chilled water circulation pump, a cooling tower, and the like. Three refrigeration cycles are formed among the devices, and the refrigeration cycle comprises: a chilled water cycle, a refrigerant cycle, and a cooling water cycle, which are coupled to each other and affect each other. In order to realize energy conservation of the cold source system, the control amounts of the temperature, the flow and the like of the chilled water and the cooling water can be optimized, and the energy consumption of the water chilling unit, the chilled water pump and the cooling water pump can be reduced. The cold source system can include a plurality of cooling water sets, and a plurality of cooling water sets can include a plurality of chilled water pumps and a plurality of cooling water pumps, and cooling water sets and chilled water pumps, cooling water pumps can be the non-one-to-one relation, and a cooling water set does not necessarily correspond a chilled water pump and a cooling water pump promptly, specifically can dispose according to actual demand.

In order to improve the accuracy of the control of the cold source system, the system state and the adopted control quantity of a plurality of current and previous periods can be selected to study the cold source system in consideration of the multi-stage influence of the control quantity of the cold source system, so that a more optimized, more efficient and more energy-saving control strategy is obtained. Therefore, the current state quantity may include a current period and a state quantity of a preset period before the current period.

A large number of sensors are generally deployed in a cold source system of a data center, and a certain redundancy exists in the data, wherein a large number of detection data are irrelevant to energy saving optimization of the cold source system, so that a plurality of key features can be selected, for example, the data can be divided into three categories according to units to which the data belong: the state quantity of the water chiller, the state quantity of the water pump, the environment and the external variables are subdivided into 11 items, for example, the state quantity of the considered cold source system can be shown in the following table 1:

table 1: state quantity considered in the study

The main control quantity in the cold source system of the data center can be roughly divided into two aspects of temperature and flow. For example, the temperature aspect can comprise the chilled water outlet temperature of the water chilling unit and the return water temperature of the cooling water; flow aspects may include chiller chilled water and cooling water flows, for example, the control of the chiller system considered may be as shown in table 2 below:

Table 2: control amount considered in the study

The control strategy may refer to a strategy and a method for controlling the cold source system. For example, a control model, such as a preset control model, may be preset, and then a reinforcement learning algorithm may be used to optimize the control strategy of the preset control model, so as to find an optimal control strategy, and further control the cold source system, so as to implement optimization of the cold source system.

For example, the current control amount adopted may be specifically determined according to the current state amount of the cold source system and the target control policy.

The preset period may be set in various manners, for example, may be flexibly set according to the actual application requirement, or may be preset and stored in the electronic device. In addition, the preset period may be built in the electronic device, or may be stored in a memory and transmitted to the electronic device, or the like. The preset period may refer to a preset number of periods, for example, the preset period may be 5 periods, that is, the preset period before the current period may be 5 periods before the current period, and so on.

102. Predicting the predicted state quantity of a plurality of dimensions of the cold source system in a target period according to the current state quantity and a target control strategy, and fusing the predicted state quantity of the plurality of dimensions to obtain a fused predicted state quantity.

Modeling of the data center may employ a data-driven approach, i.e., learning neural network parameters using historical data. For example, the deep neural network can be trained by directly utilizing historical data, and a pure data driving model is constructed; a linear model may also be considered, but the linear model is not supported by the physical mechanism and lacks interpretability. Generally speaking, the data driving model can capture the numerical relation between the input quantity and the output quantity, so that the fluctuation in a small range is predicted more accurately, but the interpretation is poor; the mechanism model can better describe the relation between the energy consumption and the related control variable, has strong interpretability and can accurately predict a large change trend. Therefore, in order to improve the accuracy of cold source system prediction, data driving and mechanism are combined, namely, a pure data-based model is combined with a mechanism model with a physical basis. According to the data driving method, the historical data is fitted by using the artificial neural network, so that the association relation between the features can be captured well, but the fitting condition is easy to occur, and the generalization capability is poor. The mechanism driving method has the advantages that the equipment such as the water chilling unit, the freezing/cooling water pump and the like has a referenceable energy consumption physical model, model parameters can be fitted by utilizing historical data, the relation between energy consumption and key characteristics can be better reflected, the generalization capability is strong, and the change response to secondary characteristics is relatively slow. The two are fused, so that the advantages of the two can be combined, and the model is more accurate.

In order to improve the efficiency of energy-saving control of the cold source system, a simulation model may be constructed first, for example, as shown in fig. 1d, historical data may be preprocessed, and key features may be selected. Then, constructing the cold source system model based on the processed historical data to obtain a simulation model capable of simulating the running state of the real cold source system (if the current system state and the control quantity are input into the simulation model and a corresponding new state is expected to be obtained), predicting state quantities of multiple dimensions of the cold source system in a target period, and fusing the predicting state quantities of the multiple dimensions. For example, the simulation model may be a fusion model that drives data and mechanism energy consumption, i.e., the simulation model may include a data-driven sub-model, a mechanism energy consumption sub-model, and a fusion sub-model (i.e., a fusion of data-driven sub-model and mechanism energy consumption sub-model outputs) in a preset control model. For example, as shown in fig. 1e, the state quantity, the control quantity, and the like of the current and previous steps (i.e., the current period and the preset period before the current period) of the system are input into the model, and it is desirable to output the state quantity of the system for the next period. And based on the constructed simulation model, running constraint, energy consumption optimization targets and the like of the system are considered, a reward function is designed, a control strategy of the cold source system is optimized by using a reinforcement learning method, and corresponding model verification and strategy test are performed, so that the accuracy of a preset control model is improved, and the energy conservation of the cold source system of the data center is further optimized.

For example, the preset control model may include a data-driven sub-model and a mechanism energy sub-model, and specifically, the currently adopted target control amount may be determined according to the current state amount and the target control strategy; predicting a first prediction state quantity of the cold source system in a target period by using a data driving sub-model, wherein the data driving sub-model is a data model obtained by training based on historical data of the cold source system, and the historical data comprises a historical state quantity and a historical control quantity of the cold source system; predicting a second prediction state quantity of the cold source system in a target period by using a mechanism energy consumption sub-model, wherein the mechanism energy consumption sub-model is a physical model established based on reference energy consumption of internal equipment of the cold source system; and fusing the first predicted state quantity and the second predicted state quantity to obtain a fused predicted state quantity. The first predicted state quantity may represent a relationship between a current state quantity of the cold source system and a state quantity of the target period, and the second predicted state quantity may represent a relationship between energy consumption of the cold source system and the current state quantity.

For example, the preset control model may include a fusion sub-model, and specifically, the fusion sub-model may be used to fuse the first predicted state quantity and the second predicted state quantity to obtain a fused predicted state quantity.

The data-driven sub-model may specifically be based on historical data, and a state transfer function may be fitted by using a machine learning method, e.g., an artificial neural network (a recurrent neural network and its variants), a regression tree (XGBoost), and the like.

For example, the system state of the t-recording period is S _t The state under consideration is shown in Table 1, and the control amount adopted is a _t The control amounts considered are shown in table 2. Taking the multi-stage influence of the control quantity into consideration, taking the system states of a plurality of current and previous time periods and the adopted control quantity as inputs, and outputting the system state of the next time period, namely:

S _t+1 ＝f(S _t-L+1 ,a _t-L+1 ,…,S _t ,a _t )

where L is the considered window length (i.e., the preset period), i.e.The period length of the next state quantity may be affected, and this value may be determined according to the specific characteristics of the system. Predicting a first predicted state quantity of the cold source system in a target period by using a data driving sub-model, wherein the first predicted state quantity comprises a first predicted water chilling unit power, a first predicted chilled water pump power and a first predicted cooling water pump power, namely, predicting that the first predicted water chilling unit power in the next period is P _c ^′ _h The power of the first predictive chilled water pump is P _c ^′ _hp The power of the first predictive cooling water pump is P _c ^′ _p 。

The mechanical energy consumption sub-model can specifically utilize historical data to fit model parameters of the energy consumption physical model in the equipment of the cold source system, for example, a cold water unit, a chilled water pump and a cooling water pump have relatively clear and simple physical mechanism models, and the prediction precision can be improved by utilizing the mechanism models in an auxiliary mode.

For example, the mechanism energy consumption sub-model includes a chiller energy consumption module, a chilled water pump energy consumption module and a cooling water pump energy consumption module, and the second predicted state quantity includes a second predicted chiller power, a second predicted chilled water pump power and a second predicted cooling water pump power, and specifically, the chiller energy consumption module may be used to predict the second predicted chiller power of the cold source system in the target period; predicting second predicted chilled water pump power of the cold source system in a target period by utilizing a chilled water pump energy consumption module; and predicting the second predicted cooling water pump power of the cold source system in the target period by using a cooling water pump energy consumption module.

The energy consumption of the water chilling unit is related to factors such as condensation temperature, evaporation temperature, chilled water flow and the like. The condensing temperature in the water chilling unit can be represented by the outlet water temperature of cooling water, and the evaporating temperature can be represented by the outlet water temperature of chilled water. Therefore, a water chiller energy consumption model (i.e., a water chiller energy consumption module) can be provided with respect to the cooling water outlet temperature, the chilled water outlet temperature, and the unit load. For example, the current state quantity comprises a target cooling water outlet temperature and a target chilled water return temperature, the target control quantity comprises a target chilled water outlet temperature and a target chilled water flow, and specifically, cold water model parameters of the energy consumption module of the water chiller can be obtained; and calculating the second predicted water chilling unit power of the cold source system in the target period based on the target cooling water outlet temperature, the target chilled water return temperature, the target chilled water outlet temperature, the target chilled water flow and the cold water model parameters.

For example, the specific expression of the energy consumption module of the water chiller may be as follows:

wherein Q is _ch ＝(T _chwr -T _chws )*m _chw ，α _i I=0, 1, …,5 are parameters to be determined by the water chiller energy consumption model. P' _ch For the second predicted chiller power, i.e., the second predicted chiller input power, T _cws T is the outlet temperature of cooling water _chws For the outlet temperature of chilled water, T _chwr For the return water temperature of the chilled water, m _chw Is the flow of chilled water.

The power of the chilled water pump is related to factors such as flow, lift, efficiency and the like of the chilled water pump, and the lift and the efficiency are constant values generally, so that an energy consumption model of the chilled water pump can be established to be related to the flow. For example, the target control amount includes a target chilled water pump flow, and specifically, a chilled model parameter of the chilled water pump energy consumption module may be obtained; and calculating a second predicted chilled water pump power of the cold source system in a target period based on the target chilled water pump flow and the freezing model parameters.

For example, the specific expression of the chilled water pump energy consumption module may be as follows:

wherein f _i I=0, 1,2 is the parameter to be determined by the chilled water pump energy consumption model. P' _chp For the second predicted chilled water pump power, i.e. the second predicted chilled water pump input power, m _chp Is the flow of the chilled water pump. Wherein m is _chw And m _chp The same quantity, namely the chilled water flow of the chiller, namely the chilled water pump flow, is characterized.

The power of the cooling water pump is related to factors such as flow, lift, efficiency and the like of the cooling water pump, and the lift and the efficiency are constant values generally, so that an energy consumption model of the cooling water pump can be established to be related to the flow. For example, the target control amount includes a target cooling water pump flow, and specifically, a cooling model parameter of the cooling water pump energy consumption module may be obtained; and calculating the second predicted cooling water pump power of the cold source system in the target period based on the target cooling water pump flow and the cooling model parameters.

For example, the specific expression of the cooling water pump energy consumption module may be as follows:

wherein g _i I=0, 1,2 is the parameter to be determined by the cooling water pump energy consumption model. P' _cp For the second predicted cooling water pump power, i.e. the second predicted cooling water pump input power, m _cp Is the flow of the cooling water pump.

Based on the above mechanism model, the model parameters to be determined can be fitted using historical data.

Because of the mutual association among all the devices of the cold source system of the data center, all the state quantities and the control quantities are mutually coupled, and the actual energy consumption has a great relationship with the operation working condition. Generally, the data driving model can better describe the relation between the input quantity and the output quantity, and can accurately predict the fluctuation in a small range; the mechanism model can better describe the relation between the energy consumption and related variables, has more accurate prediction on large change trend, and has strong interpretation. Therefore, in order to improve the precision of the simulation model, bagging (bagging) operation is performed on the output of the final energy consumption, namely, the data driving sub-model and the mechanism energy consumption sub-model are weighted respectively, and the final water chilling unit input power predicted value, the chilled water pump input power predicted value and the cooling water pump input power predicted value in the next period are obtained after combination, namely, the post-fusion predicted water chilling unit power, the post-fusion predicted chilled water pump power and the post-fusion predicted cooling water pump power are obtained after combination. For example, a first weight of the first predicted state quantity may be determined based on prediction errors of the data-driven sub-model and the mechanism energy consumption sub-model, and a second weight of the second predicted state quantity may be determined based on prediction errors of the data-driven sub-model and the mechanism energy consumption sub-model; and fusing the first predicted state quantity and the second predicted state quantity based on the first weight and the second weight to obtain a fused predicted state quantity. For example, a specific fusion mode may be as follows:

P _ch ＝θ ₁ P″ _ch +(1-θ ₁ )P′ _ch

P _chp ＝θ ₂ P″ _chp +(1-θ ₂ )P′ _chp

P _cp ＝θ ₃ P″ _cp +(1-θ ₃ )P′ _cp

Wherein P' _ch For the first forecast of the power, P 'of the water chilling unit' _chp For predicting the power, P 'of the chilled water pump' _cp The cooling water pump power is first predicted. P' _ch For the second prediction of the water chilling unit power, P _chp For predicting the power of the chilled water pump, P _cp And predicting the cooling water pump power for the second.

The determination of the weight factor θ may depend on the prediction errors of two models, and a model with a smaller prediction error will be given a higher weight, specifically determined in the following manner:

wherein loss is _ch′ And loss of _ch″ Respectively representing the prediction errors of the water chilling unit power in the data driving sub-model and the mechanism energy consumption sub-model, loss _chp′ And loss of _chp″ Respectively representing the prediction errors of the power of the chilled water pump in the data driving sub-model and the mechanism energy consumption sub-model, loss _cp′ And loss of _cp″ And respectively representing the prediction errors of the cooling water pump power in the data driving sub-model and the mechanism energy consumption sub-model.

The water chilling unit power mentioned in the embodiment may refer to water chilling unit input power, the chilled water pump power may refer to chilled water pump input power, and the cooling water pump power may refer to cooling water pump input power. For example, the first predicted chiller power may refer to a first predicted chiller input power, the first predicted chilled water pump power may refer to a first predicted chilled water pump input power, and so on.

The Bagging algorithm (Bootstrap aggregating, guided aggregation algorithm), also called Bagging algorithm, is a group learning algorithm in the field of machine learning. Bagging is a technique that reduces generalization errors by combining several models. The main idea is to train several different models separately and then let all models vote on the output of the test sample. This is an example of a conventional strategy in machine learning, known as model averaging. Techniques employing such policies are known as integration methods.

103. And carrying out profit calculation on the fused predicted state quantity according to a preset reward function and a preset constraint condition to obtain a profit value of the cold source system in a target period.

For example, specifically, the reward calculation may be performed on the predicted state quantity after fusion according to a preset reward function, so as to obtain a reward value of the cold source system in the target period; based on preset constraint conditions, penalty calculation is carried out on the fused prediction state quantity, and a penalty value of the cold source system in a target period is obtained; and calculating the benefit value of the cold source system in the target period based on the reward value and the penalty value.

The preset constraint conditions can comprise preset cooling water backwater temperature constraint, preset refrigerating capacity constraint, preset operation condition constraint and the like.

For example, based on a constructed simulation model, running constraint, energy consumption optimization targets and the like of the system are considered, a reward function is designed, and a control strategy of the cold source system can be optimized by using a reinforcement learning method and verified correspondingly. The basic mathematical model of reinforcement learning is a markov decision process (Markov decision process, MDP), which is generally composed of five tuples, m= < S, a, P, R, γ >, where S represents a state space of the system (i.e., a space composed of state quantities of the system), a represents an action space (i.e., a space composed of all possible values of the control quantity), P represents a system state transition probability (i.e., a function f (·) characterizes the system state transition probability P), R represents a reward function, and γ represents a discount factor. The modeling of the Markov decision process of the energy-saving optimization problem of the cold source system of the data center is introduced. Wherein the state space, action space and system state transition probabilities of the system have been described in step 102, the system constraints and rewards functions are described herein.

(1) Cooling water backwater temperature constraint:

in the actual system operation, the return water temperature T of the cooling water _cwr Should not be lower than the wet bulb temperature T _sq The method comprises the following steps:

T _cwr ≥T _sq

when the return water temperature of the cooling water in the given control strategy is lower than the wet bulb temperature, the return water temperature of the cooling water is forcedly updated to be the wet bulb temperature:

T _cwr ＝max(T _sq ,T _cwr )

(2) Refrigeration capacity constraint:

the refrigerating capacity of the cold source system must be enough to maintain the temperature in the data center machine room within a reasonable range, so as to avoid the condition of overhigh temperature. Generally, the higher the load of IT equipment (for example, when the cold source system is a cold source system of a data center, the higher the heat generated by IT equipment is, and the higher the required amount of cold. The refrigerating capacity of the cold source system is given by the following formula:

C＝c(T _chwr -T _chws )m _ch

wherein c is the specific heat capacity of water, T _chwr And T _chws M is the return water temperature and the outlet water temperature of the chilled water _ch Is the flow of chilled water. The refrigeration capacity should satisfy the following constraints:

C≥δL _IT

wherein delta is the refrigerating capacity coefficient, L _IT For IT equipment loads.

Since the cooling capacity is related to a plurality of control amounts, it is difficult to directly hard-constrain each control amount. Thus, for the refrigeration capacity constraint, a form of soft constraint is embodied in the bonus function, which gives a penalty when the refrigeration capacity does not satisfy the inequality described above, see in particular below.

(3) Operating condition constraints:

the operation of the cold source system of the data center must be carried out on the premise of safety, and the occurrence of overstress control should be avoided. For this purpose, the values of all control quantities can be constrained to fall within a threshold range that has historically occurred, thereby ensuring that all controls fluctuate within a safe range:

a ^l ≤a _t ≤a ^u

In the above, a ^l And a ^u The control amount a is historically lower and upper limit values, respectively.

In addition, when policy optimization is performed using reinforcement learning algorithms, control amounts are explored, and a given combination of control amounts may not have historically occurred, and thus system states may exceed historical thresholds. Because of the limited generalization ability of the simulation model, its predicted state value may be inaccurate when the system state exceeds a historical threshold. Thus, the state of the system is constrained to be within a certain range of historical thresholds, namely:

τs ^l ≤s _t ≤τs ^u

in the above, s ^l Sum s ^u Respectively the control quantity s is in historyThe upper lower limit and the upper limit, τ is a threshold coefficient. When the system state exceeds the threshold, then training is terminated and a greater penalty is given.

Then, a reward value of the cold source system in a target period can be calculated, for example, the fused predicted state quantity comprises fused predicted water chilling unit power, fused predicted chilled water pump power and fused predicted cooling water pump power, and a reward weight of the fused predicted state quantity is determined; and calculating the rewarding value of the cold source system in the target period based on the fused predicted water chilling unit power, the fused predicted chilled water pump power, the fused predicted cooling water pump power and the rewarding weight.

For example, in order to achieve the effect of energy saving, the objective of reducing the energy consumption of the cold source system is to design the reward function as follows:

wherein,and->The method comprises the steps of predicting the water chilling unit power after fusion, predicting the chilled water pump power after fusion and predicting the cooling water pump power after fusion in t time periods respectively, wherein R is a proper positive constant, and alpha is a proper positive weight.

Then, a punishment value of the cold source system in a target period can be calculated, for example, the load of data center equipment, the refrigerating capacity coefficient of the cold source system and the refrigerating capacity of the cold source system in the target period can be obtained; and calculating a punishment value of the cold source system in a target period based on the load of the data center equipment, the refrigerating capacity coefficient and the refrigerating capacity.

For example, the refrigeration capacity constraint may be embodied as a penalty function, which is as follows:

r _t ＝βmax(δL _IT -C _t ,0)

wherein beta is a proper positive weight, C _t Is the cooling capacity of period t. Combining the rewarding function and the punishment function, the single-step benefit value R can be obtained _t -r _t 。

104. And determining the total profit value of the cold source system in a preset time period based on the profit value by adopting a preset control model, wherein the preset time period comprises at least one target time period.

The target period may be a period next to the current period, for example, the target control policy of the preset control model may be specifically updated based on the benefit value, so as to obtain an updated control policy; taking the target time period as the current time period, and taking the updated control strategy as the target control strategy of the preset control model to continuously predict the next time period of the cold source system until obtaining the benefit value of each time period in the preset time period; and determining the total profit value of the cold source system in the preset time period based on the profit value of each time period in the preset time period.

For example, the energy-saving optimization problem of the cold source system is converted into an optimization problem for maximizing the total income under the condition of meeting the preset constraint condition, and the optimization problem can be specifically shown as follows:

where d is the control strategy and γ is the discount factor, which may be a value of 0 to 1, representing the impact of future revenues on the current. The total benefit value may refer to an accumulated value of a single step benefit.

105. And when the total benefit value does not meet a preset condition, adjusting the target control strategy according to the total benefit to obtain an adjusted control strategy, and continuously predicting the predicted state quantity of the cold source system in multiple dimensions in a target period by taking the adjusted control strategy as the target control strategy.

The preset conditions may be set in various manners, for example, may be flexibly set according to the actual application requirement, or may be preset and stored in the electronic device. In addition, the preset condition may be built in the electronic device, or may be stored in a memory and transmitted to the electronic device, or the like. For example, the preset condition may be a preset reinforcement learning round number M, that is, a preset learning round number, and so on.

For example, specifically, when the total benefit value does not meet the preset condition, that is, when the total benefit does not reinforcement learn the preset learning round number M, the target control strategy is adjusted according to the total benefit, so as to obtain an adjusted control strategy, and the adjusted control strategy is used as the target control strategy to continuously predict the predicted state quantities of the multiple dimensions of the cold source system in the target period until reinforcement learning reaches the preset learning round number.

106. And when the total income value meets the preset condition, outputting a trained control model for controlling the cold source system.

For example, the trained control model may be obtained specifically when the total benefit value satisfies a preset condition, that is, when the total benefit is not reinforcement-learned to a preset learning round number M. Then, acquiring the current state quantity of the cold source system, wherein the current state quantity comprises a current period and a state quantity of a preset period before the current period; determining a system control strategy of the cold source system by utilizing the trained control model based on the current state quantity; and controlling the cold source system according to the system control strategy. For example, the trained control model may be used to determine a system control policy of the cold source system, determine a system control amount currently adopted by the cold source system based on the current state amount and the system control policy, and control the cold source system based on the system control amount.

It should be noted that, an appropriate reinforcement learning algorithm may be selected according to practical situations, including but not limited to, reinforcement learning algorithms such as Q-learning, DQN, actor-Critic, DDPG, TRPO, PPO, SAC.

Through the verification of a numerical experiment, the method provided by the scheme can realize the energy saving ratio of the cold source system of the data center to about 1% -5%, and the adjustment direction of the control strategy summarized according to the learned strategy accords with the expert experience.

In order to improve the safety of the energy-saving control of the cold source system, the data storage in the method is stored in the block chain. The blockchain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism, an encryption algorithm and the like. The Blockchain (Blockchain), which is essentially a decentralised database, is a string of data blocks that are generated by cryptographic means in association, each data block containing a batch of information of network transactions for verifying the validity of the information (anti-counterfeiting) and generating the next block. The blockchain may include a blockchain underlying platform, a platform product services layer, and an application services layer.

The blockchain underlying platform may include processing modules for user management, basic services, smart contracts, and operation detection. The user management module is responsible for identity information management of all blockchain participants, including maintenance of public and private key generation (account management), key management, maintenance of corresponding relation between the real identity of the user and the blockchain address (authority management) and the like, and under the condition of authorization, supervision and audit of transaction conditions of certain real identities, and provision of rule configuration (wind control audit) of risk control; the basic service module is deployed on all block chain node devices, is used for verifying the validity of a service request, recording the service request on a storage after the effective request is identified, for a new service request, the basic service firstly analyzes interface adaptation and authenticates the interface adaptation, encrypts service information (identification management) through an identification algorithm, and transmits the encrypted service information to a shared account book (network communication) in a complete and consistent manner, and records and stores the service information; the intelligent contract module is responsible for registering and issuing contracts, triggering contracts and executing contracts, a developer can define contract logic through a certain programming language, issue the contract logic to a blockchain (contract registering), invoke keys or other event triggering execution according to the logic of contract clauses to complete the contract logic, and simultaneously provide a function of registering contract upgrading; the operation detection module is mainly responsible for deployment in the product release process, modification of configuration, contract setting, cloud adaptation and visual output of real-time states in product operation, for example: alarms, detecting network conditions, detecting node device health status, etc.

The platform product service layer provides basic capabilities and implementation frameworks of typical applications, and developers can complete the blockchain implementation of business logic based on the basic capabilities and the characteristics of the superposition business. The application service layer provides the application service based on the block chain scheme to the business participants for use.

As can be seen from the foregoing, the present embodiment may obtain a current state quantity of the cold source system and a target control policy of a preset control model, where the current state quantity includes a current period and a state quantity of a preset period before the current period; then, predicting the predicted state quantity of a plurality of dimensions of the cold source system in a target period according to the current state quantity and a target control strategy, and fusing the predicted state quantity of the plurality of dimensions to obtain a fused predicted state quantity; then, carrying out profit calculation on the fused predicted state quantity according to a preset reward function and a preset constraint condition to obtain a profit value of the cold source system in a target period; determining the total profit value of the cold source system in a preset time period based on the profit value by adopting a preset control model, wherein the preset time period comprises at least one target time period; when the total benefit value does not meet a preset condition, adjusting the target control strategy according to the total benefit to obtain an adjusted control strategy, and continuously predicting the predicted state quantity of the cold source system in multiple dimensions in a target period by taking the adjusted control strategy as the target control strategy; and when the total income value meets the preset condition, outputting a trained control model for controlling the cold source system. The scheme can consider the energy conservation of the data center cold source system, and the energy conservation optimization method of the data center cold source system based on reinforcement learning is provided from the control of the temperature and flow of chilled water and cooling water of a water chilling unit and a cooling tower. Firstly, detecting data by utilizing sensors accumulated in a data center, learning and constructing a cold source system simulation model, then, based on the simulation model, taking the operation constraint, the energy consumption optimization target and the like of the system into consideration, designing a reward function, optimizing a control strategy by utilizing a reinforcement learning algorithm, and finally, realizing the energy saving of the cold source system. The scheme can model and optimize a data center cold source system, and can obtain more energy-saving cold source system control parameters under the appropriate refrigeration capacity constraint according to fluctuation of IT load in a data center machine room, so that energy consumption is reduced. And the precision and the interpretability are improved by combining a data driving model and a mechanism driving model. Because the data center cold source system is large in scale and complex in structure, the system simulation model is difficult to construct. The data and mechanism fusion model can exert the value of historical data on one hand, and the association relation between variables is mined by combining an artificial neural network; on the other hand, the characteristic of strong interpretability of the mechanism model can be exerted, so that the model has higher credibility. Based on the trained control model and strategy, the corresponding control quantity can be directly output according to the working condition of the cold source system, and efficient control strategy optimization is realized. With the running of the system, the working condition characteristics of the system can be changed, and the model and the strategy can be continuously and iteratively updated to realize real-time dynamic adjustment, so that the system has good adaptability.

The method described in the previous embodiment is described in further detail below by way of example.

In this embodiment, the energy-saving control device of the cold source system is specifically integrated in the electronic device, the cold source system is specifically a cold source system of a data center, the preset time period is specifically a T time period, the preset condition is specifically a preset learning round number M, and the current state of the cold source system is s _t The target control strategy is d, which is described as an example.

Firstly, a data driving sub-model can be built firstly, and specifically can be as follows:

in order to improve the efficiency of energy-saving control of the cold source system, the data driving sub-model can be trained first. The data-driven sub-model may be trained from a plurality of historical data of the cold source system. The energy-saving control device of the cold source system can be provided for training by other equipment, or the energy-saving control device of the cold source system can perform training by itself. For example, the electronic device may train the data-driven sub-model using an artificial neural network (recurrent neural network and its variants), regression tree (XGBoost), or the like.

(II) model parameters of the mechanism energy consumption sub-model can be determined, specifically as follows:

In order to improve the efficiency of energy-saving control of the cold source system, model parameters of the mechanical energy consumption sub-model can be determined first. The model parameters of the mechanism energy consumption sub-model can be determined according to a plurality of historical data of the cold source system, such as the historical state quantity and the historical control quantity of the cold source system. The energy-saving control device of the cold source system can be provided for the energy-saving control device of the cold source system after training by other equipment, or the energy-saving control device of the cold source system can perform training by itself so as to determine model parameters of the mechanical energy consumption sub-model.

And thirdly, energy-saving control can be realized on the cold source system by utilizing the determined model parameters of the data driving sub-model and the mechanism energy consumption sub-model, and particularly, the cold source system can be seen in fig. 2a and 2b.

As shown in fig. 2a, a specific flow of the energy-saving control method of the cold source system may be as follows:

201. the electronic equipment initializes reinforcement learning parameters of a preset control model.

For example, the electronic device may specifically initialize reinforcement learning parameters including: initialization of a neural network or equation for estimating a state value function, learning rate, learning round number M, round step number T, and the like, the round count m=0 is set. Setting a round inner counting step t=0, and initializing a system state s _t 。

202. The electronic equipment acquires the current state quantity of the cold source system and a target control strategy of a preset control model.

Wherein the current state quantity includes a current period and a state quantity of a preset period before the current period. For example, the electronic device may specifically obtain the input power of the chiller unit, the input power of the chilled water pump, and the input power of the cooling water pump of the cold source system, and determine the current control amount to be adopted according to the current state amount of the cold source system and the target control policy of the preset control model. Then, according to the current state s _t And policy d, select control action a _t The method comprises the steps of carrying out a first treatment on the surface of the Current state s _t Action a of selection _t Inputting the simulation model into a pre-built simulation model, namely a pre-built data driving sub-model and a mechanism energy consumption sub-model and a fusion sub-model; then the simulation model outputs updated state s' to let s _t =s'. In particular, steps 203 to 205 may be described as follows.

203. And the electronic equipment predicts a first predicted state quantity of the cold source system in the next period by using the data driving submodel.

For example, the electronic device may specifically predict, using the data-driven submodel, the first predicted chiller power, the first predicted chilled water pump power, and the first predicted cooling water pump power of the cold source system in the next period, that is, predict the first predicted chiller power in the next period to be P _c ^′ _h The power of the first predictive chilled water pump is P _c ^′ _hp The power of the first predictive cooling water pump is P _c ^′ _p 。

For example, the system state of the t-recording period is S _t The control amount adopted is a _t . Taking the multi-stage influence of the control quantity into consideration, taking the system states of a plurality of current and previous time periods and the adopted control quantity as inputs, and outputting the system state of the next time period, namely:

S _t+1 ＝f(S _t-L+1 ,a _t-L+1 ,…,S _t ,a _t )

where L is the considered window length (i.e., the preset period), i.e., the period length that can affect the next state quantity, the value can be determined according to the specific characteristics of the system.

204. And the electronic equipment predicts a second predicted state quantity of the cold source system in the next period by using the mechanism energy consumption sub-model.

For example, the mechanism energy consumption sub-model comprises a water chilling unit energy consumption module, a chilled water pump energy consumption module and a cooling water pump energy consumption module, and the electronic equipment can specifically predict the second predicted water chilling unit power of the cold source system in the next period by using the water chilling unit energy consumption module; predicting the second predicted chilled water pump power of the cold source system in the next period by utilizing a chilled water pump energy consumption module; and predicting the second predicted cooling water pump power of the cold source system in the next period by using a cooling water pump energy consumption module.

205. And the electronic equipment fuses the first predicted state quantity and the second predicted state quantity to obtain a fused predicted state quantity.

For example, the electronic device may specifically determine a weight of the first predicted chiller power, a weight of the first predicted chilled water pump power, and a weight of the second predicted chiller power, a weight of the second predicted chilled water pump power, and a weight of the second predicted chilled water pump power based on prediction errors of the data driven sub-model and the mechanism energy consumption sub-model; based on the weight of the first predicted chiller power and the weight of the second predicted chiller power, fusing the first predicted chiller power and the second predicted chiller power to obtain fused predicted chiller power; based on the weight of the first predicted chilled water pump power and the weight of the second predicted chilled water pump power, fusing the first predicted chilled water pump power and the second predicted chilled water pump power to obtain fused predicted chilled water pump power; and fusing the first predicted cooling water pump power and the second predicted cooling water pump power based on the weight of the first predicted cooling water pump power and the weight of the second predicted cooling water pump power to obtain fused predicted cooling water pump power. In order to improve the fusion accuracy, the sum of the weight of the first predicted chiller power and the weight of the second predicted chiller power may be 1, the sum of the weight of the first predicted chilled water pump power and the weight of the second predicted chilled water pump power may be 1, and the sum of the weight of the first predicted chilled water pump power and the weight of the second predicted chilled water pump power may be 1.

206. And the electronic equipment carries out rewarding calculation on the fused predicted state quantity according to a preset rewarding function to obtain a rewarding value of the cold source system in a target period.

For example, the electronic device may specifically determine the bonus weights of the post-fusion predicted state quantities; and calculating the rewarding value of the cold source system in the target period based on the fused predicted water chilling unit power, the fused predicted chilled water pump power, the fused predicted cooling water pump power and the rewarding weight.

207. And the electronic equipment performs punishment calculation on the fused predicted state quantity based on a preset constraint condition to obtain a punishment value of the cold source system in a target period.

For example, the electronic device may specifically obtain a load of the data center device, a refrigeration capacity coefficient of the cold source system, and a refrigeration capacity of the cold source system in a target period; and calculating a punishment value of the cold source system in a target period based on the load of the data center equipment, the refrigerating capacity coefficient and the refrigerating capacity.

208. And the electronic equipment calculates the benefit value of the cold source system in the target period based on the reward value and the punishment value, and determines the total benefit value of the cold source system in the preset period.

For example, the electronic device may specifically combine the reward function and the penalty function, and may obtain the benefit value R of the next period _t -r _t . Then, updating the target control strategy of the preset control model based on the benefit value to obtain an updated control strategy; taking the next time period as the current time period (i.e. let t=t+1), and taking the updated control strategy as a target control strategy of a preset control model, continuing to predict the next time period of the cold source system (i.e. returning to execute step 202) until obtaining a benefit value of each time period in the preset time period (i.e. until t=t); a total benefit value of the cold source system in the preset time period is determined based on the benefit value of each time period in the preset time period, and then step 209 is performed.

209. And when the reinforcement learning does not reach the preset learning round number, the electronic equipment adjusts the target control strategy, and returns the adjusted control strategy to the execution step 202 as the target control strategy.

For example, when the reinforcement learning does not reach the preset learning round number, the electronic device adjusts the target control strategy according to the total benefit to obtain an adjusted control strategy, and uses the adjusted control strategy as the target control strategy to continuously predict the predicted state quantity of the cold source system in multiple dimensions in the next period.

For example, the electronic device may specifically adjust the target control policy (i.e. adjust parameters of a neural network or an equation for estimating a state value function) according to the total benefit when the reinforcement learning does not reach the preset learning round number, obtain an adjusted control policy, and continue reinforcement learning with the adjusted control policy as the target control policy (i.e. return to execute step 202, let m=m+1) until reinforcement learning reaches the preset learning round number (i.e. until m=m), and then execute step 209.

210. And when the reinforcement learning reaches the preset learning round number, the electronic equipment outputs a trained control model and controls the cold source system based on the trained control model.

For example, the electronic device may specifically obtain the trained control model when the reinforcement learning reaches the preset number of learning rounds (i.e., when m=m). Then, acquiring the current state quantity of the cold source system, wherein the current state quantity comprises a current period and a state quantity of a preset period before the current period; determining a system control strategy of the cold source system by utilizing the trained control model based on the current state quantity; and controlling the cold source system according to the system control strategy. For example, the trained control model may be used to determine a system control policy of the cold source system, determine a system control amount currently adopted by the cold source system based on the current state amount and the system control policy, and control the cold source system based on the system control amount. The specific flow may be as shown in fig. 2 b.

Through the verification of a numerical experiment, the embodiment can realize the energy saving ratio of the cold source system of the data center to about 1% -5%, and the adjustment direction of the control strategy summarized according to the learned strategy accords with the expert experience.

In order to better implement the method, correspondingly, the embodiment of the application also provides an energy-saving control device of the cold source system, wherein the energy-saving control device of the cold source system can be integrated in electronic equipment, and the electronic equipment can be a server or a terminal and other equipment.

For example, as shown in fig. 3, the energy saving control device of the cold source system may include an acquisition unit 301, a prediction unit 302, a calculation unit 303, a determination unit 304, an adjustment unit 305, and a control unit 306, as follows:

an obtaining unit 301, configured to obtain a current state quantity of a cold source system and a target control policy of a preset control model, where the current state quantity includes a current period and a state quantity of a preset period before the current period;

the prediction unit 302 is configured to predict predicted state amounts of multiple dimensions of the cold source system in a target period according to the current state amounts and a target control policy, and fuse the predicted state amounts of the multiple dimensions to obtain a fused predicted state amount;

the calculating unit 303 is configured to perform profit calculation on the fused predicted state quantity according to a preset reward function and a preset constraint condition, so as to obtain a profit value of the cold source system in a target period;

A determining unit 304, configured to determine, using a preset control model, a total benefit value of the cold source system in a preset time period based on the benefit value, where the preset time period includes at least one target time period;

the adjusting unit 305 is configured to adjust the target control policy according to the total benefit when the total benefit value does not meet a preset condition, obtain an adjusted control policy, and continuously predict the predicted state quantities of the multiple dimensions of the cold source system in the target period by using the adjusted control policy as the target control policy;

and the control unit 306 is configured to output a trained control model for controlling the cold source system when the total profit value meets a preset condition.

Optionally, in some embodiments, the prediction unit 302 includes a determination subunit, a first prediction subunit, a second prediction subunit, and a fusion subunit, as follows:

Optionally, in some embodiments, the computing unit 303 includes a first computing subunit, a second computing subunit, and a third computing subunit, as follows:

Optionally, in some embodiments, the target period is a period next to the current period, and the determining unit 304 may be specifically configured to update the target control policy of the preset control model based on the benefit value, to obtain an updated control policy; taking the target time period as the current time period, and taking the updated control strategy as the target control strategy of the preset control model to continuously predict the next time period of the cold source system until obtaining the benefit value of each time period in the preset time period; and determining the total profit value of the cold source system in the preset time period based on the profit value of each time period in the preset time period.

Optionally, in some embodiments, the control unit 306 may be specifically configured to obtain a current state quantity of the cold source system, where the current state quantity includes a current period and a state quantity of a preset period before the current period; determining a system control strategy of the cold source system by utilizing the trained control model based on the current state quantity; and controlling the cold source system according to the system control strategy.

In the implementation, each unit may be implemented as an independent entity, or may be implemented as the same entity or several entities in any combination, and the implementation of each unit may be referred to the foregoing method embodiment, which is not described herein again.

As can be seen from the foregoing, in this embodiment, the obtaining unit 301 may obtain the current state quantity of the cold source system and the target control policy of the preset control model, where the current state quantity includes the current period and the state quantity of the preset period before the current period; then, predicting the predicted state quantity of a plurality of dimensions of the cold source system in a target period by a prediction unit 302 according to the current state quantity and a target control strategy, and fusing the predicted state quantity of the plurality of dimensions to obtain a fused predicted state quantity; then, the calculation unit 303 calculates the benefits of the fused prediction state quantity according to a preset reward function and a preset constraint condition to obtain the benefits of the cold source system in the target period; determining, by the determining unit 304, a total benefit value of the cold source system in a preset time period based on the benefit value by adopting a preset control model, wherein the preset time period includes at least one target time period; when the total benefit value does not meet the preset condition, the target control strategy is adjusted by the adjusting unit 305 according to the total benefit, an adjusted control strategy is obtained, and the adjusted control strategy is used as the target control strategy to continuously predict the predicted state quantity of the cold source system in multiple dimensions in the target period; when the total profit value meets the preset condition, the control unit 306 outputs a trained control model for controlling the cold source system. According to the scheme, the sensor detection data accumulated by the data center can be utilized to learn and construct a simulation model of the cold source system, then based on the simulation model, operation constraint, energy consumption optimization targets and the like of the system are considered, bonus function design is conducted, a control strategy is optimized by means of reinforcement learning algorithm, and finally energy conservation of the cold source system is achieved. The scheme can model and optimize a data center cold source system, and can obtain more energy-saving control parameters of the cold source system according to fluctuation of equipment load in a data center machine room and under proper refrigeration capacity constraint, so that energy consumption is reduced. And the precision and the interpretability are improved by combining a data driving model and a mechanism driving model. Based on the trained control model and strategy, the corresponding control quantity can be directly output according to the working condition of the cold source system, and efficient control strategy optimization is realized. With the running of the system, the working condition characteristics of the system can be changed, and the model and the strategy can be continuously and iteratively updated to realize real-time dynamic adjustment, so that the system has good adaptability.

In addition, the embodiment of the application further provides an electronic device, as shown in fig. 4, which shows a schematic structural diagram of the electronic device according to the embodiment of the application, specifically:

the electronic device may include one or more processing cores 'processors 401, one or more computer-readable storage media's memory 402, power supply 403, and input unit 404, among other components. Those skilled in the art will appreciate that the electronic device structure shown in fig. 4 is not limiting of the electronic device and may include more or fewer components than shown, or may combine certain components, or may be arranged in different components. Wherein:

the processor 401 is a control center of the electronic device, connects various parts of the entire electronic device using various interfaces and lines, and performs various functions of the electronic device and processes data by running or executing software programs and/or modules stored in the memory 402, and calling data stored in the memory 402, thereby performing overall detection of the electronic device. Optionally, processor 401 may include one or more processing cores; preferably, the processor 401 may integrate an application processor and a modem processor, wherein the application processor mainly processes an operating system, a user interface, an application program, etc., and the modem processor mainly processes wireless communication. It will be appreciated that the modem processor described above may not be integrated into the processor 401.

The memory 402 may be used to store software programs and modules, and the processor 401 executes various functional applications and data processing by executing the software programs and modules stored in the memory 402. The memory 402 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program (such as a sound playing function, an image playing function, etc.) required for at least one function, and the like; the storage data area may store data created according to the use of the electronic device, etc. In addition, memory 402 may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid-state storage device. Accordingly, the memory 402 may also include a memory controller to provide the processor 401 with access to the memory 402.

The electronic device further comprises a power supply 403 for supplying power to the various components, preferably the power supply 403 may be logically connected to the processor 401 by a power management system, so that functions of managing charging, discharging, and power consumption are performed by the power management system. The power supply 403 may also include one or more of any of a direct current or alternating current power supply, a recharging system, a power failure detection circuit, a power converter or inverter, a power status indicator, and the like.

The electronic device may further comprise an input unit 404, which input unit 404 may be used for receiving input digital or character information and generating keyboard, mouse, joystick, optical or trackball signal inputs in connection with user settings and function control.

Although not shown, the electronic device may further include a display unit or the like, which is not described herein. In particular, in this embodiment, the processor 401 in the electronic device loads executable files corresponding to the processes of one or more application programs into the memory 402 according to the following instructions, and the processor 401 executes the application programs stored in the memory 402, so as to implement various functions as follows:

acquiring a current state quantity of a cold source system and a target control strategy of a preset control model, wherein the current state quantity comprises a current period and a state quantity of a preset period before the current period; then, predicting the predicted state quantity of a plurality of dimensions of the cold source system in a target period according to the current state quantity and a target control strategy, and fusing the predicted state quantity of the plurality of dimensions to obtain a fused predicted state quantity; then, carrying out profit calculation on the fused predicted state quantity according to a preset reward function and a preset constraint condition to obtain a profit value of the cold source system in a target period; determining the total profit value of the cold source system in a preset time period based on the profit value by adopting a preset control model, wherein the preset time period comprises at least one target time period; when the total benefit value does not meet a preset condition, adjusting the target control strategy according to the total benefit to obtain an adjusted control strategy, and continuously predicting the predicted state quantity of the cold source system in multiple dimensions in a target period by taking the adjusted control strategy as the target control strategy; and when the total income value meets the preset condition, outputting a trained control model for controlling the cold source system.

The specific implementation of each operation above may be referred to the previous embodiments, and will not be described herein.

As can be seen from the foregoing, the present embodiment may obtain a current state quantity of the cold source system and a target control policy of a preset control model, where the current state quantity includes a current period and a state quantity of a preset period before the current period; then, predicting the predicted state quantity of a plurality of dimensions of the cold source system in a target period according to the current state quantity and a target control strategy, and fusing the predicted state quantity of the plurality of dimensions to obtain a fused predicted state quantity; then, carrying out profit calculation on the fused predicted state quantity according to a preset reward function and a preset constraint condition to obtain a profit value of the cold source system in a target period; determining the total profit value of the cold source system in a preset time period based on the profit value by adopting a preset control model, wherein the preset time period comprises at least one target time period; when the total benefit value does not meet a preset condition, adjusting the target control strategy according to the total benefit to obtain an adjusted control strategy, and continuously predicting the predicted state quantity of the cold source system in multiple dimensions in a target period by taking the adjusted control strategy as the target control strategy; and when the total income value meets the preset condition, outputting a trained control model for controlling the cold source system. According to the scheme, the sensor detection data accumulated by the data center can be utilized to learn and construct a simulation model of the cold source system, then based on the simulation model, operation constraint, energy consumption optimization targets and the like of the system are considered, bonus function design is conducted, a control strategy is optimized by means of reinforcement learning algorithm, and finally energy conservation of the cold source system is achieved. The scheme can model and optimize a data center cold source system, and can obtain more energy-saving control parameters of the cold source system according to fluctuation of equipment load in a data center machine room and under proper refrigeration capacity constraint, so that energy consumption is reduced. And the precision and the interpretability are improved by combining a data driving model and a mechanism driving model. Based on the trained control model and strategy, the corresponding control quantity can be directly output according to the working condition of the cold source system, and efficient control strategy optimization is realized. With the running of the system, the working condition characteristics of the system can be changed, and the model and the strategy can be continuously and iteratively updated to realize real-time dynamic adjustment, so that the system has good adaptability.

Those of ordinary skill in the art will appreciate that all or a portion of the steps of the various methods of the above embodiments may be performed by instructions, or by instructions controlling associated hardware, which may be stored in a computer-readable storage medium and loaded and executed by a processor.

Therefore, the embodiment of the application also provides a storage medium, in which a plurality of instructions are stored, and the instructions can be loaded by a processor to execute the steps in the energy-saving control method of any cold source system provided by the embodiment of the application. For example, the instructions may perform the steps of:

Wherein the storage medium may include: read Only Memory (ROM), random access Memory (RAM, random Access Memory), magnetic or optical disk, and the like.

The instructions stored in the storage medium can execute the steps in the energy-saving control method of any cold source system provided by the embodiment of the application, so that the beneficial effects of the energy-saving control method of any cold source system provided by the embodiment of the application can be realized, and detailed descriptions of the previous embodiments are omitted.

The energy-saving control method, the device, the electronic equipment and the storage medium of the cold source system provided by the embodiment of the application are described in detail, and specific examples are applied to the principle and the implementation mode of the application, and the description of the above embodiment is only used for helping to understand the method and the core idea of the application; meanwhile, as those skilled in the art will have variations in the specific embodiments and application scope in light of the ideas of the present application, the present description should not be construed as limiting the present application.

Claims

1. An energy-saving control method of a cold source system, which is characterized by being applied to the cold source system of a data center, comprising the following steps:

predicting the predicted state quantity of a plurality of dimensions of the cold source system in a target period according to the current state quantity and a target control strategy, and fusing the predicted state quantity of the plurality of dimensions to obtain a fused predicted state quantity, wherein the state quantity is predicted based on a machine learning method;

2. The method of claim 1, wherein the preset control model includes a data-driven sub-model and a mechanism energy sub-model, and the predicting the predicted state quantity of the multiple dimensions of the cold source system in the target period according to the current state quantity and the target control strategy includes:

determining a target control quantity currently adopted according to the current state quantity and a target control strategy;

predicting a first prediction state quantity of the cold source system in a target period by using a data driving sub-model, wherein the data driving sub-model is a data model obtained by training based on historical data of the cold source system, and the historical data comprises a historical state quantity and a historical control quantity of the cold source system;

predicting a second prediction state quantity of the cold source system in a target period by using a mechanism energy consumption sub-model, wherein the mechanism energy consumption sub-model is a physical model established based on reference energy consumption of internal equipment of the cold source system;

the step of fusing the predicted state quantities of the multiple dimensions to obtain a fused predicted state quantity comprises the following steps: and fusing the first predicted state quantity and the second predicted state quantity to obtain a fused predicted state quantity.

3. The method of claim 2, wherein the mechanism energy consumption sub-model includes a chiller energy consumption module, a chilled water pump energy consumption module, and a chilled water pump energy consumption module, the second predicted state quantity includes a second predicted chiller power, a second predicted chilled water pump power, and the predicting, using the mechanism energy consumption sub-model, a second predicted state quantity of the cold source system during a target period includes:

predicting second predicted chiller power of the cold source system in a target period by using a chiller energy consumption module;

predicting second predicted chilled water pump power of the cold source system in a target period by utilizing a chilled water pump energy consumption module;

and predicting the second predicted cooling water pump power of the cold source system in the target period by using a cooling water pump energy consumption module.

4. The method of claim 3, wherein the current state quantity comprises a target chilled water outlet temperature and a target chilled water return temperature, the target control quantity comprises a target chilled water outlet temperature and a target chilled water flow, the predicting, with a chiller energy consumption module, a second predicted chiller power for the chiller system over a target period of time comprises:

Acquiring cold water model parameters of the energy consumption module of the water chiller;

and calculating the second predicted water chilling unit power of the cold source system in the target period based on the target cooling water outlet temperature, the target chilled water return temperature, the target chilled water outlet temperature, the target chilled water flow and the cold water model parameters.

5. The method of claim 3, wherein the target control amount comprises a target chilled water pump flow, the predicting, with a chilled water pump energy consumption module, a second predicted chilled water pump power of the cold source system for a target period of time, comprising:

obtaining freezing model parameters of the freezing water pump energy consumption module;

and calculating a second predicted chilled water pump power of the cold source system in a target period based on the target chilled water pump flow and the freezing model parameters.

6. The method of claim 3, wherein the target control amount comprises a target cooling water pump flow, the predicting, with a cooling water pump energy consumption module, a second predicted cooling water pump power of the cold source system for a target period of time, comprising:

obtaining cooling model parameters of the cooling water pump energy consumption module;

and calculating the second predicted cooling water pump power of the cold source system in the target period based on the target cooling water pump flow and the cooling model parameters.

7. The method according to claim 2, wherein fusing the first predicted state quantity and the second predicted state quantity to obtain the fused predicted state quantity includes:

determining a first weight of the first predicted state quantity based on the prediction errors of the data-driven sub-model and the mechanism energy sub-model, and determining a second weight of the second predicted state quantity based on the prediction errors of the data-driven sub-model and the mechanism energy sub-model;

and fusing the first predicted state quantity and the second predicted state quantity based on the first weight and the second weight to obtain a fused predicted state quantity.

8. The method according to any one of claims 1 to 7, wherein the performing profit calculation on the fused predicted state quantity according to a preset reward function and a preset constraint condition to obtain a profit value of the cold source system in a target period includes:

performing rewarding calculation on the fused predicted state quantity according to a preset rewarding function to obtain a rewarding value of the cold source system in a target period;

based on preset constraint conditions, penalty calculation is carried out on the fused prediction state quantity, and a penalty value of the cold source system in a target period is obtained;

And calculating the benefit value of the cold source system in the target period based on the reward value and the penalty value.

9. The method of claim 8, wherein the post-fusion predicted state quantity includes post-fusion predicted chiller power, post-fusion predicted chilled water pump power, and wherein the performing a bonus calculation on the post-fusion predicted state quantity according to a preset bonus function to obtain a bonus value of the cold source system in a target period comprises:

determining the rewarding weight of the predicted state quantity after fusion;

and calculating the rewarding value of the cold source system in the target period based on the fused predicted water chilling unit power, the fused predicted chilled water pump power, the fused predicted cooling water pump power and the rewarding weight.

10. The method of claim 8, wherein the penalty calculation is performed on the fused predicted state quantity based on a preset constraint condition to obtain a penalty value of the cold source system in a target period, and the method comprises:

acquiring a load of data center equipment, a refrigerating capacity coefficient of a cold source system and the refrigerating capacity of the cold source system in a target period;

and calculating a punishment value of the cold source system in a target period based on the load of the data center equipment, the refrigerating capacity coefficient and the refrigerating capacity.

11. The method according to any one of claims 1 to 7, wherein the target period is a period next to the current period, wherein the determining, using a preset control model, a total benefit value of the cold source system in a preset period based on the benefit value includes:

updating the target control strategy of the preset control model based on the benefit value to obtain an updated control strategy;

taking the target time period as the current time period, and taking the updated control strategy as the target control strategy of the preset control model to continuously predict the next time period of the cold source system until obtaining the benefit value of each time period in the preset time period;

and determining the total profit value of the cold source system in the preset time period based on the profit value of each time period in the preset time period.

12. The method according to any one of claims 1 to 7, wherein after outputting the trained control model when the total benefit value satisfies a preset condition, further comprising:

acquiring a current state quantity of a cold source system, wherein the current state quantity comprises a current period and a state quantity of a preset period before the current period;

determining a system control strategy of the cold source system by utilizing the trained control model based on the current state quantity;

And controlling the cold source system according to the system control strategy.

13. An energy-saving control device of a cold source system, which is characterized in that the device is applied to the cold source system of a data center, and the method comprises the following steps:

the prediction unit is used for predicting the predicted state quantity of the multiple dimensions of the cold source system in the target period according to the current state quantity and the target control strategy, and fusing the predicted state quantity of the multiple dimensions to obtain a fused predicted state quantity, wherein the state quantity is predicted based on a machine learning method;

14. A computer readable storage medium, characterized in that it stores a plurality of instructions adapted to be loaded by a processor to perform the steps in the energy saving control method of the cold source system according to any one of claims 1 to 12.

15. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the steps of the method of any one of claims 1 to 12 when the program is executed.