CN119205078A

CN119205078A - Agricultural-photovoltaic complementary operation and maintenance method, device, equipment and medium for photovoltaic power station in desertified land

Info

Publication number: CN119205078A
Application number: CN202411285628.0A
Authority: CN
Inventors: 梁哲铭; 丁春兴; 杜宝刚; 于洋
Original assignee: Liaoning Branch Company Huaneng Renewables Corp ltd
Current assignee: Liaoning Branch Company Huaneng Renewables Corp ltd
Priority date: 2024-09-12
Filing date: 2024-09-12
Publication date: 2024-12-27

Abstract

The invention belongs to the technical field of light Fu Yun dimension, and particularly relates to a method, a device, equipment and a medium for agricultural light complementary operation and maintenance of a desertification land photovoltaic power station. The state of the photovoltaic power station, the growth state of crops and the external environment condition are obtained in real time, and the collected data are input into a pre-trained DQN (deep Q network) neural network model. The DQN model outputs a corresponding Q value for each possible operation and maintenance strategy by analyzing the data, and determines the operation and maintenance strategy corresponding to the optimal Q value as the current optimal operation and maintenance strategy according to the Q value output by the DQN model, so that comprehensive optimization of the power generation efficiency, the soil improvement effect and the crop yield of the photovoltaic power station is realized.

Description

Method, device, equipment and medium for agricultural and optical complementary operation and maintenance of sandy land photovoltaic power station

Technical Field

The invention belongs to the technical field of light Fu Yun dimension, and particularly relates to a method, a device, equipment and a medium for agricultural light complementary operation and maintenance of a desertification land photovoltaic power station.

Background

In the field of photovoltaic power generation, operation and maintenance management strategies of land photovoltaic power stations are mature, but operation and maintenance management strategies of desertification land photovoltaic power stations related to agricultural and photo complementation and soil improvement are still to be explored. Taking the continuous desertification land of Zhangwu Fuxin city of Liaoning as an example, the crop planting is commonly carried out with the soil gradually desertified before the photovoltaic power station is not installed. Along with the installation of the photovoltaic power station, the water content of the soil is not reduced any more, and how to complement the photovoltaic power station with soil improvement and crop planting to form agriculture and light complementation, and higher requirements are put forward in the aspect of operation and maintenance management strategies of the photovoltaic power station. In the prior art, liu Shangguang V power station operation and maintenance strategies cannot be directly applied to agricultural and optical complementary photovoltaic power stations in sandy land and other areas.

Disclosure of Invention

The invention aims to provide a method, a device, equipment and a medium for carrying out agricultural and optical complementary operation and maintenance on a sandy land photovoltaic power station, so as to solve the problem that a Liu Shangguang-volt power station operation and maintenance strategy is not suitable for the sandy land agricultural and optical complementary photovoltaic power station in the prior art.

In order to achieve the above purpose, the present invention adopts the following technical scheme:

The invention provides a method for carrying out agricultural light complementation operation and maintenance on a desertification land photovoltaic power station, which comprises the following steps:

Acquiring a state of a photovoltaic power station, a growth state of crops and external environmental conditions, wherein the photovoltaic power station is built in a sandy land area of a target area, and the image data of the target area is identified by a pre-trained image identification CNN neural network to determine the sandy land area;

Inputting the state of a photovoltaic power station, the growth state of crops and the external environmental conditions into a pre-trained DQN neural network model, wherein the DQN neural network model outputs Q values corresponding to operation and maintenance strategies;

And determining an operation and maintenance strategy corresponding to the optimal Q value as an optimal operation and maintenance strategy, wherein the operation and maintenance strategy comprises a photovoltaic power station maintenance mode, a photovoltaic module inclination angle and a crop planting mode.

Further, the sandy land area is determined through the image data of the pre-trained image recognition CNN neural network recognition target area, wherein the image recognition CNN neural network is trained according to the following mode:

building a sandy land image recognition training environment, and determining the characteristics and classification of the sandy land required in an image recognition algorithm;

determining the number of neurons required by a neural network and the number of layers of the neural network in an image recognition algorithm, and establishing a CNN neural network;

Training the neural network through the real sandy soil image, wherein after each training is finished, the optimal training effect on the neural network is achieved through adjusting training parameters;

After training, testing the trained neural network by adopting image sets of different blocks, sizes, colors and red and infrared sanded lands, adjusting parameters of the neural network according to test results, training and testing again after adjustment, and iterating until the optimal effect is achieved.

Further, the state of the photovoltaic power station, the growth state of crops and the external environmental conditions are input into a pre-trained DQN neural network model, the DQN neural network model outputs Q values corresponding to operation and maintenance strategies, and the DQN neural network model is trained according to the following mode:

defining a state space, wherein the state comprises a normal state, the current month of the emergency, the inclination angle of the photovoltaic module, the moisture content of soil, the temperature of the soil and the pH value index of the soil;

defining an action space, wherein the action space comprises all crop planting modes, photovoltaic power station maintenance modes and inclination angle adjustment strategies of a photovoltaic module, and each action corresponds to a specific crop planting decision, photovoltaic power station maintenance or inclination angle adjustment of the photovoltaic module;

designing a reward function, wherein the reward function is used for reflecting the generated energy, crop harvest and soil improvement effect;

Determining a network structure of the DQN neural network model, comprising:

An input layer for receiving as input each index in the state space;

a plurality of hidden layers are arranged to extract characteristics and learn complex nonlinear relations;

the output layer comprises neurons with the same size as the action space, and the output of each neuron represents the Q value of the corresponding action;

The training process comprises the steps of collecting state-action-rewarding-new state quadruple data for subsequent training of the DQN neural network through simulating a photovoltaic power station and a crop planting system, storing the collected experience data in an experience playback buffer area, randomly extracting a batch of experience samples from the buffer area to train in the training process, calculating a target Q value by a target network, measuring the difference between the Q value output by the DQN neural network and the target Q value by using a mean square error as a loss function, and updating parameters of the DQN neural network by using an optimization algorithm to minimize the loss function.

Further, different weights are assigned to the generated energy, crop harvest and soil improvement effect in the reward function, which weights are used to reflect the relative importance of each optimization objective in the overall benefit.

The second aspect of the invention provides a sand land photovoltaic power station agricultural light complementary operation and maintenance device, which comprises:

The system comprises an input data acquisition module, a control module and a control module, wherein the input data acquisition module is used for acquiring the state of a photovoltaic power station, the growth state of crops and the external environmental condition, the photovoltaic power station is built in a sandy land area of a target area, and the image data of the target area is identified through a pre-trained image identification CNN neural network, so that the sandy land area is determined;

The strategy optimization module is used for inputting the state of the photovoltaic power station, the growth state of crops and the external environmental conditions into a pre-trained DQN neural network model, and the DQN neural network model outputs Q values corresponding to all operation and maintenance strategies;

The system comprises a strategy determining module, a control module and a control module, wherein the strategy determining module is used for determining an operation and maintenance strategy corresponding to the optimal Q value as an optimal operation and maintenance strategy, and the operation and maintenance strategy comprises a photovoltaic power station overhaul mode, a photovoltaic module inclination angle and a crop planting mode.

Further, in the input data acquisition module, the image recognition CNN neural network is trained as follows:

Further, in the policy optimization module, the DQN neural network model is trained as follows:

Determining a network structure of the DQN neural network model, comprising:

An input layer for receiving as input each index in the state space;

In a third aspect of the invention, an electronic device is provided, comprising a processor and a memory, the processor being configured to execute a computer program stored in the memory to implement the method for agricultural light complementary operation and maintenance of a sandy land photovoltaic power station as described above.

In a fourth aspect of the present invention, a computer readable storage medium is provided, where at least one instruction is stored, where the at least one instruction, when executed by a processor, implements the above-mentioned method for performing agricultural light complementary operation and maintenance on a sandy land photovoltaic power station.

Compared with the prior art, the invention has the following beneficial effects:

In the field of photovoltaic power generation, although the operation and maintenance management strategy of Liu Shangguang volt power stations is relatively mature, a significant technical gap still exists for the photovoltaic power stations on sandy lands, especially the operation and maintenance management strategy combining agriculture and light complementation and soil improvement. In order to solve the problems, the invention provides a method for carrying out agricultural light complementation operation and maintenance on a photovoltaic power station on a sandy land, which utilizes a pre-trained image recognition CNN (convolutional neural network) neural network to process image data of a target area and accurately recognizes the sandy land area as an address selection basis for construction of the photovoltaic power station. The state of the photovoltaic power station, the growth state of crops and the external environment condition are obtained in real time, and the collected data are input into a pre-trained DQN (deep Q network) neural network model. The DQN model outputs a corresponding Q value for each possible operation and maintenance strategy by analyzing the data, and determines the operation and maintenance strategy corresponding to the optimal Q value as the current optimal operation and maintenance strategy according to the Q value output by the DQN model, so that comprehensive optimization of the power generation efficiency, the soil improvement effect and the crop yield of the photovoltaic power station is realized. According to the scheme, the sandy land area is accurately identified through the image identification CNN neural network, and the customized photovoltaic power station construction scheme is achieved according to the sandy land area, so that the construction of the photovoltaic power station can adapt to the special environment of the sandy land. The method has the advantages that the state of the photovoltaic power station, the growth state of crops, the external environmental conditions and other multivariate data are input into the DQN neural network model, an optimal operation and maintenance strategy is provided for the agricultural light complementation system through intelligent analysis, the power generation efficiency of the photovoltaic power station is considered, the growth requirement of the crops and the soil improvement effect are considered, and the agricultural light complementation intelligent management in the true sense is realized. The problem that the operation and maintenance strategy of the traditional land photovoltaic power station is directly applied to the desertification land and is not applicable is solved. The invention provides a sand land photovoltaic power station agricultural light complementary operation and maintenance device, electronic equipment and a computer readable storage medium, which also solve the problems in the background art.

Through operation and maintenance strategies, triple promotion of the generated energy of the photovoltaic power station, the crop yield and the soil improvement effect is realized. The stable operation of the photovoltaic power station improves the supply of clean energy, the increase of crop yield drives the development of agricultural economy, the soil improvement is favorable for the recovery and protection of ecological environment, the three are mutually promoted, the virtuous circle is formed, and the comprehensive benefit of the system is improved.

Compared with the traditional manual inspection and experience judgment, the operation and maintenance strategy provided by the scheme can monitor and analyze data in real time, discover and solve problems in time, reduce the requirement of manual intervention, and further reduce operation and maintenance cost.

In conclusion, the method for the agricultural and photo complementary operation and maintenance of the sandy land photovoltaic power station effectively solves the problem that the operation and maintenance strategy of the land photovoltaic power station in the prior art is not suitable for the agricultural and photo complementary photovoltaic power station of the sandy land through the intelligent identification and decision technology, and realizes the collaborative development of the photovoltaic power station, crop planting and soil improvement.

Drawings

The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this specification, illustrate embodiments of the application and together with the description serve to explain the application. In the drawings:

FIG. 1 is a flow chart of a method for carrying out agricultural and optical complementation operation and maintenance on a desertification land photovoltaic power station according to an embodiment of the invention;

FIG. 2 is a block diagram of a device for complementary operation and maintenance of agricultural light of a photovoltaic power station in sandy land according to an embodiment of the invention;

fig. 3 is a block diagram of an electronic device according to an embodiment of the present invention.

Detailed Description

The application will be described in detail below with reference to the drawings in connection with embodiments. It should be noted that, without conflict, the embodiments of the present application and features of the embodiments may be combined with each other.

The following detailed description is exemplary and is intended to provide further details of the application. Unless defined otherwise, all technical terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments in accordance with the application.

Example 1

As shown in fig. 1, the method for performing agricultural light complementation operation and maintenance on the desertification land photovoltaic power station comprises the following steps:

S1, acquiring a state of a photovoltaic power station, a crop growth state and external environmental conditions, wherein the photovoltaic power station is built in a sandy land area of a target area, and the sandy land area is determined by identifying image data of the target area through a pre-trained image identification CNN neural network;

Specifically, the image recognition CNN neural network is trained as follows:

S2, inputting the state of the photovoltaic power station, the growth state of crops and the external environmental conditions into a pre-trained DQN neural network model, and outputting Q values corresponding to operation and maintenance strategies by the DQN neural network model;

Specifically, the DQN neural network model is trained as follows:

Determining a network structure of the DQN neural network model, comprising:

An input layer for receiving as input each index in the state space;

Preferably, the reward functions are assigned different weights for the amount of electricity generated, crop harvest and soil improvement effect, which weights are used to reflect the relative importance of each optimization objective in the overall benefit.

In a more specific embodiment, the aim is to maximize the three benefits of the generation capacity, crop planting and soil improvement of the desertification land photovoltaic power station during the training of the DQN (Deep Q-Networks) neural network.

The following is a detailed training method of the DQN neural network:

(1) Data preparation

State space definition a state space defining DQN. The state comprises a plurality of indexes such as the current month, the inclination angle of the photovoltaic module, the soil moisture content, the soil temperature, the soil pH value and the like, and the indexes are used as the input of the DQN neural network.

And defining an action space, wherein the action space comprises all photovoltaic inspection modes, crop planting modes and adjustment strategies of inclination angles of photovoltaic modules. Each action corresponds to a specific photovoltaic inspection mode, crop planting decision or inclination adjustment of the photovoltaic module.

And designing a reward function, wherein the reward function reflects the generated energy, crop harvest and soil improvement effect. Trade-offs may be made based on actual business requirements, for example, higher rewards may be given to state-action versus high power generation, high crop yield, and good soil improvement.

More specifically, the state space is defined in a scene of the agricultural and optical complementary operation and maintenance management of the sandy land photovoltaic power station, and the state space contains all key factors influencing crop planting, soil improvement and photovoltaic power generation. For example:

The current month, the climate conditions (such as temperature, humidity and illumination time length) of different months can influence the growth period of crops and the power generation efficiency of the photovoltaic module.

The inclination angle of the photovoltaic module determines the angle of the photovoltaic panel for receiving solar radiation, thereby affecting the power generation efficiency. Meanwhile, the inclination angle also influences the illumination condition and the temperature distribution of the soil below.

Moisture content of soil moisture content in soil is critical to the growth of crops, and simultaneously physical properties of the soil and basic stability of the photovoltaic panel are also affected.

Soil temperature-soil temperature affects the growth rate of crops, the activity of microorganisms in the soil and the release of nutrients.

Soil pH (pH) soil directly affects the growth of crops and the fertility of the soil, and needs to be properly adjusted to suit the growth needs of the particular crop.

The index is encoded into numerical or class data and is used as an input feature of the DQN neural network.

In practical applications, these data are also normalized or standardized to ensure stability and convergence of neural network training.

More specifically, the action space definition

The action space comprises all photovoltaic inspection modes, crop planting modes and adjustment strategies of inclination angles of photovoltaic modules. These actions may be embodied in the following form:

the photovoltaic inspection method comprises inspection route, inspection time and the like.

Crop planting decisions include selection of which crop (e.g., wheat, corn, vegetables, etc.), planting density, planting time, etc.

And (3) adjusting the inclination angle of the photovoltaic module according to seasonal changes, solar altitude changes and the requirements of soil improvement and crop growth, so as to optimize the power generation efficiency and illumination distribution.

In DQN, each action is represented as a discrete number or code. The size of the action space depends on the type of crop available, the planting mode and the number of tilt strategies.

More specifically, bonus function design

The reward function evaluates the quality of an action based on the current state, the action taken, and the subsequent state. In the scene of the agricultural light complementary operation and maintenance management of the sandy land photovoltaic power station, the rewarding function comprehensively considers the multiple aspects of generating capacity, crop harvest, soil improvement effect and the like.

And the generated electricity is rewarded according to the generating efficiency of the photovoltaic module. The higher the power generation, the greater the prize.

And (3) rewarding the crop, namely awarding rewards according to the yield and quality of the crop. The better the harvest, the greater the prize.

The soil improvement effect rewards according to the improvement degree of the indexes such as the soil water content, the temperature, the PH value and the like. The more obvious the soil condition improves, the greater the reward.

In addition, to encourage the agent to explore new state-action pairs, a small exploration bonus term may be added to the bonus function. At the same time, negative rewards or penalties may be given in these cases in order to avoid agents taking detrimental actions to the system (e.g. breaking the soil structure, affecting the safety of the photovoltaic module, etc.).

Finally, the design of the reward function should be weighed and adjusted according to the actual business requirements to ensure that the DQN can learn an operation and maintenance management strategy that is both economical and environmentally sustainable.

(2) Neural network structure

And an input layer for receiving each index in the state space as input.

Hidden layer-multiple hidden layers are set up to extract features and learn complex nonlinear relationships. The number of hidden layers and the number of neurons per layer should be tailored to the specific task.

And the output layer comprises neurons with the same size as the action space, and the output of each neuron represents the Q value of the corresponding action.

More specifically, the input layer is the first layer of the neural network, responsible for receiving data from the state space. In the case of a sandy land photovoltaic power plant agro-optical complementary operation and maintenance management, the input layer comprises a plurality of neurons, each neuron corresponding to an index in a state space. These indicators include:

the current month may typically be entered by encoding as a one-hot encoding (one-hot encoding) or a simple integer representation.

The inclination angle of the photovoltaic module can be a continuous numerical value, representing the degree of inclination angle, or divided into a plurality of sections and represented by independent heat vectors.

The water content of soil, the temperature of soil and the pH value of soil are continuous values, which can be directly used as input, but normalization treatment is needed to be carried out to a more proper input range (such as between 0 and 1) of the neural network.

The main function of the input layer is to convert the original data into a form which can be processed by the neural network and to pass the next layer for further processing.

More specifically, the hidden layer is a part of the neural network between the input layer and the output layer, and is responsible for extracting characteristics of input data and learning complex nonlinear relations.

In the hidden layer, an activation function (e.g., reLU, sigmoid, or Tanh) is used to increase the non-linear capabilities of the network. The choice of activation function also affects the performance and training efficiency of the network.

More specifically, the output layer is the last layer of the neural network, and is responsible for outputting the Q value of each action. In DQN, the number of neurons of the output layer is equal to the size of the motion space.

Output layer neurons-one for each neuron, the output value of which represents the expected return (i.e., Q value) for taking that action in that state. The Q value is obtained through the learning of the state characteristics by the network and is used for guiding the decision process of the intelligent agent.

During training, the DQN updates the weights of the network using the error between the target Q value (calculated by the target network) and the predicted Q value (calculated by the current network). This process is implemented by a back-propagation algorithm, aimed at minimizing the difference between the predicted Q value and the target Q value.

In summary, the structural design of the DQN neural network comprehensively considers the characteristics of the input data, the task requirements, the computing resources, and other factors. By adjusting parameters such as the number of hidden layers, the number of neurons at each layer, an activation function and the like, a neural network model suitable for the operation and maintenance management task of the agricultural optical complementation of the sandy land photovoltaic power station can be constructed.

(3) Training process

Experience collection, namely collecting a large amount of state-action-rewarding-new state four-element data by simulating or actually running a photovoltaic power station and a crop planting system.

Experience playback-the collected experience data is stored in an experience playback buffer. In the training process, a batch of experience samples are randomly extracted from the buffer area for training, so that the time correlation between data is broken and the training stability is improved.

Target network the target network structure is the same as the DQN neural network but the parameter update frequency is lower. The target network is used to calculate a target Q value to stabilize the training process.

Loss function using Mean Square Error (MSE) as the loss function, the difference between the Q value of the DQN neural network output and the target Q value is measured.

Optimization algorithm parameters of the DQN neural network are updated using an optimization algorithm (e.g., adam) to minimize the loss function.

More specifically, a large amount of training data needs to be collected first before training the DQN. These data exist in the form of quadruples, i.e. (states, actions, rewards, new states), representing the interaction history of the agent in the environment. Data can be collected by simulating the operation of photovoltaic power plants and crop planting systems. At each time step, the system records the current state (e.g., month, inclination of photovoltaic module, soil index, etc.), the action taken (e.g., adjust inclination, plant crop, etc.), the rewards earned (e.g., increase in power generation, harvest of crop, etc.), and the new state reached after the action was performed. Empirical playback breaks the temporal correlation between data by randomizing the presentation of samples. All collected empirical data is stored in a buffer, which is a queue of finite size, and when new data arrives, the earliest data will be replaced if the buffer is full. In addition, rather than updating the network directly with the latest empirical data during training, a batch of samples is randomly drawn from the empirical playback buffer for training. This has the advantage that the time correlation between samples is reduced, making the training process more stable.

The target network is used to calculate a target Q value to stabilize the training process. The structure of the target network is identical to that of the DQN neural network, but the frequency of parameter updates is different.

The parameters of the DQN neural network are updated according to the gradient descent algorithm at each time step, and the parameters of the target network are copied from the DQN neural network at regular steps (e.g., every 1000 steps). By doing so, the calculation of the target Q value can be more stable, so that the stability of the training process is facilitated.

DQN uses the Mean Square Error (MSE) as a loss function to measure the difference between the predicted Q value and the target Q value. For each training sample (state, action, rewards, new state), the DQN neural network will output a predicted Q value representing the expected return for taking the action in that state. The target Q value is calculated by the target network, which takes into account rewards and state transitions that are available in the future. Specifically, the target Q value may be expressed as the current prize plus a discount off of the future maximum Q value (i.e., r+γmaxa ' Qtarget (s ', a ')), where r is the current prize, γ is the discount factor, s ' is the new state, and a ' is an optional action in the new state.

The difference between the predicted Q value and the target Q value is measured by MSE loss, i.e., mse=n1Σi=1n (Q (si, ai) -Qtarget (si, ai)) 2, where N is the number of samples in the batch process.

Alternatively, the DQN uses an optimization algorithm (e.g., adam) to update its parameters to minimize the loss function.

In each training iteration, the parameters of the DQN neural network are updated as indicated by the gradient descent algorithm (or more specifically, adam algorithm). These updates are intended to reduce the difference between the predicted Q value and the target Q value.

(4) Multi-objective optimization

Specifically, the generated energy, crop harvest and soil improvement effect are assigned different weights in the reward function to reflect their relative importance in the overall benefit. And adjusting the inclination angle, the overhaul mode and the crop planting strategy of the photovoltaic module according to the output of the DQN neural network and the feedback of the rewarding function in the training process so as to realize multi-objective optimization.

Generating capacity weight generating capacity is a main economic index of a photovoltaic power station, and therefore the weight is usually higher. However, specific weight values need to be determined according to practical situations, and factors such as power generation cost of a power station, market demand, policy subsidy and the like affect the relative importance of the generated energy.

Crop harvest weight-crop harvest is an important component of the agricultural light complementary system, reflecting the contribution of the system to agricultural production. Factors such as the type of crops, market demand, price and the like all affect the distribution of weights.

Soil improvement effect weight is that soil improvement is a long-term benefit embodiment, and is beneficial to improving the fertility and ecological environment of the land. However, the soil improvement effect tends to take a long time to develop, and thus is relatively low in weight.

In practice, the assignment of weights may be determined by expert consultation, market research, cost-effectiveness analysis, and the like.

Specifically, policy adjustment is a specific implementation of multi-objective optimization. And adjusting the inclination angle, the overhaul mode and the crop planting strategy of the photovoltaic module according to the output of the DQN neural network and the feedback of the rewarding function so as to realize multi-objective optimization.

And the inclination angle of the photovoltaic module is adjusted, so that the angle of the photovoltaic module for receiving solar radiation can be optimized by adjusting the inclination angle of the photovoltaic module, and the power generation efficiency is improved. At the same time, the inclination angle is adjusted by considering the requirement of crop growth and the soil improvement effect so as to ensure the maximization of the overall benefit.

The maintenance mode is optimized, and the regular maintenance of the photovoltaic module and the agricultural facility is an important measure for keeping the system stably running. By optimizing the maintenance mode and period, the fault rate can be reduced, the equipment utilization rate can be improved, and the operation and maintenance cost can be reduced. In addition, in the overhaul process, the condition of soil and crops can be combined for targeted maintenance.

And adjusting crop planting strategies, and formulating proper crop planting strategies according to soil conditions, climate conditions, market demands and other factors. In the planting process, the problems of growth cycle, nutrient requirement, pest control and the like of crops are required to be concerned so as to ensure high yield and high quality of the crops. Meanwhile, the influence of the planting density and the layout of crops on the power generation efficiency of the photovoltaic module is also required to be considered.

(5) Monitoring and evaluation

Training monitoring, namely using tensorboard and other tools to monitor the change conditions of indexes such as loss values, average rewards and the like in the training process.

It should be noted that, in the training process of the DQN, monitoring the change of each key index is important for timely finding problems and adjusting the training strategy.

Loss value monitoring-monitoring the change of the loss value in the training process by using visualization tools such as TensorBoard. The loss value reflects the difference between the model predictive Q value and the target Q value, and the change trend can reflect the convergence condition and the training effect of the model. If the loss value continues to drop and gradually stabilizes, the model is gradually converging and learning a valid strategy.

Average prize monitoring-in addition to loss values, the change in average prizes may be monitored. Average rewards are the average of the sum of rewards obtained by the agent over a period of time reflecting the level of performance of the model under the current strategy. If the average prize is gradually rising and tends to stabilize, the model is continually optimizing its strategy and achieving better performance.

Other indexes are monitored, namely other related indexes such as exploration rate (epsilon value in epsilon-greedy strategy), learning rate, training speed and the like can be monitored according to specific application scenes and requirements.

Model evaluation, namely, after training is finished, an independent test data set is used for evaluating the trained DQN model, and whether the performance of the DQN model under different conditions meets the expected targets is checked.

Test data set preparation an independent test data set is prepared, separate from the training data set, to ensure objectivity and accuracy of the evaluation results.

And (3) evaluating performance indexes, namely selecting proper performance indexes to evaluate according to specific application scenes and requirements, wherein the proper performance indexes comprise accuracy, average rewards, accumulated rewards, completion time and the like.

(6) Practical application

And deploying the trained DQN model into a photovoltaic power station agriculture and light complementary operation and maintenance management system, and outputting an optimal crop planting mode and a photovoltaic module inclination angle adjustment strategy according to real-time environmental data.

By updating the data and retraining the model periodically, environmental changes and new business needs are accommodated.

And S3, determining an operation and maintenance strategy corresponding to the optimal Q value as an optimal operation and maintenance strategy, wherein the operation and maintenance strategy comprises a photovoltaic power station overhaul mode, a photovoltaic module inclination angle and a crop planting mode.

In an alternative embodiment, the scheme provides a method for managing the operation and maintenance of the agricultural and optical complementation of the desertification land photovoltaic power station.

In the foundation construction stage, on-site pictures are shot through the unmanned aerial vehicle, desertification land is accurately identified, land occupation in red lines of basic farmlands, forest lands and the like is avoided, soil moisture content monitoring devices are set up in each photovoltaic array, and soil indexes such as soil moisture content and the like of the photovoltaic modules of all the photovoltaic arrays under different dip angles are recorded. After the record is completed, the agricultural crop adaptive planting is carried out aiming at the soil indexes, the optimal crop planting modes under the conditions of different months, different inclination angles of the photovoltaic modules and different soil indexes are learned through the neural network, multi-objective optimization is carried out based on the learning result, the maintenance mode of the photovoltaic power station is reasonably planned, the photovoltaic modules are adjusted to a proper angle, the conditions of illumination and the like are ensured to be beneficial to soil improvement and crop planting, optimal crops under different soil conditions are bred and planting is carried out, and three-party benefit maximization of the generated energy of the photovoltaic power station of the sandy soil, crop planting and soil improvement is realized.

First, the characteristics and classification of the sandy land required in the image recognition algorithm are determined. And determining the number of neurons required by the neural network and the number of layers of the neural network in the image recognition algorithm, and establishing the neural network. The neural network is trained by truly desertification land images. And after each training is finished, the optimal training effect on the neural network is achieved by adjusting training parameters. And then testing the trained neural network by using the new image. The trained neural network can be tested by adopting image sets of sanded lands inside and outside red lines with different areas, sizes and colors. Thus, the capability of identifying the sanded land inside and outside the red line, which is proposed by the scheme, can be tested. The image recognition CNN neural network is used for adjusting training parameters, such as the layer number, the neuron number and the like. And secondly, determining the inclination angles, soil indexes and the characteristics and classification of crops of different photovoltaic modules. Different photovoltaic module dip angles, soil indexes and characteristics and classification of crops refer to different photovoltaic module dip angles, different soil indexes and different crops as state quantities when neural network training is carried out. The method is characterized in that features refer to the features of each state quantity, classification refers to the types of the state quantity (such as eggplants, potatoes, corns and the like, planted crops) and the number of neurons required by a neural network and the number of layers of the neural network in a deep-reinforcement learning algorithm are determined, and a deep-Q-network (DQN) neural network is established. Different photovoltaic power station operation and maintenance modes, and drop irrigation frequencies of crop drip irrigation belts are used as action amounts through different photovoltaic module inclination angles, different soil indexes and different crops as state amounts, and the generated energy of the desertification land photovoltaic power station, the soil improvement effect and crop harvest are used as rewarding equations to train the DQN neural network. After each training is finished, the optimal training effect on the neural network is achieved by adjusting training parameters, namely, the optimal photovoltaic power station operation and maintenance mode and the drop irrigation frequency of a crop drip irrigation belt are obtained under different inclination angles of the photovoltaic modules, different soil indexes and different crop planting states, and the generated energy of the photovoltaic power station in the sandy land, the optimal soil improvement effect and the optimal crop harvest are obtained.

The best inclination angle and the best maintenance route are pursued by the generated energy, the best soil improvement effect is that the soil moisture content is optimal, the inclination angle of the photovoltaic module needs to be adjusted, the maintenance time needs to be adjusted, the soil drip irrigation needs to be carried out on the optimal crop harvest, and the drip irrigation belt can influence the maintenance route, so that the scheme is multi-objective optimization.

And then, testing the trained neural network by adopting different operation and maintenance modes of the photovoltaic power station and drip irrigation frequency of the crop drip irrigation belt, adjusting related parameters such as the learning rate (LEARNING RATE) and the learning step length (LEARNING STEP) of the neural network according to the test result, and performing learning training and testing again.

According to the tested neural network, the uncertain factors such as sudden faults and the like, the position of the desertification land where the photovoltaic array is located and road conditions are combined, an optimal overhauling and defect eliminating route is planned, the angles of photovoltaic modules in corresponding areas and drip irrigation frequencies of crop drip irrigation belts are adjusted, the soil improvement effect and crop planting income are checked regularly, and the maximization of the generated energy of the desertification land photovoltaic power station, the soil improvement effect and crop harvest is achieved.

The uncertainty factors are embodied in the training process, and the associated uncertainty factors are included in the training set.

In other alternative embodiments, there is also provided an operation and maintenance method, including:

Firstly, building a sandy land image recognition training environment, and determining the characteristics and classification of the sandy land required in an image recognition algorithm. The number of neurons required by the neural network and the number of layers of the neural network in the image recognition algorithm are determined, and a (CNN) neural network is established. The neural network is trained by truly desertification land images. And after each training is finished, the optimal training effect on the neural network is achieved by adjusting training parameters. And then testing the trained neural network by adopting image sets of sanded lands inside and outside different blocks, sizes, colors and red lines, adjusting parameters of the neural network according to test results, training and testing again after adjustment, and iterating until the optimal effect is achieved.

And secondly, building a training environment of the deep reinforcement learning algorithm neural network. Mathematical modeling is performed on internal facilities, matrix positions, road conditions, soil indexes and the like of the photovoltaic power station, and linearization is performed in a Mixed Integer Linear Programming (MILP) model as a final form. The method comprises the steps of carrying out analysis modeling on related uncertainties such as operation and maintenance of a photovoltaic power station, soil improvement, crop planting and the like (illumination and temperature, soil water content, fault positions of a photovoltaic module, fault time and the like), establishing a distribution curve of uncertainties based on historical data, determining correlations among the uncertainties (mutual influence among different uncertainties), determining a state space (state variable), an action space (control variable) and a reward function of a neural network, constructing a mathematical environment which can represent the photovoltaic power station, establishing a 'smart brain' based on a deep reinforcement learning algorithm through Python, planning an optimization arithmetic unit to connect the mathematical environment through Gurobi, training a DQN neural network in the 'smart brain' through historical data, achieving an optimal training effect through adjusting training parameters, testing the trained DQN through real-time data and verifying simulation results through real environments, and actually applying the strategy in the photovoltaic power station to maximize the complementary benefits of agriculture.

Uncertainty is contained in each training set, i.e., the training environment is different at each training.

The operation and maintenance strategy of the existing land photovoltaic power station cannot be directly applied to agricultural and photovoltaic complementary photovoltaic power stations in sandy land and other areas. Moreover, the existing unmanned aerial vehicle technology cannot accurately identify whether the required land is the red-line land or not, and needs to be compared with the three-tone map of the forestry grassland and the like. The prior art can not effectively combine soil improvement, crop planting and photovoltaic power station operation and maintenance, and can not realize multi-objective optimization tasks. However, by the scheme of the invention, the generated energy of the desertification land photovoltaic power station, the optimal soil improvement effect and the optimal crop harvest can be obtained.

Example 2

As shown in fig. 2, based on the same inventive concept as the above embodiment, the present invention further provides an agricultural light complementary operation and maintenance device for a sandy land photovoltaic power station, which includes:

Determining a network structure of the DQN neural network model, comprising:

An input layer for receiving as input each index in the state space;

Example 3

As shown in fig. 3, the present invention further provides an electronic device 100 for implementing the method for performing the agricultural light complementary operation and maintenance of the sanded land photovoltaic power station of embodiment 1;

the electronic device 100 comprises a memory 101, at least one processor 102, a computer program 103 stored in the memory 101 and executable on the at least one processor 102, and at least one communication bus 104.

The memory 101 may be used to store a computer program 103, and the processor 102 implements the steps of the method for agricultural light complementation operation and maintenance of the sandy land photovoltaic power plant of embodiment 1 by running or executing the computer program stored in the memory 101 and invoking data stored in the memory 101.

The memory 101 may mainly include a storage program area that may store an operating system, application programs required for at least one function (such as a sound playing function, an image playing function, etc.), etc., and a storage data area that may store data created according to the use of the electronic device 100 (such as audio data), etc. In addition, memory 101 may include non-volatile memory, such as a hard disk, memory, plug-in hard disk, smart memory card (SMART MEDIA CARD, SMC), secure Digital (SD) card, flash memory card (FLASH CARD), at least one disk storage device, flash memory device, or other non-volatile solid-state storage device.

The at least one Processor 102 may be a central processing unit (Central Processing Unit, CPU), but may also be other general purpose processors, digital signal processors (DIGITAL SIGNAL Processor, DSP), application SPECIFIC INTEGRATED Circuit (ASIC), field-Programmable gate array (Field-Programmable GATE ARRAY, FPGA) or other Programmable logic device, discrete gate or transistor logic device, discrete hardware components, or the like. The processor 102 may be a microprocessor or the processor 102 may be any conventional processor or the like, the processor 102 being a control center of the electronic device 100, the various interfaces and lines being utilized to connect various portions of the overall electronic device 100.

The memory 101 in the electronic device 100 stores a plurality of instructions to implement a method for performing complementary operation and maintenance of a photovoltaic power plant for a sandy land, the processor 102 executing the plurality of instructions to implement:

Example 4

The modules/units integrated with the electronic device 100 may be stored in a computer readable storage medium if implemented in the form of software functional units and sold or used as a stand alone product. Based on such understanding, the present invention may implement all or part of the flow of the method of the above embodiment, or may be implemented by a computer program to instruct related hardware, where the computer program may be stored in a computer readable storage medium, and when the computer program is executed by a processor, the steps of each method embodiment described above may be implemented. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, executable files or in some intermediate form, etc. The computer readable medium may include any entity or device capable of carrying computer program code, a recording medium, a USB flash disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, and a Read-Only Memory (ROM).

It will be appreciated by those skilled in the art that embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

In the description of the present specification, the descriptions of the terms "one embodiment," "example," "specific example," and the like, mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present invention. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiments or examples. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.

Finally, it should be noted that the above embodiments are only for illustrating the technical solution of the present invention and not for limiting the same, and although the present invention has been described in detail with reference to the above embodiments, it should be understood by those skilled in the art that modifications and equivalents may be made to the specific embodiments of the present invention without departing from the spirit and scope of the present invention, and any modifications and equivalents are intended to be included in the scope of the claims of the present invention.

Claims

1. The method for carrying out agricultural light complementation operation and maintenance on the sandy land photovoltaic power station is characterized by comprising the following steps of:

2. The method for agricultural light complementation operation and maintenance of a sandy land photovoltaic power plant according to claim 1, wherein the sandy land area is determined by pre-trained image data of an image recognition CNN neural network recognition target area, wherein the image recognition CNN neural network is trained in the following manner:

3. The method for agricultural light complementation operation and maintenance of a sandy land photovoltaic power station according to claim 1, wherein the state of the photovoltaic power station, the growth state of crops and the external environmental conditions are input into a pre-trained DQN neural network model, the DQN neural network model outputs Q values corresponding to operation and maintenance strategies, and the DQN neural network model is trained according to the following modes:

Determining a network structure of the DQN neural network model, comprising:

An input layer for receiving as input each index in the state space;

4. A method of complementary operation and maintenance of a sandy land photovoltaic plant farming light according to claim 3, characterized in that different weights are assigned in the reward function for the amount of electricity generated, crop harvest and soil improvement effect, said weights being used to reflect the relative importance of each optimization objective in the overall benefit.

5. The utility model provides a desertification land photovoltaic power plant farming-light complementary operation and maintenance device which characterized in that includes:

6. The device of claim 5, wherein the image recognition CNN neural network is trained in the input data acquisition module in the following manner:

7. The method of claim 5, wherein in the policy optimization module, the DQN neural network model is trained as follows:

Determining a network structure of the DQN neural network model, comprising:

An input layer for receiving as input each index in the state space;

8. The method of claim 7, wherein different weights are assigned to the power generation, crop harvest and soil improvement effects in the bonus function, said weights being used to reflect the relative importance of each optimization objective in the overall benefit.

9. An electronic device comprising a processor and a memory, the processor configured to execute a computer program stored in the memory to implement the method of operation and maintenance of the agricultural complement of a sandy land photovoltaic power plant as defined in any one of claims 1 to 4.

10. A computer readable storage medium storing at least one instruction that when executed by a processor implements the method of agricultural complementary operation and maintenance of a sandy land photovoltaic power plant of any one of claims 1 to 4.