Background
In the process of planting soybeans in China, diseases and insect pests are main reasons for the yield reduction and quality failure of soybean grains in China. The main cause of disease and insect damage is due to the nature of the plant, soybean.
Among diseases of soybean, soybean sheath blight, sclerotinia, gray leaf spot and root rot are the main types. Gray leaf spot is the most common type of soybean disease and has been classified as a worldwide disease. In China, gray leaf spot is mainly common in soybean production areas of the three provinces in northeast; once the soybean plants are ill, the problem of serious yield reduction can be caused, the yield reduction rate is generally different from 10 to 50 percent, and the serious influence is caused to the economy of China. The main influence of the gray leaf disease is the leaves and seeds of soybeans, and the state of the leaves can be greatly different when the soybean is attacked.
Among the pests of soybeans, the types of pests that are common are: budworm, aphid, red spider, etc. As a main soybean producing area in northeast China, aphids are most seriously harmful, and soybean leaves infected with the aphids are often expressed as follows: leaves curl, secreting a clear viscous liquid. After the soybean blooms, the saturation degree of buds affected by aphids is reduced, and the yield of the soybean is directly influenced. Moreover, too much aphid causes death of soybeans.
With the continuous development of electronic technology, hyperspectral remote sensing technology has been gradually applied to crop nutrient diagnosis, classification and identification, quality identification and other applications. The hyperspectral image consists of images of hundreds of bands, which makes it possible to identify or detect materials at a fine-grained level, especially with very similar spectral features from a visual point of view. The classification of hyperspectral images generally consists of the following steps: preprocessing (denoising), reducing dimensions, and extracting features to obtain final classification. Among them, the feature extraction stage is receiving wide attention. Over the past decades, manual features have been largely applied in the feature extraction stage; the characteristic extraction method has a good effect when the sample size is small, and the effect is gradually weakened along with the increase of the sample size.
With the development of deep learning, neural networks are gradually used in feature extraction of images, but the conventional CNN network does not perform well enough in extracting global features.
Disclosure of Invention
In order to solve the technical problems in the prior art, the invention provides a soybean pest and disease identification method based on a hyperspectral image, which takes a Transformer as a model of a trunk network and avoids the over-fitting problem in a course learning mode in a training stage, and the specific technical scheme is as follows:
a soybean disease and pest identification method based on hyperspectral images comprises the following steps:
shooting hyperspectral images and corresponding RGB images at different heights by using a hyperspectral camera and an RGB camera carried by an unmanned aerial vehicle to obtain an acquired hyperspectral data set and a corresponding RGB data set;
step two, performing data augmentation on the hyperspectral data set acquired in the step one based on the open source RGB data set;
thirdly, carrying out plant region segmentation on the open source RGB data set and the images in the collected RGB data set to obtain a mask image, carrying out pixel point multiplication on the mask image and the corresponding hyperspectral image to obtain an image containing a plant region, and then carrying out pretreatment to calculate an average spectral characteristic curve of each category;
inputting the hyperspectral data set after data augmentation into a soybean pest and disease identification network model, and performing model training by adopting a course learning mode and the average spectral characteristic curve of each category obtained in the step three to obtain a trained soybean pest and disease identification network model;
and step five, adopting the trained soybean pest and disease identification network model to predict and classify the collected and input hyperspectral images and outputting the finally predicted pest type.
Further, the second step specifically includes the following substeps:
step 2.1, loading a hyperspectral image reconstruction network trained and completed on an open source RGB data set, wherein the hyperspectral image reconstruction network adopts an MST + + algorithm;
step 2.2, inputting the soybean plant RGB images collected on the internet into a hyperspectral image reconstruction network after loading, obtaining and storing corresponding hyperspectral images generated by reconstruction, obtaining a generated hyperspectral data set, merging the generated hyperspectral data set into a hyperspectral data set collected by an unmanned aerial vehicle, recording the hyperspectral data set as a total hyperspectral data set, wherein labels of the data set are respectively the number of categories
Wherein, in the process,
is the number of categories, which is expressed by
Class image data;
and 2.3, randomly overturning and cutting all the hyperspectral images and the corresponding RGB images in the total hyperspectral data set.
Further, the third step specifically includes the following substeps:
step 3.1, performing soybean plant area segmentation on each RGB image by applying the existing segmentation image segmentation algorithm to obtain a Mask image of the soybean plant area;
step 3.2, multiplying pixel points of the obtained Mask image and the image of each spectral frequency band of the corresponding hyperspectral image to obtain an image only containing a soybean plant area in the hyperspectral image;
step 3.3, then, normalizing the image only containing the soybean plant area, wherein the expression is as follows:
wherein,
the image is an input image, namely an input image only containing soybean areas;
is the output normalized image;
are respectively input images
Of medium to maximumA pixel value and a minimum pixel value;
reducing the image size to after normalization
W is width and h is height;
and 3.4, respectively counting the spectral characteristic curves of the images with reduced sizes in each category to obtain the average spectral characteristic curve of each category.
Further, the step 3.4 specifically includes: calculating the average pixel value of each spectral band of the image with reduced size, calculating the average pixel values of all spectral bands by analogy, and finally arranging the spectral characteristic curves of the image in sequence; then, all the spectral characteristic curves in each class are respectively calculated, and all the spectral characteristic curves are averaged to finally obtain the average value
Average spectral characteristics.
Furthermore, the soybean pest and disease identification network model takes a Transformer as a backbone network and comprises a spectral characteristic curve extraction module and a classification prediction module; the spectral characteristic curve extraction module is used for extracting a spectral characteristic curve of the soybean plant region image and then calculating a loss function with the average spectral characteristic curve of the same category; the classification prediction module classifies the extracted spectral characteristic curves.
Further, the course learning mode is to gradually increase the weight of the loss function which is difficult to train in the iterative training process, and the specific expression is as follows:
wherein,
all are the weights for course learning;
、
respectively is an initial weight, a current iteration number and a total iteration number;
respectively a classification loss function, a negative pearson correlation coefficient, a time domain loss function and a frequency domain loss function; wherein,
calculating a Pearson correlation coefficient of a spectral characteristic curve of the current hyperspectral image and an average spectral characteristic curve of a corresponding category, and taking a negative value;
the calculation of (2) is carried out by respectively taking the spectrum of Fourier transform from the spectrum characteristic curve of the current hyperspectral image and the average spectrum characteristic curve of the corresponding category, and then taking the average absolute error of the two spectrums.
A soybean disease and pest identification device based on hyperspectral images comprises one or more processors and is used for achieving the soybean disease and pest identification method based on the hyperspectral images.
A computer-readable storage medium having stored thereon a program which, when executed by a processor, implements the hyperspectral image-based soybean pest identification method.
Has the beneficial effects that:
first, at the data level: shooting data at different heights by using an unmanned aerial vehicle to obtain images at different scales, so that the model has generalization capability; in addition, the data amplification is completed through a hyperspectral image reconstruction network based on the MST + + algorithm, the cost for acquiring hyperspectral data is reduced, and the balance between the data amount and the category is ensured;
secondly, at the model level: a structure with a Transformer as a backbone network is used, so that the global property is enhanced; then, a course learning mode is used for training during training, so that overfitting is avoided, and convergence is accelerated;
finally, the combination of the frequency domain and the time domain is fully considered in the loss function, so that the accuracy of the prediction classification identification of the model is higher.
Detailed Description
In order to make the objects, technical solutions and technical effects of the present invention more apparent, the present invention is further described in detail below with reference to the accompanying drawings and embodiments of the specification.
As shown in figures 1 and 2, the soybean pest and disease identification method based on the hyperspectral image comprises the following steps:
the method comprises the steps of firstly, shooting hyperspectral images and corresponding RGB images at different heights by using a hyperspectral camera and an RGB camera carried by an unmanned aerial vehicle to obtain an acquired hyperspectral data set and a corresponding RGB data set.
In the embodiment of the invention, the heights of the unmanned aerial vehicle are respectively 3 meters, 5 meters and 7 meters; the shot hyperspectral image and the RGB image have the same required size, and the pixel points are in one-to-one correspondence; after the hyperspectral image and the RGB image are obtained, the hyperspectral image is stored as a mat file (the mat file is a standard binary file for data storage of matlab).
And step two, performing data augmentation on the hyperspectral data set acquired in the step one based on the open source RGB data set.
The hyperspectral data has high acquisition cost and long acquisition time; in order to fully train the model, firstly, a hyperspectral image reconstruction network is utilized to reconstruct and generate more hyperspectral images from an open RGB image, so that the full training of the model is ensured; specifically, the method comprises the following substeps:
step 2.1, loading a hyperspectral image reconstruction network (MST + + model) trained on an open source RGB data set, wherein the hyperspectral image reconstruction network model adopts an MST + + algorithm;
step 2.2, inputting the soybean plant RGB images collected on the internet into the loaded MST + + model to obtain corresponding hyperspectral images generated by reconstruction and storing the hyperspectral images as mat files to obtain a generated hyperspectral data set, merging the generated hyperspectral data set into the hyperspectral data set collected by the unmanned aerial vehicle before, and recording the hyperspectral data set as a total hyperspectral data set, wherein labels of the data set are respectively the number of the categories
Wherein
is the number of categories, which is expressed by
Class image data;
and 2.3, randomly overturning all the hyperspectral images and corresponding RGB images in the total hyperspectral data set, and cutting to further improve the data volume.
Thirdly, carrying out plant region segmentation on the image in the RGB data set of the existing source and the collected RGB data set to obtain a mask image, carrying out pixel point multiplication on the mask image and the corresponding hyperspectral image to obtain an image containing a plant region, and then carrying out pretreatment to calculate an average spectral characteristic curve of each category.
Specifically, the method comprises the following substeps:
and 3.1, performing soybean plant area segmentation on each RGB image by using the existing segmentation image segmentation algorithm to obtain a Mask image of the soybean plant area. In the Mask image, the pixel value of the soybean-containing area is 255, and the pixel values of other areas are 0;
step 3.2, multiplying pixel points of the obtained Mask image and the image of each spectral frequency band of the corresponding hyperspectral image to obtain an image only containing a soybean plant area in the hyperspectral image;
step 3.3, then, normalizing the image only containing the soybean plant area, wherein the expression is as follows:
wherein,
the image is an input image, namely an input image only containing soybean areas;
is the output normalized image;
are respectively input images
The maximum pixel value and the minimum pixel value;
reducing the image size to after normalization
W is width and h is height;
step 3.4, respectively counting the spectral characteristic curves of the images with reduced sizes in each category, and specifically comprising the following steps: calculating the average pixel value of each spectral band of the image with reduced size, calculating the average pixel values of all spectral bands by analogy, and finally arranging the spectral characteristic curves of the image in sequence; then, all spectral characteristics in each class are calculated separately and allThe spectral characteristic curve is averaged to finally obtain
The average spectral characteristics are shown in fig. 3.
And step four, inputting the hyperspectral data set after data augmentation into a soybean disease and insect pest identification network model, and performing model training by adopting a course learning mode and the average spectral characteristic curves of all categories obtained in the step three to obtain the trained soybean disease and insect pest identification network model.
Inputting the images of the hyperspectral data sets with augmented data into a soybean pest and disease identification network in batches; wherein, soybean plant diseases and insect pests identification network's major structure does: the device comprises a spectral characteristic curve extraction module and a classification prediction module.
The spectral characteristic curve extraction module mainly extracts the spectral characteristic curve of the soybean plant region image through a neural network, and the spectral characteristic curves of the soybean leaves with different plant diseases and insect pests are different; in this module, the final output is a spectral profile of this class, and then a loss function is calculated with the average spectral profile of the same class, as described below.
The classification prediction module classifies the extracted spectral characteristic curves, and the number of classes is
。
In the training phase, the main strategy adopted is Curriculum Learning (Curriculum Learning); the main embodiment mode of course learning is that along with the deepening of training, the weight of a certain less-trained Loss part is gradually increased, the model convergence is accelerated from shallow to deep just like a learning course, and after N times of iterative training convergence, the trained model is stored, and the specific mode is as follows:
in the formula, the first step is that,
all are the weights for course learning;
、
respectively is an initial weight, a current iteration number and a total iteration number;
respectively as follows: a classification loss function, a negative pearson correlation coefficient, a time domain + frequency domain loss function; wherein,
calculating a Pearson correlation coefficient of a spectral characteristic curve of the current hyperspectral image and an average spectral characteristic curve of a corresponding category and taking a negative value;
the calculation of (2) is carried out by firstly taking the frequency spectrum of Fourier transform for the spectral characteristic curve of the current hyperspectral image and the average spectral characteristic curve of the corresponding category and then taking the average absolute error for the two frequency spectrums.
And step five, adopting the trained soybean pest and disease identification network model to predict and classify the collected and input hyperspectral images and outputting the finally predicted pest type.
Firstly, adjusting a model to an eval model evaluation mode, and then loading a stored model file; in the inference stage, only hyperspectral images are input into the recognition network, normalization and size adjustment are carried out on the images before input, and then the images are input into the classification prediction network to obtain the final output category.
Corresponding to the embodiment of the soybean disease and insect pest identification method based on the hyperspectral image, the invention also provides an embodiment of a soybean disease and insect pest identification device based on the hyperspectral image.
Referring to fig. 4, the soybean pest and disease identification device based on the hyperspectral image provided by the embodiment of the invention comprises one or more processors, and is used for realizing the soybean pest and disease identification method based on the hyperspectral image in the embodiment.
The soybean pest and disease identification device based on the hyperspectral image can be applied to any equipment with data processing capacity, and the any equipment with data processing capacity can be equipment or devices such as computers. The apparatus embodiments may be implemented by software, or by hardware, or by a combination of hardware and software. The software implementation is taken as an example, and as a device in a logical sense, a processor of any device with data processing capability reads corresponding computer program instructions in the nonvolatile memory into the memory for operation. In terms of hardware, as shown in fig. 4, the present invention is a hardware structure diagram of any device with data processing capability where the soybean pest identification apparatus based on hyperspectral image is located, and besides the processor, the memory, the network interface, and the nonvolatile memory shown in fig. 4, in an embodiment, any device with data processing capability where the apparatus is located may also include other hardware according to the actual function of the any device with data processing capability, which is not described again.
The specific details of the implementation process of the functions and actions of each unit in the above device are the implementation processes of the corresponding steps in the above method, and are not described herein again.
For the device embodiments, since they substantially correspond to the method embodiments, reference may be made to the partial description of the method embodiments for relevant points. The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules can be selected according to actual needs to achieve the purpose of the solution of the present invention. One of ordinary skill in the art can understand and implement it without inventive effort.
The embodiment of the invention also provides a computer readable storage medium, wherein a program is stored on the computer readable storage medium, and when the program is executed by a processor, the soybean pest and disease identification method based on the hyperspectral image in the embodiment is realized.
The computer readable storage medium may be an internal storage unit, such as a hard disk or a memory, of any data processing device described in any previous embodiment. The computer readable storage medium may also be an external storage device such as a plug-in hard disk, a Smart Media Card (SMC), an SD Card, a Flash memory Card (Flash Card), etc. provided on the device. Further, the computer readable storage medium may include both an internal storage unit and an external storage device of any data processing capable device. The computer-readable storage medium is used for storing the computer program and other programs and data required by the arbitrary data processing capable device, and may also be used for temporarily storing data that has been output or is to be output.
The above description is only a preferred embodiment of the present invention, and is not intended to limit the present invention in any way. Although the foregoing has described the practice of the present invention in detail, it will be apparent to those skilled in the art that modifications may be made to the practice of the invention as described in the foregoing examples, or that certain features may be substituted in the practice of the invention. All changes, equivalents and the like which come within the spirit and principles of the invention are desired to be protected.