Disclosure of Invention
In order to solve the technical problems of high operation complexity, low identification efficiency and poor positioning accuracy of an identification method of an agricultural target in the prior art, the embodiment of the invention provides a target identification method based on a lightweight network, a target identification system based on the lightweight network and an agricultural machine.
In order to achieve the above object, an embodiment of the present invention provides an object identification method based on a lightweight network, where the identification method includes: establishing a target database, wherein the target database comprises a target image; performing data preprocessing operation on the target image to obtain corresponding preprocessing data; establishing a lightweight network model, and processing the preprocessed data based on the lightweight network model to obtain corresponding processed data; and acquiring frame information, and analyzing the processed data based on the frame information to acquire target identification information.
Preferably, the establishing a target database includes: acquiring a target image; performing effectiveness screening on the target image to obtain a screened image; labeling the screened image to obtain labeling information corresponding to the screened image, wherein the labeling information comprises first target information and second target information; and establishing a target database based on the screened image and the labeling information.
Preferably, the performing a data preprocessing operation on the target image to obtain corresponding preprocessed data includes: performing a first data enhancement operation on the screened image to obtain first enhancement data; executing a second data enhancement operation on the first enhancement data to obtain second enhancement data; and processing the second enhancement data according to a preset requirement to obtain the preprocessed data.
Preferably, the processing the second enhancement data according to a preset requirement to obtain the preprocessed data includes: acquiring preset picture format information; performing format processing on the second enhanced data based on the preset picture format information to obtain formatted data; and carrying out normalization processing on the formatted data to obtain the preprocessed data.
Preferably, the establishing a lightweight network model, processing the preprocessed data based on the lightweight network model, and obtaining corresponding processed data includes: acquiring an infrastructure network architecture, wherein the infrastructure network architecture is a network architecture based on lightweight design; acquiring first channel information, and performing data expansion processing on the preprocessed data based on the first channel information in the basic network architecture to obtain expanded data; acquiring second channel information, and performing feature extraction processing on the expanded data based on the second channel information in the basic network architecture to obtain extracted data; and performing fusion operation on the extracted data based on the first channel information in the basic network architecture to obtain fused data, and taking the fused data as the processed data.
Preferably, the annotation information further includes filtered size information of the filtered image, and the identification method further includes: after the frame body information is obtained, generating corresponding size statistical information based on the screened size information; and adjusting the frame information based on the basic network architecture and the size statistical information to obtain the adjusted frame information.
Preferably, the analyzing the processed data based on the frame information to obtain target identification information includes: analyzing the processed data based on the frame information to obtain a plurality of candidate prediction frames; screening the candidate prediction frames according to a preset screening rule to obtain at least one screened prediction frame; taking the post-screening prediction box as the analysis result; and extracting the classification information and the positioning information of each screened prediction box, and generating target identification information corresponding to the screened prediction boxes on the basis of the classification information and the positioning information.
Preferably, the identification method further comprises: generating loss calculation information based on the frame information, the second channel information and a preset loss calculation algorithm before analyzing the processed data based on the frame information; and optimizing the frame body information based on the loss calculation information to obtain the optimized frame body information.
Preferably, the optimizing the frame information based on the loss calculation information to obtain the optimized frame information includes: judging whether the loss calculation information is larger than a preset loss threshold value or not; when the loss calculation information is less than or equal to the preset loss threshold: processing the second channel information based on the loss calculation information to obtain processed second channel information; optimizing the frame information based on the processed second channel information to obtain optimized frame information; the identification method further comprises the following steps: in the case that the loss calculation information is greater than the preset loss threshold: judging whether the screened image corresponding to the loss calculation information is a qualified image; if the screened image corresponding to the loss calculation information is a qualified image, adjusting the annotation information corresponding to the screened image based on the loss calculation information to obtain adjusted annotation information; and if not, deleting the screened image corresponding to the loss calculation information.
Preferably, the screening the candidate prediction frames according to a preset screening rule to obtain at least one screened prediction frame includes: screening the candidate prediction frames based on a non-maximum suppression algorithm to obtain a prediction frame after first processing; screening the first processed prediction frame based on a score threshold value to obtain a second processed prediction frame; screening the second processed prediction frame based on the minimum size threshold to obtain a third processed prediction frame; judging whether an overlapped third-processing prediction frame with an overlapped area larger than a maximum overlapping threshold exists or not, if so, acquiring target score information of the overlapped third-processing prediction frame, deleting the overlapped third-processing prediction frame with smaller target score information, and acquiring at least one screened prediction frame; otherwise, the prediction frame after the third processing is used as the prediction frame after screening.
Accordingly, the present invention also provides an object recognition system based on a lightweight network, the recognition system comprising: the system comprises a library construction unit, a database processing unit and a database processing unit, wherein the library construction unit is used for establishing a target database, and the target database comprises a target image; the preprocessing unit is used for executing data preprocessing operation on the target image to obtain corresponding preprocessing data; the light-weight network unit is used for establishing a light-weight network model, processing the preprocessed data based on the light-weight network model and obtaining corresponding processed data; an identification unit; the frame information is acquired, and the processed data is analyzed based on the frame information to acquire target identification information.
Preferably, the library construction unit includes: the image acquisition module is used for acquiring a target image; the effectiveness screening module is used for carrying out effectiveness screening on the target image to obtain a screened image; the marking module is used for marking the screened image to obtain marking information corresponding to the screened image, and the marking information comprises first target information and second target information; and the library establishing module is used for establishing a target database based on the screened images and the marking information.
Preferably, the preprocessing unit includes: the first enhancement module is used for executing a first data enhancement operation on the screened image to obtain first enhancement data; the second enhancement module is used for executing second data enhancement operation on the first enhancement data to obtain second enhancement data; and the preprocessing module is used for processing the second enhanced data according to a preset requirement to obtain the preprocessed data.
Preferably, the processing the second enhancement data according to a preset requirement to obtain the preprocessed data includes: acquiring preset picture format information; performing format processing on the second enhanced data based on the preset picture format information to obtain formatted data; and carrying out normalization processing on the formatted data to obtain the preprocessed data.
Preferably, the lightweight network unit comprises: the basic network module is used for acquiring a basic network architecture, and the basic network architecture is a network architecture based on lightweight design; the expansion module is used for acquiring first channel information, and performing data expansion processing on the preprocessed data based on the first channel information in the basic network architecture to obtain expanded data; the extraction module is used for acquiring second channel information, and performing feature extraction processing on the expanded data based on the second channel information in the basic network architecture to obtain extracted data; and the fusion module is used for executing fusion operation on the extracted data in the basic network architecture based on the first channel information to obtain fused data, and taking the fused data as the processed data.
Preferably, the annotation information further includes filtered size information of the filtered image, and the identification unit includes: the statistical module is used for generating corresponding size statistical information based on the screened size information after the frame body information is obtained; and the adjusting module is used for adjusting the frame body information based on the basic network architecture and the size statistical information to obtain the adjusted frame body information.
Preferably, the identification unit includes: the analysis module is used for analyzing the processed data based on the frame body information to obtain a plurality of candidate prediction frames; the screening module is used for screening the candidate prediction frames according to a preset screening rule to obtain at least one screened prediction frame; and the identification module is used for extracting the classification information and the positioning information of each screened prediction frame and generating target identification information corresponding to the screened prediction frames based on the classification information and the positioning information.
Preferably, the identification unit further comprises: a loss calculation module, configured to generate loss calculation information based on the frame information, the second channel information, and a preset loss calculation algorithm before analyzing the processed data based on the frame information; and the frame body optimization module is used for optimizing the frame body information based on the loss calculation information to obtain the optimized frame body information.
Preferably, the optimizing the frame information based on the loss calculation information to obtain the optimized frame information includes: judging whether the loss calculation information is larger than a preset loss threshold value or not; when the loss calculation information is less than or equal to the preset loss threshold: processing the second channel information based on the loss calculation information to obtain processed second channel information; optimizing the frame information based on the processed second channel information to obtain optimized frame information; the identification device further comprises: in the case that the loss calculation information is greater than the preset loss threshold: judging whether the screened image corresponding to the loss calculation information is a qualified image; if the screened image corresponding to the loss calculation information is a qualified image, adjusting the annotation information corresponding to the screened image based on the loss calculation information to obtain adjusted annotation information; and if not, deleting the screened image corresponding to the loss calculation information.
Preferably, the screening the candidate prediction frames according to a preset screening rule to obtain at least one screened prediction frame includes: screening the candidate prediction frames based on a non-maximum suppression algorithm to obtain a prediction frame after first processing; screening the first processed prediction frame based on score threshold information to obtain a second processed prediction frame; screening the second processed prediction frame based on the minimum size threshold to obtain a third processed prediction frame; judging whether an overlapped third-processing prediction frame with an overlapped area larger than a maximum overlapping threshold exists or not, if so, acquiring target score information of the overlapped third-processing prediction frame, deleting the overlapped third-processing prediction frame with smaller target score information, and acquiring at least one screened prediction frame; otherwise, the prediction frame after the third processing is used as the prediction frame after screening.
In another aspect, the present invention also provides a computer-readable storage medium, on which a computer program is stored, which when executed by a processor implements the method provided by the present invention.
In another aspect, the present disclosure also provides an agricultural machine including an identification system provided by the present disclosure and/or a computer-readable storage medium provided by the present disclosure.
Through the technical scheme provided by the invention, the invention at least has the following technical effects:
through the technical scheme, the agricultural target is identified, analyzed and processed by constructing a lightweight network architecture, the operation complexity is greatly reduced, the operation amount is reduced, and meanwhile, the image data is subjected to data enhancement by innovatively combining a generative countermeasure network, so that the diversity and effectiveness of the data are effectively improved, and further, the training model is continuously optimized through the label information in the specific identification process, so that the identification accuracy is greatly improved.
Additional features and advantages of embodiments of the invention will be set forth in the detailed description which follows.
Detailed Description
In order to solve the technical problems of high operation complexity, low identification efficiency and poor positioning accuracy of an identification method of an agricultural target in the prior art, the embodiment of the invention provides a target identification method based on a lightweight network, a target identification system based on the lightweight network and an agricultural machine.
The agricultural target recognition of the present invention refers to recognition of relevant objects of agricultural working machines, including weed recognition, pest recognition, crop lodging recognition, harvest object recognition, bundling object recognition, fertilization object recognition, crop growth situation recognition, and the like.
The following detailed description of embodiments of the invention refers to the accompanying drawings. It should be understood that the detailed description and specific examples, while indicating embodiments of the invention, are given by way of illustration and explanation only, not limitation.
The terms "system" and "network" in embodiments of the present invention may be used interchangeably. The "plurality" means two or more, and in view of this, the "plurality" may also be understood as "at least two" in the embodiments of the present invention. "and/or" describes the association relationship of the associated objects, meaning that there may be three relationships, e.g., a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" generally indicates that the preceding and following related objects are in an "or" relationship, unless otherwise specified. In addition, it should be understood that the terms first, second, etc. in the description of the embodiments of the invention are used for distinguishing between the descriptions and are not intended to indicate or imply relative importance or order to be construed.
Referring to fig. 1, an embodiment of the present invention provides an object identification method based on a lightweight network, where the identification method includes:
s10), establishing a target database, wherein the target database comprises target images;
s20) executing data preprocessing operation on the target image to obtain corresponding preprocessed data;
s30), establishing a lightweight network model, and processing the preprocessed data based on the lightweight network model to obtain corresponding processed data;
s40) obtaining frame information, and analyzing the processed data based on the frame information to obtain target identification information.
In an embodiment of the present invention, the establishing a target database includes: acquiring a target image; performing effectiveness screening on the target image to obtain a screened image; labeling the screened image to obtain labeling information corresponding to the screened image, wherein the labeling information comprises first target information and second target information; and establishing a target database based on the screened image and the labeling information.
In a possible implementation mode, in the process of identifying weeds in a certain farmland or farm area, a database of agricultural targets needs to be established first, for example, the target database is a database of crop weeds, for example, in the embodiment of the present invention, a technician drives a plant protection machine to automatically fly above the farmland area to be identified and uses a camera arranged on the plant protection machine to shoot a farmland image of the farmland, after the farmland image is obtained, validity screening needs to be performed on the preliminarily obtained farmland image, for example, in the embodiment of the present invention, the technician screens each obtained farmland image through manual identification so as to screen and delete farmland images including non-farmland scenes, high-repetition degree, blurred images, no target crops and the like, and thus images meeting identification requirements or identification scenes are reserved, at this time, the technician labels each image after the screening, for example, labels crops and weeds in each image in the form of a label frame, and labels information such as corresponding size information for each label frame.
Referring to fig. 2, in the embodiment of the present invention, the performing a data preprocessing operation on the target image to obtain corresponding preprocessed data includes:
s201) performing a first data enhancement operation on the screened image to obtain first enhancement data;
s202) executing a second data enhancement operation on the first enhancement data to obtain second enhancement data;
s203) processing the second enhanced data according to a preset requirement to obtain the preprocessed data.
In one possible embodiment, after the creation of the agricultural target database (e.g., the agricultural target database is a database of wheat lodging) is completed, the pre-processing operation of the filtered images in the database of wheat lodging is started. Firstly, acquiring a preset operation probability, and then judging whether to perform a first data enhancement operation on a screened image in a database of wheat lodging according to the preset probability. For example, the preset operation probability is 33%, the database of the lodging wheat includes 35 screened images, and at this time, the 35 screened images are sequentially subjected to random first data enhancement operation according to the preset operation probability.
For example, for the 1 st screened image, the operation after random selection is selected as not to carry out the first data enhancement operation, so that the 1 st screened image is skipped; for the 2 nd screened image, the operation after random selection is selected as the first data enhancement operation, and at this time, operations such as left-right turning, clipping, brightness conversion, color conversion and the like are further selected randomly to perform the data enhancement operation on the 2 nd screened image, so that the enhanced screened image … is obtained until all screened images complete the first data enhancement operation according to the preset operation probability. And finally, processing the second enhancement data according to a preset requirement, for example, the preset requirement is an input image size requirement required in a lightweight network, namely, the second enhancement data is subjected to size cutting according to the input image size requirement to obtain final preprocessed data.
In the traditional agricultural target identification method, the acquired image is usually subjected to direct image identification or deep learning so as to generate an identification result of the agricultural target, and the input image in the mode usually has the condition of single shooting mode or similar acquisition way, so that the input data usually has the problems of insufficient diversity and insufficient effectiveness.
Further, in this embodiment of the present invention, the processing the second enhancement data according to a preset requirement to obtain the preprocessed data includes: acquiring preset picture format information; performing format processing on the second enhanced data based on the preset picture format information to obtain formatted data; and carrying out normalization processing on the formatted data to obtain the preprocessed data.
In a possible implementation manner, the crop growth situation of a certain farmland or farm area is identified, in order to solve the technical problem that the collected image is greatly influenced by the ambient illumination of the identification scene in the existing agricultural target identification method, for example, in the embodiment of the present invention, the crop image collected under different ambient illumination of the current identification scene has a certain change, and further has a certain influence on the identification and analysis of the subsequent crop growth situation, so in step S201, a further data enhancement operation is performed on the processed first enhancement data. Firstly, a Generative Adaptive Networks (GAN) is created, the GAN includes a first training model and a second training model, the first training model is used for extracting and identifying color and form distribution in first enhanced data according to preset identification parameters, and generating new picture data according to the extracted and identified color and form distribution, at this time, the second training model further judges the probability that the picture data comes from the screened image according to the new picture data, and optimizes the preset identification parameters of the first training model according to the probability. And further performing data enhancement operation on the first enhancement data to obtain second enhancement data through the mutual game learning of the first training model and the second training model.
In the embodiment of the invention, the GAN is innovatively adopted to further enhance the data of the first enhanced data, so that the influence of the color difference and the shape of the input data on the identification process is greatly reduced, the problems of low identification accuracy and large identification deviation of the agricultural target caused by the large shape difference of the agricultural target due to the illumination of the external environment or the shooting angle are effectively avoided, and the accuracy and the effectiveness of the identification of the agricultural target are greatly improved.
Referring to fig. 3, in the embodiment of the present invention, the establishing a lightweight network model, and processing the preprocessed data based on the lightweight network model to obtain corresponding processed data includes:
s301) obtaining an infrastructure network architecture, wherein the infrastructure network architecture is a network architecture based on lightweight design;
s302) acquiring first channel information, and performing data expansion processing on the preprocessed data based on the first channel information in the basic network architecture to acquire expanded data;
s303) acquiring second channel information, and performing feature extraction processing on the expanded data based on the second channel information in the basic network architecture to obtain extracted data;
s304) performing fusion operation on the extracted data in the basic network architecture based on the first channel information to obtain fused data, and taking the fused data as the processed data.
In order to solve the technical problems, in a possible implementation manner, before the input data is identified, a basic network architecture is further constructed, wherein the basic network architecture is based on lightweight design MobileNet as a basic network architecture, and specifically, the basic network architecture based on MobileNet V2 is adopted. At this time, for an input image with the size of a, B, C, firstly, expanding the number of channels of the basic network through a convolution kernel of 1, C, generating corresponding expanded data by the expanded channels for the preprocessed data, then, performing feature extraction on the expanded data through convolution kernel of 3, C to realize pattern analysis on the expanded data and obtain extracted data, and further, performing information fusion on the extracted data in different channels through convolution kernel of 1, thereby obtaining final fused data, wherein the fused data is the processed data.
In the embodiment of the invention, by adopting a lightweight network architecture, preprocessed data is firstly processed by a convolution kernel of 1 × C instead of a convolution kernel of 3 × C, so that the parameter quantity of data processing is greatly reduced, the operation complexity is reduced, the data processing difficulty is reduced, the nonlinear characteristic of the network is increased, the operation efficiency and the operation accuracy are improved, and the model training loss is reduced.
In an embodiment of the present invention, the annotation information further includes filtered size information of the filtered image, and step S40 of the identification method further includes: after the frame body information is obtained, generating corresponding size statistical information based on the screened size information; and adjusting the frame information based on the basic network architecture and the size statistical information to obtain the adjusted frame information.
In the traditional agricultural target identification method, a plurality of marking frames are generated in the identification process, and each marking frame is generated randomly and is not fixed in size, so that the traditional agricultural target identification method has the defects that the number of the marking frames needing to be processed is large, the operation complexity is improved, the convergence rate of an identification network on agricultural target identification is reduced, and the positioning accuracy of an agricultural target is reduced.
Therefore, in order to solve the above technical problem, in a possible embodiment, a harvested object of a harvester is identified, for example, the harvested object is an apple, before the input processed data is identified as the apple, the present invention further performs statistical analysis according to the labeled information of the filtered image, specifically, performs statistical analysis according to the filtered size information in the labeled information, performs statistical analysis on the size and aspect ratio information in all the filtered size information, and further performs cluster analysis on the filtered size information according to a method such as a K-Means algorithm, so as to further improve the accuracy of the analysis, analyze the receptive field size of the apple on each layer of the filtered image, and further generate the optimal size and the optimal aspect ratio of the labeled frame, that is, obtain the adjusted frame information.
In the embodiment of the invention, the frame bodies in the agricultural target identification process are further optimized, so that on one hand, the number of the frame bodies in the identification and analysis process is further reduced, the operation complexity is reduced, and the convergence rate of the identification network on the agricultural target identification is increased, and meanwhile, the optimized frame bodies can better accord with the actual characteristics of data sets (a first target set and a second target set, for example, the first target set is a crop set, and the second target set is a weed set), so that the identification accuracy and the positioning accuracy of the agricultural target are further improved.
Referring to fig. 4, in the embodiment of the present invention, analyzing the processed data based on the frame information to obtain target identification information includes:
s401) analyzing the processed data based on the frame body information to obtain a plurality of candidate prediction frames;
s402) screening the candidate prediction frames according to a preset screening rule to obtain at least one screened prediction frame;
s403) extracting the classification information and the positioning information of each prediction frame after screening, and generating target identification information corresponding to the prediction frame after screening based on the classification information and the positioning information.
In one possible embodiment, insect pests in a current farmland or farm area are identified, after appropriate frame information is designed, processed data is analyzed according to the designed frame information, for example, the processed data is a plurality of images containing a first target (for example, the first target is a crop) and a second target (for example, the second target is a pest), at least one candidate prediction frame is marked on each image according to the distribution situation of the crop and the pest on the image, at this time, a plurality of candidate prediction frames are screened according to a preset screening rule, so as to obtain at least one screened prediction frame, the screened prediction frame contains the relevant information of the crop or the pest contained on each image, at this time, the classification information and the positioning information of each screened prediction frame are extracted, and finally, the crop target identification information of the current farmland is generated, for example, in embodiments of the present invention, pest identification information for a current field is generated.
In the embodiment of the present invention, the identification method further includes: generating loss calculation information based on the frame information, the second channel information and a preset loss calculation algorithm before analyzing the processed data based on the frame information; and optimizing the frame body information based on the loss calculation information to obtain the optimized frame body information.
Further, in this embodiment of the present invention, the optimizing the frame information based on the loss calculation information to obtain the optimized frame information includes: judging whether the loss calculation information is larger than a preset loss threshold value or not; when the loss calculation information is less than or equal to the preset loss threshold: processing the second channel information based on the loss calculation information to obtain processed second channel information; optimizing the frame information based on the processed second channel information to obtain optimized frame information; the identification method further comprises the following steps: in the case that the loss calculation information is greater than the preset loss threshold: judging whether the screened image corresponding to the loss calculation information is a qualified image; if the screened image corresponding to the loss calculation information is a qualified image, adjusting the annotation information corresponding to the screened image based on the loss calculation information to obtain adjusted annotation information; and if not, deleting the screened image corresponding to the loss calculation information.
In order to further optimize the above frame information, so that the prediction frame marked each time can be as close to the marking frame as possible, that is, the prediction result each time is as close to the actual display result as possible, in the embodiment of the present invention, before the processed data is recognized, the recognition method is further deeply learned and trained to further enhance the accuracy of the prediction result.
In a possible implementation, a preset loss calculation algorithm is first obtained, and then loss calculation is performed on the plurality of filtered prediction frames generated in the training process by combining the frame body information and the second channel information, so as to evaluate loss calculation information between the prediction frames and the labeled filtered size information. At this time, it is determined whether the obtained loss calculation information is greater than a preset loss threshold (for example, the preset loss threshold is 20%) and the current loss calculation information is 18%, that is, it is determined that the currently calculated loss calculation information is less than the preset loss threshold, so that the second channel information is optimized and adjusted according to the loss calculation information, the optimized second channel information is obtained, and the frame information is optimized according to the processed second channel information, so as to obtain the optimized frame information. And in the subsequent training process of identifying the agricultural target, continuously optimizing the second channel information and/or the frame body information according to the generated loss calculation information between the prediction frame and the marking frame, so that the final prediction frame is more matched with the actual marking frame.
In the embodiment of the invention, the training stage of deep learning of the agricultural target identification method is continuously learned according to the preset loss calculation algorithm, and the second channel information and the frame body information are continuously optimized, so that the matching degree between the prediction frame generated in the identification process and the actual marking frame is further improved, and the identification accuracy of the agricultural target is improved. On the other hand, in the process of optimizing the agricultural target identification method by the optimization method, the annotation information can be further adjusted and optimized, so that the annotation information is more accurate and reasonable, or unreasonable screened images are removed, the training complexity of the agricultural target identification method is reduced, and the operation accuracy and the operation efficiency are improved.
Referring to fig. 5, in an embodiment of the present invention, the screening the candidate prediction frames according to a preset screening rule to obtain at least one screened prediction frame includes:
s4021) screening the candidate prediction boxes based on a non-maximum suppression algorithm to obtain a prediction box after first processing;
s4022) screening the prediction frame after the first processing based on the score threshold to obtain a prediction frame after the second processing;
s4023) screening the second processed prediction frame based on the minimum size threshold to obtain a third processed prediction frame;
s4024) judging whether an overlapped third processed prediction frame with an overlapped area larger than the maximum overlapping threshold exists;
s40241) if yes, acquiring target score information of the prediction frames after the third overlapping processing, deleting the prediction frames after the third overlapping processing with smaller target score information, and acquiring at least one prediction frame after screening;
s40242) if not, taking the third post-processing prediction box as the post-screening prediction box.
After a plurality of candidate prediction frames for performing agricultural target identification on the screened image are obtained, the candidate prediction frames are further screened so as to improve the accuracy of the identification result. In one possible embodiment, the fertilization targets of the current field or farm area are identified, for example, the fertilization targets are corn crops. After obtaining a plurality of candidate prediction frames for identifying corn crops, firstly screening the plurality of candidate prediction frames based on a Non-Maximum Suppression algorithm (NMS) so as to identify and eliminate redundant prediction frames in the plurality of candidate prediction frames and obtain a first processed prediction frame; then further acquiring score threshold information and target score information of each first processed prediction box, deleting the first processed prediction boxes of which the target score information is lower than the score threshold information, and acquiring second processed prediction boxes, wherein in the embodiment of the invention, the score threshold information can be calculated based on the overlapping degree of the prediction boxes and the real frames; further, according to the acquired minimum size threshold information and the size information of each second processed prediction frame, deleting the second processed prediction frames with the size information smaller than the minimum size threshold information so as to further exclude the prediction frames with the too small size and acquire a third processed prediction frame; and finally, analyzing each prediction frame after the third processing according to the acquired maximum overlapping threshold information, judging whether an overlapping prediction frame after the third processing with an overlapping area larger than the maximum overlapping threshold information exists, and if so, deleting the prediction frame after the third processing with smaller target score information in the overlapping prediction frame after the third processing so as to avoid repeated operation, thereby greatly improving the accuracy of identifying and positioning the agricultural target and improving the operation efficiency.
In the embodiment of the invention, the prediction frame generated after agricultural target identification is subjected to multi-level screening, so that the accuracy of agricultural target identification is further improved, the calculation amount in the subsequent calculation process is reduced, the calculation complexity is reduced, and the calculation efficiency is improved.
An object recognition system based on a lightweight network according to an embodiment of the present invention will be described with reference to the drawings.
Referring to fig. 6, based on the same inventive concept, an embodiment of the present invention provides an object recognition system based on a lightweight network, where the object recognition system includes: the system comprises a library construction unit, a database processing unit and a database processing unit, wherein the library construction unit is used for establishing a target database, and the target database comprises a target image; the preprocessing unit is used for executing data preprocessing operation on the target image to obtain corresponding preprocessing data; the light-weight network unit is used for establishing a light-weight network model, processing the preprocessed data based on the light-weight network model and obtaining corresponding processed data; an identification unit; the frame information is acquired, and the processed data is analyzed based on the frame information to acquire target identification information.
In an embodiment of the present invention, the library construction unit includes: the image acquisition module is used for acquiring a target image; the effectiveness screening module is used for carrying out effectiveness screening on the target image to obtain a screened image; the marking module is used for marking the screened image to obtain marking information corresponding to the screened image, and the marking information comprises first target information and second target information; and the library establishing module is used for establishing a target database based on the screened images and the marking information.
In an embodiment of the present invention, the preprocessing unit includes: the first enhancement module is used for executing a first data enhancement operation on the screened image to obtain first enhancement data; the second enhancement module is used for executing second data enhancement operation on the first enhancement data to obtain second enhancement data; and the preprocessing module is used for processing the second enhanced data according to a preset requirement to obtain the preprocessed data.
In this embodiment of the present invention, the processing the second enhancement data according to a preset requirement to obtain the preprocessed data includes: acquiring preset picture format information; performing format processing on the second enhanced data based on the preset picture format information to obtain formatted data; and carrying out normalization processing on the formatted data to obtain the preprocessed data.
In an embodiment of the present invention, the lightweight network unit includes: the basic network module is used for acquiring a basic network architecture, and the basic network architecture is a network architecture based on lightweight design; the expansion module is used for acquiring first channel information, and performing data expansion processing on the preprocessed data based on the first channel information in the basic network architecture to obtain expanded data; the extraction module is used for acquiring second channel information and performing feature extraction processing on the expanded data based on the second channel information in the basic network architecture to obtain extracted data; and the fusion module is used for executing fusion operation on the extracted data in the basic network architecture based on the first channel information to obtain fused data, and taking the fused data as the processed data.
In an embodiment of the present invention, the annotation information further includes filtered size information of the filtered image, and the identifying unit includes: the statistical module is used for generating corresponding size statistical information based on the screened size information after the frame body information is obtained; and the adjusting module is used for adjusting the frame body information based on the basic network architecture and the size statistical information to obtain the adjusted frame body information.
In an embodiment of the present invention, the identification unit includes: the analysis module is used for analyzing the processed data based on the frame body information to obtain a plurality of candidate prediction frames; the screening module is used for screening the candidate prediction frames according to a preset screening rule to obtain at least one screened prediction frame; and the identification module is used for extracting the classification information and the positioning information of each screened prediction frame and generating target identification information corresponding to the screened prediction frames based on the classification information and the positioning information.
In an embodiment of the present invention, the identification unit further includes: a loss calculation module, configured to generate loss calculation information based on the frame information, the second channel information, and a preset loss calculation algorithm before analyzing the processed data based on the frame information; and the frame body optimization module is used for optimizing the frame body information based on the loss calculation information to obtain the optimized frame body information.
In this embodiment of the present invention, the optimizing the frame information based on the loss calculation information to obtain the optimized frame information includes: judging whether the loss calculation information is larger than a preset loss threshold value or not; when the loss calculation information is less than or equal to the preset loss threshold: processing the second channel information based on the loss calculation information to obtain processed second channel information; optimizing the frame information based on the processed second channel information to obtain optimized frame information; the identification device further comprises: in the case that the loss calculation information is greater than the preset loss threshold: judging whether the screened image corresponding to the loss calculation information is a qualified image; if the screened image corresponding to the loss calculation information is a qualified image, adjusting the annotation information corresponding to the screened image based on the loss calculation information to obtain adjusted annotation information; and if not, deleting the screened image corresponding to the loss calculation information.
In this embodiment of the present invention, the screening the candidate prediction frames according to a preset screening rule to obtain at least one screened prediction frame includes: screening the candidate prediction frames based on a non-maximum suppression algorithm to obtain a prediction frame after first processing; screening the first processed prediction frame based on score threshold information to obtain a second processed prediction frame; screening the second processed prediction frame based on the minimum size threshold to obtain a third processed prediction frame; judging whether an overlapped third-processing prediction frame with an overlapped area larger than a maximum overlapping threshold exists or not, if so, acquiring target score information of the overlapped third-processing prediction frame, deleting the overlapped third-processing prediction frame with smaller target score information, and acquiring at least one screened prediction frame; otherwise, the prediction frame after the third processing is used as the prediction frame after screening.
Further, an embodiment of the present invention also provides a computer-readable storage medium, on which a computer program is stored, which when executed by a processor implements the method of the present invention.
Further, the embodiment of the invention also provides an agricultural machine, and the agricultural machine comprises the identification system provided by the embodiment of the invention and/or the computer-readable storage medium provided by the embodiment of the invention.
Although the embodiments of the present invention have been described in detail with reference to the accompanying drawings, the embodiments of the present invention are not limited to the details of the above embodiments, and various simple modifications can be made to the technical solutions of the embodiments of the present invention within the technical idea of the embodiments of the present invention, and the simple modifications all belong to the protection scope of the embodiments of the present invention.
It should be noted that the various features described in the above embodiments may be combined in any suitable manner without departing from the scope of the invention. In order to avoid unnecessary repetition, the embodiments of the present invention do not describe every possible combination.
Those skilled in the art will understand that all or part of the steps in the method according to the above embodiments may be implemented by a program, which is stored in a storage medium and includes several instructions to enable a single chip, a chip, or a processor (processor) to execute all or part of the steps in the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
In addition, any combination of various different implementation manners of the embodiments of the present invention is also possible, and the embodiments of the present invention should be considered as disclosed in the embodiments of the present invention as long as the combination does not depart from the spirit of the embodiments of the present invention.