Disclosure of Invention
The invention provides a solution to the above technical problems.
The technical scheme of the invention is as follows:
In one aspect, the present invention provides a spare part management system based on target identification and change detection, including:
Spare part detection module contains image acquisition unit, image preprocessing unit, target detection unit, post-processing unit, wherein:
an image acquisition unit: the method comprises the steps that an image acquisition device installed in a warehouse is used for acquiring images or videos of spare parts entering and exiting the warehouse, and the images or video information is transmitted to an image preprocessing unit through a communication interface;
an image preprocessing unit: preprocessing input image or video information, including denoising, normalization and enhancement operations;
Target detection unit: confirming warehouse-in and warehouse-out information of spare parts through an image processing algorithm and a reinforcement learning model according to the input preprocessed image or video information;
Post-processing unit: processing the output warehouse-in and warehouse-out information of the spare parts into a format which can be identified by the log management module, and outputting the format to the log management module;
And the log management module is used for: recording a detailed log of system operation, including user operation, system events and processing results;
And a user interaction module: the system is used for inquiring the history of the warehouse-in and warehouse-out of spare parts, the real-time images and the result information; selecting different image acquisition devices; and displaying the current system state, receiving user input, and configuring system parameters.
As a preferred embodiment, the image preprocessing unit: preprocessing the input image or video information, specifically comprising the following steps:
the specific denoising process comprises the following steps:
and (5) average value filtering: for each pixel point in the image, the new pixel value is the average value of the neighborhood pixel values; let the new value I new (x, y) of pixel (x, y) for the neighborhood size m×n be:
Wherein I (I, j) represents the original pixel value at pixel point (I, j);
median filtering: sorting the pixel values in the adjacent areas, and taking the intermediate value as a new pixel value;
the specific formula of the normalization flow is as follows:
where I (x, y) is the original pixel value, and I max and I min are the minimum and maximum pixel values, respectively, in the image;
the specific formula of the enhancement flow is as follows:
Brightness adjustment: i brightened (x, y) =i (x, y) +b; wherein b is a brightness adjustment value;
Contrast adjustment: i contrasted (x, y) =a×i (x, y) +b; where a controls contrast and b controls brightness.
As a preferred embodiment, the target detection unit confirms the in-out information of the spare parts through an image processing algorithm and a reinforcement learning model, and includes the following steps:
Network forward propagation: processing the preprocessed image through a multi-layer convolutional neural network to extract multi-layer features; extracting basic features of an image by using a backhaul deep neural network, capturing multi-scale information by using a pyramid pooling module PPM, integrating feature pyramid network FPN to fuse feature graphs of different layers, and combining a self-adaptive spatial feature network ASFF to improve a feature fusion effect; finally, performing target detection on a plurality of scales by utilizing a Multi-scale detection head Multi-scale Detection Head so as to adapt to targets with different sizes; in the training process, optimizing and improving a Mosaic data enhancement strategy, enhancing the generalization capability of the model, and dynamically adjusting the learning rate to improve the training efficiency and the model performance;
prediction output: the output of the network is a fixed-size grid, each grid cell predicting the bounding box, confidence score and class probability of the target;
decoding output: decoding the prediction result output by the network into an actual target boundary box and category; converting the relative coordinates into actual coordinates of the image, filtering out boundary frames with low confidence coefficient according to a confidence coefficient threshold value, removing repeatedly overlapped boundary frames, and reserving boundary frames with highest confidence coefficient so as to avoid multiple detection of the same target;
Post-treatment: carrying out subsequent processing on the final target detection result; drawing the detected target boundary box and class label on the original image, and outputting the detection result to a log management module for storage.
As a preferred embodiment, the network propagates the flow forward; the method specifically comprises the following steps:
the specific formula of the convolution operation is as follows:
Wherein: i in and I out are respectively input and output images, Being a convolution kernel, k depends on the convolution kernel size;
the specific formula of the pyramid pooling module PPM is as follows:
Wherein: f pooled is a characteristic diagram obtained under different pooling scales d, and s is the pooling window size;
The feature map F after feature pyramid network FPN fusion is expressed as:
F=H+L
Wherein: h is a high-level feature map, L is an up-sampled bottom-level feature map;
the specific formula of the adaptive spatial signature network ASFF is as follows:
FASFF=w1F1+w2F2+,…,+wcFc
wherein F c is the c-th fused feature map, and w c is the adaptive weight obtained by corresponding network learning;
The specific formula of the Multi-scale detection head Multi-scale Detection Heads is as follows:
x=xa+Δx,y=ya+Δy,z=za+Δz,h=ha+Δh
Wherein: x a、ya、za、ha is the original anchor frame coordinates, and Δx, Δy, Δz, Δh are the predicted bounding box offsets, respectively.
As a preferred embodiment, the prediction output process specifically includes the following steps:
Coordinate conversion:
x=xgrid×stride+xoffset
y=ygrid×stride+yoffset
Wherein: x grid、ygrid is the index of the grid cell; stride is the step size of the cell; x offset、yoffset is the predicted offset;
The confidence Softmax is calculated as follows:
Wherein: p o is a probability class; s o is the confidence score of the o category; o=1, 2,3, …, b.
In a preferred embodiment, in the decoding output flow, a bounding box with low confidence is filtered out according to a confidence threshold; the method is realized by a non-maximum suppression algorithm, and the specific formula is as follows:
Wherein: s o is a confidence score; m is the bounding box with highest score; b v is the bounding box to be processed; ioU (M, b v) is the intersection ratio of bounding boxes M and b v; n t is a set cross ratio threshold;
then there are:
Wherein: area (M ∈b v) is the intersection area of two bounding boxes; area (mθb v) is the union area of two bounding boxes.
As a preferred implementation mode, the user interaction module can inquire the warehouse-in and warehouse-out history of spare parts and real-time image and result information, and can timely find error information by manually comparing an output result with the image and feed back machine reinforcement learning through the user interaction module.
On the other hand, the invention also provides a spare part management method based on target identification and change detection, which comprises the following steps:
step S1: collecting images or videos of the spare parts entering and exiting the warehouse through a camera arranged in the warehouse, and transmitting image information to an image processing assembly through a communication interface;
Step S2: preprocessing an input image or video, including denoising, normalization and enhancement operations;
Step S3: according to the input preprocessed image or video, confirming the warehouse-in and warehouse-out information of spare parts through an image processing algorithm and a reinforcement learning model pair;
step S4: processing the output warehouse-in and warehouse-out information of the spare parts into a format which can be identified by the log management module, and outputting the format to the log management module;
Step S5: the log management module records a detailed log of system operation, including user operation, system events and processing results; providing operation record and problem tracking functions, and helping system maintenance and fault investigation;
Step S6: the user interaction module can inquire the warehouse-in and warehouse-out history of spare parts, real-time images and result information; different image devices can be selected; the current state can be displayed, user input is accepted, and system parameters are configured.
In still another aspect, the present invention further provides an electronic device, where a computer program is stored, where the computer program when executed by a processor implements a spare part management method based on object identification and change detection according to any of the embodiments of the present invention.
In yet another aspect, the present invention also provides a computer readable medium storing one or more programs, which when executed by the one or more processors, cause the one or more processors to implement a spare part management method based on object identification and change detection according to any of the embodiments of the present invention.
The invention has the following beneficial effects:
Automation and efficiency:
the camera is utilized to automatically collect images or videos of spare parts entering and exiting the warehouse, so that the tedious work of manual recording and checking is reduced, and the management efficiency is greatly improved.
The whole process realizes full-automatic processing from image acquisition, preprocessing, post-processing detection and log recording, and reduces the possibility of human errors.
Accurate target detection:
Through an image processing algorithm and a reinforcement learning model, the warehouse-in and warehouse-out information of spare parts can be more accurately confirmed, and compared with the traditional identification method which relies on manual judgment or simplicity, the accuracy and the reliability of detection are improved.
Perfect log management:
Logging system operations, including user operations, system events, and processing results, provides powerful support for system maintenance and troubleshooting. The prior art may be deficient in the integrity and detail of the log records.
Good user interactivity:
The user interaction module allows a user to inquire the history and real-time information of the spare parts entering and exiting, and can also select different image devices and configure system parameters. This enables a user to use the system more flexibly and conveniently, meeting the personalized needs, while some existing systems may not be sufficiently friendly and flexible in terms of user interaction.
Real-time and dynamic monitoring:
the method can acquire and process the images in real time, acquire the in-out conditions of spare parts in time, realize dynamic monitoring and management, and reflect inventory changes more timely compared with the traditional periodic inventory mode.
Traceability of data:
complete log records and clear in-out and in-out information processing ensure the traceability of data, and facilitate the examination and analysis of the inventory management process and results, which may not be outstanding in the prior art.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
It should be understood that the step numbers used herein are for convenience of description only and are not limiting as to the order in which the steps are performed.
It is to be understood that the terminology used in the description of the invention is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in this specification and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.
The terms "comprises" and "comprising" indicate the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
The term "and/or" refers to any and all possible combinations of one or more of the associated listed items, and includes such combinations.
Embodiment one:
In order to make the objects, technical solutions and advantages of the present application more apparent, the technical solution of the present application will be clearly and completely described below with reference to fig. 1 in conjunction with a specific embodiment of the present application.
In order to solve the problems in the prior art, the invention provides a spare part management system based on target identification and change detection, which comprises:
Spare part detection module contains image acquisition unit, image preprocessing unit, target detection unit, post-processing unit, wherein:
an image acquisition unit: the method comprises the steps that an image acquisition device installed in a warehouse is used for acquiring images or videos of spare parts entering and exiting the warehouse, and the images or video information is transmitted to an image preprocessing unit through a communication interface;
The acquired spare parts are stored in the intelligent cabinet and are laid in a flat mode according to layers, and the conditions that the identification blind areas are caused by stacking and the dense arrangement and the identification are unclear are avoided.
An image preprocessing unit: preprocessing input image or video information, including denoising, normalization and enhancement operations; the specific operation flow is as follows:
Denoising:
and (5) average value filtering: for each pixel point in the image, the new pixel value is the average value of the neighborhood pixel values; let the new value I new (x, y) of pixel (x, y) for the neighborhood size m×n be:
Wherein I (I, j) represents the original pixel value at pixel point (I, j);
median filtering: sorting the pixel values in the adjacent areas, and taking the intermediate value as a new pixel value;
the specific formula of the normalization flow is as follows:
where I (x, y) is the original pixel value, and I max and I min are the minimum and maximum pixel values, respectively, in the image;
the specific formula of the enhancement flow is as follows:
Brightness adjustment: i brightened (x, y) =i (x, y) +b; wherein b is a brightness adjustment value;
Contrast adjustment: i contrasted (x, y) =a×i (x, y) +b; where a controls contrast and b controls brightness.
Target detection unit: according to the input preprocessed image or video, confirming the warehouse-in and warehouse-out information of spare parts through an image processing algorithm and a reinforcement learning model; the specific flow is as follows:
Network forward propagation: processing the preprocessed image through a multi-layer convolutional neural network to extract multi-layer features; extracting basic features of an image by using a backhaul deep neural network, capturing multi-scale information by using a pyramid pooling module PPM, integrating feature pyramid network FPN to fuse feature graphs of different layers, and combining a self-adaptive spatial feature network ASFF to improve a feature fusion effect; finally, performing target detection on a plurality of scales by utilizing a Multi-scale detection head Multi-scale Detection Head so as to adapt to targets with different sizes; in the training process, optimizing and improving a Mosaic data enhancement strategy, enhancing the generalization capability of the model, and dynamically adjusting the learning rate to improve the training efficiency and the model performance; the method specifically comprises the following steps:
the specific formula of the convolution operation is as follows:
Wherein: i in and I out are respectively input and output images, Being a convolution kernel, k depends on the convolution kernel size;
the specific formula of the pyramid pooling module PPM is as follows:
Wherein: f pooled is a characteristic diagram obtained under different pooling scales d, and s is the pooling window size;
The feature map F after feature pyramid network FPN fusion is expressed as:
F=H+L
Wherein: h is a high-level feature map, L is an up-sampled bottom-level feature map;
the specific formula of the adaptive spatial signature network ASFF is as follows:
FASFF=w1F1+w2F2+,…,+wcFc
wherein F c is the c-th fused feature map, and w c is the adaptive weight obtained by corresponding network learning;
The specific formula of the Multi-scale detection head Multi-scale Detection Heads is as follows:
x=xa+Δx,y=ya+Δy,z=za+Δz,h=ha+Δh
Wherein: x a、ya、za、ha is the original anchor frame coordinates, and Δx, Δy, Δz, Δh are the predicted bounding box offsets, respectively.
Optimizing and improving a Mosaic data enhancement strategy, enhancing the generalization capability of a model, and specifically adopting the following method:
The mosaics data enhancement algorithm enables the model to identify the target in a smaller range by combining a plurality of pictures into one picture according to a certain proportion. Firstly, four pictures are randomly selected and respectively placed at the upper left, the upper right, the lower left and the lower right of a specified large picture after size adjustment and scaling according to reference point coordinates. And secondly, according to the size conversion mode of each picture, mapping relation is corresponding to the picture label. And finally, splicing the large images according to the designated abscissa and ordinate, and processing the coordinates of the detection frame exceeding the boundary. The Mosaic data enhancement algorithm can effectively enhance data diversity and model robustness, and is beneficial to improving small target detection performance.
In the training process, the Mosaic data enhancement strategy is continuously improved by optimizing and adjusting the super-parameters. First, setting img-size to 980 x 980 is beneficial to improving training results under ideal training time and hardware conditions. Second, for batch-size, a hardware maximum is used to avoid the statistics offset caused by too small a batch-size value. Then, the iteration times of epochs are adjusted by combining the loss curve after the primary training structure, so that the under-fitting caused by the too small data and the over-fitting caused by the too large numerical value are avoided. Then, the cache is selected to be started, training is quickened, and training time is shortened. Finally, setting random seed can make training process repeatable and controllable.
Prediction output: the output of the network is a fixed-size grid, each grid cell predicting the bounding box, confidence score and class probability of the target; the method specifically comprises the following steps:
Coordinate conversion:
x=xgrid×stride+xoffset
y=ygrid×stride+yoffset
Wherein: x grid、ygrid is the index of the grid cell; stride is the step size of the cell; x offset、yoffset is the predicted offset;
The confidence Softmax is calculated as follows:
Wherein: p o is a probability class; s o is the confidence score of the o category; o=1, 2,3, …, b.
Filtering out the bounding box with low confidence according to the confidence threshold; the method is realized by a non-maximum suppression algorithm, and the specific formula is as follows:
Wherein: s o is a confidence score; m is the bounding box with highest score; b v is the bounding box to be processed; ioU (M, b v) is the intersection ratio of bounding boxes M and b v; n t is a set cross ratio threshold;
then there are:
Wherein: area (M ∈b v) is the intersection area of two bounding boxes; area (mθb v) is the union area of two bounding boxes.
Post-processing unit: processing the output warehouse-in and warehouse-out information of the spare parts into a format which can be identified by the log management module, and outputting the format to the log management module;
And the log management module is used for: recording a detailed log of system operation, including user operation, system events and processing results;
And a user interaction module: the system is used for inquiring the history of the warehouse-in and warehouse-out of spare parts, the real-time images and the result information; selecting different image acquisition devices; and displaying the current system state, receiving user input, and configuring system parameters.
The user interaction module is a front end part of the system and is responsible for interacting with a user and providing an operation interface. Its main functions include displaying the current state of the system, accepting user input, configuring system parameters, etc. The method comprises the following specific functions:
1. display analog switching: allowing the user to switch between different simulated views helps the user view, understand and use the various functional modules of the system and their status in order to view different functions or states of the system.
2. Building a webpage: the web page interface for creating and managing the system comprises layout design, content display and the like, and provides an intuitive and easy-to-operate interface for the user, so that the user can conveniently access and use the system functions.
3. A configuration panel: providing the function of user to adjust system parameters and settings. The function helps a user to configure the working mode and parameters of the system according to actual requirements, and is suitable for different application scenes and operation requirements.
4. Input source selection: allowing the user to select and specify the source of the data input, e.g. to select different image acquisition devices or data streams. The function supports the management and switching of multiple input sources, ensures that the system can acquire information from different data sources and perform corresponding processing and analysis.
5. Historical or real-time data feedback and derivation: the user can compare the history and real-time output result with the image through the user interaction module, can discover error information in time, and feeds back the machine reinforcement learning through the user interaction module. The log management module data can also be derived according to user requirements.
Embodiment two:
The embodiment provides a spare part management method based on target identification and change detection, which comprises the following steps:
step S1: collecting images or videos of the spare parts entering and exiting the warehouse through a camera arranged in the warehouse, and transmitting image information to an image processing assembly through a communication interface;
Step S2: preprocessing an input image or video, including denoising, normalization and enhancement operations;
Step S3: according to the input preprocessed image or video, confirming the warehouse-in and warehouse-out information of spare parts through an image processing algorithm and a reinforcement learning model pair;
step S4: processing the output warehouse-in and warehouse-out information of the spare parts into a format which can be identified by the log management module, and outputting the format to the log management module;
Step S5: the log management module records a detailed log of system operation, including user operation, system events and processing results; providing operation record and problem tracking functions, and helping system maintenance and fault investigation;
Step S6: the user interaction module can inquire the warehouse-in and warehouse-out history of spare parts, real-time images and result information; different image devices can be selected; the current state can be displayed, user input is accepted, and system parameters are configured.
Embodiment III:
The present embodiment provides an electronic device, on which a computer program is stored, which when executed by a processor implements a spare part management method based on object identification and change detection according to any one of the embodiments of the present invention.
Embodiment four:
The present embodiment provides a computer readable medium storing one or more programs, which when executed by the one or more processors, cause the one or more processors to implement a spare part management method based on object recognition and change detection according to any of the embodiments of the present invention.
In the embodiments of the present application, "at least one" means one or more, and "a plurality" means two or more. "and/or", describes an association relation of association objects, and indicates that there may be three kinds of relations, for example, a and/or B, and may indicate that a alone exists, a and B together, and B alone exists. Wherein A, B may be singular or plural. The character "/" generally indicates that the context-dependent object is an "or" relationship. "at least one of the following" and the like means any combination of these items, including any combination of single or plural items. For example, at least one of a, b and c may represent: a, b, c, a and b, a and c, b and c or a and b and c, wherein a, b and c can be single or multiple.
Those of ordinary skill in the art will appreciate that the various elements and algorithm steps described in the embodiments disclosed herein can be implemented as a combination of electronic hardware, computer software, and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described systems, apparatuses and units may refer to corresponding procedures in the foregoing method embodiments, and are not repeated herein.
In several embodiments provided by the present application, any of the functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a usb disk, a removable hard disk, a Read-Only Memory (hereinafter referred to as ROM), a random access Memory (Random Access Memory hereinafter referred to as RAM), a magnetic disk, or an optical disk, etc., which can store program codes.
The foregoing description is only illustrative of the present invention and is not intended to limit the scope of the invention, and all equivalent structures or equivalent processes or direct or indirect application in other related technical fields are included in the scope of the present invention.