CN117253194A

CN117253194A - Commodity damage detection method, device and storage medium

Info

Publication number: CN117253194A
Application number: CN202311499128.2A
Authority: CN
Inventors: 王欢; 高伟明; 冯继威; 李彦君
Original assignee: Networks Technology Co ltd
Current assignee: Networks Technology Co ltd
Priority date: 2023-11-13
Filing date: 2023-11-13
Publication date: 2023-12-19
Anticipated expiration: 2043-11-13
Also published as: CN117253194B

Abstract

The application provides a commodity damage detection method, a commodity damage detection device and a storage medium, wherein computer equipment can respectively identify finger joint line areas of each frame of video image when acquiring monitoring video clips of an unmanned store, and mark finger structures in each frame of video image according to each finger joint line area in each frame of video image, wherein the finger structures can reflect bending states of each finger of a customer. The computer equipment can input the monitoring video segments marked with the finger structures into the hand action classification model, so that the hand action classification model can accurately identify and output the hand action types of customers based on the bending states of the fingers. The computer device may determine whether the merchandise in the unmanned store is artificially damaged based on the type of hand motion. Therefore, whether the commodities in the unmanned store are damaged by people or not can be automatically and accurately detected, and the automation degree of the unmanned store can be further improved.

Description

Commodity damage detection method, commodity damage detection device and storage medium

Technical Field

The present disclosure relates to the field of image recognition technologies, and in particular, to a method and apparatus for detecting damage to a commodity, and a storage medium.

Background

With the continuous development and optimization of software algorithms and hardware devices, unmanned techniques have been increasingly applied in daily life, such as unmanned, unmanned aerial vehicles, unmanned factories, unmanned shops, etc. The unmanned store is a store which carries out intelligent automatic processing on all or part of operation flows in the store by a technical means so as to reduce manual intervention, and can automatically detect selected commodities of customers, automatically generate orders and automatically collect money. The unmanned store has the advantages of convenient setting, low labor cost and high operation efficiency, and has wide development prospect.

However, with the popularity of unmanned stores, new problems and challenges also arise. Since no staff posts are arranged on the site of the unmanned store, the behavior of damaging the commodity by the customer cannot be found in time, and further the continuation of the damaged behavior cannot be prevented or compensation cannot be required in time. It follows that it is necessary to provide a solution that automatically detects whether a commodity is artificially damaged.

Disclosure of Invention

The present application aims to solve at least one of the above technical drawbacks, and in particular, the technical drawbacks of the prior art that cannot automatically detect whether a commodity is damaged by a person.

In a first aspect, an embodiment of the present application provides a method for detecting damage to a commodity, where the method includes:

under the condition that a preset detection rule is met, acquiring a monitoring video clip of the unmanned store;

identifying a finger joint print area in each frame of video image of the monitoring video clip, and marking a finger structure in the video image according to the area coordinates of each finger joint print area in the video image;

inputting the monitoring video segments marked with the finger structures into a hand motion classification model obtained by training in advance so as to obtain the hand motion types output by the hand motion classification model;

and generating a detection result based on the hand action type, wherein the detection result is used for indicating whether the goods in the unmanned store are damaged by people.

In one embodiment, for each frame of video image, marking a finger structure in the video image according to the region coordinates of each of the finger joint print regions in the video image comprises:

for each frame of video image, respectively determining the joint pattern type corresponding to each finger joint pattern area according to the pixel information of each finger joint pattern area in the video image, grouping each finger joint pattern area based on each joint pattern type, and marking a finger structure in the video image according to the grouping condition and the area coordinates of each finger joint pattern area;

Wherein, each finger joint line area belonging to the same group corresponds to the same finger, and each finger joint line area belonging to different groups corresponds to different fingers respectively.

In one embodiment, for each frame of video image, determining, according to pixel information of each finger joint print area in the video image, a joint print type corresponding to each finger joint print area includes:

for each finger joint print area of each frame of video image, generating a gray level histogram corresponding to the finger joint print area, calculating a pixel proportion corresponding to each gray level in the gray level histogram, determining texture complexity corresponding to the finger joint print area according to the pixel proportion corresponding to each gray level, and determining the joint print type corresponding to the finger joint print area based on the texture complexity.

In one embodiment, for each finger joint print area of each frame of video image, determining the texture complexity corresponding to the finger joint print area according to the pixel proportion corresponding to each gray level includes:

for each finger joint print area of each frame of video image, calculating a gray average value of a gray histogram corresponding to the finger joint print area, and calculating texture complexity corresponding to the finger joint print area by adopting the following expression:

；

In the method, in the process of the invention,texture complexity corresponding to the knuckle region;The number of gray levels corresponding to the finger joint print area;In the gray level histogram corresponding to the finger joint print area, the firstiGray values corresponding to the respective gray levels;a gray average value of a gray histogram corresponding to the finger joint print area;In the gray level histogram corresponding to the finger joint print area, the firstiThe proportion of pixels corresponding to the respective gray levels.

In one embodiment, for each finger joint print area of each frame of video image, determining, based on the texture complexity, a type of joint print corresponding to the finger joint print area includes:

for each finger joint pattern area of each frame of video image, if the texture complexity corresponding to the finger joint pattern area is larger than a preset complexity threshold, the joint pattern type corresponding to the finger joint pattern area is middle joint pattern, otherwise, the joint pattern type corresponding to the finger joint pattern area is far-side joint pattern.

In one embodiment, the type of the joint pattern corresponding to each finger joint pattern area is middle joint pattern or distal joint pattern;

grouping each of the fingerprint areas based on each of the fingerprint types for each frame of video image, comprising:

For each frame of video image, determining the nearest distal joint line area corresponding to each middle joint line area in the video image and the nearest middle joint line area corresponding to each distal joint line area respectively; wherein, the knuckle area with the knuckle type being the middle knuckle is the middle knuckle area, and the knuckle area with the knuckle type being the far-side knuckle is the far-side knuckle area;

for each intermediate joint area, the intermediate joint area is taken as a first target area, the nearest distal joint area corresponding to the first target area is taken as a second target area, if the nearest intermediate joint area corresponding to the second target area is the first target area, the first target area and the second target area are determined to belong to the same group, otherwise, the first target area and the second target area are determined to belong to different groups.

In one embodiment, the generating a detection result based on the hand action type includes:

identifying the commodity type of the held commodity according to the monitoring video clip;

determining a damage action type set corresponding to the held commodity based on the commodity type;

And if the damage action type set comprises the hand action type, generating a damage detection result for indicating that the goods in the unmanned store are damaged artificially.

In one embodiment, the obtaining the surveillance video clip of the unmanned store under the condition that the preset detection rule is satisfied includes:

acquiring the commodity weight corresponding to each goods shelf in the unmanned store in real time;

when the weight of the commodity corresponding to any goods shelf is detected to be reduced, starting a target camera to acquire the monitoring video segment through the target camera; the target camera is a camera arranged on a target goods shelf, and the target goods shelf is a goods shelf with reduced commodity weight.

In a second aspect, embodiments of the present application provide a merchandise damage detection device, the device comprising:

the video segment acquisition module is used for acquiring the monitoring video segment of the unmanned store under the condition that the preset detection rule is met;

the finger structure labeling module is used for identifying finger joint print areas in each frame of video image of the monitoring video clip, and labeling finger structures in the video image according to the area coordinates of each finger joint print area in the video image;

The hand action type determining module is used for inputting the monitoring video segments marked with the finger structures into a hand action classification model obtained through training in advance so as to obtain the hand action types output by the hand action classification model;

and the detection result generation module is used for generating a detection result based on the hand action type, and the detection result is used for indicating whether the goods in the unmanned store are damaged by people or not.

In a third aspect, embodiments of the present application provide a storage medium having stored therein computer readable instructions which, when executed by one or more processors, cause the one or more processors to perform the steps of the method for detecting damage to goods described in any of the embodiments above.

In a fourth aspect, embodiments of the present application provide a computer device, comprising: one or more processors, and memory;

the memory has stored therein computer readable instructions which, when executed by the one or more processors, perform the steps of the method for detecting damage to merchandise of any of the embodiments described above.

In the commodity damage detection method, the device and the storage medium provided by some embodiments of the present application, when a monitoring video clip of an unmanned shop is obtained, the computer device may identify a finger joint pattern area of each frame of video image, and mark a finger structure in each frame of video image according to each finger joint pattern area in each frame of video image, where the finger structure may reflect a bending state of each finger of a customer. The computer equipment can input the monitoring video segments marked with the finger structures into the hand action classification model, so that the hand action classification model can accurately identify and output the hand action types of customers based on the bending states of the fingers. The computer device may determine whether the merchandise in the unmanned store is artificially damaged based on the type of hand motion. Therefore, whether the commodities in the unmanned store are damaged by people or not can be automatically and accurately detected, and the automation degree of the unmanned store can be further improved.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive faculty for a person skilled in the art.

FIG. 1 is a schematic diagram illustrating an application of a method for detecting damage to a commodity according to an embodiment;

FIG. 2 is a flow chart of a method for detecting damage to an article according to one embodiment;

FIG. 3 is a flowchart illustrating a step of generating a detection result based on a hand motion type in one embodiment;

FIG. 4 is a block diagram of a merchandise damage detection device in one embodiment;

fig. 5 is an internal structural diagram of a computer device in one embodiment.

Detailed Description

The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are only some, but not all, of the embodiments of the present application. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present disclosure.

In some embodiments, the method for detecting damage to goods provided in the present application may be applied to the application environment shown in fig. 1. The computer device 104 may be a device having a data processing function, and may be, but not limited to, various personal computers, notebook computers, and servers. The camera 102 may be a device having a video capturing function and a communication function. The number of cameras 102 may be one or more, which is not particularly limited herein. Each camera 102 may be provided in the unmanned store and generate a surveillance video of the unmanned store so that in-store conditions of the unmanned store may be recorded through the surveillance video. The computer device 104 may be communicatively or electrically connected to each of the cameras 102 to obtain the surveillance video captured by the cameras 102 and determine whether the merchandise in the unmanned store is artificially damaged based thereon.

It will be appreciated that the number of cameras and the location of each camera 102 may be determined according to the area of the unmanned shop, the layout of the unmanned shop, and/or the merchandise placement of the unmanned shop, which is not specifically limited herein. In one example, the cameras 102 may include at least one first camera and at least one second camera, each of the first cameras may be disposed at a high point location of the unmanned store, and each of the second cameras may be disposed on each shelf in a one-to-one correspondence so as to collect customer behavior at a close distance and obtain surveillance video.

In one embodiment, each shelf of the unmanned store may also be provided with a weight sensor. Each weight sensor is used for collecting the commodity weight corresponding to the corresponding goods shelf, and the commodity weight can be the total weight of the commodities supported by the corresponding goods shelf. The computer device 104 may be communicatively or electrically coupled to each weight sensor to obtain the weight of the merchandise corresponding to each shelf in the unmanned store.

In one embodiment, the unmanned store may also be provided with an alarm device, and the computer device 104 may be communicatively or electrically connected to the alarm device and may actuate the alarm device to alarm if it is detected that the merchandise in the unmanned store is artificially damaged. In one example, the alarm device may be an audible and visual alarm device.

In one embodiment, the present application provides a method for detecting damage to a commodity, and the following embodiment is described by taking an example that the method is applied to a computer device shown in fig. 1. As shown in fig. 2, the commodity damage detection method of the present application may include the steps of:

s202: and under the condition that the preset detection rule is met, acquiring the monitoring video clip of the unmanned store.

The preset detection rules can be used for judging whether to trigger commodity damage detection so as to detect whether the commodities in the unmanned store are artificially damaged. It will be appreciated that the rule content of the preset detection rule may be determined according to actual factors such as computing resources of the computer device, configuration information received by the computer device, the number of cameras set in the unmanned store, and/or the setting positions of the respective cameras, which is not particularly limited herein. In one example, the merchandise damage detection may be performed at all times.

The surveillance video clip may be a video clip to be subjected to merchandise damage detection, for example, may be a video clip taken from a complete surveillance video and/or a surveillance video captured by a particular camera. Because the unmanned store has the characteristic of low manual intervention degree, the unmanned store is in a full-time monitoring state, so that the video duration of the complete monitoring video is long. Based on the method, the computer equipment can intercept the monitoring video clips with shorter video duration from the complete monitoring video, and detect commodity damage based on the monitoring video clips. It can be appreciated that the acquisition mode of the surveillance video clip and/or the video duration of the surveillance video clip can be determined according to practical situations, which is not limited herein.

When the unmanned store is provided with a plurality of cameras, the shooting angles of the cameras are different, so that the computer equipment can obtain monitoring video clips according to the monitoring videos collected by the specific cameras, and commodity damage detection is carried out according to the monitoring video clips.

S204: and identifying a finger joint print area in each frame of video image of the monitoring video clip, and marking a finger structure in the video image according to the area coordinates of each finger joint print area in the video image.

The finger joint lines refer to the textures of the finger joints on the back of the finger, such as joint lines on metacarpophalangeal joints and joint lines on interphalangeal joints. The region coordinates of the fingerprint region may be used to reflect where the fingerprint region is located in the video image. The finger structure may be used to reflect the bending state of each finger. Further, the bending state may include whether the finger is bent, and a bending direction and a bending angle of the finger at the time of bending.

Because the finger joint prints are positioned on the back of the fingers and are easy to capture by the camera, the computer equipment can detect the finger joint prints so as to facilitate the subsequent recognition of the hand actions. Specifically, the surveillance video clip includes at least one frame of video image, and for each frame of video image, the computer device may execute steps A1 to A3 to mark the finger structure in the frame of video image:

step A1: a finger joint print region is identified in the frame video image. That is, each finger joint print is identified in the frame video image, and in the case that at least one finger joint print is identified, the area where each finger joint print is located in the frame video image is determined separately, so as to obtain each finger joint print area.

Step A3: and marking the finger structure in the frame of video image according to the region coordinates of each finger joint print region in the frame of video image. In other words, if the computer device identifies at least one finger joint print area in the frame video image, it can be determined that the monitoring video clip records the hand action of the customer, so the computer device can mark the finger structure in the frame video image according to the area coordinates of each finger joint print area to mark the bending state of each finger.

S206: and inputting the monitoring video segments marked with the finger structures into a hand motion classification model which is trained in advance so as to obtain the hand motion types output by the hand motion classification model.

The monitoring video segment marked with the finger structure can be a monitoring video segment marked with the finger structure in each frame of target video image, and the target video image can be a video image capable of identifying at least one finger joint print area.

Specifically, the computer device may input the monitoring video segments with the labeled finger structures into the hand motion classification model, so that the hand motion classification model may identify and output the hand motion type based on the bending state of each finger. In one example, the hand motion classification model may be pre-trained using a hand motion training set, where the hand motion training set may include multiple sets of hand motion samples, each set of hand motion samples may include a hand motion video and a predetermined motion type label for a person's hand marked with a finger structure to act on a commodity. Further, the computer device may train, verify and test the deep learning network model accordingly, and further obtain a hand motion classification model by taking 90% of the hand motion samples in the hand motion training set as the training set, 5% of the hand motion samples as the verification set, and 5% of the hand motion samples as the test set.

S208: and generating a detection result based on the hand action type, wherein the detection result is used for indicating whether the goods in the unmanned store are damaged by people.

The computer device may determine whether or not the commodity in the unmanned store is artificially damaged based on the hand motion type output by the hand motion classification model. In one example, a computer device may obtain a pre-built set of action types that may include various types of damage actions that may result in damage to the merchandise, e.g., the set of action types may include damage action types such as twisting, tearing, kneading, and the like. The computer device may determine whether the hand motion type output by the hand motion classification model exists in the motion type set, and generate a detection result according to the determination result.

In one embodiment, the method for detecting damage to goods of the present application may further include: and if the detection result is a damage detection result for indicating that the goods in the unmanned store are damaged artificially, driving an alarm device arranged in the unmanned store to alarm. For example, the computer device may drive the alarm means to sound an alarm. Thus, the customer can be timely reminded and warned.

In one embodiment, the method for detecting damage to goods of the present application may further include: if the detection result is a damage detection result for indicating that the goods in the unmanned store have been damaged artificially, the computer device may acquire customer information of the customers in the store and add the customer information to the blacklist to reject the customer from reentering the unmanned store. The customer information is acquired with the authorization of the customer.

In one embodiment, the method for detecting damage to goods of the present application may further include: if the detection result is a damage detection result for indicating that the goods in the unmanned store are damaged by people, the computer equipment can send a reminding message to the target electronic terminal so as to inform relevant personnel of the unmanned store to arrive at the scene in time.

In the application, when the monitoring video segment of the unmanned store is acquired, the computer equipment can respectively identify the finger joint line areas of each frame of video image, and mark the finger structures in each frame of video image according to each finger joint line area in each frame of video image, wherein the finger structures can reflect the bending state of each finger of a customer. The computer equipment can input the monitoring video segments marked with the finger structures into the hand action classification model, so that the hand action classification model can accurately identify and output the hand action types of customers based on the bending states of the fingers. The computer device may determine whether the merchandise in the unmanned store is artificially damaged based on the type of hand motion. Therefore, whether the commodities in the unmanned store are damaged by people or not can be automatically and accurately detected, and the automation degree of the unmanned store can be further improved.

It will be appreciated that in S204, the computer device may identify, detect, and determine from any manner, the region of the fingerprint in the video image. In one embodiment, for each frame of video image, identifying a finger joint print area in the frame of video image may include:

and aiming at each frame of video image, identifying a hand area in the frame of video image, and under the condition that the hand area is identified, intercepting the hand area image from the frame of video image according to the area coordinates of the hand area, and processing the hand area image by adopting a texture feature extraction algorithm so as to identify the finger joint print area.

In this embodiment, the computer device may identify and extract a region of interest, i.e., a hand region, in the video image. For example, a computer device may detect hand regions in a video image using a deep learning-based human detection model, providing a basis for subsequent joint print recognition. The human body detection model can be realized based on RCNN or YOLO and the like.

Under the condition that the hand area is identified, the computer equipment can divide the hand area image corresponding to the hand area from the video image according to the area coordinates of the hand area so as to avoid the interference of the background information of the video image on the subsequent identification and facilitate the subsequent processing of the hand area image.

Compared with other areas of the hand, the texture complexity of the finger joint print area is higher, so that the finger joint print area where the finger joint print is located can be extracted from the hand area image through a texture feature extraction algorithm. For example, the computer device may be implemented using texture feature extraction algorithms such as statistical methods, structural methods, transform-based methods, model-based methods, graph-based methods, learning-based methods, and/or entropy-based methods.

In the embodiment, the fingerprint area is identified by adopting texture analysis, so that the method has the advantages of simple algorithm, low cost and high calculation speed, and can be used for coping with real-time situations in unmanned shops.

In this embodiment, the pixel information of each finger joint print area may be used to reflect the pixel distribution of the finger joint print area, and the type of the joint print corresponding to each finger joint print area may be used to reflect the distance relationship between the finger joint print area and the palm.

Specifically, the joint lines may be divided into proximal joint lines, intermediate joint lines, and distal joint lines, wherein the proximal joint lines refer to joint lines located at metacarpophalangeal joints, the distal joint lines refer to joint lines located at interphalangeal joints near fingertips, and the intermediate joint lines refer to joint lines located at interphalangeal joints near palms. The knuckles of the thumb comprise a proximal knuckle and a distal knuckle, and the knuckles of the other four fingers comprise a proximal knuckle, a middle knuckle and a distal knuckle. In one example, the type of joint corresponding to each finger joint print region may be a proximal joint print, a medial joint print, or a distal joint print.

For each frame of video image of the surveillance video clip, the computer device may execute steps B1 to B5 to mark the finger structure in the frame of video image:

Step B1: and respectively determining the type of the joint print corresponding to each finger joint print area according to the pixel information of each finger joint print area in the frame video image so as to respectively determine the distance relation between each finger joint print area and the palm.

Step B3: grouping each of the fingerprint areas based on each of the fingerprint types. Specifically, the computer device may determine whether each two fingerprint areas correspond to two fingerprints on the same finger according to the distance relationship between each fingerprint area and the palm center in the frame video image, if so, the two fingerprint areas may be assigned to the same group, otherwise, the two fingerprint areas may be assigned to different groups.

Step B5: and marking a finger structure in the video image according to the grouping condition and the region coordinates of each finger joint print region.

In one example, the computer device may identify palm coordinates in the frame of video image. For each group, if the group includes a plurality of fingerprint areas, the computer device may construct a link according to the palm coordinates, the type of the fingerprint corresponding to each fingerprint area in the group, and the area coordinates of each fingerprint area. For example, the computer device may sequentially connect the palm coordinates, the region coordinates of the intermediate joint print region, and the region coordinates of the distal joint print region. The middle joint pattern area is a finger joint pattern area with joint pattern type being middle joint pattern, and the distal joint pattern area is a finger joint pattern area with joint pattern type being distal joint pattern. For each group, if the group includes a finger joint print area, the computer device may link the palm coordinates to the area coordinates of the finger joint print area included in the group to construct a link. Thus, the complete finger structure can be reflected by each connection line.

Considering that the situation that the fingers are not overlapped in commodity damage behaviors generally does not occur, and the shooting angle of the camera generally is less, so that in one example, for each frame of video image, if at least two lines in each line of the frame of video image intersect, the computer equipment can determine that the current grouping situation has a problem of grouping errors, and regrouping is performed based on each joint pattern type, so that the grouping accuracy is improved.

In this embodiment, the computer device may determine the types of the knuckles corresponding to the knuckle areas, and group the knuckles according to the types of the knuckles, so that the grouping situation may reflect the correspondence between each knuckle area and the finger. Therefore, the computer equipment can accurately mark the finger structures according to the grouping situation so as to improve the accuracy of the subsequent hand action classification.

Specifically, the knuckle lines adapt to the flexion movement of the knuckles, and when the flexion range of the knuckles is large, the knuckle lines are complicated. Thus, the computer device may determine the type of joint print corresponding to the region of the joint print based on the texture complexity corresponding to the region of the joint print.

In this embodiment, in determining the type of the joint pattern corresponding to each finger joint pattern region, the computer device may count the gray level histogram of the finger joint pattern region according to the pixel value of each pixel in the finger joint pattern region, and respectively count the proportion of pixels corresponding to each gray level in the gray level histogram. The pixel ratio corresponding to each gray level may be a ratio of the number of pixels corresponding to the gray level to the total number of pixels of the gray histogram. The computer device may determine texture complexity corresponding to the knuckle region according to the proportion of each pixel corresponding to the knuckle region, and further determine the type of the knuckle according to the texture complexity. Therefore, the calculation mode of the texture complexity can be simplified, and the calculation resources are saved.

；

In this embodiment, the computer device may calculate the texture complexity corresponding to each finger joint print area by using the above formula, which is simple to implement and has the advantages of low cost and fast calculation speed.

In one embodiment, for each finger joint print area of each frame of video image, the computer device may calculate the gray average value corresponding to that finger joint print area using the following expression:

；

in the method, in the process of the invention,for the gray scale corresponding to the finger joint print areaA gray average value of the histogram;The number of gray levels corresponding to the finger joint print area; / >In the gray level histogram corresponding to the finger joint print area, the firstiGray values corresponding to the respective gray levels;In the gray level histogram corresponding to the finger joint print area, the firstiThe proportion of pixels corresponding to the respective gray levels.

In one embodiment, for each fingerprint area of each frame of video image, determining a fingerprint type corresponding to the fingerprint area based on the texture complexity includes:

In particular, the texture complexity of the middle joint texture is higher than the texture complexity of the distal joint texture, as the range of flexion of the finger joint is greater, the texture of the finger joint texture is approximately complex. In this embodiment, under the condition that the texture complexity corresponding to the fingerprint area is determined, the computer device may compare the texture complexity corresponding to the fingerprint area with a preset complexity threshold, and determine the type of the fingerprint corresponding to the fingerprint area according to the size relationship between the texture complexity and the preset complexity threshold. When the texture complexity corresponding to the fingerprint area is larger than a preset complexity threshold, the type of the fingerprint corresponding to the fingerprint area is middle fingerprint. When the texture complexity corresponding to the fingerprint area is smaller than or equal to a preset complexity threshold, the type of the fingerprint corresponding to the fingerprint area is distal fingerprint.

It will be appreciated that the specific value of the preset complexity threshold may be determined according to the actual situation, and this is not particularly limited herein.

The embodiment can determine the type of the joint line through numerical comparison, is simple to realize, and has the advantages of low cost and high calculation speed.

In one embodiment, the type of joint pattern corresponding to each of the finger joint pattern areas is a medial joint pattern or a distal joint pattern. For a specific description of the intermediate joint and the distal joint, reference may be made to the above embodiments, and no further description is given here.

The nearest distal joint print region corresponding to the middle joint print region may be a distal joint print region closest to the middle joint print region in the same video image. The closest intermediate joint print region corresponding to the distal joint print region may be the intermediate joint print region closest to the distal joint print region in the same video image.

For each frame of video image, the computer device may group the respective intermediate joint region and the respective distal joint region according to the nearest distal joint region corresponding to each intermediate joint region and the nearest intermediate joint region corresponding to each distal joint region, respectively. In particular, the computer device may match each intermediate joint print region and each distal joint print region in the frame of video image in pairs. In the process of matching two pairs, the computer device may select one middle joint line area and one distal joint line area, and if the nearest distal joint line area corresponding to the selected middle joint line area is the selected distal joint line area and the nearest middle joint line area corresponding to the selected distal joint line area is the selected middle joint line area, the selected middle joint line area and the selected distal joint line area may belong to the same group. If the nearest distal joint print region corresponding to the selected intermediate joint print region is not the selected distal joint print region, or the nearest intermediate joint print region corresponding to the selected distal joint print region is not the selected intermediate joint print region, the selected intermediate joint print region and the selected distal joint print region may be assigned to different groups, respectively.

Therefore, the grouping accuracy can be improved, and the fact that all finger joint print areas belonging to the same group correspond to the same finger can be ensured, so that the accuracy of the finger structure marking can be improved, and the accuracy of the hand action classification can be improved.

In one embodiment, as shown in fig. 3, the generating a detection result based on the hand action type includes:

s302: identifying the commodity type of the held commodity according to the monitoring video clip;

s304: determining a damage action type set corresponding to the held commodity based on the commodity type;

s306: and if the damage action type set comprises the hand action type, generating a damage detection result for indicating that the goods in the unmanned store are damaged artificially.

Wherein the held commodity may be a commodity held in a hand.

In particular, different kinds of merchandise may correspond to different merchandise damage behaviors. For example, if the product is a beverage, this may mean that the beverage bottle cap is unscrewed when the consumer makes a twisting action on the product, which in turn results in the product being damaged. As another example, if the product is a pouch food, the act of tearing the product by the consumer may mean that the product package is torn open, thereby causing the product to be damaged. For another example, if the commodity is puffed food, the food may be broken when the customer performs kneading, squeezing, etc. on the commodity, thereby damaging the commodity.

In this embodiment, the computer device may identify the type of the article to be held from the surveillance video clip, and determine, based on the type of the article, a set of damage action types corresponding to the article to be held, where the set of damage action types may include each of target hand action types that may cause damage to the article to be held. The computer device may combine the set of damage action types and the customer's hand action types to determine if the merchandise is artificially damaged. When the hand motion type output by the hand motion classification model is the target hand motion type, the computer device can determine that the goods in the unmanned store are artificially damaged, and generate a damage detection result.

In this embodiment, the accuracy of detecting damage to the commodity can be further improved by determining the corresponding damage action type set according to the commodity type and jointly determining whether the commodity is damaged by the person by combining the damage action type set and the hand action type.

It will be appreciated that any manner may be used by the present application to identify the type of merchandise being held. In one embodiment, the computer device may identify the type of merchandise using a pre-trained merchandise identification model. The commodity identification model can be obtained by training a commodity training set, wherein the commodity training set can comprise a plurality of groups of commodity training samples, and each group of commodity training samples can comprise a pre-collected commodity image and a pre-labeled commodity type. Further, each commodity image included in the commodity training set may be an image obtained by photographing a commodity placed in a store under a multi-angle and multi-illumination condition.

and if the damage action type set comprises the hand action type, respectively acquiring commodity coordinates of the commodity with the hand coordinates held, and generating a detection result based on the commodity coordinates and the hand coordinates.

In this embodiment, the specific description of identifying the commodity type and determining the damage action type set may refer to the above embodiment, and will not be repeated herein. If the damage action type set comprises a hand action type, the hand coordinates and the commodity coordinates of the held commodity can be obtained respectively, whether the held commodity is still held at the hand or not is judged again based on the commodity coordinates and the hand coordinates, and then a detection result is generated. Thus, the accuracy of commodity damage detection can be further improved.

In one embodiment, the method for detecting damage to goods of the present application may further include: and if the detection result is a damage detection result for indicating that the goods in the unmanned store are damaged artificially, generating a settlement bill according to the goods information of the held goods. In this way, the price of the artificially damaged commodity can be added to the clearing house.

In one embodiment, the obtaining the surveillance video clip of the unmanned store when the preset detection rule is satisfied includes:

Specifically, each goods shelf in the unmanned store can be provided with a weight sensor so as to detect the weight of the goods corresponding to each goods shelf in real time, and the weight of the goods corresponding to each goods shelf can be the total weight of the goods supported by the goods shelf. When the commodity leaves the shelf, the weight of the commodity corresponding to the shelf is reduced. Therefore, the computer equipment can acquire the commodity weight corresponding to each goods shelf in real time, and when the commodity weight corresponding to any goods shelf is reduced, the target camera arranged on the goods shelf is started, so that the customer behaviors are acquired through the target camera in a short distance, and the acquired monitoring video clips are obtained. Therefore, the resource consumption can be reduced, and further, commodity damage detection can be completed through fewer resources.

Further, in one embodiment, the computer device may further acquire a surveillance video clip acquired by a first camera disposed at a high-point position of the unmanned store when detecting a decrease in the weight of the commodity corresponding to any one of the shelves, and perform commodity damage detection according to the surveillance video clip acquired by the first camera and the surveillance video clip acquired by the target camera. Therefore, the behavior of the first client can be identified at multiple angles, and the identification efficiency and accuracy are improved.

The commodity damage detection apparatus provided in the embodiments of the present application will be described below, and the commodity damage detection apparatus described below and the commodity damage detection method described above may be referred to correspondingly to each other.

In one embodiment, an embodiment of the present application provides a merchandise damage detection device 400. As shown in fig. 4, the apparatus 400 may include:

the video clip obtaining module 410 is configured to obtain a monitoring video clip of the unmanned store if a preset detection rule is satisfied;

the finger structure labeling module 420 is configured to identify, for each frame of video image of the monitoring video clip, a finger joint print region in the video image, and label a finger structure in the video image according to region coordinates of each finger joint print region in the video image;

The hand motion type determining module 430 is configured to input the monitoring video segment with the labeled finger structure into a hand motion classification model obtained by training in advance, so as to obtain a hand motion type output by the hand motion classification model;

and a detection result generating module 440, configured to generate a detection result based on the hand action type, where the detection result is used to indicate whether the goods in the unmanned shop are damaged by human.

In one embodiment, the finger structure labeling module 420 may include a structure labeling unit. The structure labeling unit is used for respectively determining the joint pattern type corresponding to each finger joint pattern area according to the pixel information of each finger joint pattern area in each frame of video image, grouping the finger joint pattern areas based on the joint pattern type, and labeling the finger structure in the video image according to the grouping condition and the area coordinates of the finger joint pattern areas. Wherein, each finger joint line area belonging to the same group corresponds to the same finger, and each finger joint line area belonging to different groups corresponds to different fingers respectively.

In one embodiment, the structure labeling unit may include an arthroplasty type determining unit. The joint pattern type determining unit is used for generating a gray level histogram corresponding to each finger joint pattern area of each frame of video image, calculating a pixel proportion corresponding to each gray level in the gray level histogram, determining texture complexity corresponding to the finger joint pattern area according to the pixel proportion corresponding to each gray level, and determining the joint pattern type corresponding to the finger joint pattern area based on the texture complexity.

In one embodiment, the joint texture type determination unit may include a texture complexity calculation unit. The texture complexity calculation unit is used for calculating a gray average value of a gray histogram corresponding to each finger joint texture region of each frame of video image, and calculating the texture complexity corresponding to the finger joint texture region by adopting the following expression:

；

In one embodiment, the joint texture type determination unit may include a texture complexity comparison unit. The texture complexity comparison unit is used for aiming at each finger joint pattern area of each frame of video image, if the texture complexity corresponding to the finger joint pattern area is larger than a preset complexity threshold, the joint pattern type corresponding to the finger joint pattern area is middle joint pattern, otherwise, the joint pattern type corresponding to the finger joint pattern area is far-side joint pattern.

In one embodiment, the type of joint pattern corresponding to each of the finger joint pattern areas is a medial joint pattern or a distal joint pattern. The structure labeling unit may include a nearest region determining unit and a grouping unit. The nearest region determining unit is used for determining a nearest distal joint pattern region corresponding to each middle joint pattern region in each video image and a nearest middle joint pattern region corresponding to each distal joint pattern region respectively for each video image; the joint pattern type is that the finger joint pattern area of the middle joint pattern is that of the middle joint pattern, and the finger joint pattern area of the joint pattern type is that of the far-side joint pattern. The grouping unit is configured to determine, for each intermediate joint area, the intermediate joint area as a first target area, and a nearest distal joint area corresponding to the first target area as a second target area, if the nearest intermediate joint area corresponding to the second target area is the first target area, the first target area and the second target area belong to the same group, otherwise, determine that the first target area and the second target area belong to different groups.

In one embodiment, the detection result generation module 440 may include a commodity kind identification unit, a set determination unit, and a result generation unit. The commodity type identification unit is used for identifying the commodity type of the held commodity according to the monitoring video clip. The set determining unit is used for determining a damage action type set corresponding to the held commodity based on the commodity type. The result generation unit is used for generating a damage detection result for indicating that the goods in the unmanned store are damaged artificially if the damage action type set comprises the hand action type.

In one embodiment, the video clip acquisition module 410 may include a weight acquisition unit and a camera activation unit. The weight acquisition unit is used for acquiring the weight of the commodity corresponding to each goods shelf in the unmanned store in real time. The camera starting unit is used for starting a target camera when the weight of the commodity corresponding to any goods shelf is detected to be reduced, so that the monitoring video segment is obtained through the target camera; the target camera is a camera arranged on a target goods shelf, and the target goods shelf is a goods shelf with reduced commodity weight.

In one embodiment, the present application also provides a storage medium having stored therein computer readable instructions which, when executed by one or more processors, cause the one or more processors to perform the steps of the method of merchandise damage detection as in any of the embodiments.

In one embodiment, the present application also provides a computer device having stored therein computer readable instructions which, when executed by one or more processors, cause the one or more processors to perform the steps of the method of merchandise damage detection as in any of the embodiments.

Schematically, fig. 5 is a schematic internal structure of a computer device provided in an embodiment of the present application, where in an example, the computer device may be a server. Referring to FIG. 5, computer device 900 includes a processing component 902 that further includes one or more processors, and memory resources represented by memory 901, for storing instructions, such as applications, executable by processing component 902. The application program stored in the memory 901 may include one or more modules each corresponding to a set of instructions. Further, the processing component 902 is configured to execute instructions to perform the steps of the merchandise damage detection method of any of the embodiments described above.

The computer device 900 may also include a power component 903 configured to perform power management of the computer device 900, a wired or wireless network interface 904 configured to connect the computer device 900 to a network, and an input output (I/O) interface 905. The computer device 900 may operate based on an operating system stored in memory 901, such as Windows Server TM, mac OS XTM, unix, linux, free BSDTM, or the like.

It will be appreciated by those skilled in the art that the internal structure of the computer device shown in the present application is merely a block diagram of some of the structures related to the aspects of the present application and does not constitute a limitation of the computer device to which the aspects of the present application apply, and that a particular computer device may include more or less components than those shown in the figures, or may combine some of the components, or have a different arrangement of the components.

Finally, it is further noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element. Herein, "a," "an," "the," and "the" may also include plural forms, unless the context clearly indicates otherwise. Plural means at least two cases such as 2, 3, 5 or 8, etc. "and/or" includes any and all combinations of the associated listed items.

In the present specification, each embodiment is described in a progressive manner, and each embodiment focuses on the difference from other embodiments, and may be combined according to needs, and the same similar parts may be referred to each other.

The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. A method of detecting damage to a commodity, the method comprising:

2. The method of claim 1, wherein for each frame of video image, marking finger structures in the video image according to the region coordinates of each of the finger joint print regions in the video image comprises:

3. The method according to claim 2, wherein for each frame of video image, determining the type of joint print corresponding to each of the joint print regions according to the pixel information of each of the joint print regions in the video image, respectively, includes:

4. A method according to claim 3, wherein for each finger joint print area of each frame of video image, determining the texture complexity corresponding to the finger joint print area from the pixel scale corresponding to the respective gray scale level comprises:

；

5. A method according to claim 3, wherein for each finger joint print region of each frame of video image, determining the type of joint print corresponding to that finger joint print region based on the texture complexity comprises:

6. The method of claim 2, wherein the type of joint pattern corresponding to each of the knuckle areas is either a medial joint pattern or a distal joint pattern;

7. The method of any one of claims 1 to 6, wherein the generating a detection result based on the hand motion type comprises:

8. The method according to any one of claims 1 to 6, wherein the acquiring the surveillance video clip of the unmanned store if the preset detection rule is satisfied comprises:

9. A merchandise damage detection device, the device comprising:

10. A storage medium having stored therein computer readable instructions which, when executed by one or more processors, cause the one or more processors to perform the steps of the method of merchandise damage detection of any one of claims 1 to 8.