Disclosure of Invention
The invention mainly aims to provide a graph relation network people counting method and related equipment, and aims to solve the problem of low accuracy of a people counting method based on video monitoring in the prior art.
In order to achieve the above purpose, the invention adopts the following technical scheme:
A graph relationship network demographics method, the graph relationship network demographics method comprising the steps of:
Predicting an input picture sequence by using a detection network to obtain the positions, the boundary boxes and the categories of key points of all detection objects in the picture sequence and the boundary box sizes corresponding to the key points, and obtaining boundary box images of the picture sequence according to the positions, the categories and the boundary box sizes of the key points;
Constructing a corresponding first relationship subgraph according to the boundary boxes of the detection objects, initializing a second relationship subgraph and inquiring the first relationship subgraph after obtaining first relationship subgraphs of all the detection objects, and directly adding the first relationship subgraphs appearing for the first time into the second relationship subgraph set;
Respectively carrying out normalization processing and iterative calculation on correlations between adjacent nodes in the first relational subgraph and the second relational subgraph and correlations between adjacent edges in the first relational subgraph and the second relational subgraph in sequence to obtain graph similarity between the first relational subgraph and the second relational subgraph;
Respectively calculating the histogram similarity of a first boundary frame image and a second boundary frame image on three channels, and calculating the average histogram similarity of the first boundary frame image and the second boundary frame image;
calculating to obtain the comprehensive similarity of the first relation subgraph and the second relation subgraph according to the graph similarity and the average histogram similarity;
Calculating the comprehensive similarity of each first relationship sub-graph and all second relationship sub-graphs respectively, comparing the comprehensive similarity with a preset threshold, and if the comprehensive similarity is not smaller than the preset threshold, replacing the second relationship sub-graph with the first relationship sub-graph, otherwise, directly adding the first relationship sub-graph into the second relationship sub-graph set to update the second relationship sub-graph set;
and after the calculation of all the first relation subgraphs is completed, calculating the number of the second relation subgraphs in the updated second relation subgraphs to obtain the number of different detection objects in the picture sequence.
In the graph relation network people counting method, the step of predicting an input picture sequence by using a detection network to obtain the positions of key points, boundary boxes and categories of all detection objects in the picture sequence and the boundary box sizes corresponding to the key points, and obtaining boundary box images of the picture sequence according to the positions of the key points, the categories and the boundary box sizes specifically comprises the following steps:
Compressing the picture sequence by taking the step length as a unit to obtain a compressed image, predicting the key point positions and the categories of all the detection objects in the compressed image by using the detection network, and simultaneously obtaining a boundary frame;
Predicting a corresponding key point heat map according to each category, splicing all the key point heat maps to obtain a predicted heat map, and calculating a loss value according to the predicted heat map and a reference heat map of the category to obtain the size of the boundary frame;
and restoring the boundary box images of the picture sequence according to the key point positions, the categories and the boundary box sizes of the compressed images.
In the graph relationship network people counting method, the steps of constructing a corresponding first relationship sub-graph according to the bounding box of the detection object, initializing a second relationship sub-graph set and inquiring the first relationship sub-graph after obtaining the first relationship sub-graph set of all the detection objects, and directly adding the first relationship sub-graph appearing for the first time into the second relationship sub-graph set specifically comprise:
the boundary frame of the detection object is taken as a central node and other adjacent detection objects are taken as neighbor nodes to jointly construct a relation diagram;
refining the relation graph, and constructing a relation subgraph according to the refined relation graph to obtain the first relation subgraph set of all the detection objects;
After initializing an empty second relationship sub-graph set, querying the first relationship sub-graph in the first relationship sub-graph set, and directly adding the first relationship sub-graph appearing for the first time into the second relationship sub-graph set.
In the graph relationship network people counting method, the steps of respectively carrying out normalization processing and iterative calculation on the correlations between adjacent nodes in the first relationship sub-graph and the second relationship sub-graph and the correlations between adjacent edges in the first relationship sub-graph and the second relationship sub-graph to obtain the graph similarity between the first relationship sub-graph and the second relationship sub-graph specifically comprise:
Respectively calculating the correlation between adjacent nodes in the first relational subgraph and the correlation between adjacent nodes in the second relational subgraph, and then carrying out normalization processing to obtain node similarity between the first relational subgraph and the second relational subgraph, wherein the second relational subgraph is the first relational subgraph added into the second relational subgraph set;
Respectively calculating the correlation between adjacent edges in the first relational subgraph and the correlation between adjacent edges in the second relational subgraph, and then carrying out normalization processing to obtain the edge similarity of the adjacent edges in the first relational subgraph and the adjacent edges in the second relational subgraph;
Performing iterative computation on the node similarity and the edge similarity to obtain the node similarity matrix and the edge similarity matrix;
and comparing corresponding positions in the node similarity matrix to obtain graph similarity of the first relationship graph and the second relationship graph.
In the graph relation network people counting method, the steps of respectively calculating the histogram similarity of the first boundary frame image and the second boundary frame image on three channels and calculating the average histogram similarity of the first boundary frame image and the second boundary frame image comprise the following steps:
The first boundary frame image corresponding to the first relation sub-graph and the second boundary frame image corresponding to the second relation sub-graph are adjusted to be of preset sizes, wherein the first boundary frame image is the boundary frame image corresponding to the first relation sub-graph, and the second boundary frame image is the boundary frame image corresponding to the second relation sub-graph;
separating the adjusted first boundary frame image and the adjusted second boundary block diagram in three channels respectively, and calculating corresponding histogram similarity in the three channels respectively;
And solving the average value of the histogram similarity of the three channels to obtain the average histogram similarity of the first boundary frame image and the second boundary frame image.
In the graph relationship network population statistics method, the step of calculating the comprehensive similarity of the first relationship subgraph and the second relationship subgraph according to the graph similarity and the average histogram similarity specifically includes:
Substituting a preset proportion into the graph similarity and the average histogram similarity to calculate so as to obtain the comprehensive similarity of the first relationship subgraph and the second relationship subgraph.
In the graph relationship network people counting method, the step of respectively calculating the comprehensive similarity of each first relationship sub-graph and all second relationship sub-graphs and comparing the comprehensive similarity with a preset threshold, if the comprehensive similarity is not smaller than the preset threshold, replacing the second relationship sub-graph with the first relationship sub-graph, otherwise, directly adding the first relationship sub-graph into the second relationship sub-graph set to update the second relationship sub-graph set specifically comprises the following steps:
Respectively calculating the comprehensive similarity of each first relation sub-graph and all second relation sub-graphs, and comparing each time with the preset threshold value;
And if the comprehensive similarity is greater than or equal to the preset threshold, replacing the second relationship sub-graph with the first relationship sub-graph, and if the comprehensive similarity is less than the preset threshold, directly adding the first relationship sub-graph into the second relationship sub-graph set to update the second relationship sub-graph set.
A graph relationship network demographics system, the graph relationship network demographics system comprising:
the boundary frame image generation module is used for predicting an input picture sequence by utilizing a detection network to obtain the positions of key points, boundary frames and categories of all detection objects in the picture sequence and the sizes of boundary frames corresponding to the key points, and obtaining boundary frame images of the picture sequence according to the positions of the key points, the categories and the sizes of the boundary frames;
The second relation sub-graph set generation module is used for constructing a corresponding first relation sub-graph according to the boundary boxes of the detection objects, initializing the second relation sub-graph set and inquiring the first relation sub-graph after the first relation sub-graph sets of all the detection objects are obtained, and directly adding the first relation sub-graph appearing for the first time into the second relation sub-graph set;
The graph similarity calculation module is used for respectively carrying out normalization processing and iterative calculation on the correlations between adjacent nodes in the first relational subgraph and the second relational subgraph and the correlations between adjacent edges in the first relational subgraph and the second relational subgraph in sequence to obtain graph similarity between the first relational subgraph and the second relational subgraph;
The average histogram similarity calculation module is used for calculating the histogram similarity of the first boundary frame image and the second boundary frame image on three channels respectively and calculating the average histogram similarity of the first boundary frame image and the second boundary frame image;
The comprehensive similarity solving module is used for calculating and obtaining the comprehensive similarity of the first relational subgraph and the second relational subgraph according to the graph similarity and the average histogram similarity;
The preset threshold comparison module is used for respectively calculating the comprehensive similarity of each first relationship sub-graph and all second relationship sub-graphs and comparing the comprehensive similarity with a preset threshold, if the comprehensive similarity is not smaller than the preset threshold, the first relationship sub-graph is used for replacing the second relationship sub-graph, otherwise, the first relationship sub-graph is directly added into the second relationship sub-graph set to update the second relationship sub-graph set;
And the number generation module is used for calculating the number of the second relation subgraphs in the updated second relation subgraphs after the calculation of all the first relation subgraphs is completed, so as to obtain the number of different detection objects in the picture sequence.
A terminal comprising a memory, a processor and a graph relationship network demographics program stored on the memory and executable on the processor, which when executed by the processor implements the steps of the graph relationship network demographics method as described above.
A computer readable storage medium storing a graph relationship network demographics program which when executed by a processor implements the steps of a graph relationship network demographics method as described above.
Compared with the prior art, the graph relation network people counting method and related equipment provided by the invention comprise the following steps of predicting an input picture sequence by using a detection network to obtain the positions of key points, boundary frames and categories of all detection objects in the picture sequence and the sizes of boundary frames corresponding to the key points, and obtaining boundary frame images of the picture sequence according to the positions of the key points, the categories and the sizes of the boundary frames; constructing corresponding first relation subgraphs according to the boundary boxes of the detection objects, initializing a second relation subgraphs after obtaining first relation subgraphs of all detection objects, inquiring the first relation subgraphs, directly adding the first relation subgraphs appearing for the first time into the second relation subgraphs, respectively carrying out normalization processing and iterative calculation on correlation between adjacent nodes in the first relation subgraphs and adjacent edges in the second relation subgraphs in sequence to obtain graph similarity between the first relation subgraphs and the second relation subgraphs, respectively calculating histogram similarity of a first boundary frame image and a second boundary frame image on three channels, and calculating average similarity of the first boundary frame image and the second boundary frame image, respectively calculating comprehensive similarity of the first relation subgraphs and the second relation subgraphs according to the graph similarity and the average histogram similarity, respectively calculating histogram similarity of each first relation subgraph and all second relation subgraphs, respectively comparing the first relation subgraphs with the second relation subgraphs, and comparing the first relation subgraphs with the second relation subgraphs with a preset threshold value, replacing the second relationship sub-graph with the first relationship sub-graph, otherwise, directly adding the first relationship sub-graph into the second relationship sub-graph set to update the second relationship sub-graph set; and after the calculation of all the first relation subgraphs is completed, calculating the number of the second relation subgraphs in the updated second relation subgraphs to obtain the number of different detection objects in the picture sequence. After calculating the graph similarity of a first relationship sub-graph and a second relationship sub-graph and the average histogram similarity of a first boundary frame image corresponding to the first relationship sub-graph and a second boundary frame image corresponding to the second relationship sub-graph, solving the comprehensive similarity of the graph similarity and the average histogram similarity, comparing the comprehensive similarity with a preset threshold value, and correspondingly updating the second relationship sub-graph set according to a comparison result so as to obtain the number of different detection objects in the image sequence, thereby efficiently and accurately counting the number of different detection objects in the image.
Detailed Description
In order to make the objects, technical solutions and effects of the present invention clearer and more specific, the present invention will be described in further detail below with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
As used herein, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, unless expressly stated otherwise, as understood by those skilled in the art. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. It will be understood that when an element is referred to as being "connected" or "coupled" to another element, it can be directly connected or coupled to the other element or intervening elements may also be present. Further, "connected" or "coupled" as used herein may include wirelessly connected or wirelessly coupled. The term "and/or" as used herein includes all or any element and all combination of one or more of the associated listed items.
It will be understood by those skilled in the art that all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs unless defined otherwise. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the prior art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.
The invention provides a graph relationship network people counting method and related equipment. According to the method, node similarity between the first relationship sub-graph and the second relationship sub-graph is obtained through calculation according to the correlation between adjacent nodes in the first relationship sub-graph and the second relationship sub-graph, then iterative calculation is carried out to obtain graph similarity of the first relationship sub-graph and the second relationship sub-graph, then average histogram similarity of a first boundary frame image corresponding to the first relationship sub-graph and a second boundary frame image corresponding to the second relationship sub-graph is calculated, after comprehensive similarity of the graph similarity and the average histogram similarity is solved, comparison is carried out with a preset threshold, and the second relationship sub-graph set is correspondingly updated according to a comparison result so as to obtain the number of different detection objects in the image sequence, so that the number of different detection objects in the image can be counted efficiently and accurately.
The following description will describe the design scheme of the graph relationship network people counting method through specific exemplary embodiments, and it should be noted that, the following embodiments are only used for explaining the technical scheme of the invention, and are not limited in particular:
Referring to fig. 1, the method for counting the number of people in the graph relationship network provided by the invention comprises the following steps:
S100, predicting an input picture sequence by using a detection network to obtain key point positions, boundary boxes and categories of all detection objects in the picture sequence and boundary box sizes corresponding to the key points, and obtaining boundary box images of the picture sequence according to the key point positions, the categories and the boundary box sizes;
Specifically, firstly, dividing a video shot by the unmanned aerial vehicle into the picture sequences: wherein W is the image width, H is the image height, k is the number of pictures, and each picture can use a matrix with W x H x 3 size And then inputting the obtained images into the detection network for prediction to detect all the detection objects in each picture, wherein in the embodiment, a bounding box corresponding to each detection object (the bounding box contains the key point position and the category information of the detection object) and the bounding box size corresponding to the key point are obtained by taking human as an example: R is a step length, and then a boundary frame image of the picture sequence is correspondingly obtained according to the key point position, the category and the boundary frame size: k is the number of the bounding box images, and is specifically shown in fig. 2, where the detection network is based on CENTERNET networks, and the network can directly predict the center coordinates (key points), the length and the width, and the class c of the bounding box, for example, please refer to the left part of fig. 3, and for inputting N pictures, the detected detection objects obtained through the CENTERNET networks include the center coordinates (key point positions), the length and the width of the detection box (bounding box), and the class of the detection object, so as to obtain the corresponding bounding box images and the relational subgraph.
Still further, referring to fig. 4, the step S100 of predicting an input picture sequence by using a detection network to obtain the positions of key points, bounding boxes and categories of all detection objects in the picture sequence, and the bounding box sizes corresponding to the key points, and obtaining the bounding box images of the picture sequence according to the positions of the key points, the categories and the bounding box sizes specifically includes:
S110, compressing the picture sequence by taking a step length as a unit to obtain a compressed image, predicting the key point positions and the categories of all the detection objects in the compressed image by using the detection network, and simultaneously obtaining a boundary box;
S120, predicting corresponding key point heat maps according to each category, splicing all the key point heat maps to obtain predicted heat maps, and calculating a loss value according to the predicted heat maps and the reference heat maps of the categories to obtain the size of the boundary frame;
s130, restoring according to the key point positions, the categories and the bounding box sizes of the compressed images to obtain bounding box images of the image sequence.
Specifically, for the sequence of pictures: each picture in the list is compressed by taking the step length r as a unit to obtain a compressed picture Then, using the compressed map as input to the detection network, a key point heat map is predicted for each class c using a neural network: (the heat map is in the form of a matrix), i.e. each location in the compressed map is predicted, wherein, Representing the position x, y in the compressed map as class c, and simultaneously obtaining bounding boxes, and then splicing the key point heat maps of the class c to obtain a predicted heat map: where W, H, C and r are the broad, high, number of keypoint categories and step size, respectively, e.g. w=256, h=256, c=2, r=4;
And for each category c in the picture, there is one of said reference heatmaps: The reference heat map is generated by a gaussian function of the reference heat map (the reference heat map is the maximum value at the keypoints, and is only the value 1), and the function expression is as follows:
wherein the meaning of equation (1) is that one is calculated for each category in each image Then c classes have c Y x,, and the splice Y x,,,σp is the adaptive standard deviation, here set to 1,Is a super-parameter, is a constant term value, C is a class set, Y x,, is represented as the magnitude of the thermodynamic value in class C with x on the abscissa and Y on the ordinate,And (5) compressing the position coordinates of each type of object in the picture in the compressed picture.
For each category, such as people. If one image has a plurality of objects of the same type, a plurality of Gaussian functions are overlapped, and the value with the largest numerical value in the plurality of Gaussian functions is selected for the position (x, y) in the position compressed map, so that the reference heat map can be finally obtainedY x,y, means the corresponding heat map value of the coordinate x, Y category c, wherein the closer to the corresponding category object in the position compressed map the larger the position value.
Second, according to the predicted heat mapCalculating a predicted coordinate loss value (predicted key point position) and a reference heat map Y x,, (real key point position) obtained by the formula (1), wherein the predicted coordinate loss value is a logistic regression function with reduced pixel penalty of focal loss, whether the predicted key point heat map is accurate or not is measured by using the reference heat map, and the more similar loss is smaller, and the function expression of the predicted coordinate loss value is as follows:
Wherein α and β are both super parameters of the function, where α=2, β=4, and n is the number of keypoints (i.e. the total number of all detected objects) in each image; Representing predicted keypoint locations, Y xyc represents the true keypoint locations. Based on the above formula (2), the location and category of the key point can be obtained And then represents that the (x, y) position of the compressed map is considered as category c.
However, since the compressed image is predicted, there may be a positional shift in the compressed image, and thus, in order to eliminate the positional shift caused by compression, there is a predicted shift amount for each key point: Only the original image position and the compressed image position are used for training, and the position deviation loss function is obtained as follows:
Wherein, For the compressed map to correspond to the coordinates (real coordinates) of the key points, p is the position (predicted coordinates) of the key points after the compressed map is expanded back to the original size (original picture), r is the compressed step length, and N is the number of the key points. Based on the above formulas (2) and (3), the position (center point) and the category of the key point in the original picture can be obtained, and the position can be obtained by predictionCombined positional deviationAnd expanding back to the original picture according to the step size r.
Furthermore, all nodes with thermal values greater than 8 points around (one node for each position of the compressed graph, with nodes greater than 8 nodes around representing that he is greater than the nodes adjacent to him) are screened out, and the top 100 peak nodes (top 100 key points with maximum thermal values) are reserved for training of the bounding box model for each class, and for class c, namelyWherein the peak node has a size ofWhere c represents the c-th category, e.g., category number 2 (people and machines). And predicting a bounding box size for each predicted keypoint: The loss function to get the bounding box size is as follows, i.e. the difference between the predicted bounding box size and the real bounding box size is calculated:
Wherein p k represents the kth key point, The predicted bounding box size representing the kth keypoint, S pk is the kth keypoint true bounding box size, N is the number of keypoints, e.g. 100 here. The size of each bounding box is expressed in such a form (wide, high), as S pk = (4, 6) represents that the true bounding box of the kth key point is 6 in height and 4 in width.
And (3) combining the formulas (2), (3) and (4) to obtain an integral loss function expression of the detection network:
Ldet=Lk+λoLo+λsLs (5);
Wherein, formula (5) is a weighted combination (coefficient sum may not be 1) of formula (2) (predicted coordinate loss), formula (3) (position offset loss) and formula (4) (predicted bounding box size loss), λ o,λs is a weight super-parameter, and may be set to 1 and 0.1 correspondingly.
The loss function of the bounding box size incorporates a predictive heat mapAnd loss of the reference heat map Y x,, (for measuring whether the predicted position is accurate), position deviation loss (one position deviation is generated for each category), and bounding box loss (for measuring the predicted bounding box size).
Finally, for each picture sequenceAnd (3) reversely pushing the key point positions, the categories and the bounding box sizes of the original images by using the key point positions, the categories and the bounding box sizes in the compressed image to obtain corresponding detection results (a plurality of detection objects are identified): And for each detection result (each object) All have corresponding boundary frames to frame the pictureIs stored to obtain a boundary box image:
Further, referring to fig. 1, s200, after a corresponding first relationship sub-graph is constructed according to the bounding box of the detected object, a second relationship sub-graph is initialized and the first relationship sub-graph is queried after the first relationship sub-graph set of all the detected objects is obtained, and the first relationship sub-graph appearing for the first time is directly added into the second relationship sub-graph set;
Specifically, after the bounding box of the detection object is obtained through the detection network prediction, a corresponding relation diagram is constructed according to the bounding box: The relationship graph may also be referred to as a graph sequence or directed graph, which is directed from nodes of a category person to other nodes, the directed graph being composed of a finite set of nodes (Wherein, l is the number of nodes), an adjacency matrixNode characteristic matrix(Where d is the extracted feature dimension) and an edge feature matrix ε i, where nodesIs a key node (i.e. a node containing a category, a person or a machine) identified in the detection network, the edge feature matrix epsilon i is generated by the distance between the nodes, and the node featuresUsing node labels.
Then, constructing the first relation subgraph of all detection objects according to the relation graph: wherein a j,fj,ej corresponds to A i, Epsilon i is the same meaning, and a first relationship sub-set of all test objects, Z i, is obtained at the same time,M is the number of people in the subgraph.
Next, in order to finally calculate the accurate number of different detection objects in the relation sub-graph, after a first relation sub-graph set of all detection objects is constructed according to the boundary box, an empty second relation sub-graph set Z is initialized, then all first relation sub-graphs in the first relation sub-graph are queried, and the first relation sub-graph appearing for the first time is directly added into the second relation sub-graph to be used as an initial second relation sub-graph in the second relation sub-graph set Z. Wherein the second relationship sub-atlas is a global sub-atlas Z, Z 0 as an initial global sub-atlas.
Still further, referring to fig. 5, the step S200 of constructing a corresponding first relationship sub-graph according to the bounding box of the detected object, initializing a second relationship sub-graph set and querying the first relationship sub-graph after obtaining the first relationship sub-graph set of all the detected objects, and directly adding the first relationship sub-graph appearing first time to the second relationship sub-graph set specifically includes:
S210, constructing a relation diagram by taking the boundary frame of the detection object as a central node and other adjacent detection objects as neighbor nodes;
S220, refining the relation graph, and constructing a relation subgraph according to the refined relation graph to obtain the first relation subgraph set of all the detection objects;
S230, after initializing an empty second relation sub-graph set, inquiring the first relation sub-graph in the first relation sub-graph set, and directly adding the first relation sub-graph appearing for the first time into the second relation sub-graph set.
Specifically, after the picture sequence is input to the detection network to obtain the bounding box of the detection object, the relationship diagram is constructed by taking the detection box as a central node and other adjacent detection objects as neighbor nodes together: The topology of the relationship graph is refined by using a node-edge adjacency matrix A s and a terminal node-edge adjacency matrix A t, wherein A s and A t are both matrices composed of the number of nodes and the number of edges, i and j epsilon j0,1j represent the terminals of whether the nodes are edges or not, i and j represent the ith row and jth column respectively, i and j epsilon j0,1j are each 0 or 1 of each element of the matrices, Indicating that node i is the start of edge j,Indicating that node i is the end point of edge j. In particular if node v i is the terminal of edge e j (i.e., t (e j)=vi), thenIf node v i is the start of edge e j (i.e., s (e j)=vi), thenAnd gives a special kind of node-edge critical matrix A=A s AtT andWherein the method comprises the steps ofS (e i)=t(ej) is indicated that the start point of the side i is the end point of the side j.
Then, constructing a relational subgraph according to the thinned relational graph: Wherein n is the number of relational subgraphs, the construction process of the relational subgraphs is shown in fig. 6, and fig. 6 shows that a relational subgraph of a detection target K is constructed according to a relational graph of the detection target K, so that the first relational subgraphs can be obtained, and the first relational subgraphs of all the detection objects are obtained at the same time, wherein Z i is the first relational subgraph set.
Next, in order to finally calculate the accurate number of different detection objects in the relation sub-graph, after a first relation sub-graph set of all detection objects is constructed according to the boundary box, an empty second relation sub-graph set Z is initialized, then all first relation sub-graphs in the first relation sub-graph are queried, and the first relation sub-graph appearing for the first time is directly added into the second relation sub-graph to be used as an initial second relation sub-graph in the second relation sub-graph set Z. Wherein the second relationship sub-atlas is a global sub-atlas Z, Z 0 as an initial global sub-atlas.
Further, please continue to refer to fig. 1, s300, respectively performing normalization processing and iterative calculation on correlations between adjacent nodes in the first relational subgraph and the second relational subgraph, and correlations between adjacent edges in the first relational subgraph and the second relational subgraph in sequence, so as to obtain graph similarity between the first relational subgraph and the second relational subgraph;
Specifically, firstly, calculating the correlation between adjacent nodes in the first relational subgraph and the correlation between adjacent nodes in the second relational subgraph, so as to calculate the node similarity between the first relational subgraph and the second relational subgraph, and then, performing iterative calculation on the node similarity to obtain the graph similarity between the first relational subgraph and the second relational subgraph.
Still further, referring to fig. 7, the step S300 of performing normalization processing and iterative computation on correlations between adjacent nodes in the first relational subgraph and the second relational subgraph and correlations between adjacent edges in the first relational subgraph and the second relational subgraph in sequence to obtain graph similarity between the first relational subgraph and the second relational subgraph specifically includes:
S310, respectively calculating the correlation between adjacent nodes in the first relational subgraph and the correlation between adjacent nodes in the second relational subgraph, and then carrying out normalization processing to obtain node similarity between the first relational subgraph and the second relational subgraph, wherein the second relational subgraph is the first relational subgraph added into the second relational subgraph set;
s320, respectively calculating the correlation between the adjacent edges in the first relational subgraph and the correlation between the adjacent edges in the second relational subgraph, and then carrying out normalization processing to obtain the edge similarity of the adjacent edges in the first relational subgraph and the adjacent edges in the second relational subgraph;
s330, performing iterative computation on the node similarity and the edge similarity to obtain the node similarity matrix and the edge similarity matrix
S340, comparing corresponding positions in the node similarity matrix to obtain graph similarity of the first relationship subgraph and the second relationship subgraph.
Specifically, referring to fig. 8, taking a neighboring node u, i in one of the first relational subgraphs and a neighboring node v, j in one of the second relational subgraphs as an example, the respective correlations r ui and r vj of the neighboring node u, i and the neighboring node v, j are defined as follows:
Wherein d u、di、dv and d j are u node label, i node label, v node label and j node label, c ui is the distance between u node and i node and c vj is the distance between v node and j node, respectively, and a label value with a lower value can be defined for a target class (such as a person) easy to move, and a label value with a higher value can be defined for a target class (such as a machine) difficult to move.
Then, after obtaining the correlation between the adjacent nodes, carrying out normalization processing to obtain the node similarity between the adjacent nodes in the first relational subgraph and the adjacent nodes in the second relational subgraph, namely obtaining the node similarity between the adjacent nodes in different directed graphs, wherein the formula is as follows:
Wherein h ij represents the node similarity of node i in the first relationship graph and node j in the second relationship graph;
Similarly, the edge similarity is calculated in the same way as the node similarity is calculated, referring to fig. 9, for the adjacent edges e u and e i of the neighboring nodes u, i in the first relational subgraph, and the edge neighbors e v and e j of the neighboring nodes v, j in the second relational subgraph, the correlation of the edge neighbors is defined as follows:
Where d iu is the node class connecting edge e i and edge e u, d jv is the node class connecting edge e j and edge e v, and c i and c u are the lengths of edge e i and edge e u.
And then, after obtaining the correlation of the adjacent edges, carrying out normalization processing to obtain the edge similarity between the adjacent edges in the first relational subgraph and the adjacent edges in the second relational subgraph, namely obtaining the edge similarity between the adjacent edges in different directed graphs, wherein the formula is as follows:
Next, performing iterative computation on the node similarity and the edge similarity to obtain a node similarity matrix M N between the first relationship sub-graph and the second relationship sub-graph, and an edge similarity matrix M E between the first relationship sub-graph and the second relationship sub-graph, where the specific formulas are as follows:
Wherein, the above formula (10) is to calculate the node similarity matrix of the kth generation And the k-th generation edge similarity matrixWhen all the edge similarity matrix of the k-1 generation is neededNode similarity matrix with k-1 generationA is a directed graph a: B is the directed graph B: Is used for the adjacent matrix of (a), For initial node similarity matrixThe corresponding element is h ij which is the number of the elements,For an initial edge correlation matrixThe corresponding element is h i′j, k is the iteration number, A s (or A t) and B s (or B t) are node-edge adjacency matrices, A T is the transpose of the adjacency matrix of the directed graph A, and A s)T is the transpose of the node-edge adjacency matrix.
Finally, comparing the node similarity matrix between the first relationship sub-graph and the second relationship sub-graphCorresponding positions of the same detection object in the first relation subgraph and the second relation subgraph are obtained to obtain graph similarity S g between the first relation subgraph and the second relation subgraph, namely node similarity matrixes between two directed graphs are comparedCorresponding positions of the same person to obtain a graph similarity S g between the directed graphs represented by the two persons, e.gThe number of rows is the number of nodes A and the number of columns is the number of nodes B, and the v th row and the u th column representNode u and of (2)Is a graph similarity of node v.
Further, please continue with fig. 1, s400, respectively calculating the histogram similarity of the first bounding box image and the second bounding box image on three channels, and calculating the average histogram similarity of the first bounding box image and the second bounding box image;
Specifically, the histogram similarity of the first boundary frame image corresponding to the first relation sub-graph and the second boundary frame image corresponding to the second relation sub-graph on the three channels of R, G, B is calculated respectively, so that the average histogram similarity between the first boundary frame image and the second boundary frame image is calculated.
Still further, referring to fig. 10, the step of S400 of calculating the histogram similarity of the first and second boundary frame images on three channels, respectively, and calculating the average histogram similarity of the first and second boundary frame images includes:
s410, adjusting the first boundary frame image corresponding to the first relation sub-graph and the second boundary frame image corresponding to the second relation sub-graph to a preset size, wherein the first boundary frame image is the boundary frame image corresponding to the first relation sub-graph, and the second boundary frame image is the boundary frame image corresponding to the second relation sub-graph;
S420, separating the adjusted first boundary frame image and the adjusted second boundary block diagram in three channels respectively, and calculating corresponding histogram similarity in the three channels respectively;
s430, solving the average value of the histogram similarity of the three channels to obtain the average histogram similarity of the first boundary frame image and the second boundary frame image.
Specifically, the first boundary frame image corresponding to the first relational subgraphA second boundary frame image corresponding to the second relational graphThe following are examples:
First, the first boundary frame image And the second bounding box imageAfter adjusting to a predetermined size, for example 256×256, a matrix, which may be expressed as 256×256×3 for color pictures, is obtained by separating on R, G, B three channelsAnd respectively extracting the adjusted first boundary frame imagesAnd an adjusted second boundary block imageIs provided for the pixel channels of (a)Then, the adjusted first bounding box images on R, G, B channels are calculated separatelyAnd an adjusted second boundary block imageCorrespondingly obtaining the adjusted first boundary frame image on R, G, B three channelsAnd an adjusted second boundary block imageAnd finally, calculating the average value of the histogram similarity of the three channels to obtain the average histogram similarity of the first boundary frame image and the second boundary frame image:
wherein the algorithm of average histogram similarity is expressed as follows:
Algorithm 1:
Input of first boundary frame image And a second bounding box image
Output: S h: And Average histogram similarity of (a)
1, AdjustingAndIs 256 by 256 in size
2, WillAndSeparated into R, G, B three channels
3, Respectively extractingAndIs provided for the pixel channels of (a)
4:for p in range(r,g,b)do
Initializing d=0// d as histogram coincidence degree
For q=0, q <256×256do// P represents the pixel value
7:ifDetermining bounding box imageIn channel p, if the pixel values at position q are equal
8:
9:else
10:d=d+1
11:p=p+1
12:end for
13:
14:end for
15:
16:return Sh。
The process of algorithm 1 is described as follows:
for the first bounding box image of the input And the second bounding box imageCan be decomposed into a matrix of 3 three channels of 256 x 1. And initializing d=0 for the matrix of each channel. For each position in the matrix(Wherein,Representing a bounding box imageIn channel p, the pixel value at position q,Representing a bounding box imageIn channel p, pixel value at position q) to measure the similarity between them, 1 if and only if the pixel values at the corresponding positions are equal. 256 x 256 similarity meansThe similarity of the channel matrix is shown, i.e. d is the sum of the similarity of 256 x 256 positions. Finally, the average of the similarity of the three channel matrices is denoted as the average histogram similarity S h, i.e
Further, please continue to refer to fig. 1, s500, and the comprehensive similarity of the first relationship sub-graph and the second relationship sub-graph is obtained according to the graph similarity and the average histogram similarity;
Specifically, after the average histogram similarity S h and the graph similarity S g between the first bounding box image and the second bounding box image are calculated, the integrated similarity between the first relational subgraph and the second relational subgraph may be further calculated.
Further, the step S500 of calculating the integrated similarity of the first relational subgraph and the second relational subgraph according to the graph similarity and the average histogram similarity includes:
S510, substituting a preset proportion into the graph similarity and the average histogram similarity to calculate so as to obtain the comprehensive similarity of the first relationship subgraph and the second relationship subgraph, wherein the preset proportion of the graph similarity and the preset proportion of the average histogram similarity are added to be equal to 1.
Specifically, after obtaining the average histogram similarity S h and the graph similarity S g between the first bounding box image and the second bounding box image, the comprehensive similarity between the person i in the first relational subgraph and the person j in the second relational subgraph may be obtained by calculating according to a preset ratio, for example, please refer to the right part of fig. 3, and for the obtained first relational subgraph, the calculation formula of the comprehensive similarity is as follows:
S(i,j)=λhSh(i,j)+λgSg(i,j) (11);
Where λ h is the weight of the average histogram similarity S h, λ g is the weight of the graph similarity S g, in the embodiment of the present invention, λ h=0.5,λg =0.5 is set, and in fact, λ h and λ g may be arbitrarily set, but λ h+λg =1 needs to be satisfied, that is, the preset proportion of the graph similarity and the preset proportion of the average histogram similarity are added to be equal to "1".
Further, please continue to refer to fig. 1, s600, respectively calculate the comprehensive similarity of each first relationship sub-graph and all second relationship sub-graphs, and compare with a preset threshold, if the comprehensive similarity is not smaller than the preset threshold, the first relationship sub-graph is substituted for the second relationship sub-graph, otherwise, the first relationship sub-graph is directly added into the second relationship sub-graph set to update the second relationship sub-graph set;
Specifically, after initializing one second relationship sub-graph set and directly adding the first relationship sub-graph appearing first to the second relationship sub-graph set, calculating the integrated similarity of each first relationship sub-graph and all second relationship sub-graphs by using the method for calculating the integrated similarity, comparing the integrated similarity with a preset threshold, and if the integrated similarity is not smaller than the preset threshold, replacing the second relationship sub-graph with the first relationship sub-graph, otherwise, directly adding the first relationship sub-graph to the second relationship sub-graph set to update the second relationship sub-graph set.
Further, referring to fig. 11, S600 is configured to calculate the integrated similarity of each of the first relational subgraphs and all the second relational subgraphs, compare the integrated similarity with a preset threshold, replace the second relational subgraphs with the first relational subgraphs if the integrated similarity is not smaller than the preset threshold, and otherwise directly add the first relational subgraphs into the second relational subgraphs to update the second relational subgraphs, where the step of updating the second relational subgraphs specifically includes:
S610, respectively calculating the comprehensive similarity of each first relation sub-graph and all second relation sub-graphs, and comparing each time with the preset threshold value;
S620, if the integrated similarity is greater than or equal to the preset threshold, replacing the second relationship sub-graph with the first relationship sub-graph, and if the integrated similarity is less than the preset threshold, directly adding the first relationship sub-graph into the second relationship sub-graph set to update the second relationship sub-graph set.
Specifically, after initializing one second relationship sub-graph set and directly adding the first relationship sub-graph appearing first to the second relationship sub-graph set, traversing and querying the second relationship sub-graphs in the second relationship sub-graph set, sequentially calculating the comprehensive similarity between the first relationship sub-graph in the remaining first relationship sub-graph set Z i and all the second relationship sub-graphs in the global sub-graph set Z (the second relationship sub-graph set), and comparing each time with the preset threshold (for example, 0.9):
If the comprehensive similarity is greater than or equal to the preset threshold, replacing the second relationship subgraph with the first relationship subgraph, if the comprehensive similarity is less than the preset threshold, directly adding the first relationship subgraph into the second relationship subgraph set, so as to update the second relationship subgraph set, namely, regarding the relationship subgraph with high similarity (the value of the comprehensive similarity is greater than 0.9), the relationship subgraph can be considered as a repeated character and replaced, the replacement is caused by updating the latest state of the character, avoiding the similarity of the same person from being reduced along with the time, and regarding the relationship subgraph with low similarity as a new character to be added into the global relationship subgraph set, thereby keeping the same detected object in the global relationship subgraph set with higher similarity in real time, namely, keeping the accuracy of calculating different detected objects in the global relationship subgraph set.
Wherein, the algorithm for updating the second relation sub-graph set and carrying out the statistics of the number of people is expressed as follows:
Algorithm 2:
Inputting a sequence of pictures
Output is total number of people N
1, Detecting all targets in I t by using a detection network to obtain a boundary boxAnd corresponding bounding box image
2 According toConstruction of a relationship graph
Constructing a first relationship sub-graph set Z i for all people according to the relationship graph G i, and simultaneously acquiring corresponding boundary block imagesWherein, the first relational subgraph: m is the number of people in the relational graph
4, Regarding Z 0 as the initialized global map set Z (second relationship sub-map set)
For p=1, p < =kdo// p is traversed from 1~k, where k is the number of boundary block diagrams
6:I x = number of Z neutron maps
Number of 7:i y=Zp neutron maps
8:Sh=0,Sg=0
9:for q=0,q<=ixdo
10:for w=0,w<=iydo
11 Calculation with Algorithm 1Average histogram similarity to B q For the first bounding box imageB 2 is a second bounding box image
12 Calculating according to the formulas (6) - (8)Graph similarity to g q For the first relational subgraph, g q is the second relational subgraph
13:ifThe// t is a preset threshold, e.g. t=0.9
14 Delete g q from Z and toAdded into
15: continue
16:else
17 Will beAdding Z
18:end for
19:end for
20 Update Z
21:end for
22 Calculating the number N of relational subgraphs in the set Z
23:return N
The process of algorithm 2 is described as follows:
Initializing an empty set Z, carrying out similarity analysis on each person (i) in the picture sequence, namely, carrying out corresponding boundary frame images and relationship subgraphs extracted by taking the person as a center, removing j from Z and adding i into Z if the person (j) in Z is highly similar (the value of the comprehensive similarity is greater than 0.9), and directly adding j into Z if the highly similar person does not exist, wherein the number of the relationship subgraphs of Z is the total number of people.
With continued reference to fig. 1, s700, after all the first relational subgraphs are calculated, the number of the second relational subgraphs in the updated second relational subgraphs is calculated, so as to obtain the number of different detection objects in the picture sequence.
Specifically, the comprehensive similarity between each first relationship sub-graph in the first relationship sub-graph set Z i and all the second relationship sub-graphs in the global sub-graph set Z (the second relationship sub-graph set) is sequentially calculated, and compared with the preset threshold value each time, the second relationship sub-graph set is correspondingly operated according to the comparison result to be updated until the calculation of all the first relationship sub-graphs is completed, and the number of the second relationship sub-graphs in the updated second relationship sub-graph is calculated, so that the number of different detection objects in the picture sequence, namely the number of the second relationship sub-graphs in the updated second relationship sub-graph set, namely the number of different detection objects, namely the accurate total number, can be obtained.
Further, referring to fig. 12, the present invention provides a graph relationship network demographics system, the graph relationship network demographics system comprising:
the boundary block image generation module 100, the second relation sub-graph set generation module 200, the graph similarity calculation module 300, the average histogram similarity calculation module 400, the comprehensive similarity solving module 500, the preset threshold comparison module 600 and the detection object number generation module 700;
The boundary block image generating module 100 is configured to predict an input image sequence by using a detection network to obtain the positions of key points, boundary boxes and categories of all detection objects in the image sequence, and the sizes of boundary boxes corresponding to the key points, and obtain boundary box images of the image sequence according to the positions of the key points, the categories and the sizes of the boundary boxes, the second relationship sub-graph set generating module 200 is configured to construct corresponding first relationship sub-graphs according to the boundary boxes of the detection objects, initialize the second relationship sub-graph set and query the first relationship sub-graph to directly add the first relationship sub-graph appearing first to the second relationship sub-graph set, the graph similarity calculating module 300 is configured to sequentially normalize and iterate correlations between adjacent edges in the first relationship sub-graph and the second relationship sub-graph according to the positions of the key points, and to obtain the first relationship sub-graph and the first relationship sub-graph, and calculate the average similarity between the first relationship sub-graph and the first histogram, the first relationship sub-graph and the histogram similarity calculating module 400 is configured to calculate the average similarity between the first relationship sub-graph and the first histogram and the histogram, the first histogram and the histogram similarity calculating module 600 is configured to calculate the average similarity between the first relationship sub-graph and the first histogram and the histogram similarity, the method comprises the steps of respectively calculating the comprehensive similarity of each first relationship sub-graph and all second relationship sub-graphs, comparing the comprehensive similarity with a preset threshold, replacing the second relationship sub-graphs with the first relationship sub-graphs if the comprehensive similarity is not smaller than the preset threshold, otherwise, directly adding the first relationship sub-graphs into the second relationship sub-graphs to update the second relationship sub-graph set; the number of detected objects generating module 700 is configured to calculate, after all the first relational subgraphs are calculated, the number of the second relational subgraphs in the updated second relational subgraphs, so as to obtain the number of different detected objects in the picture sequence.
Specifically, first, the input picture sequence is checked by the detection network (CENTERNET network)Predicting to detect all the detection objects in each picture, and obtaining a bounding box corresponding to each detection object: (c is a category, the bounding box contains the location of the keypoint and category information of the detection object) and the bounding box size corresponding to the keypoint: and obtaining a boundary frame image of the picture sequence according to the key point position, the category and the boundary frame size: then, constructing a corresponding relation diagram according to the bounding box: And constructing the first relation subgraph of all detection objects according to the relation graph: Obtaining first relationship sub-graph set Z i of all detected objects, initializing a second relationship sub-graph set Z, inquiring all first relationship sub-graphs in the first relationship sub-graph set, directly adding the first relationship sub-graph appearing first into the second relationship sub-graph set as initial second relationship sub-graph in the second relationship sub-graph set Z, calculating correlation between adjacent nodes in the first relationship sub-graph, for example And correlations between adjacent nodes in the second relational subgraph, e.gAnd calculating to obtain the node similarity between the first relation subgraph and the second relation subgraph: And performing iterative computation on the node similarity to obtain the graph similarity between the first relationship sub-graph and the second relationship sub-graph, wherein the graph similarity is S g.
Further, the histogram similarity of the first bounding box image corresponding to the first relational sub-graph and the second bounding box image corresponding to the second relational sub-graph on the three channels R, G, B is calculated respectively, wherein the histogram similarity is calculated by S r、Sg and S b: And finally, respectively calculating the comprehensive similarity of each first relationship sub-graph and all second relationship sub-graphs, comparing the comprehensive similarity with a preset threshold, replacing the second relationship sub-graph with the first relationship sub-graph if the comprehensive similarity is not smaller than the preset threshold, otherwise, directly adding the first relationship sub-graph into the second relationship sub-graph until the comprehensive similarity of all first relationship sub-graphs and all second relationship sub-graphs is calculated, and then calculating the number of the second relationship sub-graphs in the updated second relationship sub-graph to obtain the number of different detection objects in the picture sequence.
According to the method, node similarity between the first relational subgraph and the second relational subgraph is obtained through calculation according to the correlation between adjacent nodes in the first relational subgraph and the second relational subgraph, then iterative calculation is carried out to obtain graph similarity of the first relational subgraph and the second relational subgraph, then average histogram similarity of a first boundary frame image corresponding to the first relational subgraph and a second boundary frame image corresponding to the second relational subgraph is calculated, after comprehensive similarity of the graph similarity and the average histogram similarity is solved, comparison is carried out with a preset threshold, and the second relational subgraph set is correspondingly updated according to comparison results so as to obtain the number of different detection objects in the image sequence, so that the number of different detection objects in the image is counted efficiently and accurately.
Still further, referring to fig. 13, the present invention also provides a terminal, which includes a processor 10, a memory 20 and a display 30. Fig. 13 shows only some of the components of the terminal, but it should be understood that not all of the illustrated components are required to be implemented and that more or fewer components may alternatively be implemented.
The memory 20 may in some embodiments be an internal storage unit of the terminal, such as a hard disk or a memory of the terminal. The memory 20 may in other embodiments also be an external storage device of the terminal, such as a plug-in hard disk provided on the terminal, a smart memory card (SMART MEDIA CARD, SMC), a Secure Digital (SD) card, a flash memory card (FLASH CARD), etc. Further, the memory 20 may also include both an internal storage unit and an external storage device of the terminal. The memory 20 is used for storing application software and various data installed in the terminal. The memory 20 may also be used to temporarily store data that has been output or is to be output. In one embodiment, the memory 20 has stored thereon a graph relationship network demographics program 40, the graph relationship network demographics program 40 being executable by the processor 10 to implement the soft constraint-based defense methodology of the present invention for a population of unmanned aerial vehicles.
The processor 10 may in some embodiments be a central processing unit (Central Processing Unit, CPU), microprocessor or other data processing chip for running program code or processing data stored in the memory 20, for example performing the soft constraint based defense method for the drone group, etc.
The display 30 may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch, or the like in some embodiments. The display 30 is used for displaying information on the device and for displaying a visual user interface. The components 10-30 of the device communicate with each other via a system bus.
In one embodiment, the steps of implementing the graph relationship network demographics method described above are implemented when the processor 10 executes the graph relationship network demographics program 40 in the memory 20.
Further, the invention also provides a computer readable storage medium, wherein the computer readable storage medium stores a graph relationship network people counting program, the graph relationship network people counting program realizes the steps of the graph relationship network people counting method when being executed by a processor, and the detailed description of the steps of the graph relationship network people counting method is omitted.
In summary, the graph relationship network people counting method and related equipment provided by the invention comprise the steps of predicting an input image sequence by utilizing a detection network to obtain key point positions, boundary frames and categories of all detection objects in the image sequence and boundary frame sizes corresponding to key points, obtaining boundary frame images of the image sequence according to the key point positions, the categories and the boundary frame sizes, constructing corresponding first relationship subgraphs according to the boundary frames of the detection objects, initializing a second relationship subgraphs, inquiring the first relationship subgraphs, directly adding the first relationship subgraphs appearing for the first time into the second relationship subgraphs, respectively carrying out normalization processing and iterative calculation on correlations between adjacent nodes in the first relationship subgraphs and the second relationship subgraphs in sequence to obtain first relationship subgraphs and second relationship subgraphs, respectively carrying out calculation on the first relationship subgraphs and the second relationship subgraphs according to the first relationship subgraphs, respectively carrying out the first channel similarity and the first histogram similarity and the second histogram similarity, respectively carrying out the first histogram similarity and the second histogram similarity calculation on the first subgraphs and the second histogram similarity, and the first histogram similarity and the second histogram similarity are calculated according to the first histogram and the first histogram similarity and the second histogram similarity are calculated respectively, replacing the second relationship sub-graph with the first relationship sub-graph, otherwise, directly adding the first relationship sub-graph into the second relationship sub-graph set to update the second relationship sub-graph set; and after the calculation of all the first relation subgraphs is completed, calculating the number of the second relation subgraphs in the updated second relation subgraphs to obtain the number of different detection objects in the picture sequence. After calculating the graph similarity of a first relationship sub-graph and a second relationship sub-graph and the average histogram similarity of a first boundary frame image corresponding to the first relationship sub-graph and a second boundary frame image corresponding to the second relationship sub-graph, solving the comprehensive similarity of the graph similarity and the average histogram similarity, comparing the comprehensive similarity with a preset threshold value, and correspondingly updating the second relationship sub-graph set according to a comparison result so as to obtain the number of different detection objects in the image sequence, thereby efficiently and accurately counting the number of different detection objects in the image.
It will be understood that equivalents and modifications will occur to those skilled in the art in light of the present invention and their spirit, and all such modifications and substitutions are intended to be included within the scope of the present invention as defined in the following claims.