CN118692146B - Intelligent auxiliary training method, system, equipment and medium for badminton - Google Patents
Intelligent auxiliary training method, system, equipment and medium for badminton Download PDFInfo
- Publication number
- CN118692146B CN118692146B CN202410788636.0A CN202410788636A CN118692146B CN 118692146 B CN118692146 B CN 118692146B CN 202410788636 A CN202410788636 A CN 202410788636A CN 118692146 B CN118692146 B CN 118692146B
- Authority
- CN
- China
- Prior art keywords
- target
- gesture
- target image
- image
- pixel
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 46
- 238000012549 training Methods 0.000 title claims abstract description 31
- 238000001514 detection method Methods 0.000 claims description 54
- 210000000988 bone and bone Anatomy 0.000 claims description 24
- 239000003086 colorant Substances 0.000 claims description 15
- 238000004422 calculation algorithm Methods 0.000 claims description 5
- 230000000295 complement effect Effects 0.000 claims description 5
- 238000004590 computer program Methods 0.000 claims description 4
- 230000006870 function Effects 0.000 claims description 4
- 230000036544 posture Effects 0.000 description 23
- 230000008569 process Effects 0.000 description 10
- 230000008859 change Effects 0.000 description 4
- 238000004891 communication Methods 0.000 description 4
- 238000012545 processing Methods 0.000 description 4
- 238000003860 storage Methods 0.000 description 4
- 230000009471 action Effects 0.000 description 3
- 210000001624 hip Anatomy 0.000 description 3
- 210000003127 knee Anatomy 0.000 description 3
- 210000000707 wrist Anatomy 0.000 description 3
- 210000003423 ankle Anatomy 0.000 description 2
- 210000002683 foot Anatomy 0.000 description 2
- 238000005286 illumination Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000001360 synchronised effect Effects 0.000 description 2
- 210000000689 upper leg Anatomy 0.000 description 2
- 230000000007 visual effect Effects 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 210000000527 greater trochanter Anatomy 0.000 description 1
- 210000000528 lesser trochanter Anatomy 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 210000004197 pelvis Anatomy 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/20—Movements or behaviour, e.g. gesture recognition
- G06V40/23—Recognition of whole body movements, e.g. for sport training
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/56—Extraction of image or video features relating to colour
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/41—Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
- G06V20/42—Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items of sport video content
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Multimedia (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Medical Informatics (AREA)
- Evolutionary Computation (AREA)
- Databases & Information Systems (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- Psychiatry (AREA)
- Social Psychology (AREA)
- Human Computer Interaction (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses an intelligent auxiliary training method, system, equipment and medium for badminton, which belong to the technical field of computer vision and comprise the steps of setting a plurality of cameras, selecting one camera as a default camera, acquiring a first target image shot by the default camera and target pixels of a target object, forming a target area based on the target pixels, acquiring a first gesture of the target object in the target area, comparing the first gesture with a second gesture in a first database, calculating the highest similarity of the first gesture, acquiring a second target image which is a K-frame image shot by the default camera, judging whether the target object in the second target image is blocked, acquiring a supplementary video image and analyzing if the target object is blocked, otherwise, acquiring the first gesture from the second target image, adding the similarities of all the first gestures to obtain an average value, and obtaining the motion score of the target object. The accuracy of gesture recognition in badminton training is improved through the method and the device.
Description
Technical Field
The invention belongs to the technical field of computer vision, and particularly relates to an intelligent auxiliary training method, system, equipment and medium for badminton.
Background
In badminton training, the accuracy of the posture and the action of a player is important for improving the sports and assisting training, and along with the rapid development of computer vision, the human posture of the player is obtained through video analysis, and assisting training data is generated, so that the badminton training becomes a new training means.
The current intelligent auxiliary training method for badminton, for example, the Chinese patent document with publication number CN116958872A discloses an intelligent auxiliary training method for badminton, which outputs the position information and time sequence information of a ball body from the two-dimensional ball path detection and tracking, and carries out reduction and track optimization on the ball body in a three-dimensional space, thereby realizing the athlete technical action acquisition method based on binocular viewing angles and monocular viewing angles, introducing additional condition constraint from a plurality of angles such as time sequence, field, competition rules and the like, and trying to obtain a more accurate estimation result for the three-dimensional human body posture of the badminton athlete under the monocular viewing angles.
However, the above-mentioned prior art method needs to introduce additional condition constraints from various angles of time sequence, field, competition rules, etc., which may cause that under certain special conditions, such as nonstandard field or competition rules change, the accuracy and effectiveness of the system are affected, and the limitation of monocular viewing angle, such as viewing angle shielding, illumination change, etc., may cause that the accuracy of human body posture estimation is affected, so that an intelligent auxiliary training method for badminton exercise is required to provide higher accuracy in human body posture estimation.
Disclosure of Invention
In order to solve the problems, the invention provides an intelligent auxiliary training method, system, equipment and medium for badminton, which are used for solving the problems in the prior art.
In order to achieve the above object, the present invention provides an intelligent training aid method for badminton, comprising:
S1, setting a plurality of cameras, selecting one camera as a default camera, and simultaneously shooting complementary video images by other cameras to obtain a first target image shot by the default camera, wherein the first target image is a first frame image comprising a target object, and obtaining target pixels of the target object based on the first target image;
s2, forming a target area based on the target pixel, acquiring a first gesture of the target object in the target area, establishing a first database, comparing the first gesture with a second gesture in the first database, calculating to obtain the highest similarity of the first gesture, and storing the first gesture and the highest similarity of the first target image into the first database;
S3, acquiring a second target image, wherein the second target image is a K-th frame image shot by the default camera and behind the first target image, positioning the target area in the second target image based on the target pixels, judging whether the target object is blocked, acquiring the supplementary video image if the target object is blocked, analyzing the supplementary video image to acquire a first gesture of the target object at the moment, acquiring the first gesture from the second target image if the target object is not blocked, and storing the first gesture into the first database;
And S4, adding the highest similarity of all the first gestures in the first database, calculating an average value, and taking the average value as a motion score corresponding to the target object.
Further, locating the target region in the second target image comprises the steps of:
In the first target image, setting a size of a target area based on the target object, acquiring colors including pixel points in the target area of the first target image, defining the colors as target pixel points, calculating a first number of the target pixel points of different colors in the target area, positioning alternative positions where the target pixel points appear in the second target image, generating alternative areas at the alternative positions, wherein the size of the alternative areas is the same as that of the target area, calculating a second number of the pixel points of different colors in each of the alternative areas, calculating a difference value of the pixel point numbers of different colors based on the first number and the second number, and setting the alternative area with the smallest sum of the difference values as the target area.
Further, acquiring the first pose of the target object within the target area comprises the steps of:
Acquiring a first pixel, wherein the first pixel is any pixel in a target area, calculating a first difference value of the first pixel and each surrounding pixel, defining the first pixel as a boundary pixel of the target object if the first difference value is greater than or equal to a first threshold value and smaller than a second threshold value, sequentially connecting the boundary pixels to obtain a boundary contour, and defining the boundary contour as the first gesture of the target object.
Further, calculating the highest similarity of the first pose comprises the steps of:
Setting a plurality of first detection points on the boundary area, setting a plurality of second detection points on the boundary area of a second gesture in the first database, setting a reference position, placing the first gesture and the second gesture based on the reference position, calculating the similarity alpha of the first gesture and each second gesture based on a first formula, wherein the first formula is as follows: Wherein N is the number of the first detection points, Q I is the three-dimensional space position coordinate of the I-th first detection point, Q J is the three-dimensional space position coordinate of the J-th second detection point, Q 1 is the set of the first detection points, Q 2 is the set of the second detection points, min is the minimum function of obtaining the second detection point closest to the first detection point, Q I-qJ||2 is the square of the distance between the first detection point Q I and the second detection point Q J, and the highest similarity of the first gesture is obtained by comparing the magnitudes of the similarities.
Further, determining whether the target object is occluded comprises the steps of:
Obtaining K-2 and K-1 frame images shot by the default camera, defining the K-2 and K-1 frame images as a third target image and a fourth target image respectively, identifying the third target image and the fourth target image, obtaining a first skeleton structure and a second skeleton structure, wherein the skeleton structure comprises a plurality of skeleton nodes, marking key nodes in the skeleton nodes, and respectively calculating a first distance between each skeleton node and the default camera based on a monocular ranging algorithm;
And judging the movement trend of skeleton nodes based on the time sequence of the third target image and the fourth target image, generating a third skeleton structure of the second target image based on the movement trend and the second skeleton structure, generating a human body model based on the skeleton nodes in the third skeleton structure, defining skeleton nodes appearing in the overlapping area as target nodes if an overlapping area exists in the human body model, and judging that the target object is blocked if the target nodes comprise the key nodes and the first distance of the key nodes is larger than that of the rest target nodes.
Further, storing the first gesture to the first database may further comprise:
And acquiring the position information of each key node in each first gesture, generating rotation angle data corresponding to each first gesture, calculating a second difference value of the rotation angle data of different first gestures, and eliminating the first gestures with the second difference value smaller than a third threshold value.
The invention also provides an intelligent auxiliary training system for badminton, which is used for realizing the method, and mainly comprises the following steps:
The acquisition module is used for setting a plurality of cameras, selecting one camera as a default camera, shooting complementary video images by the other cameras at the same time, and acquiring a first target image shot by the default camera, wherein the first target image is a first frame image comprising a target object, and acquiring target pixels of the target object based on the first target image;
The first generation module forms a target area based on the target pixel, acquires a first gesture of the target object in the target area, establishes a first database, compares the first gesture with a second gesture in the first database, acquires the second gesture with the highest similarity with the first gesture, and stores the first gesture and the similarity of the first target image into the first database;
The second generation module is used for acquiring a second target image, wherein the second target image is a K-th frame image shot by the default camera and behind the first target image, positioning the target area in the second target image based on the target pixels, judging whether the target object is blocked, acquiring the supplementary video image if the target object is blocked, analyzing the supplementary video image to acquire the first gesture of the target object at the moment, acquiring the first gesture from the second target image if the target object is not blocked, and storing the first gesture into the first database;
and the scoring module is used for adding the similarity of all the first gestures in the first database, calculating an average value, and taking the average value as a motion score corresponding to the target object.
The invention also provides equipment for realizing the method, which comprises a processor, a memory and a communication bus, wherein the processor and the memory are communicated with each other through the communication bus;
the memory is used for storing a computer program;
The processor being adapted to execute a program stored in the memory for implementing the method of any one of claims 1-6.
The invention also provides a computer medium, wherein the computer medium stores program instructions, and the program instructions control equipment where the computer medium is located to execute the method when running.
Compared with the prior art, the invention has the following beneficial effects:
The method and the device for identifying the object in the image form a target area based on target pixels in a first target image under a default camera, gesture identification is carried out in the target area, the range of image processing is clarified, interference of background information on an identification result is effectively reduced, the gesture with the highest similarity with the target gesture is found out by comparing the first gesture with a known second gesture in a database, accuracy of gesture identification is improved, possibility of identification errors is reduced, the first gesture of the object is determined by adopting a camera under a multi-visual angle, gesture information can be identified in real time even if the object is blocked, robustness of gesture identification is improved, and then a motion score corresponding to the object is obtained by storing the first gesture and the similarity value in the sub-first database, and visual data presentation and quantized information feedback are provided for athletes or trainers.
Drawings
FIG. 1 is a flow chart of the steps of the intelligent auxiliary training method for badminton;
FIG. 2 is a block diagram of an intelligent training aid system for badminton.
Detailed Description
The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
It will be understood that the terms "first," "second," and the like, as used herein, may be used to describe various elements, but these elements are not limited by these terms unless otherwise specified. These terms are only used to distinguish one element from another element. For example, a first xx script may be referred to as a second xx script, and similarly, a second xx script may be referred to as a first xx script, without departing from the scope of this disclosure.
As shown in fig. 1, an intelligent auxiliary training method for badminton, includes:
S1, setting a plurality of cameras, selecting one camera as a default camera, simultaneously shooting complementary video images by the other cameras, acquiring a first target image shot by the default camera, wherein the first target image is a first frame image comprising a target object, and acquiring target pixels of the target object based on the first target image.
Specifically, in this embodiment, in order to better identify the human body posture of the athlete or the trainer in the badminton training process, a plurality of cameras at different viewing angles are set to perform video shooting, multiple factors such as the viewing angle, the image definition, the frame rate, the stability, the illumination condition and the like are comprehensively considered to select a camera capable of shooting a target object to the maximum extent as a default camera, the first target image is a first frame image shot by the default camera, the image may include a plurality of moving objects, for example, a plurality of athletes or trainers, shuttlecocks, badminton rackets and the like, and the target object is an athlete a designated to observe.
And S2, forming a target area based on target pixels, acquiring a first gesture of a target object in the target area, establishing a first database, comparing the first gesture with a second gesture in the first database, calculating to obtain the highest similarity of the first gesture, and storing the first gesture and the highest similarity of the first target image into the first database.
Specifically, in this embodiment, the target area is first determined according to the pixel proportion of the target pixels in the target object, where the target area is a rectangular or square area, including the athlete a, but may also include other moving objects, for example, a shuttlecock, and the boundary pixels in the target area are identified to form a first pose (subsequent expansion description) of the target object, where the first pose refers to a pose profile of the target object in the first target image, the first pose is compared with a second pose stored in a first database in advance, where the second pose is a standard pose of the athlete in each state during the training process of the shuttlecock, the second pose with the highest similarity to the first pose is acquired, the similarity calculation method is then expanded, the information of the maximum similarity and the first pose are stored in the first database, and the first database is used to store pose information of each target image (the first target image, the second target image, the third and the nth target image), and the image that generates small changes can be filtered (subsequent expansion description) according to the size of the pose changes in the target image, so that the storage space of the first database is convenient to process. According to the invention, the target area is determined, so that the processing range of the first target image is reduced to a certain extent, the gesture recognition efficiency of the target object is enhanced, and all gesture information and similarity information are stored in the first database, thereby facilitating the display and inquiry of subsequent data.
And S3, acquiring a second target image, wherein the second target image is a K-th frame image shot by a default camera and behind the first target image, positioning a target area in the second target image based on target pixels, judging whether the target object is blocked, acquiring a supplementary video image if the target object is blocked, analyzing the supplementary video image to acquire a first posture of the target object at the moment, acquiring the first posture from the second target image if the target object is not blocked, and storing the first posture into a first database.
Specifically, in this embodiment, during the training process, based on the sending route of the shuttlecock, the athlete a needs to frequently move, jump and strike, in this process, a default camera is required to capture multiple frames of images, where the second target image is the first target image and then the K frame of image, the athlete a in the second target image is identified, the athlete a is positioned in a target area (subsequent expansion description) in the second target image according to the pixel proportion of the athlete a, and the athlete a in the target area may have partial occlusion with other athletes or moving objects such as the shuttlecock due to the movement, so that only the first pose of the athlete a cannot be completely seen under the view angle of the default camera, then the video image captured by the camera located at the other view angle needs to be acquired, the first pose of the target object is acquired from the video image, if the occlusion does not occur, the first pose is directly acquired from the second target image, and the first pose is stored in the first database.
And S4, adding the highest similarity of all the first gestures in the first database, calculating an average value, and taking the average value as a motion score corresponding to the target object.
Specifically, in this embodiment, the highest similarity corresponding to each first gesture in the first database is added and averaged to obtain a motion score of the athlete a in the whole badminton training process, the higher the average value is, the higher the level of badminton training of the athlete is, and the more standard the first gesture similarity is, the lower the similarity is, and training adjustment can be performed according to the second gesture in the corresponding state.
The method and the device for identifying the object in the image form a target area based on target pixels in a first target image under a default camera, gesture identification is carried out in the target area, the range of image processing is clarified, interference of background information on an identification result is effectively reduced, the gesture with the highest similarity with the target gesture is found out by comparing the first gesture with a known second gesture in a database, accuracy of gesture identification is improved, possibility of identification errors is reduced, the first gesture of the object is determined by adopting a camera under a multi-visual angle, gesture information can be identified in real time even if the object is blocked, robustness of gesture identification is improved, and then a motion score corresponding to the object is obtained by storing the first gesture and the similarity value in the sub-first database, and visual data presentation and quantized information feedback are provided for athletes or trainers.
It is particularly noted that, by the above technical solution, locating the target area in the second target image comprises the following steps:
In a first target image, setting the size of a target area based on a target object, acquiring colors comprising pixel points in the target area of the first target image, defining the colors as target pixel points, calculating the first number of the target pixel points with different colors in the target area, positioning the alternative positions of the target pixel points in the second target image, generating alternative areas at the alternative positions, wherein the size of the alternative areas is the same as the size of the target area, calculating the second number of the pixel points with different colors in each alternative area, calculating the difference value of the pixel points with different colors based on the first number and the second number, and setting the alternative area with the smallest sum of the difference values as the target area.
Specifically, in this embodiment, the pixels of the target area in the first target image are defined as target pixels, the target pixels include pixels of athlete a, for example, including blue, black, white and red, the first number of four target pixels of athlete a is 30, 20 and 10, the total pixel value is 90, the size of the target area in the first target image is set to be 12×12 according to the total pixel value and the pixel position distribution, the target area is set to be as large as 12×12 in the second target image, due to the change of the posture of athlete a in the training process, the target area may include not only four target pixels of a certain athlete, and the second number of four target pixels is 29, 31, 20 and 9, but also includes target pixels of badminton B, the color of the target pixels of badminton B includes black and black, the second number is 2 and 6, two candidate areas are generated based on two moving objects of a certain athlete and badminton B, and the difference value of the target area of the athlete a with the target area of the first target pixel a is calculated.
Acquiring the first pose of the target object within the target region comprises the steps of:
and acquiring a first pixel, wherein the first pixel is any pixel in the target area, calculating a first difference value of the first pixel and each surrounding pixel, defining the first pixel as a boundary pixel of the target object if the first difference value is greater than or equal to a first threshold value and smaller than a second threshold value, sequentially connecting the boundary pixels to obtain a boundary contour, and defining the boundary contour as a first gesture of the target object.
Specifically, in this embodiment, in the target area, any one pixel is defined as a first pixel, the target pixel often exists in one portion as a whole, a first difference value between any one first pixel and surrounding pixels is calculated, if the first difference value is greater than or equal to a first threshold value and less than a second threshold value, a background pixel where the target area is located and a target pixel of the athlete a may be obtained, and the magnitudes of the first threshold value and the second threshold value are set, so as to identify a pixel where a jump in the pixel value occurs, and the jump range is within a preset range, such a pixel is defined as a boundary pixel, and profile information of a first posture of the target object is formed based on the boundary pixel.
Calculating the highest similarity for the first pose comprises the steps of:
Setting a plurality of first detection points on a boundary area, setting a plurality of second detection points on a boundary area of a second gesture in a first database, setting a reference position, placing the first gesture and the second gesture based on the reference position, and calculating the similarity alpha of the first gesture and each second gesture based on a first formula, wherein the first formula is as follows: Wherein N is the number of first detection points, Q I is the three-dimensional space position coordinate of the I first detection point, Q J is the three-dimensional space position coordinate of the J second detection point, Q 1 is the set of the first detection points, Q 2 is the set of the second detection points, min is the minimum function of obtaining the second detection point nearest to the first detection point, Q I-qJ||2 is the square of the distance between the first detection point Q I and the second detection point Q J, and the highest similarity of the first gesture is obtained by comparing the sizes of the similarities.
Specifically, in the present embodiment, for example, 50 first detection points are set on the boundary area of the first posture, 50 second detection points are also set on the boundary area of the corresponding second posture in the first database, a plurality of reference positions, for example, positions important for the posture such as the head, the left shoulder, the right shoulder, the elbow, the wrist, the finger, the waist and the like are set, the reference positions are used as reference points, the first posture and the second posture are placed on the same spatial plane, and the first formula is utilizedAnd calculating the similarity alpha of the first gesture and each second gesture, wherein the distance between the second detection point closest to the first detection point and the first detection point is acquired according to min|q I-qJ||2, q I and q J are the three-dimensional space position coordinates of the I first detection point and the three-dimensional space position coordinates of the J second detection point respectively, the reciprocal of the average value of the sum of the distances of all the first detection points and the second detection point in the closest second gesture is taken as the similarity of the first gesture and the second gesture, and the highest similarity of the first gesture is obtained by comparing the size of each similarity, and the highest similarity is stored in the first database.
Determining whether the target object is occluded comprises the steps of:
The method comprises the steps of obtaining K-2 and K-1 frame images shot by a default camera, defining the K-2 and K-1 frame images as a third target image and a fourth target image respectively, identifying the third target image and the fourth target image, obtaining a first skeleton structure and a second skeleton structure, marking key nodes in the skeleton nodes, and calculating first distances between each skeleton node and the default camera based on a monocular ranging algorithm.
Judging the moving trend of skeleton nodes based on the time sequence of the third target image and the fourth target image, generating a third skeleton structure of the second target image based on the moving trend and the second skeleton structure, generating a human body model based on skeleton nodes in the third skeleton structure, defining skeleton nodes appearing in the overlapped area as target nodes if the human body model has an overlapped area, and judging that a target object is blocked if the target nodes comprise key nodes and the first distance of the key nodes is larger than that of other target nodes.
Specifically, in the present embodiment, three-dimensional bone node information of athlete a in the target image is detected by OpenPose or the like from the third target image and the fourth target image of athlete a in the K-2 and K-1 frame images taken from the default camera, and a bone structure is formed from the bone node information, which may include a head, a neck, a left and right shoulder, a left and right elbow, a left and right wrist, a left and right hip, a left and right knee, a femoral head, a left and right ankle, a left and right foot, a spine, a pelvis, and the like, from which bone node information important for recognition of a badminton training posture is selected as a key node, such as a head, a neck, a left and right shoulder, a left and right elbow, a left and right wrist, a left and right hip, a left and right knee, a left and right ankle, a left and right foot, and a left and right hand, wherein the first distance of each bone node from the default camera is calculated based on a monocular ranging algorithm, which is a computer vision technique for estimating a distance between an object and a camera by a single camera.
Then, according to the difference of the positions of the bone nodes in the third target image and the fourth target image, the movement trend of the bone nodes is judged, for example, the positions of the bone nodes Q in the third target image and the fourth target image, namely the K-2 frame image and the K-1 frame image are P k-2 and P k-1 respectively, different weights W 1 and W 2 can be set respectively, according to the formula P k=W1Pk-1+W2Pk-2, the position information P k of the bone nodes Q in the second target image is obtained, and the third bone structure of the second target image is generated according to the distance between each bone node in the second bone structure.
The human body model is generated based on the positions and distances of skeleton nodes in the third skeleton structure, and because of different hitting postures of the player A in the badminton catching process, the human body model may have an overlapping area, for example, a jump posture of the player A in the badminton catching process (assuming that the player A hits a ball with the right hand), in the jump process, the left hand position and the left thigh position (excluding the left knee) overlap to form an overlapping area, the left hand position is defined as a target node, the left hand position is a key node which is set in advance, the first distance between the left hand position and a default camera is judged, and the difference value between other target nodes such as a femoral head, a greater trochanter and a lesser trochanter in the left thigh position overlapping area and the first distance between other target nodes and the default camera is judged, if the first distance between the key node and the other target nodes is greater than the other target nodes, the left hand position serving as the key node is blocked, specific position information cannot be identified under the default camera, and posture identification needs to be carried out through a camera with other view angles.
Storing the first pose in the first database further comprises the following operations:
position information of each key node in each first gesture is obtained, rotation angle data corresponding to each first gesture is generated, second difference values of different first gesture rotation angle data are calculated, and first gestures with the second difference values smaller than a third threshold value are eliminated.
Specifically, in this embodiment, according to a first gesture under each target image in the first database, position information of key nodes in the first gesture is obtained, according to distances between skeleton nodes in the skeleton model and inverse kinematics and forward kinematics principles, rotation angles of each key node, namely joint angles and joint distances, are obtained, a second difference value of rotation angles at the same key node in a plurality of first gestures is calculated, the second difference value is smaller than a third threshold value (which can be set to 1), namely the first gesture with small change is eliminated, and only the first gestures representing the movement states, such as a preparation state, a serving state, a receiving state, a moving state (which can be selected differently for different actions), a batting state, a jump-killing state, and the like, are remained, so that redundant data are reduced, the storage space of the first database is optimized, the accuracy of analysis of the badminton gesture of a player is improved, and the data processing speed is accelerated.
As shown in fig. 2, the present invention further provides an intelligent training aid system for badminton, for implementing the above method, which mainly includes:
The acquisition module is used for setting a plurality of cameras, selecting one camera as a default camera, shooting complementary video images by the other cameras at the same time, acquiring a first target image shot by the default camera, wherein the first target image is a first frame image comprising a target object, and acquiring target pixels of the target object based on the first target image.
The first generation module forms a target area based on target pixels, acquires a first gesture of a target object in the target area, establishes a first database, compares the first gesture with a second gesture in the first database, calculates to obtain the highest similarity of the first gesture, and stores the first gesture and the highest similarity of the first target image in the first database.
The second generation module is used for acquiring a second target image, wherein the second target image is a K-th frame image shot by a default camera and behind the first target image, positioning a target area in the second target image based on target pixels, judging whether a target object is blocked, acquiring a supplementary video image if the target object is blocked, analyzing the supplementary video image to acquire a first posture of the target object at the moment, acquiring the first posture from the second target image if the target object is not blocked, and storing the first posture into the first database.
And the scoring module is used for adding the highest similarity of all the first gestures in the first database, calculating an average value, and taking the average value as a motion score corresponding to the target object.
The invention also provides equipment for realizing the method, which comprises a processor, a memory and a communication bus, wherein the processor and the memory are communicated with each other through the communication bus.
And a memory for storing a computer program.
A processor for executing a program stored in a memory, implementing the method of any one of claims 1-6.
The invention also provides a computer medium, wherein the computer medium stores program instructions, and the program instructions control equipment where the computer medium is located to execute the method when running.
It should be understood that, although the steps in the flowcharts of the embodiments of the present invention are shown in order as indicated by the arrows, these steps are not necessarily performed in order as indicated by the arrows. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, at least some of the steps in various embodiments may include multiple sub-steps or stages that are not necessarily performed at the same time, but may be performed at different times, nor do the order in which the sub-steps or stages are performed necessarily performed in sequence, but may be performed alternately or alternately with at least a portion of the sub-steps or stages of other steps or other steps.
Those skilled in the art will appreciate that implementing all or part of the above-described methods may be accomplished by way of computer programs, which may be stored on a non-transitory computer readable storage medium, and which, when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in embodiments provided herein may include non-volatile and/or volatile memory. The nonvolatile memory can include Read Only Memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous link (SYNCHLINK) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), among others.
The technical features of the foregoing embodiments may be arbitrarily combined, and for brevity, all of the possible combinations of the technical features of the foregoing embodiments are not described, however, they should be considered as the scope of the disclosure as long as there is no contradiction between the combinations of the technical features.
The foregoing examples illustrate only a few embodiments of the invention and are described in detail herein without thereby limiting the scope of the invention. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the invention, which are all within the scope of the invention. Accordingly, the scope of protection of the present invention is to be determined by the appended claims.
The foregoing description of the preferred embodiments of the invention is not intended to be limiting, but rather is intended to cover all modifications, equivalents, and alternatives falling within the spirit and principles of the invention.
Claims (5)
1. An intelligent auxiliary training method for badminton, which is characterized by comprising the following steps:
S1, setting a plurality of cameras, selecting one camera as a default camera, and simultaneously shooting complementary video images by other cameras to obtain a first target image shot by the default camera, wherein the first target image is a first frame image comprising a target object, and obtaining target pixels of the target object based on the first target image;
S2, forming a target area based on the target pixel, acquiring a first gesture of the target object in the target area, comparing the first gesture with a second gesture in a database, calculating to obtain the highest similarity of the first gesture, and storing the first gesture and the highest similarity of the first target image into the database;
Wherein acquiring the first pose of the target object within the target area comprises the steps of:
Acquiring a first pixel, wherein the first pixel is any pixel in a target area, calculating a first difference value of the first pixel and each surrounding pixel, defining the first pixel as a boundary pixel of the target object if the first difference value is greater than or equal to a first threshold value and smaller than a second threshold value, sequentially connecting the boundary pixels to obtain a boundary contour, and defining the boundary contour as the first gesture of the target object;
Calculating the highest similarity of the first pose comprises the following steps:
Setting a plurality of first detection points on a boundary contour, setting a plurality of second detection points on a boundary contour of a second gesture in a database, setting a reference position, placing the first gesture and the second gesture based on the reference position, and calculating the similarity of the first gesture and each second gesture based on a first formula The first formula is: Wherein, the method comprises the steps of, wherein, For the number of said first detection points,Is the firstThree-dimensional spatial position coordinates of the first detection points,Is the firstThree-dimensional spatial position coordinates of the second detection points,For the set of said first detection points,For the set of second detection points, min is a minimum function of the second detection point nearest to the first detection point,For the first detection pointAnd the second detection pointThe squares of the distances between the two gestures are compared with each other to obtain the highest similarity of the first gesture;
S3, acquiring a second target image, wherein the second target image is a K-th frame image shot by the default camera and behind the first target image, positioning the target area in the second target image based on the target pixels, judging whether the target object is blocked, acquiring the supplementary video image if the target object is blocked, analyzing the supplementary video image to acquire a first gesture of the target object at the moment, acquiring the first gesture from the second target image if the target object is not blocked, and storing the first gesture into a database;
Wherein, judging whether the target object is blocked comprises the following steps:
Obtaining K-2 and K-1 frame images shot by the default camera, defining the K-2 and K-1 frame images as a third target image and a fourth target image respectively, identifying the third target image and the fourth target image, obtaining a first skeleton structure and a second skeleton structure, wherein the skeleton structure comprises a plurality of skeleton nodes, marking key nodes in the skeleton nodes, and respectively calculating a first distance between each skeleton node and the default camera based on a monocular ranging algorithm;
Judging the movement trend of skeleton nodes based on the time sequence of the third target image and the fourth target image, generating a third skeleton structure of the second target image based on the movement trend and the second skeleton structure, generating a human body model based on the skeleton nodes in the third skeleton structure, defining skeleton nodes appearing in the overlapping area as target nodes if an overlapping area exists in the human body model, and judging that a shielding exists in the target object if the target nodes comprise the key nodes and the first distance of the key nodes is larger than that of the rest target nodes;
the method further comprises the following operations after the first gesture is acquired from the second target image:
acquiring position information of each key node in each first gesture, generating rotation angle data corresponding to each first gesture, calculating a second difference value of the rotation angle data of different first gestures, and eliminating the first gestures with the second difference value smaller than a third threshold value;
And S4, adding the highest similarity of all the first gestures in the database, calculating an average value, and taking the average value as a motion score corresponding to the target object.
2. The method of claim 1, wherein locating the target region in the second target image comprises the steps of:
In the first target image, setting a size of a target area based on the target object, acquiring colors including pixel points in the target area of the first target image, defining the colors as target pixel points, calculating a first number of the target pixel points of different colors in the target area, positioning alternative positions where the target pixel points appear in the second target image, generating alternative areas at the alternative positions, wherein the size of the alternative areas is the same as that of the target area, calculating a second number of the pixel points of different colors in each of the alternative areas, calculating a difference value of the pixel point numbers of different colors based on the first number and the second number, and setting the alternative area with the smallest sum of the difference values as the target area.
3. An intelligent training aid system for badminton, for implementing the method according to any one of claims 1-2, characterized in that it comprises the following modules:
The acquisition module is used for setting a plurality of cameras, selecting one camera as a default camera, simultaneously shooting supplementary video images by other cameras, acquiring a first target image shot by the default camera, wherein the first target image is a first frame image comprising a target object, and acquiring target pixels of the target object based on the first target image;
A first generation module for forming a target area based on the target pixel, obtaining a first gesture of the target object in the target area, comparing the first gesture with a second gesture in a database, calculating to obtain the highest similarity of the first gesture, wherein the obtaining the first gesture of the target object in the target area comprises the steps of obtaining a first pixel which is any pixel in the target area, calculating a first difference value between the first pixel and each surrounding pixel, defining the first pixel as a boundary pixel of the target object if the first difference value is larger than or equal to a first threshold and smaller than a second threshold, sequentially connecting the boundary pixels to obtain a boundary contour, defining the boundary contour as the first gesture of the target object, calculating to obtain the highest similarity of the first gesture comprises the steps of setting a plurality of first detection points on the boundary contour, setting a plurality of second detection points on the boundary contour of the second gesture in the database, setting a plurality of second positions on the boundary contour of the second gesture, setting the first gesture and setting the first gesture based on the first gesture and the second gesture, and setting the first gesture based on the first gesture and the similarity The first formula is: Wherein, the method comprises the steps of, wherein, For the number of said first detection points,Is the firstThree-dimensional spatial position coordinates of the first detection points,Is the firstThree-dimensional spatial position coordinates of the second detection points,For the set of said first detection points,For the set of second detection points, min is a minimum function of the second detection point nearest to the first detection point,For the first detection pointAnd the second detection pointThe squares of the distances between the two gestures are compared with each other to obtain the highest similarity of the first gesture;
The second generating module acquires a second target image, wherein the second target image is a K-th frame image shot by the default camera and after the first target image, positions the target area in the second target image based on the target pixels, judges whether the target object is blocked, acquires the supplementary video image if blocked, analyzes the supplementary video image to acquire a first posture of the target object at the moment, and acquires the first posture from the second target image if not blocked, wherein judging whether the target object is blocked comprises the following steps: obtaining K-2 and K-1 frame images shot by the default camera, respectively defining the K-2 and K-1 frame images as a third target image and a fourth target image, identifying the third target image and the fourth target image, obtaining a first bone structure and a second bone structure, wherein the bone structure comprises a plurality of bone nodes, marking key nodes in the bone nodes, respectively calculating a first distance between each bone node and the default camera based on a monocular ranging algorithm, judging a moving trend of the bone nodes based on a time sequence of the third target image and the fourth target image, generating a third bone structure of the second target image based on the moving trend and the second bone structure, generating a human body model based on the bone nodes in the third bone structure, defining the bone nodes appearing in the overlapped area as target nodes if the human body model has an overlapped area, and defining the first distance of the key nodes as being larger than the rest of the target nodes if the key nodes are included in the target nodes, acquiring position information of each key node in each first gesture, generating rotation angle data corresponding to each first gesture, calculating a second difference value of the rotation angle data of different first gestures, and eliminating the first gestures with the second difference value smaller than a third threshold value;
And the scoring module is used for adding the highest similarity of all the first gestures, calculating an average value, and taking the average value as a motion score corresponding to the target object.
4. An intelligent training aid for badminton, characterized in that the aid comprises a processor and a memory for storing at least one section of a computer program, which is loaded by the processor and which carries out the method according to any of claims 1 to 2.
5. A computer medium, characterized in that the computer medium stores program instructions, wherein the program instructions, when run, control a device in which the computer medium is located to perform the method of any of claims 1-2.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410788636.0A CN118692146B (en) | 2024-06-19 | 2024-06-19 | Intelligent auxiliary training method, system, equipment and medium for badminton |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410788636.0A CN118692146B (en) | 2024-06-19 | 2024-06-19 | Intelligent auxiliary training method, system, equipment and medium for badminton |
Publications (2)
Publication Number | Publication Date |
---|---|
CN118692146A CN118692146A (en) | 2024-09-24 |
CN118692146B true CN118692146B (en) | 2025-01-17 |
Family
ID=92769080
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202410788636.0A Active CN118692146B (en) | 2024-06-19 | 2024-06-19 | Intelligent auxiliary training method, system, equipment and medium for badminton |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN118692146B (en) |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110472462A (en) * | 2018-05-11 | 2019-11-19 | 北京三星通信技术研究有限公司 | Attitude estimation method, the processing method based on Attitude estimation and electronic equipment |
CN111931701A (en) * | 2020-09-11 | 2020-11-13 | 平安国际智慧城市科技股份有限公司 | Gesture recognition method and device based on artificial intelligence, terminal and storage medium |
CN116682268A (en) * | 2023-03-13 | 2023-09-01 | 沈阳航空航天大学 | Portable urban road vehicle violation inspection system and method based on machine vision |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101231755B (en) * | 2007-01-25 | 2013-03-06 | 上海遥薇(集团)有限公司 | Moving target tracking and quantity statistics method |
JP7571461B2 (en) * | 2020-10-26 | 2024-10-23 | セイコーエプソン株式会社 | Identification method, image display method, identification system, image display system, and program |
WO2023108842A1 (en) * | 2021-12-14 | 2023-06-22 | 成都拟合未来科技有限公司 | Motion evaluation method and system based on fitness teaching training |
CN116958872A (en) * | 2023-07-26 | 2023-10-27 | 浙江大学 | Intelligent auxiliary training method and system for badminton |
-
2024
- 2024-06-19 CN CN202410788636.0A patent/CN118692146B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110472462A (en) * | 2018-05-11 | 2019-11-19 | 北京三星通信技术研究有限公司 | Attitude estimation method, the processing method based on Attitude estimation and electronic equipment |
CN111931701A (en) * | 2020-09-11 | 2020-11-13 | 平安国际智慧城市科技股份有限公司 | Gesture recognition method and device based on artificial intelligence, terminal and storage medium |
CN116682268A (en) * | 2023-03-13 | 2023-09-01 | 沈阳航空航天大学 | Portable urban road vehicle violation inspection system and method based on machine vision |
Also Published As
Publication number | Publication date |
---|---|
CN118692146A (en) | 2024-09-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11967101B2 (en) | Method and system for obtaining joint positions, and method and system for motion capture | |
CN112819852A (en) | Evaluating gesture-based motion | |
US11798318B2 (en) | Detection of kinetic events and mechanical variables from uncalibrated video | |
CN111444890A (en) | Sports data analysis system and method based on machine learning | |
WO2013088639A1 (en) | Posture estimation device and posture estimation method | |
US20160296795A1 (en) | Apparatus and method for analyzing golf motion | |
JP2021105887A (en) | Three-dimensional pose obtaining method and device | |
WO2014017006A1 (en) | Posture estimation device, posture estimation method, and posture estimation program | |
JP7422456B2 (en) | Image processing device, image processing method and program | |
US11908161B2 (en) | Method and electronic device for generating AR content based on intent and interaction of multiple-objects | |
Park et al. | Accurate and efficient 3d human pose estimation algorithm using single depth images for pose analysis in golf | |
JP5503510B2 (en) | Posture estimation apparatus and posture estimation program | |
Sokolova et al. | Human identification by gait from event-based camera | |
Ludwig et al. | All keypoints you need: Detecting arbitrary keypoints on the body of triple, high, and long jump athletes | |
CN118692146B (en) | Intelligent auxiliary training method, system, equipment and medium for badminton | |
JP2020107071A (en) | Object tracking device and program thereof | |
US11741756B2 (en) | Real time kinematic analyses of body motion | |
WO2022116860A1 (en) | Swimmer performance analysis system | |
Sawahata et al. | Instance Segmentation-Based Markerless Tracking of Fencing Sword Tips | |
Yagi et al. | Estimation of runners' number of steps, stride length and speed transition from video of a 100-meter race | |
JP2023179239A (en) | Information processing program, information processing method, and information processing apparatus | |
Cheng et al. | Body part connection, categorization and occlusion based tracking with correction by temporal positions for volleyball spike height analysis | |
JP2003032544A (en) | Scene content information adding apparatus and scene content information adding method | |
CN114973409A (en) | Goal scoring identification method and system based on court environment and personnel pose | |
US20230285802A1 (en) | Method, device, and non-transitory computer-readable recording medium for estimating information on golf swing |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |