Disclosure of Invention
In view of the foregoing, there is a need for an identity information association system and method, a computer storage medium, and a user device that do not require pre-registration and can be based on user behavior analysis.
An identity information association method is applied to an identity information association device, and comprises the following steps:
identifying individuals within a scene;
recording sound information within the scene;
identifying individual sounds in the sound information;
judging whether a target individual has a response action to a trigger sound in the individual sounds;
recording the trigger sound; and
associating the trigger sound with the target individual.
Further, the air conditioner is provided with a fan,
prior to said associating said trigger sound with said target individual, said method further comprising the steps of:
analyzing semantics of a plurality of trigger sounds; and
and judging whether the number of the same semantics in the plurality of trigger sounds exceeds a preset number.
Further, the determining whether a target individual has an action of responding to a trigger sound in the sound information includes:
judging whether the target individual has a body action after the sound is triggered;
judging whether the body motion amplitude exceeds a preset amplitude or not; and
determining whether a plurality of individuals have the physical action simultaneously.
Further, the body action comprises at least one of a head action, a face action and a hand action, the head action comprises raising or turning the head, the face action comprises a specific mouth action or eye action, and the hand action comprises a hand raising response action.
An identity information association system comprising:
the video monitoring module is used for identifying individuals in a scene;
the sound monitoring module is used for recording sound information in the scene;
the voice identification module is used for identifying the individual voice in the voice information;
the response judging module is used for judging whether a target individual has a response action on a trigger sound in the individual information;
the trigger recording module is used for recording the trigger sound; and
and the identity correlation module is used for correlating the trigger sound with the target individual.
Furthermore, the identity information correlation system further comprises a semantic analysis module, a voice conversion module and a semantic judgment module, wherein the voice analysis module is used for analyzing the semantics of the plurality of trigger voices, and the semantic judgment module is used for judging whether the number of the same semantics in the plurality of trigger voices exceeds a preset number; the voice conversion module is used for converting the trigger voice into characters to be associated with the target individual when the number of the same semantics in the trigger voices exceeds a preset number.
Further, the response judging module is further configured to judge whether the target individual has a body motion after the trigger sound, judge whether the body motion amplitude exceeds a predetermined amplitude, and judge whether a plurality of individuals have the body motion at the same time.
Further, the body action comprises at least one of a head action, a face action and a hand action, the head action comprises raising or turning the head, the face action comprises a specific mouth action or eye action, and the hand action comprises a hand raising response action.
A computer storage medium storing a plurality of instructions adapted to be loaded by a processor and to perform the above identity information association method.
A user equipment, comprising:
a processor to implement one or more instructions; and
a computer storage medium storing a plurality of instructions adapted to be loaded by a processor and to perform the above identity information association method.
According to the identity information correlation system and the identity information correlation method, the target individual and the corresponding title converted characters are stored in a correlation mode, when identity information such as a user name is needed, only individual features such as physical signs and body states need to be identified, and manual filling is not needed.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It will be understood that when an element is referred to as being "connected" to another element, it can be directly connected to the other element or intervening elements may also be present. When an element is referred to as being "disposed on" another element, it can be directly on the other element or intervening elements may also be present.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used herein, the term "and/or" includes any and all combinations of one or more of the associated listed items.
Referring to fig. 1, an embodiment of the present invention provides an identity information association method, which can obtain and record identity information of an individual through behavior analysis. The method comprises the following steps:
step S101: individuals within the scene are identified. The scene is a fixed-area activity space, such as a conference room, a supermarket, a laboratory, a classroom, a restaurant, a mall and the like, which is monitored by a video monitoring device. The individual may be a human body, an animal or a man-made object, such as an artificial intelligence robot or the like. The video monitoring device is a camera. The video monitoring device can determine and track each individual through sign recognition (such as face recognition) and posture recognition.
Step S102: recording sound information within the scene. Recording the sound in the target scene through a sound monitoring device; in one embodiment, the sound monitoring device is a microphone. Sounds in a scene may include sounds made by individuals and other sounds.
Step S103: individual sounds in the sound information are identified. The individual sounds in the sound information may be identified by sound frequency, by combining changes in the individual's posture in the video, such as mouth opening actions, or by identifying semantics.
Step S104: and judging whether a target individual has response action on a trigger sound in the individual sound, if so, executing the step S105, and if not, returning to the step S101. The response action comprises at least one of head action, face action and hand action, the head action comprises head raising and head turning, the face action comprises specific mouth action and eye action, and the hand action comprises hand raising response; the trigger sound includes a name of the target individual, such as a name, a foreign number, a nickname, and the like.
Step S105: recording the trigger sound.
Step S106: the semantics of a plurality of trigger sounds are analyzed.
Step S107: and judging whether the number of the same semantics in the plurality of trigger sounds exceeds a preset number, if so, executing the step S108, and if not, returning to the step S104.
Step S108: associating the trigger sound with the target individual.
Step S109: and converting the trigger sound into characters to be associated with the target individual.
After the trigger sound is converted into characters to be associated with the target individual, the target individual can be directly registered through sign recognition (such as face recognition) or posture recognition without manual registration when the target individual needs to be registered. By using the target individual and the associated characters, other personal data corresponding to the characters, such as career experience, diagnosis data, health condition and the like, can be associated in the database through big data analysis. After physical sign recognition or posture recognition, the corresponding name and the related personal data can be known.
Referring to fig. 2, step S103 includes:
step S201: judging whether the target individual has physical movement after the sound is triggered, if so, executing step S202, and if not, only executing step S201.
Step S202, judging whether the body motion amplitude exceeds a preset amplitude, if so, executing step S203, and if not, returning to step S201. The body action comprises at least one of head action, face action and hand action, the head action comprises head raising and head turning, the face action comprises specific mouth action and eye action, and the hand action comprises hand raising response.
Step S203: and judging whether a plurality of individuals have the physical movement at the same time, if not, executing the step S204, and if so, returning to the step S201.
Step S204: the physical action is recorded as a response action.
Referring to fig. 3, an identity information association system according to an embodiment of the present invention includes:
the video monitoring module 31 is used for identifying individuals in the scene. The scene is a fixed area activity space such as a conference room, supermarket, laboratory, classroom, restaurant, mall, etc. The individual may be a human body, an animal or a man-made object, such as an artificial intelligence robot or the like. The video monitoring device is a camera. The video monitoring module 31 can determine and track each individual through sign recognition (such as face recognition) and posture recognition.
And the sound monitoring module 32 is used for recording sound information in the scene. Recording the sound in the target scene by installing a sound monitoring device in the target scene; in one embodiment, the sound monitoring device is a microphone. Sounds in a scene may include sounds made by individuals and other sounds.
The voice recognition module 33 is used for recognizing the individual voice in the voice message. The individual sounds in the sound information can be identified through sound frequency, a mode combining the individual posture change in the video or identifying semantics, such as mouth opening action.
The response determining module 34 is used to determine whether a target individual has a response action to a trigger sound in the individual sounds. The response action comprises at least one of head action, face action and hand action, the head action comprises head raising and head turning, the face action comprises specific mouth action and eye action, and the hand action comprises hand raising response.
The response determination module 34 determines whether to respond by determining whether there is a body movement in the target scene, determining whether the body movement amplitude exceeds a predetermined amplitude, and determining whether multiple individuals have the body movement at the same time. The body action comprises head action, face action and hand action, the head action comprises raising head and turning head, the face action comprises specific mouth action and eye action, and the hand action comprises raising hand response. When the magnitude of the body motion is too small or when many people have body motion, it is not considered as a response motion.
The trigger recording module 35 is used to record the trigger voice for triggering the response action.
The semantic analysis module 38 is used for analyzing the semantics of the triggering sounds.
The semantic determining module 39 is used to determine whether the number of the same semantic in the triggering sounds exceeds a preset number. The preset number is more than or equal to two.
An identity associating module 36, configured to associate the trigger sounds with the target individual when the number of the same semantics in the trigger sounds exceeds a preset number.
The voice conversion module 37 is used for converting the trigger voice into text to be associated with the target individual.
After the trigger sound is converted into characters to be associated with the target individual, the target individual can be directly registered through sign recognition (such as face recognition) or posture recognition without manual registration when the target individual needs to be registered. By using the target individual and the associated characters, other personal data corresponding to the characters, such as career experience, diagnosis data, health condition and the like, can be associated in the database through big data analysis. After physical sign recognition or posture recognition, the corresponding name and the related personal data can be known.
Referring to fig. 4, the present invention also discloses a user equipment, which may include at least one processor 71 (one processor 71 is shown as an example) and a computer storage medium 73. Processor 71 may invoke logic instructions in computer storage medium 73 to perform the methods in the embodiments described above. In one embodiment, the user equipment is a server.
Furthermore, the logic instructions in the computer storage medium 73 can be implemented in the form of software functional units and stored in a computer storage medium when sold or used as a stand-alone product.
The computer storage medium 73 may be configured to store software programs, computer-executable programs, such as program instructions or modules corresponding to the methods in the embodiments of the present disclosure. The processor 71 executes functional applications and data processing, i.e. implements the methods in the above-described embodiments, by running software programs, instructions or modules stored in the computer storage medium 73.
The computer storage medium 73 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to the use of the terminal device, and the like. Further, computer storage media 73 may include high speed random access computer storage media, and may also include non-volatile computer storage media. For example, a variety of media that can store program codes, such as a usb disk, a removable hard disk, a Read-Only computer Memory (ROM), a Random Access computer Memory (RAM), a magnetic disk, or an optical disk, may also be transient storage media.
The processor 71 loads and executes one or more instructions stored in the computer storage medium 73 to implement the corresponding steps of the method flows shown in fig. 1-2; in a specific implementation, one or more instructions in a computer storage medium are loaded by a processor and perform the following steps:
step S101: individuals within the scene are identified. The scene is a fixed-area activity space, such as a conference room, a supermarket, a laboratory, a classroom, a restaurant, a mall and the like, which is monitored by a video monitoring device. The individual may be a human body, an animal or a man-made object, such as an artificial intelligence robot or the like. The video monitoring device is a camera. The video monitoring device can determine and track each individual through sign recognition (such as face recognition) and posture recognition.
Step S102: recording sound information within the scene. Recording the sound in the target scene through a sound monitoring device; in one embodiment, the sound monitoring device is a microphone. Sounds in a scene may include sounds made by individuals and other sounds.
Step S103: individual sounds in the sound information are identified. The individual sounds in the sound information can be identified through sound frequency, a mode combining the individual posture change in the video or identifying semantics, such as mouth opening action.
Step S104: and judging whether a target individual has response action on a trigger sound in the individual sound, if so, executing the step S105, and if not, returning to the step S101. The response action comprises at least one of head action, face action and hand action, the head action comprises head raising and head turning, the face action comprises specific mouth action and eye action, and the hand action comprises hand raising response; the trigger sound includes a name of the target individual, such as a name, a foreign number, a nickname, and the like.
Step S105: recording the trigger sound.
Step S106: the semantics of a plurality of trigger sounds are analyzed.
Step S107: and judging whether the number of the same semantics in the plurality of trigger sounds exceeds a preset number, if so, executing the step S108, and if not, returning to the step S104.
Step S108: associating the trigger sound with the target individual.
Step S109: and converting the trigger sound into characters to be associated with the target individual.
After the trigger sound is converted into characters to be associated with the target individual, the target individual can be directly registered through sign recognition (such as face recognition) or posture recognition without manual registration when the target individual needs to be registered. By using the target individual and the associated characters, other personal data corresponding to the characters, such as career experience, diagnosis data, health condition and the like, can be associated in the database through big data analysis. After physical sign recognition or posture recognition, the corresponding name and the related personal data can be known.
Step S201: judging whether the target individual has physical movement after the sound is triggered, if so, executing step S202, and if not, only executing step S201.
Step S202, judging whether the body motion amplitude exceeds a preset amplitude, if so, executing step S203, and if not, returning to step S201. The body action comprises at least one of head action, face action and hand action, the head action comprises head raising and head turning, the face action comprises specific mouth action and eye action, and the hand action comprises hand raising response.
Step S203: and judging whether a plurality of individuals have the physical movement at the same time, if not, executing the step S204, and if so, returning to the step S201.
Step S204: the physical action is recorded as a response action.
In addition, other modifications within the spirit of the invention will occur to those skilled in the art, and it is understood that such modifications are included within the scope of the invention as claimed.