CN111147806A

CN111147806A - Video content risk detection method, device and system

Info

Publication number: CN111147806A
Application number: CN201811311966.1A
Authority: CN
Inventors: 李东声
Original assignee: Tendyron Corp
Current assignee: Tendyron Corp
Priority date: 2018-11-06
Filing date: 2018-11-06
Publication date: 2020-05-12

Abstract

The invention discloses a video content risk detection method, a device and a system, wherein the method comprises the following steps: the first camera sends the first video data to the monitoring device; the second camera sends the second video data to the monitoring device; the monitoring device determines a user to be analyzed; extracting a first background feature when a user to be analyzed is located at a first preset position, extracting a second background feature when the user to be analyzed is located at a second preset position, and calculating to obtain the matching degree between the background feature to be analyzed and a preset background collaborative model; extracting a first moment corresponding to the user to be analyzed when the first preset position appears from the first time information, extracting a second moment corresponding to the user to be analyzed when the second preset position appears from the second time information, and calculating the time interval between the first moment and the second moment; and if the matching degree is lower than a preset background threshold value or the time interval is greater than a preset time threshold value, generating a first comparison result and determining that a preset risk exists.

Description

Video content risk detection method, device and system

Technical Field

The invention relates to the field of video monitoring, in particular to a video content risk detection method, device and system.

Background

An existing Automatic Teller Machine (ATM for short) is generally arranged in an Automatic bank, and after a bank card is inserted into the ATM, bank counter services such as money withdrawal, deposit, transfer and the like can be performed on the ATM. Due to the public, convenience and environmental specificity of self-service banking and automated teller machines. Criminal activity has increased in recent years for self-service banks and automated teller machines.

However, the conventional ATM video monitoring system mainly records the video, and after an event occurs, the recorded video is subjected to post-event evidence obtaining, so that disputes can be eliminated and cases can be cracked, but such a mechanism only provides a post-event evidence obtaining effect, and cannot achieve real-time or early warning.

Disclosure of Invention

The present invention aims to solve one of the above problems.

The invention mainly aims to provide a video content risk detection method, device and system.

In order to achieve the purpose, the technical scheme of the invention is realized as follows:

one aspect of the present invention provides a video content risk detection method, including: the method comprises the steps that a first camera carries out video acquisition on a first preset position in an environment to be detected to obtain first video data, and the first video data are sent to a monitoring device, wherein the first video data at least comprise first time information, and the first time information cannot be changed; the second camera carries out video acquisition on a second preset position in the environment to be detected to obtain second video data, and sends the second video data to the monitoring device, wherein the second video data at least comprises second time information which cannot be changed, the first camera and the second camera are arranged at different positions in the environment to be detected, and the first time information and the second time information are obtained based on the same reference timing; the monitoring device receives the first video data and the second video data, identifies the face corresponding to the user to be analyzed in the first video data and the second video data, and determines the user to be analyzed; the monitoring device extracts a first background feature of a user to be analyzed when the user to be analyzed is located at a first preset position from the first video data, extracts a second background feature of the user to be analyzed when the user to be analyzed is located at a second preset position from the second video data, and calculates to obtain a matching degree between the background feature to be analyzed and a preset background collaborative model, wherein the background feature to be analyzed comprises the first background feature and the second background feature; the monitoring device extracts a first moment corresponding to the user to be analyzed when the first preset position appears from the first time information, extracts a second moment corresponding to the user to be analyzed when the second preset position appears from the second time information, and calculates the time interval between the first moment and the second moment; the monitoring device compares the matching degree with a preset background threshold, if the matching degree is lower than the preset background threshold, or judges whether the time interval is greater than a preset time threshold, and if the matching degree is greater than the preset time threshold, a first comparison result is generated to determine that a preset risk exists.

Wherein, the method further comprises: and when the matching degree is not lower than a preset background threshold value and the time interval is not greater than a preset time threshold value, the monitoring device generates a second comparison result and determines that no preset risk exists.

Wherein, the method further comprises: the monitoring device receives training video data acquired by the first camera and the second camera in advance; the monitoring device respectively extracts training elements from the training video data, and obtains a preset background collaborative model and a preset time threshold value according to training of the training elements.

Wherein, the method further comprises: and the monitoring device executes alarm operation after determining that the user to be analyzed has the preset risk.

In another aspect, the present invention provides a video content risk detection system, including: the monitoring device comprises a first camera, a second camera and a monitoring device, wherein the first camera is used for carrying out video acquisition on a first preset position in an environment to be detected to obtain first video data and sending the first video data to the monitoring device, and the first video data at least comprises first time information which cannot be changed; the second camera is used for carrying out video acquisition on a second preset position in the environment to be detected, obtaining second video data and sending the second video data to the monitoring device, wherein the second video data at least comprise second time information which cannot be changed, the first camera and the second camera are arranged at different positions in the environment to be detected, and the first time information and the second time information are obtained based on the same reference timing; the monitoring device is used for receiving the first video data and the second video data, identifying the face corresponding to the user to be analyzed in the first video data and the second video data, and determining the user to be analyzed; extracting a first background feature of a user to be analyzed when the user to be analyzed is located at a first preset position from the first video data, extracting a second background feature of the user to be analyzed when the user to be analyzed is located at a second preset position from the second video data, and calculating to obtain a matching degree between the background feature to be analyzed and a preset background collaborative model, wherein the background feature to be analyzed comprises the first background feature and the second background feature; extracting a first moment corresponding to the user to be analyzed when the first preset position appears from the first time information, extracting a second moment corresponding to the user to be analyzed when the second preset position appears from the second time information, and calculating the time interval between the first moment and the second moment; and comparing the matching degree with a preset background threshold, if the matching degree is lower than the preset background threshold, or judging whether the time interval is greater than a preset time threshold, and if the matching degree is greater than the preset time threshold, generating a first comparison result and determining that a preset risk exists.

And the monitoring device is further used for generating a second comparison result when the matching degree is not lower than a preset background threshold and the time interval is not greater than a preset time threshold, and determining that no preset risk exists.

The monitoring device is also used for receiving training video data acquired by the first camera and the second camera in advance; and respectively extracting training elements from the training video data, and training according to the training elements to obtain a preset background collaborative model and a preset time threshold.

The monitoring device is also used for executing alarm operation after determining that the user to be analyzed has the preset risk.

In another aspect, the present invention provides a video content risk detection apparatus, including: the receiving module is used for receiving first video data obtained by a first camera performing video acquisition on a first preset position in an environment to be detected, wherein the first video data at least comprises first time information which cannot be changed, and receiving second video data obtained by a second camera performing video acquisition on a second preset position in the environment to be detected, wherein the second video data at least comprises second time information which cannot be changed, the first camera and the second camera are arranged at different positions in the environment to be detected, and the first time information and the second time information are obtained based on the same reference timing; the identification module is used for identifying the faces corresponding to the users to be analyzed in the first video data and the second video data and determining the users to be analyzed; the computing module is used for extracting a first background feature of a user to be analyzed when the user to be analyzed is located at a first preset position from the first video data, extracting a second background feature of the user to be analyzed when the user to be analyzed is located at a second preset position from the second video data, and computing to obtain a matching degree between the background feature to be analyzed and a preset background collaborative model, wherein the background feature to be analyzed comprises the first background feature and the second background feature; extracting a first moment corresponding to the user to be analyzed when the first preset position appears from the first time information, extracting a second moment corresponding to the user to be analyzed when the second preset position appears from the second time information, and calculating the time interval between the first moment and the second moment; and the judging module is used for comparing the matching degree with a preset background threshold, if the matching degree is lower than the preset background threshold, or judging whether the time interval is greater than a preset time threshold, and if the matching degree is greater than the preset background threshold, generating a first comparison result and determining that a preset risk exists.

The judging module is further configured to generate a second comparison result when the matching degree is not lower than a preset background threshold and the time interval is not greater than a preset time threshold, and it is determined that no preset risk exists.

Therefore, according to the video content risk detection method, device and system provided by the embodiment of the invention, at least two cameras are arranged at different positions, so that the time length required by the user to be analyzed to pass through the first preset position shot by the first camera and the second preset position shot by the second camera is judged, and the background characteristics of the user to be analyzed to pass through the first preset position and the second preset position are analyzed, so that the preset risk (such as illegal criminal intention) can be found in real time, and the defects of precautionary, counterfeiting, fraud and other criminal behaviors which cannot be prevented under the monitoring of a single camera in the past are overcome.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on the drawings without creative efforts.

Fig. 1 is a flowchart of a video content risk detection method according to an embodiment of the present invention;

fig. 2 is a schematic structural diagram of a video content risk detection system according to an embodiment of the present invention;

fig. 3 is a schematic structural diagram of a video content risk detection apparatus according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention are clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present invention without making any creative effort, shall fall within the protection scope of the present invention.

In the description of the present invention, it is to be understood that the terms "center", "longitudinal", "lateral", "up", "down", "front", "back", "left", "right", "vertical", "horizontal", "top", "bottom", "inner", "outer", and the like, indicate orientations or positional relationships based on those shown in the drawings, and are used only for convenience in describing the present invention and for simplicity in description, and do not indicate or imply that the referenced devices or elements must have a particular orientation, be constructed and operated in a particular orientation, and thus, are not to be construed as limiting the present invention. Furthermore, the terms "first," "second," and the like are used for descriptive purposes only and are not to be construed as indicating or implying a relative importance or quantity or location.

In the description of the present invention, it should be noted that, unless otherwise explicitly specified or limited, the terms "mounted," "connected," and "connected" are to be construed broadly, e.g., as meaning either a fixed connection, a removable connection, or an integral connection; can be mechanically or electrically connected; they may be connected directly or indirectly through intervening media, or they may be interconnected between two elements. The specific meanings of the above terms in the present invention can be understood in specific cases to those skilled in the art.

Embodiments of the present invention will be described in further detail below with reference to the accompanying drawings.

Fig. 1 shows a flowchart of a video content risk detection method provided by an embodiment of the present invention, and referring to fig. 1, the video content risk detection method provided by the embodiment of the present invention includes:

s101, a first camera carries out video acquisition on a first preset position in an environment to be detected to obtain first video data, and the first video data are sent to a monitoring device, wherein the first video data at least comprise first time information, and the first time information cannot be changed; the second camera carries out video acquisition on a second preset position in the environment to be detected, second video data are obtained, and the second video data are sent to the monitoring device, wherein the second video data at least comprise second time information, the second time information cannot be changed, the first camera and the second camera are arranged at different positions in the environment to be detected, and the first time information and the second time information are obtained based on the same reference timing.

Specifically, first camera and second camera are the camera that sets up the different positions in waiting to detect the environment, for example when waiting to detect the environment and be self service bank, first camera can be for setting up the camera on the ATM, and the second camera can be for setting up the environment camera in the environment other than ATM in self service bank. Of course, in practical application of the present invention, more than two cameras may be provided, which is not limited in the present invention.

The first camera carries out video acquisition on the first preset position, and the second camera carries out video acquisition on the second preset position, wherein the acquisition is carried out from different positions, and the first camera and the second camera have different background characteristics. The first preset position and the second preset position are positions which are inevitably passed by a user when the user enters the environment to be detected to process the service, and the first preset position and the second preset position can be preset in the embodiment of the invention.

The first camera carries out video acquisition to a first preset position, and the second camera carries out video acquisition to a second preset position, for example, the first camera carries out video acquisition to a self-service bank entrance and exit, and the second camera carries out video acquisition to the environment in front of the ATM, carries out video acquisition through presetting two at least positions, can analyze whether the user action has the abnormality according to the time difference that the user arrives at two positions to be analyzed.

As an optional implementation manner of the embodiment of the present invention, when the first camera and the second camera perform video acquisition, current time information is added to video data, and the current time information cannot be changed, for example, the current time information may be encrypted by using an encryption method to obtain encrypted current time information, or the current time information may be signed by using a signature method to ensure that the current time information cannot be changed.

It should be noted that the first time information and the second time information are all time information contained in the video data, that is, the first time information and the second time information are a time stream formed by a plurality of time instants continuously, for example, recorded continuously in units of seconds.

The first time information and the second time information are obtained based on the same reference time, and both use, for example, internet time, that is, the first time information and the second time information are the same at the same time.

The first video data acquired by the first camera and the second video data acquired by the second camera are sent to the monitoring device in real time, or the acquired video data are sent to the monitoring device at regular time according to a preset period.

S102, the monitoring device receives the first video data and the second video data, identifies faces corresponding to users to be analyzed in the first video data and the second video data, and determines the users to be analyzed.

Specifically, the monitoring device may be disposed near the camera or in the background. For example, in an automated banking environment, the system may be disposed in an ATM or in a bank monitoring background, which is not limited in the present invention. After the monitoring device receives the first video data and the second video data, a user is identified from the first video data by adopting a face recognition technology, a user is identified from the second video data, and when the two users are determined to be the same user, the user is determined to be a user to be analyzed.

S103, the monitoring device extracts a first background feature of the user to be analyzed when the user to be analyzed is located at a first preset position from the first video data, extracts a second background feature of the user to be analyzed when the user to be analyzed is located at a second preset position from the second video data, and calculates to obtain a matching degree between the background feature to be analyzed and a preset background collaborative model, wherein the background feature to be analyzed comprises the first background feature and the second background feature; and the monitoring device extracts a first moment corresponding to the user to be analyzed when the first preset position appears from the first time information, extracts a second moment corresponding to the user to be analyzed when the second preset position appears from the second time information, and calculates the time interval between the first moment and the second moment.

In particular, the background feature may comprise any feature of a background identifier in the environment and any combination thereof to serve as an identifier for the background. For example, the information may include position information of the static object, shape information of the static object, quantity information of the static object, and motion rule of the dynamic object.

Specifically, a background cooperation model is preset in the monitoring device so as to analyze the background characteristics. As an optional implementation manner of the embodiment of the present invention, the monitoring device receives training video data acquired by the first camera and the second camera in advance; the monitoring device respectively extracts training elements from the training video data, and a preset background collaborative model is obtained according to training of the training elements. The background markers in the shooting range of each camera are analyzed to generate a background collaborative model, and a reasonable background threshold range is set according to a first preset position and a second preset position in different movement tracks of a normal user to judge, so that the intelligence and the accuracy of judgment are improved.

Inputting the extracted background features into a preset background collaborative model, and calculating a matching degree between the extracted background features and the background collaborative model, where the matching degree is a numerical value, and may be a percentage value, for example.

The monitoring device acquires first time information from the first video data, determines a first moment of the first time information when the user to be analyzed is at a first preset position, acquires second time information from the second video data, and determines a second moment of the second time information when the user to be analyzed is at a second preset position.

And S104, comparing the matching degree with a preset background threshold value by the monitoring device, if the matching degree is lower than the preset background threshold value, or judging whether the time interval is greater than a preset time threshold value, and if the matching degree is greater than the preset time threshold value, generating a first comparison result and determining that a preset risk exists.

Specifically, when the matching degree is lower than a preset background threshold, the background feature is considered to be not matched with the background collaborative model, and a preset risk may be considered to exist in the case that the background feature is not matched with the background collaborative model, for example: the video with the background features extracted has risks or users to be analyzed have risks, such as tampering of the video, hijacking of a camera, damage to normal collection of the camera by the users, and the like.

When the time length for the user to be analyzed to walk from the first preset position to the second preset position is longer than the preset time threshold, it can be determined that the behavior of the user to be analyzed is abnormal and does not conform to the normal behavior, and therefore the existence of the preset risk can be determined.

As an optional implementation manner of the embodiment of the present invention, the video content risk detection method further includes: the monitoring device receives training video data acquired by the first camera and the second camera in advance; the monitoring device respectively extracts training elements from the training video data, and obtains a preset time threshold value according to training of the training elements. The judgment is carried out by utilizing the preset time threshold obtained by training, and the intelligence and the accuracy of the judgment are improved.

As an optional implementation manner of the embodiment of the present invention, the preset time threshold may be a value or a value interval. Under the condition that the preset time threshold is a numerical value interval, the time interval is larger than the maximum value of the preset time threshold, or the time interval is smaller than the minimum value of the preset time threshold, the detection result can be generated, and the existence of the preset risk is determined. Therefore, the fact that the too long or too short time for the user to be analyzed to walk from the first preset position to the second preset position is abnormal can be judged, and the existence of the preset risk is determined.

It is worth mentioning that in the video content risk detection method, the monitoring device receives training video data acquired by the first camera and the second camera in advance; the monitoring device extracts training elements from the training video data respectively, and the preset background collaborative model and the preset time threshold obtained by training according to the training elements can be obtained in the same training or can be obtained in multiple training respectively, which is not limited in the present invention.

As an optional embodiment of the present invention, when the matching degree is not lower than the preset background threshold and the time interval is not greater than the preset time threshold, the monitoring apparatus generates a second comparison result to determine that there is no preset risk. Since the matching degree between the background features and the background collaborative model is high enough, no risk may be considered to exist, the time interval is not greater than the preset time threshold to conform to normal behavior, and no risk may be considered to exist, for example: the video is not at risk or the user to be analyzed is not at risk.

Optionally, as an optional embodiment of the present invention, the monitoring device performs an alarm operation after determining that the user to be analyzed has a preset risk. The alarm operation can be that an alarm device in the environment to be detected gives an alarm, for example, by sound and light, or an alarm device in a monitoring room of a background monitoring person, for example, by displaying on a monitoring display screen to give an alarm or sound, or sending a short message to a monitoring person or a policeman, or the like. The efficiency of risk handling for self-service banking and ATMs is further improved by alarming when a risk occurs.

Therefore, according to the video content risk detection method provided by the embodiment of the invention, at least two cameras are arranged at different positions, so that the time length required by the user to be analyzed to pass through the first preset position shot by the first camera and the second preset position shot by the second camera is judged, and the background characteristics of the user to be analyzed to pass through the first preset position and the second preset position are analyzed, so that the preset risk (such as illegal criminal intention) can be found in real time, and the defects of precautionary, intentional forgery, fake and other criminal behaviors under the monitoring of the single camera in the past are overcome.

As an optional embodiment of the present invention, first video data collected by a first camera is encrypted by a security chip disposed in the first camera, second video data collected by a second camera is encrypted by a security chip disposed in the second camera, the first camera sends the encrypted first video data to a monitoring device, and the second camera sends the encrypted second video data to the monitoring device; and after receiving the encrypted first video data and the encrypted second video data, the monitoring device decrypts the encrypted first video data and the encrypted second video data to obtain the first video data and the second video data. By carrying out encryption transmission on the video data, the security of video data transmission is improved, and the video data is prevented from being tampered after being cracked.

The method comprises the steps that first video data collected by a first camera are signed through a security chip arranged in the first camera to obtain first signature data, second video data collected by a second camera are signed through a security chip arranged in the second camera to obtain second signature data, the first camera sends the first video data and the first signature data to a monitoring device, and the second camera sends the second video data and the second signature data to the monitoring device; and after receiving the first video data, the first signature data, the second video data and the second signature data, the monitoring device checks the first signature data and the second signature data, and uses the first video data and the second video data to perform subsequent analysis after the first signature data and the second signature data pass the checking. By signing the video data, the authenticity of the video data source can be ensured, and the video data can be prevented from being tampered.

Fig. 2 is a schematic structural diagram of a video content risk detection system according to an embodiment of the present invention, where the video content risk detection system according to the embodiment of the present invention applies the above method, and only the structure of the video content risk detection system according to the embodiment of the present invention is briefly described below, and other things are not considered to be the best, with reference to the related description of the above video content risk detection method, referring to fig. 2, the video content risk detection system according to the embodiment of the present invention includes:

the first camera 201 is configured to perform video acquisition on a first preset position in an environment to be detected, obtain first video data, and send the first video data to the monitoring device, where the first video data at least includes first time information, and the first time information is unchangeable;

the second camera 202 is configured to perform video acquisition on a second preset position in the environment to be detected, obtain second video data, and send the second video data to the monitoring device, where the second video data at least includes second time information, the second time information is not changeable, the first camera and the second camera are disposed at different positions in the environment to be detected, and the first time information and the second time information are obtained based on the same reference timing;

the monitoring device 203 is used for receiving the first video data and the second video data, identifying faces corresponding to users to be analyzed in the first video data and the second video data, and determining the users to be analyzed; extracting a first background feature of a user to be analyzed when the user to be analyzed is located at a first preset position from the first video data, extracting a second background feature of the user to be analyzed when the user to be analyzed is located at a second preset position from the second video data, and calculating to obtain a matching degree between the background feature to be analyzed and a preset background collaborative model, wherein the background feature to be analyzed comprises the first background feature and the second background feature; extracting a first moment corresponding to the user to be analyzed when the first preset position appears from the first time information, extracting a second moment corresponding to the user to be analyzed when the second preset position appears from the second time information, and calculating the time interval between the first moment and the second moment; and comparing the matching degree with a preset background threshold, if the matching degree is lower than the preset background threshold, or judging whether the time interval is greater than a preset time threshold, and if the matching degree is greater than the preset time threshold, generating a first comparison result and determining that a preset risk exists.

Therefore, according to the video content risk detection system provided by the embodiment of the invention, at least two cameras are arranged at different positions, so that the time length required by the user to be analyzed to pass through the first preset position shot by the first camera and the second preset position shot by the second camera is judged, and the background characteristics of the user to be analyzed to pass through the first preset position and the second preset position are analyzed, so that the preset risk (such as illegal criminal intention) can be found in real time, and the defects of precautionary, intentional forgery, fake and other criminal behaviors under the monitoring of the single camera in the past are overcome.

As an optional embodiment of the present invention, the monitoring device 203 is further configured to generate a second comparison result when the matching degree is not lower than the preset background threshold and the time interval is not greater than the preset time threshold, so as to determine that there is no preset risk. Since the matching degree between the background features and the background collaborative model is high enough, no risk may be considered to exist, the time interval is not greater than the preset time threshold to conform to normal behavior, and no risk may be considered to exist, for example: the video is not at risk or the user to be analyzed is not at risk.

As an optional embodiment of the present invention, the monitoring device 203 is further configured to receive training video data acquired by the first camera and the second camera in advance; and respectively extracting training elements from the training video data, and training according to the training elements to obtain a preset background collaborative model and a preset time threshold. The background markers in the shooting range of each camera are analyzed to generate a background collaborative model, a reasonable background threshold range is set according to a first preset position and a second preset position in different movement tracks of a normal user to judge, and a preset time threshold obtained by training is utilized to judge, so that the intelligence and the accuracy of judgment are improved.

An alternative embodiment of the present invention is characterized in that the monitoring device 203 is further configured to perform an alarm operation after determining that the user to be analyzed has a preset risk. The efficiency of risk handling for self-service banking and ATMs is further improved by alarming when a risk occurs.

As an optional embodiment of the present invention, first video data collected by the first camera 201 is encrypted by a security chip disposed in the first camera, second video data collected by the second camera 202 is encrypted by a security chip disposed in the second camera, the first camera 201 sends the encrypted first video data to the monitoring apparatus, and the second camera 202 sends the encrypted second video data to the monitoring apparatus 203; after receiving the encrypted first video data and the encrypted second video data, the monitoring device 203 decrypts the encrypted first video data and the encrypted second video data to obtain the first video data and the second video data. By carrying out encryption transmission on the video data, the security of video data transmission is improved, and the video data is prevented from being tampered after being cracked.

A first video data collected by a first camera 201 is signed by a security chip arranged in the first camera to obtain a first signature data, a second video data collected by a second camera 202 is signed by a security chip arranged in the second camera to obtain a second signature data, the first camera 201 sends the first video data and the first signature data to a monitoring device, and the second camera 202 sends the second video data and the second signature data to a monitoring device 203; after receiving the first video data and the first signature data, and the second video data and the second signature data, the monitoring device 203 checks the first signature data and the second signature data, and performs subsequent analysis using the first video data and the second video data after the check passes. By signing the video data, the authenticity of the video data source can be ensured, and the video data can be prevented from being tampered.

On the basis of fig. 2, fig. 3 shows a schematic structural diagram of a video content risk detection apparatus provided in an embodiment of the present invention, where the video content risk detection apparatus is a monitoring apparatus in the system shown in fig. 2, and the video content risk detection apparatus provided in an embodiment of the present invention applies the above system and method, and only the structure of the video content risk detection apparatus provided in an embodiment of the present invention is briefly described below, and other things are not to the utmost, with reference to the above related description of the video content risk detection system and method, see fig. 3, the video content risk detection apparatus provided in an embodiment of the present invention includes:

a receiving module 2031, configured to receive first video data obtained by a first camera performing video acquisition on a first preset position in an environment to be detected, where the first video data at least includes first time information that is not changeable, and receive second video data obtained by a second camera performing video acquisition on a second preset position in the environment to be detected, where the second video data at least includes second time information that is not changeable, the first camera and the second camera are disposed at different positions in the environment to be detected, and the first time information and the second time information are obtained based on the same reference timing;

the identifying module 2032 is configured to identify faces corresponding to users to be analyzed in the first video data and the second video data, and determine the users to be analyzed;

a calculating module 2033, configured to extract, from the first video data, a first background feature when the user to be analyzed is located at a first preset position, extract, from the second video data, a second background feature when the user to be analyzed is located at a second preset position, and calculate to obtain a matching degree between the background feature to be analyzed and the preset background collaborative model, where the background feature to be analyzed includes the first background feature and the second background feature; extracting a first moment corresponding to the user to be analyzed when the first preset position appears from the first time information, extracting a second moment corresponding to the user to be analyzed when the second preset position appears from the second time information, and calculating the time interval between the first moment and the second moment;

the determining module 2034 is configured to compare the matching degree with a preset background threshold, and if the matching degree is lower than the preset background threshold, or determine whether the time interval is greater than a preset time threshold, and if the time interval is greater than the preset time threshold, generate a first comparison result and determine that a preset risk exists.

Therefore, according to the video content risk detection device provided by the embodiment of the invention, at least two cameras are arranged at different positions, so that the time length required by the user to be analyzed to pass through the first preset position shot by the first camera and the second preset position shot by the second camera is judged, and the background characteristics of the user to be analyzed to pass through the first preset position and the second preset position are analyzed, so that the preset risk (such as illegal criminal intention) can be found in real time, and the defects of precautionary, intentional forgery, fake and other criminal behaviors under the monitoring of the single camera in the past are overcome.

As an optional implementation manner of the embodiment of the present invention, the determining module 2034 is further configured to generate a second comparison result when the matching degree is not lower than the preset background threshold and the time interval is not greater than the preset time threshold, so as to determine that there is no preset risk. Since the matching degree between the background features and the background collaborative model is high enough, no risk may be considered to exist, the time interval is not greater than the preset time threshold to conform to normal behavior, and no risk may be considered to exist, for example: the video is not at risk or the user to be analyzed is not at risk.

Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing specific logical functions or steps of the process, and alternate implementations are included within the scope of the preferred embodiment of the present invention in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the present invention.

It should be understood that portions of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.

It will be understood by those skilled in the art that all or part of the steps carried by the method for implementing the above embodiments may be implemented by hardware that is related to instructions of a program, and the program may be stored in a computer-readable storage medium, and when executed, the program includes one or a combination of the steps of the method embodiments.

In addition, functional units in the embodiments of the present invention may be integrated into one processing module, or each unit may exist alone physically, or two or more units are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The integrated module, if implemented in the form of a software functional module and sold or used as a separate product, may also be stored in a computer readable storage medium.

The storage medium mentioned above may be a read-only memory, a magnetic or optical disk, etc.

In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.

Although embodiments of the present invention have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present invention, and that variations, modifications, substitutions and alterations can be made in the above embodiments by those of ordinary skill in the art without departing from the principle and spirit of the present invention. The scope of the invention is defined by the appended claims and equivalents thereof.

Claims

1. A video content risk detection method is characterized by comprising the following steps:

the method comprises the steps that a first camera carries out video acquisition on a first preset position in an environment to be detected to obtain first video data, and the first video data are sent to a monitoring device, wherein the first video data at least comprise first time information, and the first time information cannot be changed;

the method comprises the steps that a second camera carries out video acquisition on a second preset position in an environment to be detected to obtain second video data, and the second video data are sent to a monitoring device, wherein the second video data at least comprise second time information which cannot be changed, the first camera and the second camera are arranged at different positions in the environment to be detected, and the first time information and the second time information are obtained based on the same reference timing;

the monitoring device receives the first video data and the second video data, identifies a face corresponding to a user to be analyzed in the first video data and the second video data, and determines the user to be analyzed;

the monitoring device extracts a first background feature of the user to be analyzed when the user to be analyzed is located at the first preset position from the first video data, extracts a second background feature of the user to be analyzed when the user to be analyzed is located at the second preset position from the second video data, and calculates to obtain a matching degree between the background feature to be analyzed and a preset background collaborative model, wherein the background feature to be analyzed comprises the first background feature and the second background feature;

and

the monitoring device extracts a first moment corresponding to the user to be analyzed when the first preset position appears from the first time information, extracts a second moment corresponding to the user to be analyzed when the second preset position appears from the second time information, and calculates a time interval between the first moment and the second moment;

and the monitoring device compares the matching degree with a preset background threshold, if the matching degree is lower than the preset background threshold, or judges whether the time interval is greater than a preset time threshold, and if the matching degree is greater than the preset background threshold, a first comparison result is generated to determine that a preset risk exists.

2. The method of claim 1, further comprising:

and when the matching degree is not lower than the preset background threshold value and the time interval is not greater than the preset time threshold value, the monitoring device generates a second comparison result and determines that no preset risk exists.

3. The method of claim 1 or 2, further comprising:

the monitoring device receives training video data acquired by the first camera and the second camera in advance;

the monitoring device respectively extracts training elements from the training video data, and trains according to the training elements to obtain the preset background collaborative model and the preset time threshold.

4. The method of claim 1 or 2, further comprising:

and the monitoring device executes alarm operation after determining that the user to be analyzed has the preset risk.

5. A video content risk detection system, comprising:

the monitoring device comprises a first camera, a second camera and a monitoring device, wherein the first camera is used for carrying out video acquisition on a first preset position in an environment to be detected to obtain first video data and sending the first video data to the monitoring device, and the first video data at least comprises first time information which cannot be changed;

the second camera is used for carrying out video acquisition on a second preset position in the environment to be detected, obtaining second video data and sending the second video data to the monitoring device, wherein the second video data at least comprise second time information which cannot be changed, the first camera and the second camera are arranged at different positions in the environment to be detected, and the first time information and the second time information are obtained based on the same reference timing;

the monitoring device is used for receiving the first video data and the second video data, identifying faces corresponding to users to be analyzed in the first video data and the second video data, and determining the users to be analyzed; extracting a first background feature of the user to be analyzed when the user to be analyzed is located at the first preset position from the first video data, extracting a second background feature of the user to be analyzed when the user to be analyzed is located at the second preset position from the second video data, and calculating to obtain a matching degree between the background feature to be analyzed and a preset background collaborative model, wherein the background feature to be analyzed comprises the first background feature and the second background feature; extracting a first time corresponding to the user to be analyzed when the first preset position appears from the first time information, extracting a second time corresponding to the user to be analyzed when the second preset position appears from the second time information, and calculating a time interval between the first time and the second time; and comparing the matching degree with a preset background threshold, if the matching degree is lower than the preset background threshold, or judging whether the time interval is greater than a preset time threshold, and if the matching degree is greater than the preset background threshold, generating a first comparison result and determining that a preset risk exists.

6. The system according to claim 5, wherein the monitoring device is further configured to generate a second comparison result when the matching degree is not lower than the preset background threshold and the time interval is not greater than the preset time threshold, so as to determine that there is no preset risk.

7. The system according to claim 5 or 6, wherein the monitoring device is further configured to receive training video data acquired by the first camera and the second camera in advance; and respectively extracting training elements from the training video data, and training according to the training elements to obtain the preset background collaborative model and the preset time threshold.

8. The system according to claim 5 or 6, wherein the monitoring device is further configured to perform an alarm operation after determining that the user to be analyzed has a preset risk.

9. A video content risk detection apparatus, comprising:

the receiving module is used for receiving first video data obtained by a first camera performing video acquisition on a first preset position in an environment to be detected, wherein the first video data at least comprises first time information which cannot be changed, and receiving second video data obtained by a second camera performing video acquisition on a second preset position in the environment to be detected, wherein the second video data at least comprises second time information which cannot be changed, the first camera and the second camera are arranged at different positions in the environment to be detected, and the first time information and the second time information are obtained based on the same reference timing;

the identification module is used for identifying the faces corresponding to the users to be analyzed in the first video data and the second video data and determining the users to be analyzed;

a calculating module, configured to extract, from the first video data, a first background feature when the user to be analyzed is located at the first preset position, extract, from the second video data, a second background feature when the user to be analyzed is located at the second preset position, and calculate a matching degree between the background feature to be analyzed and a preset background collaborative model, where the background feature to be analyzed includes the first background feature and the second background feature; extracting a first time corresponding to the user to be analyzed when the first preset position appears from the first time information, extracting a second time corresponding to the user to be analyzed when the second preset position appears from the second time information, and calculating a time interval between the first time and the second time;

and the judging module is used for comparing the matching degree with a preset background threshold, if the matching degree is lower than the preset background threshold, or judging whether the time interval is greater than a preset time threshold, and if the matching degree is greater than the preset background threshold, generating a first comparison result and determining that a preset risk exists.

10. The apparatus according to claim 9, wherein the determining module is further configured to generate a second comparison result when the matching degree is not lower than the preset background threshold and the time interval is not greater than the preset time threshold, so as to determine that there is no preset risk.