WO2018157746A1 - Recommendation method and apparatus for video data - Google Patents
Recommendation method and apparatus for video data Download PDFInfo
- Publication number
- WO2018157746A1 WO2018157746A1 PCT/CN2018/076784 CN2018076784W WO2018157746A1 WO 2018157746 A1 WO2018157746 A1 WO 2018157746A1 CN 2018076784 W CN2018076784 W CN 2018076784W WO 2018157746 A1 WO2018157746 A1 WO 2018157746A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- video data
- feature information
- target
- quality
- quality feature
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 69
- 238000001514 detection method Methods 0.000 claims abstract description 77
- 238000012549 training Methods 0.000 claims description 28
- 238000012545 processing Methods 0.000 claims description 26
- 238000000605 extraction Methods 0.000 claims description 25
- 230000005012 migration Effects 0.000 claims description 19
- 238000013508 migration Methods 0.000 claims description 19
- 230000009471 action Effects 0.000 claims description 12
- 230000008859 change Effects 0.000 claims description 8
- 238000011176 pooling Methods 0.000 claims description 7
- 238000003062 neural network model Methods 0.000 claims description 5
- 238000013136 deep learning model Methods 0.000 abstract description 3
- 238000010586 diagram Methods 0.000 description 9
- 238000004590 computer program Methods 0.000 description 6
- 230000008569 process Effects 0.000 description 6
- 230000003993 interaction Effects 0.000 description 4
- 230000009193 crawling Effects 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 230000006870 function Effects 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 238000010606 normalization Methods 0.000 description 3
- 230000002085 persistent effect Effects 0.000 description 3
- 238000013075 data extraction Methods 0.000 description 2
- 238000011156 evaluation Methods 0.000 description 2
- 230000010365 information processing Effects 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 238000012216 screening Methods 0.000 description 2
- 238000001228 spectrum Methods 0.000 description 2
- PXFBZOLANLWPMH-UHFFFAOYSA-N 16-Epiaffinine Natural products C1C(C2=CC=CC=C2N2)=C2C(=O)CC2C(=CC)CN(C)C1C2CO PXFBZOLANLWPMH-UHFFFAOYSA-N 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000036961 partial effect Effects 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 230000002829 reductive effect Effects 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
- 230000002194 synthesizing effect Effects 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 230000001131 transforming effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/45—Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies, resolving scheduling conflicts
- H04N21/466—Learning process for intelligent management, e.g. learning user preferences for recommending movies
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/45—Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies, resolving scheduling conflicts
- H04N21/466—Learning process for intelligent management, e.g. learning user preferences for recommending movies
- H04N21/4668—Learning process for intelligent management, e.g. learning user preferences for recommending movies for recommending content, e.g. movies
Definitions
- the present invention relates to the field of data processing technologies, and in particular, to a method for recommending video data, a device for recommending video data, a method for generating a video data detection model, a device for generating a video data detection model, and a video.
- a method of identifying data and a device for identifying video data are examples of a method for recommending video data.
- e-commerce websites begin to use video content for shopping guide and marketing, that is, input corresponding text information according to operational needs, and then select appropriate video frames from the video library, and then adopt according to text semantics.
- the video frame constructs a video of the appropriate scene and is recommended to the target user.
- an embodiment of the present application is provided to provide a video data recommendation method, a video data recommendation device, and a video data detection model generation method, which overcome the above problems or at least partially solve the above problems.
- a device for generating a video data detection model, a method for identifying video data, and a corresponding device for identifying video data is provided.
- the present application discloses a method for recommending video data, including:
- the target video data is recommended to the user.
- the preset video data detection model is generated by:
- Training is performed by using quality feature information of the plurality of forward sample video data and negative sample video data to generate a video data detection model.
- the quality feature information includes image pixel feature information, continuous frame image object migration feature information, continuous frame image motion feature information, different frequency domain feature information of the image frame, image frame wavelet transform feature information, and/or, Image rotation operator feature information.
- the step of separately extracting quality feature information of the plurality of sample video data includes:
- the pixel information is separately subjected to convolution operation and pooling processing to obtain image pixel feature information.
- the step of separately extracting quality feature information of the plurality of sample video data includes:
- the number and frequency of occurrences of the object objects in the adjacent two frames of images are respectively determined to obtain continuous frame image object migration feature information.
- the step of separately extracting quality feature information of the plurality of sample video data includes:
- the geometric parameters of the shape features of the motion objects in the adjacent two frames of images are respectively determined to obtain continuous frame image motion feature information.
- the step of separately extracting quality feature information of the plurality of sample video data includes:
- the amplitude difference and the phase difference of the adjacent two frames of images are respectively determined to obtain different frequency domain feature information of the image frame.
- the step of separately extracting quality feature information of the plurality of sample video data includes:
- the change values of the wavelet coefficients of the adjacent two frames of images are respectively determined to obtain image frame wavelet transform feature information.
- the step of separately extracting quality feature information of the plurality of sample video data includes:
- the change values of the rotation operators of the adjacent two frames of images are respectively determined to obtain image rotation operator feature information.
- the step of training by using the quality feature information of the plurality of forward sample video data and the negative sample video data to generate the video data detection model includes:
- the target quality feature information is used to train the neural network model to generate a video data detection model.
- the step of identifying target quality feature information from the normalized quality feature information includes:
- the quality feature information identifying that the information entropy exceeds the first preset threshold is the target quality feature information.
- it also includes:
- the plurality of users are clustered into a plurality of user groups according to the attribute information, and the user groups have corresponding user labels.
- the step of identifying the quality feature information by using a preset video data detection model to obtain target video data includes:
- the video data whose quality score exceeds the second preset threshold is extracted as target video data.
- the step of recommending the target video data to a user includes:
- the target video data is recommended to the target user group.
- the target video data has a corresponding video tag
- the step of determining a target user group among the multiple user groups includes:
- the present application discloses a method for generating a video data detection model, including:
- Training is performed by using quality feature information of the plurality of forward sample video data and negative sample video data to generate a video data detection model.
- the quality feature information includes image pixel feature information, continuous frame image object migration feature information, continuous frame image motion feature information, different frequency domain feature information of the image frame, image frame wavelet transform feature information, and/or, Image rotation operator feature information.
- the present application discloses a method for identifying video data, including:
- a recommendation device for video data including:
- An obtaining module configured to acquire one or more video data to be detected
- An extraction module configured to separately extract quality feature information of each video data to be detected
- An identification module configured to identify the quality feature information by using a preset video data detection model to obtain target video data
- a recommendation module for recommending the target video data to a user.
- the preset video data detection model is generated by calling the following module:
- a quality feature information extraction module configured to separately extract quality feature information of the plurality of sample video data, where the plurality of sample video data includes a plurality of forward sample video data and negative sample video data;
- the video data detection model generating module is configured to perform training by using the quality feature information of the plurality of forward sample video data and the negative sample video data to generate a video data detection model.
- the quality feature information includes image pixel feature information, continuous frame image object migration feature information, continuous frame image motion feature information, different frequency domain feature information of the image frame, image frame wavelet transform feature information, and/or, Image rotation operator feature information.
- the quality feature information extraction module includes:
- a pixel information extraction submodule configured to extract pixel information of each frame image of each sample video data
- the pixel information processing sub-module is configured to perform convolution operation and pooling processing on the pixel information to obtain image pixel feature information.
- the quality feature information extraction module further includes:
- An object object recognition sub-module for identifying an object object in each frame image of each sample video data
- the object object processing sub-module is configured to respectively determine the number and frequency of occurrences of the object objects in the adjacent two frames of images to obtain continuous frame image object migration feature information.
- the quality feature information extraction module further includes:
- a motion object recognition submodule configured to identify a shape feature of the motion object in each frame image of each sample video data
- the action object processing sub-module is configured to respectively determine geometric parameters of the shape features of the action objects in the adjacent two frames of images to obtain continuous frame image action feature information.
- the quality feature information extraction module further includes:
- An amplitude and phase determination sub-module for determining a magnitude and a phase of each frame image of each sample video data
- the amplitude and phase processing sub-module is configured to respectively determine amplitude difference and phase difference of adjacent two frames of images to obtain different frequency domain feature information of the image frame.
- the quality feature information extraction module further includes:
- a wavelet coefficient determining submodule for determining a wavelet coefficient of each frame image of each sample video data
- the wavelet coefficient processing sub-module is configured to respectively determine the variation values of the wavelet coefficients of the adjacent two frames of images to obtain image frame wavelet transform feature information.
- the quality feature information extraction module further includes:
- a rotation operator determining sub-module for determining a rotation operator of each frame image of each sample video data
- the rotation operator processing sub-module is configured to respectively determine a variation value of a rotation operator of the adjacent two frames of images to obtain image rotation operator feature information.
- the video data detection model generating module includes:
- a normalization processing sub-module configured to normalize quality characteristic information of the plurality of forward sample video data and negative-direction sample video data to obtain normalized quality feature information
- a target quality feature information identifying submodule configured to identify target quality feature information from the normalized quality feature information
- the video data detection model generation submodule is configured to perform neural network model training by using the target quality feature information, and generate a video data detection model.
- the target quality feature information identifying submodule includes:
- An information entropy determining unit configured to determine an information entropy of the normalized quality feature information
- the target quality feature information identifying unit is configured to identify the quality feature information that the information entropy exceeds the first preset threshold as the target quality feature information.
- generating the preset video data detection model further invokes the following modules:
- An attribute information obtaining module configured to acquire attribute information of multiple users
- the user group clustering module is configured to cluster the plurality of users into a plurality of user groups according to the attribute information, where the user group has a corresponding user label.
- the identifying module includes:
- a quality feature information identifying sub-module configured to identify, by using a preset video data detection model, quality characteristic information of the one or more video data to be detected, respectively, to obtain the one or more video data to be detected.
- the target video data extraction sub-module is configured to extract video data whose quality score exceeds a second preset threshold as target video data.
- the recommendation module includes:
- a target user group determining submodule configured to determine a target user group among the plurality of user groups
- the target video data recommendation submodule is configured to recommend the target video data to the target user group.
- the target video data has a corresponding video tag
- the target user group determining submodule includes:
- the target user group determining unit is configured to determine a user group corresponding to the same user tag of the video tag of the target video data as a target user group.
- a device for generating a video data detection model including:
- a quality feature information extraction module configured to separately extract quality feature information of the plurality of sample video data, where the plurality of sample video data includes a plurality of forward sample video data and negative sample video data;
- the video data detection model generating module is configured to perform training by using the quality feature information of the plurality of forward sample video data and the negative sample video data to generate a video data detection model.
- the quality feature information includes image pixel feature information, continuous frame image object migration feature information, continuous frame image motion feature information, different frequency domain feature information of the image frame, image frame wavelet transform feature information, and/or, Image rotation operator feature information.
- an apparatus for identifying video data including:
- An obtaining module configured to acquire one or more video data to be detected
- a sending module configured to send the one or more video data to be detected to a server, where the server is configured to separately identify the one or more video data to be detected to obtain a recognition result, and the identifying The result includes one or more candidate video data;
- a receiving module configured to receive the one or more candidate video data returned by the server
- a determining module configured to determine target video data in the one or more candidate video data
- a presentation module for presenting the target video data.
- the embodiments of the present application include the following advantages:
- one or more video data to be detected are acquired, and quality characteristic information of each video data to be detected is separately extracted, and then the quality feature information is identified by using a preset video data detection model.
- the target video data is obtained, and the target video data is recommended to the user, and the high-quality video data can be quickly selected by using the deep learning model.
- the embodiment of the present application solves the problem that the prior art can only rely on manual identification and recommend the video to the user.
- the problem of the segment improves the recognition efficiency of the video data and the accuracy of the recommendation.
- Embodiment 1 is a flow chart showing the steps of Embodiment 1 of a method for recommending video data according to the present application;
- FIG. 2 is a flow chart of steps of a second embodiment of a method for recommending video data according to the present application
- FIG. 3 is a schematic block diagram of a method for recommending video data according to the present application.
- FIG. 4 is a flow chart showing the steps of an embodiment of a method for generating a video data detection model according to the present application
- FIG. 5 is a flow chart showing the steps of an embodiment of a method for identifying video data according to the present application
- FIG. 6 is a structural block diagram of an embodiment of a device for recommending video data according to the present application.
- FIG. 7 is a structural block diagram of an embodiment of a device for generating a video data detection model according to the present application.
- FIG. 8 is a structural block diagram of an embodiment of an apparatus for identifying video data according to the present application.
- FIG. 1 a flow chart of a first embodiment of a method for recommending video data according to the present application is shown. Specifically, the method may include the following steps:
- Step 101 Acquire one or more video data to be detected
- the video data to be detected may be an off-the-shelf video segment obtained from various ways, or may be a video segment that is synthesized in real time by extracting multiple video frames according to a certain rule in the video library.
- the application embodiment does not limit the specific source and type of video data.
- Step 102 Extract quality characteristic information of each video data to be detected, respectively.
- the quality feature information of the video data may be feature information for identifying the quality of the video data, for example, image pixels of the video data, content displayed by the image, and the like. By identifying the quality characteristic information of the video data, it is possible to check the fluency, consistency, and the like of the video clip.
- the type of the quality feature information to be extracted and the manner of the extraction are determined by a person skilled in the art according to actual needs, which is not limited by the embodiment of the present application.
- Step 103 Identify the quality feature information by using a preset video data detection model to obtain target video data.
- the preset video data detection model may be generated by training a plurality of sample video data in the training sample set, so that each quality feature information of the video data to be detected may be identified.
- the plurality of sample video data in the training sample set may include a plurality of forward sample video data and a plurality of negative sample video data
- the forward sample video data may be a video segment with better video quality, for example, A video clip with better fluency and coherence and a more uniform overall style between video frames.
- the forward sample video data can be obtained by manual marking or web crawling; contrary to the forward sample video data,
- the negative sample video data is a video segment with poor fluency, coherence, and overall style consistency between video frames.
- such negative sample video data can be obtained by randomly synthesizing multiple video frames.
- the source and the identification manner of the forward sample video data and the negative sample video data are not limited in the embodiment of the present application.
- the quality feature information of the forward sample video data and the negative sample video data may be respectively extracted, and model training is performed to generate a video.
- the data detection model is further configured to: after extracting the quality feature information of the video data to be detected, use the video data detection model to identify the quality feature information to obtain target video data.
- the target video data may be a video clip of good quality obtained after being identified by the video data detection model.
- Step 104 recommending the target video data to a user.
- the target video data is recommended to the user, and the target video segment may be played in the user interface, or the target video segment may be pushed to the user.
- the specific manner of recommending the target video data is not limited in this embodiment of the present application. .
- one or more video data to be detected are acquired, and quality characteristic information of each video data to be detected is separately extracted, and then the quality feature information is performed by using a preset video data detection model.
- the target video data is obtained to obtain the target video data, and the target video data is recommended to the user.
- the deep learning model in the embodiment of the present application can quickly screen out high-quality video data, and solves the problem that the prior art can only rely on manual identification and recommend to the user.
- the problem of video clips improves the efficiency of recognition of video data and the accuracy of recommendations.
- the method may include the following steps:
- Step 201 Extract quality feature information of a plurality of sample video data, where the plurality of sample video data includes a plurality of forward sample video data and negative direction sample video data;
- FIG. 3 it is a functional block diagram of a method for recommending video data of the present application.
- the embodiment of the present application performs feature extraction on the training sample set, and then performs deep learning modeling, and then uses the trained model to evaluate the detected video data, outputs corresponding quality scores, and simultaneously integrates users in the modeling process.
- the attribute information clusters the user groups to implement video recommendations to the user community.
- the forward sample video data may be a video segment with better video quality, for example, a video segment with better fluency and coherence and a uniform overall style between video frames, usually
- the class forward sample video data can be obtained by manual marking. The operator checks the fluency and consistency of the video segment and the overall style between the video frames, so that the fluency and coherence are better, and the video frames are better.
- the video clips with more consistent overall style are marked as forward sample video data, and can also be obtained through web crawling, that is, by capturing some high-quality videos with high click-through rate and many praises from the video website, as a network crawling Forward sample video data.
- the negative sample video data is a video segment with poor fluency, coherence, and overall style consistency between video frames.
- Such negative sample video data can pass through Multiple video frames are obtained by random synthesis. For example, some scattered video frame segments can be randomly extracted from multiple categories (such as travel, religion, and electronic products), and then the extracted video frame segments can be randomly combined and spliced. There are a large number of inconsistencies and semantic inconsistencies, so that such spliced video segments can be used as negative sample video data.
- the obtained forward sample video data and negative sample video data can then be used as a training sample set for subsequent model training.
- the quality feature information of the plurality of sample video data in the training sample set may be separately extracted first.
- the quality feature information may include image pixel feature information, continuous frame image object migration feature information, continuous frame image motion feature information, different frequency domain feature information of the image frame, and image frame wavelet transform. Feature information, and/or image rotation operator feature information.
- the following describes a method for extracting the above six kinds of feature information one by one.
- pixel information of each frame image of each sample video data may be extracted, and then the pixel information is separately subjected to convolution operation and pooling processing to obtain image pixel features. information.
- an image is obtained by intercepting each frame of a video segment. Therefore, pixel information in each frame image can be extracted separately as a feature set to be processed, and then the pixel information in the feature set is convoluted. And further performing pooling processing (max-pooling) on the feature set obtained after the convolution operation, thereby obtaining image pixel feature information.
- the most significant description of the pixel information can be obtained.
- the corresponding features not only have a reduced dimension, but also can express the original semantic meaning of the image.
- the object objects in each frame image of each sample video data may be identified, and then the number and frequency of occurrences of the object objects in the adjacent two frames of images may be respectively determined to obtain continuous frame image object migration.
- Feature information may be provided.
- each frame image may be separately analyzed, and object objects in each frame image are identified and extracted, and then sorted according to the chronological order of each frame, thereby determining objects in adjacent two frames of images.
- the partial adjacent image frames may be selected according to actual needs, and the number of adjacent image frames selected in this embodiment of the present application. Not limited.
- the embodiment of the present application can identify the shape feature of the action object in each frame image of each sample video data, and then respectively The geometric parameters of the shape features of the motion objects in the adjacent two frames of images are determined to obtain continuous frame image motion feature information.
- the motion object in each frame image can be separately identified, and the geometric boundary of the motion object can be determined, and then the geometric boundary of the motion in each frame image and the geometry of the motion in the previous frame image can be determined.
- the shape boundaries are compared, and the geometric parameters of the shape features of the motion object are calculated according to the geometric affine transformation, and the geometric parameters are used as continuous frame image motion feature information.
- the amplitude and phase of each frame image of each sample video data may be determined, and then the amplitude difference and phase of the adjacent two frames of images may be respectively determined. Poor to obtain different frequency domain feature information of the image frame.
- the Fourier transform of each frame image may be first performed and the spectrum system features are extracted, and then the amplitude and phase features of each of the plurality of different spectrum systems are extracted, and these features are used as feature sets of each frame image. Then, the amplitude difference and the phase difference of the adjacent two frames are calculated, and the amplitude difference and phase difference of the adjacent two frames of images are obtained.
- the embodiment of the present application may determine the wavelet coefficients of each frame image of each sample video data, and then determine the change values of the wavelet coefficients of the adjacent two frames respectively to obtain the image frame wavelet transform feature information. .
- wavelet transform processing may be performed on each frame image to obtain corresponding wavelet coefficients, and then each frame image is sorted in time series, respectively, and wavelet coefficients of adjacent two frames are calculated, and wavelet coefficients are extracted. The changed difference is used as wavelet transform feature information.
- the rotation operator of each frame image of each sample video data may be first determined, and then the change values of the rotation operators of the adjacent two frame images are respectively determined, and obtained.
- Image rotation operator feature information for the image rotation operator feature information, the rotation operator of each frame image of each sample video data may be first determined, and then the change values of the rotation operators of the adjacent two frame images are respectively determined, and obtained.
- each frame image may be first calculated, and then each frame image is sorted in time series, and the change value of the rotation operator between the adjacent two frames of images is determined to obtain image rotation operator feature information.
- the rotation operator for calculating each frame image may adopt a SIFT (Scale-invariant feature transform) algorithm, which is an algorithm for detecting local features, by seeking a picture
- SIFT Scale-invariant feature transform
- the feature points and their scale and direction descriptors obtain features and perform image feature point matching. The essence is to find key points (feature points) in different scale spaces and calculate the direction of the key points.
- Step 202 Perform training by using quality feature information of the plurality of forward sample video data and negative sample video data to generate a video data detection model.
- the quality feature information may be used for model training to generate a video data detection model.
- the quality feature information of the plurality of forward sample video data and the negative sample video data may be normalized to obtain normalized quality feature information, and the normalization may be complemented.
- the missing value of the quality feature information is then identified from the normalized quality feature information, and then the target quality feature information is used for neural network model training to generate a video data detection model.
- the identifying the target quality feature information may be screening out the high discriminative feature information.
- the information entropy of the normalized quality feature information may be first determined. Due to the larger the information entropy, the richer information is enriched, and the importance of the feature is greater, and the more it should be retained. Therefore, the quality feature information whose information entropy exceeds the first preset threshold can be identified as the target quality feature. information.
- the personalized feature information of the user may also be integrated, so that when the video data to be detected is identified, the evaluation of the video data and the user attribute may be combined. Improve the relevance and effectiveness of recommended video data.
- attribute information of multiple users may be acquired, and then, according to the attribute information, the multiple users are clustered into multiple user groups, and the user groups have corresponding user labels, so that the training samples are When the centralized video data is used for model training, the attribute information of the user can be effectively integrated.
- Step 203 Acquire one or more video data to be detected.
- the video data to be detected may be a video segment synthesized in real time by extracting a plurality of video frames according to a certain rule in a video library.
- a certain rule for example, when the e-commerce website uses the video content for shopping guide and marketing, a plurality of video frames matching the text content may be extracted from the massive video library according to the input text content, and then the multiple videos are Frames are combined into video clips according to certain rules.
- the video data to be detected may be determined by other methods in the art.
- the video data to be detected may also be an off-the-shelf video segment obtained from various paths, which is not limited in this embodiment of the present application.
- Step 204 Extract quality feature information of each video data to be detected, respectively.
- the quality feature information of the video data to be detected may also include image pixel feature information, continuous frame image object migration feature information, continuous frame image motion feature information, different frequency domain feature information of the image frame, and image frame wavelet. Transforming feature information, and/or image rotation operator feature information.
- step 201 For the method for extracting the foregoing quality feature information, refer to step 201, which is not described in this step.
- Step 205 Identify, by using a preset video data detection model, the quality feature information of the one or more video data to be detected to obtain a quality score of the one or more video data to be detected.
- the quality feature information may be identified by using the trained video detection model, and based on the recognition result. Each video data to be detected is scored, and a corresponding quality score is output.
- Step 206 Extract video data whose quality score exceeds a second preset threshold as target video data.
- video data whose quality score exceeds the second preset threshold can be extracted as target video data.
- a person skilled in the art can determine the size of the second preset threshold according to actual needs, which is not limited by the embodiment of the present application.
- the video data with the highest quality score can be directly selected as the target video data, which is not limited in this embodiment of the present application.
- Step 207 Determine a target user group among the plurality of user groups
- the identified target video data may include a corresponding video tag to reflect the classification or other information of the video data.
- the target user group for which the target video data is targeted may be identified according to the comparison between the video tag and the user tag of the user group. For example, it may be determined that the user group corresponding to the same user tag of the video tag of the target video data is the target user group.
- a person skilled in the art may also determine the target user group in other manners, which is not limited by the embodiment of the present application.
- Step 208 recommend the target video data to the target user group.
- the target video data may be recommended to the target user group.
- the video clip can be recommended to a potential consumer group, improving the user service experience and improving the user conversion rate.
- FIG. 4 a flow chart of the steps of a method for generating a video data detection model of the present application is shown, which may specifically include the following steps:
- Step 401 Extract quality feature information of a plurality of sample video data, where the plurality of sample video data includes a plurality of forward sample video data and negative direction sample video data;
- Step 402 Perform training by using quality feature information of the plurality of forward sample video data and negative sample video data to generate a video data detection model.
- the quality feature information may include image pixel feature information, continuous frame image object migration feature information, continuous frame image motion feature information, different frequency domain feature information of the image frame, and image frame wavelet transform feature information. And/or image rotation operator feature information.
- the method for generating the video data detection model in the step 401 to the step 402 of the present embodiment is similar to the step 201 to the step 202 in the second embodiment of the video data recommendation method, and can be referred to each other.
- FIG. 5 a flow chart of steps of an embodiment of a method for identifying video data according to the present application is shown. Specifically, the method may include the following steps:
- Step 501 Acquire one or more video data to be detected.
- a user interface may be provided.
- an interactive interface is displayed on the display screen of the terminal, and the user may submit a detection request for one or more video data through the interaction interface.
- the video data may be an off-the-shelf video segment obtained from various channels, or may be a video segment that is synthesized in real time by extracting a plurality of video frames according to a certain rule in the video library.
- the specific source of the video data in the embodiment of the present application is The type is not limited.
- Step 502 Send the one or more video data to be detected to a server, where the server is configured to separately identify the one or more video data to be detected to obtain a recognition result, where the identification result includes One or more candidate video data;
- the terminal may send one or more video data to be detected to the server, and the server completes the identification of the video data to obtain a corresponding recognition result.
- the identification result may include one or more candidate video data, and each candidate video data includes a corresponding quality score.
- the process of identifying the one or more video data to be detected by the server is similar to the step 201 to step 205 in the foregoing embodiment, and may be referred to each other.
- Step 503 Receive the one or more candidate video data returned by the server.
- the server may return one or more candidate video data included in the identification result to the terminal.
- Step 504 Determine target video data in the one or more candidate video data.
- the target video data since the candidate video data has a corresponding quality score, the target video data may be determined according to the level of the quality score.
- the higher the quality score the better the quality of the corresponding video data can be considered. Therefore, the video data with the highest quality score can be used as the target video data; or, the quality score can exceed a certain threshold. Determining a screening range in the plurality of candidate video data, and then determining the target video data from the plurality of candidate video data in the range according to actual requirements of the service, and the specific manner of determining the target video data in the embodiment of the present application Not limited. Of course, there may be more than one target video data, and there may be multiple, and this application does not limit this.
- the target video data may be determined by the terminal according to the information input by the user, and may be specifically selected by the user in the multiple candidate video data, which is not limited in this embodiment of the present application.
- Step 505 presenting the target video data.
- the terminal may display the target video data on the interaction interface, for example, the specific information of the target video data may be displayed, or the target video data may be directly played, which is not limited in this embodiment of the present application.
- the user can directly submit the identification request for the video data through the interaction interface, and the server identifies the video data targeted by the identification request, so that the user can The detection of the video data is completed according to actual needs, and the convenience of the user to judge the quality of the video data is improved.
- FIG. 6 a structural block diagram of a device for recommending video data of the present application is shown, which may specifically include the following modules:
- the obtaining module 601 is configured to acquire one or more video data to be detected
- the extracting module 602 is configured to separately extract quality feature information of each video data to be detected
- the identification module 603 is configured to identify the quality feature information by using a preset video data detection model to obtain target video data.
- the recommendation module 604 is configured to recommend the target video data to the user.
- the preset video data detection model may be generated by calling the following module:
- a quality feature information extraction module configured to separately extract quality feature information of the plurality of sample video data, where the plurality of sample video data may include a plurality of forward sample video data and negative direction sample video data;
- the video data detection model generating module is configured to perform training by using the quality feature information of the plurality of forward sample video data and the negative sample video data to generate a video data detection model.
- the quality feature information may include image pixel feature information, continuous frame image object migration feature information, continuous frame image motion feature information, different frequency domain feature information of the image frame, and image frame wavelet transform feature information. And/or image rotation operator feature information.
- the quality feature information extraction module may specifically include the following submodules:
- a pixel information extraction submodule configured to extract pixel information of each frame image of each sample video data
- the pixel information processing sub-module is configured to perform convolution operation and pooling processing on the pixel information to obtain image pixel feature information.
- the quality feature information extraction module may further include the following sub-modules:
- An object object recognition sub-module for identifying an object object in each frame image of each sample video data
- the object object processing sub-module is configured to respectively determine the number and frequency of occurrences of the object objects in the adjacent two frames of images to obtain continuous frame image object migration feature information.
- the quality feature information extraction module may further include the following sub-modules:
- a motion object recognition submodule configured to identify a shape feature of the motion object in each frame image of each sample video data
- the action object processing sub-module is configured to respectively determine geometric parameters of the shape features of the action objects in the adjacent two frames of images to obtain continuous frame image action feature information.
- the quality feature information extraction module may further include the following sub-modules:
- An amplitude and phase determination sub-module for determining a magnitude and a phase of each frame image of each sample video data
- the amplitude and phase processing sub-module is configured to respectively determine the amplitude difference and the phase difference of the adjacent two frames of images to obtain different frequency domain characteristic information of the image frame.
- the quality feature information extraction module may further include the following sub-modules:
- a wavelet coefficient determining submodule for determining a wavelet coefficient of each frame image of each sample video data
- the wavelet coefficient processing sub-module is configured to respectively determine the variation values of the wavelet coefficients of the adjacent two frames of images to obtain image frame wavelet transform feature information.
- the quality feature information extraction module may further include the following sub-modules:
- a rotation operator determining sub-module for determining a rotation operator of each frame image of each sample video data
- the rotation operator processing sub-module is configured to respectively determine a variation value of a rotation operator of the adjacent two frames of images to obtain image rotation operator feature information.
- the video data detection model generating module may specifically include the following submodules:
- a normalization processing sub-module configured to normalize quality characteristic information of the plurality of forward sample video data and negative-direction sample video data to obtain normalized quality feature information
- a target quality feature information identifying submodule configured to identify target quality feature information from the normalized quality feature information
- the video data detection model generation submodule is configured to perform neural network model training by using the target quality feature information, and generate a video data detection model.
- the target quality feature information identifying submodule may specifically include the following units:
- An information entropy determining unit configured to determine an information entropy of the normalized quality feature information
- the target quality feature information identifying unit is configured to identify the quality feature information that the information entropy exceeds the first preset threshold as the target quality feature information.
- generating the preset video data detection model may also invoke the following modules:
- An attribute information obtaining module configured to acquire attribute information of multiple users
- the user group clustering module is configured to cluster the plurality of users into a plurality of user groups according to the attribute information, where the user group has a corresponding user label.
- the identification module 603 may specifically include the following sub-modules:
- a quality feature information identifying sub-module configured to identify, by using a preset video data detection model, quality characteristic information of the one or more video data to be detected, respectively, to obtain the one or more video data to be detected.
- the target video data extraction sub-module is configured to extract video data whose quality score exceeds a second preset threshold as target video data.
- the recommendation module 604 may specifically include the following submodules:
- a target user group determining submodule configured to determine a target user group among the plurality of user groups
- the target video data recommendation submodule is configured to recommend the target video data to the target user group.
- the target video data may have a corresponding video label
- the target user group determining sub-module may specifically include the following units:
- the target user group determining unit is configured to determine a user group corresponding to the same user tag of the video tag of the target video data as a target user group.
- FIG. 7 a structural block diagram of an embodiment of a device for generating a video data detection model of the present application is shown, which may specifically include the following modules:
- the quality feature information extraction module 701 is configured to separately extract quality feature information of the plurality of sample video data, where the plurality of sample video data may include a plurality of forward sample video data and negative direction sample video data;
- the video data detection model generating module 702 is configured to perform training by using the quality feature information of the plurality of forward sample video data and the negative sample video data to generate a video data detection model.
- the quality feature information may include image pixel feature information, continuous frame image object migration feature information, continuous frame image motion feature information, different frequency domain feature information of the image frame, and image frame wavelet transform feature information. And/or image rotation operator feature information.
- FIG. 8 a structural block diagram of an embodiment of an apparatus for identifying video data according to the present application is shown, which may specifically include the following modules:
- the obtaining module 801 is configured to acquire one or more video data to be detected
- a sending module 802 configured to send the one or more video data to be detected to a server, where the server is configured to separately identify the one or more video data to be detected to obtain a recognition result, where
- the recognition result may include one or more candidate video data;
- the receiving module 803 is configured to receive the one or more candidate video data returned by the server;
- a determining module 804 configured to determine target video data in the one or more candidate video data
- a presentation module 805 is configured to present the target video data.
- the description is relatively simple, and the relevant parts can be referred to the description of the method embodiment.
- embodiments of the embodiments of the present application can be provided as a method, apparatus, or computer program product. Therefore, the embodiments of the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware. Moreover, embodiments of the present application can take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) including computer usable program code.
- computer-usable storage media including but not limited to disk storage, CD-ROM, optical storage, etc.
- the computer device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
- the memory may include non-persistent memory, random access memory (RAM), and/or non-volatile memory in a computer readable medium, such as read only memory (ROM) or flash memory.
- RAM random access memory
- ROM read only memory
- Memory is an example of a computer readable medium.
- Computer readable media includes both permanent and non-persistent, removable and non-removable media.
- Information storage can be implemented by any method or technology. The information can be computer readable instructions, data structures, modules of programs, or other data.
- Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read only memory. (ROM), electrically erasable programmable read only memory (EEPROM), flash memory or other memory technology, compact disk read only memory (CD-ROM), digital versatile disk (DVD) or other optical storage, Magnetic tape cartridges, magnetic tape storage or other magnetic storage devices or any other non-transportable media can be used to store information that can be accessed by a computing device.
- computer readable media does not include non-persistent computer readable media, such as modulated data signals and carrier waves.
- Embodiments of the present application are described with reference to flowcharts and/or block diagrams of methods, terminal devices (systems), and computer program products according to embodiments of the present application. It will be understood that each flow and/or block of the flowchart illustrations and/or FIG.
- These computer program instructions can be provided to a processor of a general purpose computer, special purpose computer, embedded processor or other programmable data processing terminal device to produce a machine such that instructions are executed by a processor of a computer or other programmable data processing terminal device
- Means are provided for implementing the functions specified in one or more of the flow or in one or more blocks of the flow chart.
- the computer program instructions can also be stored in a computer readable memory that can direct a computer or other programmable data processing terminal device to operate in a particular manner, such that the instructions stored in the computer readable memory produce an article of manufacture comprising the instruction device.
- the instruction device implements the functions specified in one or more blocks of the flowchart or in a flow or block of the flowchart.
- a method for recommending video data provided by the present application a recommendation device for video data, a method for generating a video data detection model, a device for generating a video data detection model, a method for identifying video data, and A device for identifying video data is described in detail.
- the principles and implementations of the present application are described in the following. The description of the above embodiments is only used to help understand the method and core idea of the present application; In the meantime, the present invention is not limited to the scope of the present application.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Evolutionary Biology (AREA)
- Computational Linguistics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Image Analysis (AREA)
Abstract
Provided are a recommendation method and apparatus for video data. The recommendation method comprises: acquiring one or more pieces of video data to be detected; respectively extracting quality feature information about each piece of video data to be detected; recognising the quality feature information using a pre-set video data detection model, so as to obtain target video data; and recommending the target video data to a user. In the embodiments of the present application, video data of high quality can be quickly screened out using a deep learning model. The embodiments of the present application solve the problem in the art that a video clip can only be recommended to a user by relying on artificial recognition, thereby improving the recognition efficiency for video data and the accuracy rate of recommendation.
Description
本申请要求2017年02月28日递交的申请号为201710113741.4、发明名称为“一种视频数据的推荐方法和装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。The present application claims priority to Chinese Patent Application Serial No. No. No. No. No. No. No. No. No. No. No. No. No. No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No
本申请涉及数据处理技术领域,特别是涉及一种视频数据的推荐方法、一种视频数据的推荐装置、一种视频数据检测模型的生成方法、一种视频数据检测模型的生成装置、一种视频数据的识别方法和一种视频数据的识别装置。The present invention relates to the field of data processing technologies, and in particular, to a method for recommending video data, a device for recommending video data, a method for generating a video data detection model, a device for generating a video data detection model, and a video. A method of identifying data and a device for identifying video data.
电子商务的发展显著地提升了人们日常生活的便捷性,通过电子商务网站,人们可以轻松地选购商品、完成支付,节省了购物的时间。The development of e-commerce has significantly improved the convenience of people's daily life. Through e-commerce websites, people can easily purchase goods and complete payment, saving shopping time.
为了更好地帮助用户了解目标商品的特性,电子商务网站开始使用视频内容进行导购及营销,即根据运营需要输入相应的文本信息,然后从视频库中选择合适的视频帧,进而根据文本语义采用视频帧构建合适场景的视频,并推荐给目标用户。In order to better help users understand the characteristics of the target products, e-commerce websites begin to use video content for shopping guide and marketing, that is, input corresponding text information according to operational needs, and then select appropriate video frames from the video library, and then adopt according to text semantics. The video frame constructs a video of the appropriate scene and is recommended to the target user.
但是,在实际应用中,海量的视频内容被提取并合成为视频之后,还需要对合成的视频的质量进行检测和评估,以筛选出最优的视频才能投放给目标用户。现有技术中对视频的质量进行检测和评估主要依赖于运营人员的人工审核,该种方法不仅耗费了大量的运营资源,而且在大多数情况下,人工审核也无法对合成的视频进行实时的处理。However, in practical applications, after a large amount of video content is extracted and synthesized into a video, the quality of the synthesized video needs to be detected and evaluated to select the optimal video to be delivered to the target user. The detection and evaluation of video quality in the prior art mainly relies on manual auditing by operators, which not only consumes a large amount of operational resources, but also in most cases, manual auditing cannot perform real-time on synthesized video. deal with.
发明内容Summary of the invention
鉴于上述问题,提出了本申请实施例以便提供一种克服上述问题或者至少部分地解决上述问题的一种视频数据的推荐方法、一种视频数据的推荐装置、一种视频数据检测模型的生成方法、一种视频数据检测模型的生成装置、一种视频数据的识别方法和相应的一种视频数据的识别装置。In view of the above problems, an embodiment of the present application is provided to provide a video data recommendation method, a video data recommendation device, and a video data detection model generation method, which overcome the above problems or at least partially solve the above problems. A device for generating a video data detection model, a method for identifying video data, and a corresponding device for identifying video data.
为了解决上述问题,本申请公开了一种视频数据的推荐方法,包括:In order to solve the above problem, the present application discloses a method for recommending video data, including:
获取一个或多个待检测的视频数据;Obtaining one or more video data to be detected;
分别提取每个待检测的视频数据的质量特征信息;Extracting quality characteristic information of each video data to be detected separately;
采用预设的视频数据检测模型对所述质量特征信息进行识别,以获得目标视频数据;Identifying the quality feature information by using a preset video data detection model to obtain target video data;
向用户推荐所述目标视频数据。The target video data is recommended to the user.
可选地,所述预设的视频数据检测模型通过如下方式生成:Optionally, the preset video data detection model is generated by:
分别提取多个样本视频数据的质量特征信息,所述多个样本视频数据包括多个正向样本视频数据和负向样本视频数据;Extracting quality feature information of the plurality of sample video data, the plurality of sample video data including a plurality of forward sample video data and negative direction sample video data;
采用所述多个正向样本视频数据和负向样本视频数据的质量特征信息进行训练,生成视频数据检测模型。Training is performed by using quality feature information of the plurality of forward sample video data and negative sample video data to generate a video data detection model.
可选地,所述质量特征信息包括图像像素特征信息,连续帧图像物体迁移特征信息,连续帧图像动作特征信息,图像帧不同的频域特征信息,图像帧小波变换特征信息,和/或,图像旋转算子特征信息。Optionally, the quality feature information includes image pixel feature information, continuous frame image object migration feature information, continuous frame image motion feature information, different frequency domain feature information of the image frame, image frame wavelet transform feature information, and/or, Image rotation operator feature information.
可选地,所述分别提取多个样本视频数据的质量特征信息的步骤包括:Optionally, the step of separately extracting quality feature information of the plurality of sample video data includes:
提取每个样本视频数据的每一帧图像的像素信息;Extracting pixel information of each frame image of each sample video data;
分别对所述像素信息进行卷积运算和池化处理,以获得图像像素特征信息。The pixel information is separately subjected to convolution operation and pooling processing to obtain image pixel feature information.
可选地,所述分别提取多个样本视频数据的质量特征信息的步骤包括:Optionally, the step of separately extracting quality feature information of the plurality of sample video data includes:
识别每个样本视频数据的每一帧图像中的物体对象;Identifying an object object in each frame of image of each sample video data;
分别确定相邻两帧图像中的物体对象出现的次数和频率,以获得连续帧图像物体迁移特征信息。The number and frequency of occurrences of the object objects in the adjacent two frames of images are respectively determined to obtain continuous frame image object migration feature information.
可选地,所述分别提取多个样本视频数据的质量特征信息的步骤包括:Optionally, the step of separately extracting quality feature information of the plurality of sample video data includes:
识别每个样本视频数据的每一帧图像中的动作对象的形状特征;Identifying a shape feature of the action object in each frame image of each sample video data;
分别确定相邻两帧图像中的动作对象的形状特征的几何参数,以获得连续帧图像动作特征信息。The geometric parameters of the shape features of the motion objects in the adjacent two frames of images are respectively determined to obtain continuous frame image motion feature information.
可选地,所述分别提取多个样本视频数据的质量特征信息的步骤包括:Optionally, the step of separately extracting quality feature information of the plurality of sample video data includes:
确定每个样本视频数据的每一帧图像的幅值和相位;Determining the amplitude and phase of each frame of image of each sample video data;
分别确定相邻两帧图像的幅值差和相位差,以获得图像帧不同的频域特征信息。The amplitude difference and the phase difference of the adjacent two frames of images are respectively determined to obtain different frequency domain feature information of the image frame.
可选地,所述分别提取多个样本视频数据的质量特征信息的步骤包括:Optionally, the step of separately extracting quality feature information of the plurality of sample video data includes:
确定每个样本视频数据的每一帧图像的小波系数;Determining a wavelet coefficient of each frame image of each sample video data;
分别确定相邻两帧图像的小波系数的变化值,以获得图像帧小波变换特征信息。The change values of the wavelet coefficients of the adjacent two frames of images are respectively determined to obtain image frame wavelet transform feature information.
可选地,所述分别提取多个样本视频数据的质量特征信息的步骤包括:Optionally, the step of separately extracting quality feature information of the plurality of sample video data includes:
确定每个样本视频数据的每一帧图像的旋转算子;Determining a rotation operator for each frame of image of each sample video data;
分别确定相邻两帧图像的旋转算子的变化值,以获得图像旋转算子特征信息。The change values of the rotation operators of the adjacent two frames of images are respectively determined to obtain image rotation operator feature information.
可选地,所述采用所述多个正向样本视频数据和负向样本视频数据的质量特征信息进行训练,生成视频数据检测模型的步骤包括:Optionally, the step of training by using the quality feature information of the plurality of forward sample video data and the negative sample video data to generate the video data detection model includes:
对所述多个正向样本视频数据和负向样本视频数据的质量特征信息进行归一化处理,以获得归一化的质量特征信息;Normalizing the quality feature information of the plurality of forward sample video data and negative sample video data to obtain normalized quality feature information;
补全所述归一化的质量特征信息的缺失值;Completing the missing value of the normalized quality feature information;
从所述归一化的质量特征信息中识别出目标质量特征信息;Identifying target quality feature information from the normalized quality feature information;
采用所述目标质量特征信息进行神经网络模型训练,生成视频数据检测模型。The target quality feature information is used to train the neural network model to generate a video data detection model.
可选地,所述从所述归一化的质量特征信息中识别出目标质量特征信息的步骤包括:Optionally, the step of identifying target quality feature information from the normalized quality feature information includes:
确定所述归一化的质量特征信息的信息熵;Determining an information entropy of the normalized quality feature information;
识别所述信息熵超过第一预设阈值的质量特征信息为目标质量特征信息。The quality feature information identifying that the information entropy exceeds the first preset threshold is the target quality feature information.
可选地,还包括:Optionally, it also includes:
获取多个用户的属性信息;Obtain attribute information of multiple users;
根据所述属性信息,将所述多个用户聚类为多个用户群体,所述用户群体具有相应的用户标签。And the plurality of users are clustered into a plurality of user groups according to the attribute information, and the user groups have corresponding user labels.
可选地,所述采用预设的视频数据检测模型对所述质量特征信息进行识别,以获得目标视频数据的步骤包括:Optionally, the step of identifying the quality feature information by using a preset video data detection model to obtain target video data includes:
采用预设的视频数据检测模型分别对所述一个或多个待检测的视频数据的质量特征信息进行识别,以获得所述一个或多个待检测的视频数据的质量分值;Determining quality characteristic information of the one or more video data to be detected by using a preset video data detection model to obtain a quality score of the one or more video data to be detected;
提取所述质量分值超过第二预设阈值的视频数据为目标视频数据。The video data whose quality score exceeds the second preset threshold is extracted as target video data.
可选地,所述向用户推荐所述目标视频数据的步骤包括:Optionally, the step of recommending the target video data to a user includes:
在所述多个用户群体中确定目标用户群体;Determining a target user group among the plurality of user groups;
向所述目标用户群体推荐所述目标视频数据。The target video data is recommended to the target user group.
可选地,所述目标视频数据具有相应的视频标签,所述在所述多个用户群体中确定目标用户群体的步骤包括:Optionally, the target video data has a corresponding video tag, and the step of determining a target user group among the multiple user groups includes:
确定与所述目标视频数据的视频标签相同的用户标签所对应的用户群体为目标用户群体。Determining a user group corresponding to the same user tag of the video tag of the target video data as a target user group.
为了解决上述问题,本申请公开了一种视频数据检测模型的生成方法,包括:In order to solve the above problem, the present application discloses a method for generating a video data detection model, including:
分别提取多个样本视频数据的质量特征信息,所述多个样本视频数据包括多个正向样本视频数据和负向样本视频数据;Extracting quality feature information of the plurality of sample video data, the plurality of sample video data including a plurality of forward sample video data and negative direction sample video data;
采用所述多个正向样本视频数据和负向样本视频数据的质量特征信息进行训练,生成视频数据检测模型。Training is performed by using quality feature information of the plurality of forward sample video data and negative sample video data to generate a video data detection model.
可选地,所述质量特征信息包括图像像素特征信息,连续帧图像物体迁移特征信息,连续帧图像动作特征信息,图像帧不同的频域特征信息,图像帧小波变换特征信息,和/或,图像旋转算子特征信息。Optionally, the quality feature information includes image pixel feature information, continuous frame image object migration feature information, continuous frame image motion feature information, different frequency domain feature information of the image frame, image frame wavelet transform feature information, and/or, Image rotation operator feature information.
为了解决上述问题,本申请公开了一种视频数据的识别方法,包括:In order to solve the above problem, the present application discloses a method for identifying video data, including:
获取一个或多个待检测的视频数据;Obtaining one or more video data to be detected;
将所述一个或多个待检测的视频数据发送至服务器,所述服务器用于分别对所述一个或多个待检测的视频数据进行识别,以获得识别结果,所述识别结果包括一个或多个候选视频数据;Sending the one or more video data to be detected to a server, where the server is configured to separately identify the one or more video data to be detected to obtain a recognition result, where the recognition result includes one or more Candidate video data;
接收所述服务器返回的所述一个或多个候选视频数据;Receiving the one or more candidate video data returned by the server;
在所述一个或多个候选视频数据中确定目标视频数据;Determining target video data in the one or more candidate video data;
展现所述目标视频数据。Presenting the target video data.
为了解决上述问题,本申请公开了一种视频数据的推荐装置,包括:In order to solve the above problem, the present application discloses a recommendation device for video data, including:
获取模块,用于获取一个或多个待检测的视频数据;An obtaining module, configured to acquire one or more video data to be detected;
提取模块,用于分别提取每个待检测的视频数据的质量特征信息;An extraction module, configured to separately extract quality feature information of each video data to be detected;
识别模块,用于采用预设的视频数据检测模型对所述质量特征信息进行识别,以获得目标视频数据;An identification module, configured to identify the quality feature information by using a preset video data detection model to obtain target video data;
推荐模块,用于向用户推荐所述目标视频数据。a recommendation module for recommending the target video data to a user.
可选地,所述预设的视频数据检测模型通过调用如下模块生成:Optionally, the preset video data detection model is generated by calling the following module:
质量特征信息提取模块,用于分别提取多个样本视频数据的质量特征信息,所述多个样本视频数据包括多个正向样本视频数据和负向样本视频数据;a quality feature information extraction module, configured to separately extract quality feature information of the plurality of sample video data, where the plurality of sample video data includes a plurality of forward sample video data and negative sample video data;
视频数据检测模型生成模块,用于采用所述多个正向样本视频数据和负向样本视频数据的质量特征信息进行训练,生成视频数据检测模型。The video data detection model generating module is configured to perform training by using the quality feature information of the plurality of forward sample video data and the negative sample video data to generate a video data detection model.
可选地,所述质量特征信息包括图像像素特征信息,连续帧图像物体迁移特征信息,连续帧图像动作特征信息,图像帧不同的频域特征信息,图像帧小波变换特征信息,和/或,图像旋转算子特征信息。Optionally, the quality feature information includes image pixel feature information, continuous frame image object migration feature information, continuous frame image motion feature information, different frequency domain feature information of the image frame, image frame wavelet transform feature information, and/or, Image rotation operator feature information.
可选地,所述质量特征信息提取模块包括:Optionally, the quality feature information extraction module includes:
像素信息提取子模块,用于提取每个样本视频数据的每一帧图像的像素信息;a pixel information extraction submodule, configured to extract pixel information of each frame image of each sample video data;
像素信息处理子模块,用于分别对所述像素信息进行卷积运算和池化处理,以获得图像像素特征信息。The pixel information processing sub-module is configured to perform convolution operation and pooling processing on the pixel information to obtain image pixel feature information.
可选地,所述质量特征信息提取模块还包括:Optionally, the quality feature information extraction module further includes:
物体对象识别子模块,用于识别每个样本视频数据的每一帧图像中的物体对象;An object object recognition sub-module for identifying an object object in each frame image of each sample video data;
物体对象处理子模块,用于分别确定相邻两帧图像中的物体对象出现的次数和频率,以获得连续帧图像物体迁移特征信息。The object object processing sub-module is configured to respectively determine the number and frequency of occurrences of the object objects in the adjacent two frames of images to obtain continuous frame image object migration feature information.
可选地,所述质量特征信息提取模块还包括:Optionally, the quality feature information extraction module further includes:
动作对象识别子模块,用于识别每个样本视频数据的每一帧图像中的动作对象的形状特征;a motion object recognition submodule, configured to identify a shape feature of the motion object in each frame image of each sample video data;
动作对象处理子模块,用于分别确定相邻两帧图像中的动作对象的形状特征的几何参数,以获得连续帧图像动作特征信息。The action object processing sub-module is configured to respectively determine geometric parameters of the shape features of the action objects in the adjacent two frames of images to obtain continuous frame image action feature information.
可选地,所述质量特征信息提取模块还包括:Optionally, the quality feature information extraction module further includes:
幅值和相位确定子模块,用于确定每个样本视频数据的每一帧图像的幅值和相位;An amplitude and phase determination sub-module for determining a magnitude and a phase of each frame image of each sample video data;
幅值和相位处理子模块,用于分别确定相邻两帧图像的幅值差和相位差,以获得图像帧不同的频域特征信息。The amplitude and phase processing sub-module is configured to respectively determine amplitude difference and phase difference of adjacent two frames of images to obtain different frequency domain feature information of the image frame.
可选地,所述质量特征信息提取模块还包括:Optionally, the quality feature information extraction module further includes:
小波系数确定子模块,用于确定每个样本视频数据的每一帧图像的小波系数;a wavelet coefficient determining submodule for determining a wavelet coefficient of each frame image of each sample video data;
小波系数处理子模块,用于分别确定相邻两帧图像的小波系数的变化值,以获得图像帧小波变换特征信息。The wavelet coefficient processing sub-module is configured to respectively determine the variation values of the wavelet coefficients of the adjacent two frames of images to obtain image frame wavelet transform feature information.
可选地,所述质量特征信息提取模块还包括:Optionally, the quality feature information extraction module further includes:
旋转算子确定子模块,用于确定每个样本视频数据的每一帧图像的旋转算子;a rotation operator determining sub-module for determining a rotation operator of each frame image of each sample video data;
旋转算子处理子模块,用于分别确定相邻两帧图像的旋转算子的变化值,以获得图像旋转算子特征信息。The rotation operator processing sub-module is configured to respectively determine a variation value of a rotation operator of the adjacent two frames of images to obtain image rotation operator feature information.
可选地,所述视频数据检测模型生成模块包括:Optionally, the video data detection model generating module includes:
归一化处理子模块,用于对所述多个正向样本视频数据和负向样本视频数据的质量特征信息进行归一化处理,以获得归一化的质量特征信息;a normalization processing sub-module, configured to normalize quality characteristic information of the plurality of forward sample video data and negative-direction sample video data to obtain normalized quality feature information;
缺失值补全子模块,用于补全所述归一化的质量特征信息的缺失值;a missing value completion sub-module for complementing the missing value of the normalized quality feature information;
目标质量特征信息识别子模块,用于从所述归一化的质量特征信息中识别出目标质量特征信息;a target quality feature information identifying submodule, configured to identify target quality feature information from the normalized quality feature information;
视频数据检测模型生成子模块,用于采用所述目标质量特征信息进行神经网络模型训练,生成视频数据检测模型。The video data detection model generation submodule is configured to perform neural network model training by using the target quality feature information, and generate a video data detection model.
可选地,所述目标质量特征信息识别子模块包括:Optionally, the target quality feature information identifying submodule includes:
信息熵确定单元,用于确定所述归一化的质量特征信息的信息熵;An information entropy determining unit, configured to determine an information entropy of the normalized quality feature information;
目标质量特征信息识别单元,用于识别所述信息熵超过第一预设阈值的质量特征信息为目标质量特征信息。The target quality feature information identifying unit is configured to identify the quality feature information that the information entropy exceeds the first preset threshold as the target quality feature information.
可选地,生成所述预设的视频数据检测模型还调用如下模块:Optionally, generating the preset video data detection model further invokes the following modules:
属性信息获取模块,用于获取多个用户的属性信息;An attribute information obtaining module, configured to acquire attribute information of multiple users;
用户群体聚类模块,用于根据所述属性信息,将所述多个用户聚类为多个用户群体,所述用户群体具有相应的用户标签。The user group clustering module is configured to cluster the plurality of users into a plurality of user groups according to the attribute information, where the user group has a corresponding user label.
可选地,所述识别模块包括:Optionally, the identifying module includes:
质量特征信息识别子模块,用于采用预设的视频数据检测模型分别对所述一个或多个待检测的视频数据的质量特征信息进行识别,以获得所述一个或多个待检测的视频数据的质量分值;a quality feature information identifying sub-module, configured to identify, by using a preset video data detection model, quality characteristic information of the one or more video data to be detected, respectively, to obtain the one or more video data to be detected. Quality score
目标视频数据提取子模块,用于提取所述质量分值超过第二预设阈值的视频数据为目标视频数据。The target video data extraction sub-module is configured to extract video data whose quality score exceeds a second preset threshold as target video data.
可选地,所述推荐模块包括:Optionally, the recommendation module includes:
目标用户群体确定子模块,用于在所述多个用户群体中确定目标用户群体;a target user group determining submodule, configured to determine a target user group among the plurality of user groups;
目标视频数据推荐子模块,用于向所述目标用户群体推荐所述目标视频数据。The target video data recommendation submodule is configured to recommend the target video data to the target user group.
可选地,所述目标视频数据具有相应的视频标签,所述目标用户群体确定子模块包括:Optionally, the target video data has a corresponding video tag, and the target user group determining submodule includes:
目标用户群体确定单元,用于确定与所述目标视频数据的视频标签相同的用户标签所对应的用户群体为目标用户群体。The target user group determining unit is configured to determine a user group corresponding to the same user tag of the video tag of the target video data as a target user group.
为了解决上述问题,本申请公开了一种视频数据检测模型的生成装置,包括:In order to solve the above problem, the present application discloses a device for generating a video data detection model, including:
质量特征信息提取模块,用于分别提取多个样本视频数据的质量特征信息,所述多个样本视频数据包括多个正向样本视频数据和负向样本视频数据;a quality feature information extraction module, configured to separately extract quality feature information of the plurality of sample video data, where the plurality of sample video data includes a plurality of forward sample video data and negative sample video data;
视频数据检测模型生成模块,用于采用所述多个正向样本视频数据和负向样本视频数据的质量特征信息进行训练,生成视频数据检测模型。The video data detection model generating module is configured to perform training by using the quality feature information of the plurality of forward sample video data and the negative sample video data to generate a video data detection model.
可选地,所述质量特征信息包括图像像素特征信息,连续帧图像物体迁移特征信息, 连续帧图像动作特征信息,图像帧不同的频域特征信息,图像帧小波变换特征信息,和/或,图像旋转算子特征信息。Optionally, the quality feature information includes image pixel feature information, continuous frame image object migration feature information, continuous frame image motion feature information, different frequency domain feature information of the image frame, image frame wavelet transform feature information, and/or, Image rotation operator feature information.
为了解决上述问题,本申请公开了一种视频数据的识别装置,包括:In order to solve the above problem, the present application discloses an apparatus for identifying video data, including:
获取模块,用于获取一个或多个待检测的视频数据;An obtaining module, configured to acquire one or more video data to be detected;
发送模块,用于将所述一个或多个待检测的视频数据发送至服务器,所述服务器用于分别对所述一个或多个待检测的视频数据进行识别,以获得识别结果,所述识别结果包括一个或多个候选视频数据;a sending module, configured to send the one or more video data to be detected to a server, where the server is configured to separately identify the one or more video data to be detected to obtain a recognition result, and the identifying The result includes one or more candidate video data;
接收模块,用于接收所述服务器返回的所述一个或多个候选视频数据;a receiving module, configured to receive the one or more candidate video data returned by the server;
确定模块,用于在所述一个或多个候选视频数据中确定目标视频数据;a determining module, configured to determine target video data in the one or more candidate video data;
展现模块,用于展现所述目标视频数据。a presentation module for presenting the target video data.
与背景技术相比,本申请实施例包括以下优点:Compared with the background art, the embodiments of the present application include the following advantages:
本申请实施例,通过获取一个或多个待检测的视频数据,并分别提取每个待检测的视频数据的质量特征信息,然后采用预设的视频数据检测模型对所述质量特征信息进行识别,以获得目标视频数据,进而向用户推荐所述目标视频数据,通过采用深度学习模型能够迅速筛选出优质的视频数据,本申请实施例解决了现有技术中只能依靠人工识别并向用户推荐视频片段的问题,提高了对视频数据的识别效率以及推荐的准确率。In the embodiment of the present application, one or more video data to be detected are acquired, and quality characteristic information of each video data to be detected is separately extracted, and then the quality feature information is identified by using a preset video data detection model. The target video data is obtained, and the target video data is recommended to the user, and the high-quality video data can be quickly selected by using the deep learning model. The embodiment of the present application solves the problem that the prior art can only rely on manual identification and recommend the video to the user. The problem of the segment improves the recognition efficiency of the video data and the accuracy of the recommendation.
图1是本申请的一种视频数据的推荐方法实施例一的步骤流程图;1 is a flow chart showing the steps of Embodiment 1 of a method for recommending video data according to the present application;
图2是本申请的一种视频数据的推荐方法实施例二的步骤流程图;2 is a flow chart of steps of a second embodiment of a method for recommending video data according to the present application;
图3是本申请的一种视频数据的推荐方法的原理框图;3 is a schematic block diagram of a method for recommending video data according to the present application;
图4是本申请的一种视频数据检测模型的生成方法实施例的步骤流程图;4 is a flow chart showing the steps of an embodiment of a method for generating a video data detection model according to the present application;
图5是本申请的一种视频数据的识别方法实施例的步骤流程图;5 is a flow chart showing the steps of an embodiment of a method for identifying video data according to the present application;
图6是本申请的一种视频数据的推荐装置实施例的结构框图;6 is a structural block diagram of an embodiment of a device for recommending video data according to the present application;
图7是本申请的一种视频数据检测模型的生成装置实施例的结构框图;7 is a structural block diagram of an embodiment of a device for generating a video data detection model according to the present application;
图8是本申请的一种视频数据的识别装置实施例的结构框图。FIG. 8 is a structural block diagram of an embodiment of an apparatus for identifying video data according to the present application.
为使本申请的上述目的、特征和优点能够更加明显易懂,下面结合附图和具体实施方式对本申请作进一步详细的说明。The above described objects, features and advantages of the present application will become more apparent and understood.
参照图1,示出了本申请的一种视频数据的推荐方法实施例一的步骤流程图,具体可以包括如下步骤:Referring to FIG. 1 , a flow chart of a first embodiment of a method for recommending video data according to the present application is shown. Specifically, the method may include the following steps:
步骤101,获取一个或多个待检测的视频数据;Step 101: Acquire one or more video data to be detected;
在本申请实施例中,所述待检测的视频数据可以是从各种途径获取的现成的视频片段,也可以是在视频库中根据某种规则提取多个视频帧实时合成的视频片段,本申请实施例对视频数据的具体来源和类型不作限定。In the embodiment of the present application, the video data to be detected may be an off-the-shelf video segment obtained from various ways, or may be a video segment that is synthesized in real time by extracting multiple video frames according to a certain rule in the video library. The application embodiment does not limit the specific source and type of video data.
步骤102,分别提取每个待检测的视频数据的质量特征信息;Step 102: Extract quality characteristic information of each video data to be detected, respectively.
在本申请实施例中,视频数据的质量特征信息可以是用于识别所述视频数据的质量的特征信息,例如,视频数据的图像像素、图像所展示的内容等特征信息。通过对视频数据的质量特征信息进行识别,能够对视频片段的流畅度、连贯性等进行检验。In the embodiment of the present application, the quality feature information of the video data may be feature information for identifying the quality of the video data, for example, image pixels of the video data, content displayed by the image, and the like. By identifying the quality characteristic information of the video data, it is possible to check the fluency, consistency, and the like of the video clip.
当然,本领域技术人员可以根据实际需要,具体确定所要提取的质量特征信息的类型及提取方式,本申请实施例对此不作限定。Certainly, the type of the quality feature information to be extracted and the manner of the extraction are determined by a person skilled in the art according to actual needs, which is not limited by the embodiment of the present application.
步骤103,采用预设的视频数据检测模型对所述质量特征信息进行识别,以获得目标视频数据;Step 103: Identify the quality feature information by using a preset video data detection model to obtain target video data.
在本申请实施例中,预设的视频数据检测模型可以通过对训练样本集中的多个样本视频数据进行训练生成,从而可以用于对待检测的视频数据的各个质量特征信息进行识别。In the embodiment of the present application, the preset video data detection model may be generated by training a plurality of sample video data in the training sample set, so that each quality feature information of the video data to be detected may be identified.
在具体实现中,训练样本集中的多个样本视频数据可以包括多个正向样本视频数据和多个负向样本视频数据,所述正向样本视频数据可以是视频质量较好的视频片段,例如,流畅度和连贯性较好、各个视频帧之间的整体风格较一致的视频片段,通常此类正向样本视频数据可以通过人工打标或者网络爬取获得;与正向样本视频数据相反,所述负向样本视频数据则是流畅度、连贯性以及各个视频帧之间的整体风格一致性较差的视频片段,通常此类负向样本视频数据可以通过对多个视频帧进行随机合成获得,本申请实施例对正向样本视频数据和负向样本视频数据的来源和识别方式不作限定。In a specific implementation, the plurality of sample video data in the training sample set may include a plurality of forward sample video data and a plurality of negative sample video data, and the forward sample video data may be a video segment with better video quality, for example, A video clip with better fluency and coherence and a more uniform overall style between video frames. Usually such forward sample video data can be obtained by manual marking or web crawling; contrary to the forward sample video data, The negative sample video data is a video segment with poor fluency, coherence, and overall style consistency between video frames. Generally, such negative sample video data can be obtained by randomly synthesizing multiple video frames. The source and the identification manner of the forward sample video data and the negative sample video data are not limited in the embodiment of the present application.
在集合多个正向样本视频数据和负向样本视频数据形成训练样本集后,可以分别提取所述正向样本视频数据和负向样本视频数据的质量特征信息,并进行模型训练,从而生成视频数据检测模型;进而可以在提取待检测的视频数据的质量特征信息后,采用所 述视频数据检测模型对所述质量特征信息进行识别,获得目标视频数据。After the plurality of forward sample video data and the negative sample video data are aggregated to form a training sample set, the quality feature information of the forward sample video data and the negative sample video data may be respectively extracted, and model training is performed to generate a video. The data detection model is further configured to: after extracting the quality feature information of the video data to be detected, use the video data detection model to identify the quality feature information to obtain target video data.
在本申请实施例中,所述目标视频数据可以是经视频数据检测模型识别后获得的质量较好的视频片段。In the embodiment of the present application, the target video data may be a video clip of good quality obtained after being identified by the video data detection model.
步骤104,向用户推荐所述目标视频数据。 Step 104, recommending the target video data to a user.
在具体实现中,向用户推荐目标视频数据可以是在用户界面播放所述目标视频片段,也可以是将所述目标视频片段推送给用户,本申请实施例对推荐目标视频数据的具体方式不作限定。In a specific implementation, the target video data is recommended to the user, and the target video segment may be played in the user interface, or the target video segment may be pushed to the user. The specific manner of recommending the target video data is not limited in this embodiment of the present application. .
在本申请实施例中,通过获取一个或多个待检测的视频数据,并分别提取每个待检测的视频数据的质量特征信息,然后采用预设的视频数据检测模型对所述质量特征信息进行识别,以获得目标视频数据,进而向用户推荐所述目标视频数据,本申请实施例采用深度学习模型能够迅速筛选出优质的视频数据,解决了现有技术中只能依靠人工识别并向用户推荐视频片段的问题,提高了对视频数据的识别效率以及推荐的准确率。In the embodiment of the present application, one or more video data to be detected are acquired, and quality characteristic information of each video data to be detected is separately extracted, and then the quality feature information is performed by using a preset video data detection model. The target video data is obtained to obtain the target video data, and the target video data is recommended to the user. The deep learning model in the embodiment of the present application can quickly screen out high-quality video data, and solves the problem that the prior art can only rely on manual identification and recommend to the user. The problem of video clips improves the efficiency of recognition of video data and the accuracy of recommendations.
参照图2,示出了本申请的一种视频数据的推荐方法实施例二的步骤流程图,具体可以包括如下步骤:Referring to FIG. 2, a flow chart of the steps of the second embodiment of the method for recommending video data of the present application is shown. Specifically, the method may include the following steps:
步骤201,分别提取多个样本视频数据的质量特征信息,所述多个样本视频数据包括多个正向样本视频数据和负向样本视频数据;Step 201: Extract quality feature information of a plurality of sample video data, where the plurality of sample video data includes a plurality of forward sample video data and negative direction sample video data;
如图3所示,是本申请的一种视频数据的推荐方法的原理框图。本申请实施例通过对训练样本集进行特征抽取,进而进行深度学习建模,然后采用训练好的模型对待检测视频数据进行评估,输出相应的质量分值,同时,在建模过程中通过融合用户属性信息对用户群体进行聚类,从而实现向用户群体的视频推荐。As shown in FIG. 3, it is a functional block diagram of a method for recommending video data of the present application. The embodiment of the present application performs feature extraction on the training sample set, and then performs deep learning modeling, and then uses the trained model to evaluate the detected video data, outputs corresponding quality scores, and simultaneously integrates users in the modeling process. The attribute information clusters the user groups to implement video recommendations to the user community.
在本申请实施例中,所述正向样本视频数据可以是视频质量较好的视频片段,例如,流畅度和连贯性较好、各个视频帧之间的整体风格较一致的视频片段,通常此类正向样本视频数据可以通过人工打标获得,由运营人员对视频片段的流畅度、连贯性以及各个视频帧之间的整体风格进行检验,从而将流畅度和连贯性较好、各个视频帧之间的整体风格较一致的视频片段标记为正向样本视频数据,还可以通过网络爬取获得,即通过从视频网站上截取一些点击率高,点赞数多的优质视频,作为网络爬取的正向样本视频数据。In the embodiment of the present application, the forward sample video data may be a video segment with better video quality, for example, a video segment with better fluency and coherence and a uniform overall style between video frames, usually The class forward sample video data can be obtained by manual marking. The operator checks the fluency and consistency of the video segment and the overall style between the video frames, so that the fluency and coherence are better, and the video frames are better. The video clips with more consistent overall style are marked as forward sample video data, and can also be obtained through web crawling, that is, by capturing some high-quality videos with high click-through rate and many praises from the video website, as a network crawling Forward sample video data.
与正向样本视频数据相反,所述负向样本视频数据则是流畅度、连贯性以及各个视频帧之间的整体风格一致性较差的视频片段,通常此类负向样本视频数据可以通过对多 个视频帧进行随机合成获得。例如,可以从多个类别中(比如旅游、宗教、电子产品中)随机分别抽取一些零散的视频帧片段,然后将抽取出的视频帧片段随意组合拼接,这些随意组合拼接而成的视频片段必然存在大量的不连贯和语义不一致,从而可以将此类拼接而成的视频片段作为负向样本视频数据。Contrary to the forward sample video data, the negative sample video data is a video segment with poor fluency, coherence, and overall style consistency between video frames. Usually such negative sample video data can pass through Multiple video frames are obtained by random synthesis. For example, some scattered video frame segments can be randomly extracted from multiple categories (such as travel, religion, and electronic products), and then the extracted video frame segments can be randomly combined and spliced. There are a large number of inconsistencies and semantic inconsistencies, so that such spliced video segments can be used as negative sample video data.
当然,本领域技术人员还可以按照其他方式获取正向样本视频数据和负向样本视频数据,本申请实施例对此不作限定。Of course, those skilled in the art can also obtain the forward sample video data and the negative sample video data in other manners, which is not limited in this embodiment of the present application.
然后,可以将获得的正向样本视频数据和负向样本视频数据作为训练样本集,供后续的模型训练使用。The obtained forward sample video data and negative sample video data can then be used as a training sample set for subsequent model training.
在具体实现中,可以首先分别提取训练样本集中的多个样本视频数据的质量特征信息。In a specific implementation, the quality feature information of the plurality of sample video data in the training sample set may be separately extracted first.
作为本申请实施例的一种示例,所述质量特征信息可以包括图像像素特征信息,连续帧图像物体迁移特征信息,连续帧图像动作特征信息,图像帧不同的频域特征信息,图像帧小波变换特征信息,和/或,图像旋转算子特征信息。As an example of the embodiment of the present application, the quality feature information may include image pixel feature information, continuous frame image object migration feature information, continuous frame image motion feature information, different frequency domain feature information of the image frame, and image frame wavelet transform. Feature information, and/or image rotation operator feature information.
下面逐一对上述六种特征信息的提取方式作一说明。The following describes a method for extracting the above six kinds of feature information one by one.
在本申请实施例中,对于图像像素特征信息,可以提取每个样本视频数据的每一帧图像的像素信息,然后分别对所述像素信息进行卷积运算和池化处理,以获得图像像素特征信息。In the embodiment of the present application, for image pixel feature information, pixel information of each frame image of each sample video data may be extracted, and then the pixel information is separately subjected to convolution operation and pooling processing to obtain image pixel features. information.
通常,图像是通过截取视频片段的每一帧获得的,因此,可以分别抽取每一帧图像中的像素信息,作为待处理的特征集合,然后对所述特征集合中的像素信息进行卷积运算,并对卷积运算后获得的特征集合进一步进行池化处理(max-pooling),从而获得图像像素特征信息。Generally, an image is obtained by intercepting each frame of a video segment. Therefore, pixel information in each frame image can be extracted separately as a feature set to be processed, and then the pixel information in the feature set is convoluted. And further performing pooling processing (max-pooling) on the feature set obtained after the convolution operation, thereby obtaining image pixel feature information.
本申请实施例对图像像素进行处理后,可以获得像素信息的最显著描述,在处理之后,相应的特征不仅维度降低了,而且更能表达图像原有的语义含义。After the image pixels are processed in the embodiment of the present application, the most significant description of the pixel information can be obtained. After the processing, the corresponding features not only have a reduced dimension, but also can express the original semantic meaning of the image.
在本申请实施例中,可以通过识别每个样本视频数据的每一帧图像中的物体对象,然后分别确定相邻两帧图像中的物体对象出现的次数和频率,以获得连续帧图像物体迁移特征信息。In the embodiment of the present application, the object objects in each frame image of each sample video data may be identified, and then the number and frequency of occurrences of the object objects in the adjacent two frames of images may be respectively determined to obtain continuous frame image object migration. Feature information.
在具体实现中,可以分别对每一帧图像进行序列分析,对各帧图像中的物体对象进行识别抽取,然后按照各帧的时间先后顺序进行排序,进而确定出相邻两帧图像中的物体对象出现的次数、频率,以及物体对象之间关联出现的次数、关联出现的概率等信息,作为连续帧图像物体迁移特征信息。In a specific implementation, each frame image may be separately analyzed, and object objects in each frame image are identified and extracted, and then sorted according to the chronological order of each frame, thereby determining objects in adjacent two frames of images. The number of times the object appears, the frequency, and the number of occurrences of the association between the object objects, the probability of occurrence of the association, and the like, as the continuous frame image object migration feature information.
需要说明的是,在确定相邻两帧图像中的物体对象出现的次数、频率等信息时,可以根据实际需要选择部分的相邻图像帧,本申请实施例对选择的相邻图像帧的数量不作限定。It should be noted that when determining the number of times, the frequency, and the like of the object object in the two adjacent frames, the partial adjacent image frames may be selected according to actual needs, and the number of adjacent image frames selected in this embodiment of the present application. Not limited.
与连续帧图像物体迁移特征信息的提取方式类似,本申请实施例在提取连续帧图像动作特征信息时,可以通过识别每个样本视频数据的每一帧图像中的动作对象的形状特征,然后分别确定相邻两帧图像中的动作对象的形状特征的几何参数,以获得连续帧图像动作特征信息。Similar to the extraction method of the continuous frame image object migration feature information, when extracting the continuous frame image motion feature information, the embodiment of the present application can identify the shape feature of the action object in each frame image of each sample video data, and then respectively The geometric parameters of the shape features of the motion objects in the adjacent two frames of images are determined to obtain continuous frame image motion feature information.
例如,可以分别对每一帧图像中的动作对象进行识别,并确定出该动作对象的几何形状边界,然后将每一帧图像中的动作的几何形状边界与前一帧图像中的动作的几何形状边界进行比较,按照几何仿射变换计算动作对象的形状特征的几何参数,并将该几何参数作为连续帧图像动作特征信息。For example, the motion object in each frame image can be separately identified, and the geometric boundary of the motion object can be determined, and then the geometric boundary of the motion in each frame image and the geometry of the motion in the previous frame image can be determined. The shape boundaries are compared, and the geometric parameters of the shape features of the motion object are calculated according to the geometric affine transformation, and the geometric parameters are used as continuous frame image motion feature information.
在本申请实施例中,对于图像帧不同的频域特征信息,可以通过确定每个样本视频数据的每一帧图像的幅值和相位,然后分别确定相邻两帧图像的幅值差和相位差,以获得图像帧不同的频域特征信息。In the embodiment of the present application, for different frequency domain feature information of an image frame, the amplitude and phase of each frame image of each sample video data may be determined, and then the amplitude difference and phase of the adjacent two frames of images may be respectively determined. Poor to obtain different frequency domain feature information of the image frame.
在具体实现中,可以首先对每一帧图像做傅里叶变换并抽取频谱系特征,然后抽取各个多个不同频谱系的幅值,相位特征,将这些特征都作为每一帧图像的特征集合,然后对于相邻两帧的幅值差和相位差异性进行计算,得到相邻两帧图像的幅值差和相位差。In a specific implementation, the Fourier transform of each frame image may be first performed and the spectrum system features are extracted, and then the amplitude and phase features of each of the plurality of different spectrum systems are extracted, and these features are used as feature sets of each frame image. Then, the amplitude difference and the phase difference of the adjacent two frames are calculated, and the amplitude difference and phase difference of the adjacent two frames of images are obtained.
对于小波变换特征信息,本申请实施例可以通过确定每个样本视频数据的每一帧图像的小波系数,然后分别确定相邻两帧图像的小波系数的变化值,以获得图像帧小波变换特征信息。For the wavelet transform feature information, the embodiment of the present application may determine the wavelet coefficients of each frame image of each sample video data, and then determine the change values of the wavelet coefficients of the adjacent two frames respectively to obtain the image frame wavelet transform feature information. .
具体地,可以对每一帧图像做小波变换处理,获得相应的小波系数,然后将各帧图像按照时间先后进行排序,分别计算相邻两帧图像之间的小波系数的变化情况,抽取小波系数变化的差值作为小波变换特征信息。Specifically, wavelet transform processing may be performed on each frame image to obtain corresponding wavelet coefficients, and then each frame image is sorted in time series, respectively, and wavelet coefficients of adjacent two frames are calculated, and wavelet coefficients are extracted. The changed difference is used as wavelet transform feature information.
在本申请实施例中,对于图像旋转算子特征信息,可以首先确定每个样本视频数据的每一帧图像的旋转算子,然后分别确定相邻两帧图像的旋转算子的变化值,获得图像旋转算子特征信息。In the embodiment of the present application, for the image rotation operator feature information, the rotation operator of each frame image of each sample video data may be first determined, and then the change values of the rotation operators of the adjacent two frame images are respectively determined, and obtained. Image rotation operator feature information.
具体地,可以首先计算每一帧图像的旋转算子,然后将各帧图像按照时间先后进行排序,确定相邻两帧图像之间的旋转算子的变化值,得到图像旋转算子特征信息。Specifically, the rotation operator of each frame image may be first calculated, and then each frame image is sorted in time series, and the change value of the rotation operator between the adjacent two frames of images is determined to obtain image rotation operator feature information.
在具体实现中,计算每一帧图像的旋转算子可以采用SIFT((Scale-invariant feature transform,尺度不变特征转换)算法,该算法是一种检测局部特征的算法,通过求一幅 图中的特征点及其尺度和方向描述子得到特征并进行图像特征点匹配,其实质是在不同的尺度空间上查找关键点(特征点),并计算出关键点的方向。In a specific implementation, the rotation operator for calculating each frame image may adopt a SIFT (Scale-invariant feature transform) algorithm, which is an algorithm for detecting local features, by seeking a picture The feature points and their scale and direction descriptors obtain features and perform image feature point matching. The essence is to find key points (feature points) in different scale spaces and calculate the direction of the key points.
以上对如何提取视频数据的图像像素特征信息,连续帧图像物体迁移特征信息,连续帧图像动作特征信息,图像帧不同的频域特征信息,图像帧小波变换特征信息和图像旋转算子特征信息进行了介绍,本领域技术人员还可以采用其他方式抽取上述特征信息,本申请实施例对比不作限定。The above is how to extract image pixel feature information of video data, continuous frame image object migration feature information, continuous frame image motion feature information, different frequency domain feature information of image frame, image frame wavelet transform feature information and image rotation operator feature information. In the introduction, the above-mentioned feature information may be extracted by other methods in the art, and the comparison of the embodiments of the present application is not limited.
步骤202,采用所述多个正向样本视频数据和负向样本视频数据的质量特征信息进行训练,生成视频数据检测模型;Step 202: Perform training by using quality feature information of the plurality of forward sample video data and negative sample video data to generate a video data detection model.
在分别获得样本视频数据的多种类型的质量特征信息后,可以采用所述质量特征信息进行模型训练,从而生成视频数据检测模型。After obtaining the plurality of types of quality feature information of the sample video data respectively, the quality feature information may be used for model training to generate a video data detection model.
在具体实现中,可以首先对所述多个正向样本视频数据和负向样本视频数据的质量特征信息进行归一化处理,获得归一化的质量特征信息,并补全所述归一化的质量特征信息的缺失值,然后从所述归一化的质量特征信息中识别出目标质量特征信息,进而采用所述目标质量特征信息进行神经网络模型训练,生成视频数据检测模型。In a specific implementation, the quality feature information of the plurality of forward sample video data and the negative sample video data may be normalized to obtain normalized quality feature information, and the normalization may be complemented. The missing value of the quality feature information is then identified from the normalized quality feature information, and then the target quality feature information is used for neural network model training to generate a video data detection model.
在本申请实施例中,识别目标质量特征信息可以是筛选出高判别性的特征信息,具体地,可以首先确定所述归一化的质量特征信息的信息熵。由于信息熵越大的特征,蕴含的信息也越丰富,进而特征的重要性也越大,越应该保留,因此,可以识别所述信息熵超过第一预设阈值的质量特征信息为目标质量特征信息。In the embodiment of the present application, the identifying the target quality feature information may be screening out the high discriminative feature information. Specifically, the information entropy of the normalized quality feature information may be first determined. Due to the larger the information entropy, the richer information is enriched, and the importance of the feature is greater, and the more it should be retained. Therefore, the quality feature information whose information entropy exceeds the first preset threshold can be identified as the target quality feature. information.
在本申请实施例中,在生成视频数据检测模型时,还可以融合进用户的个性化的特征信息,从而使对待检测的视频数据进行识别时,能够对视频数据的评价与用户属性相结合,提高推荐视频数据的针对性和有效性。In the embodiment of the present application, when the video data detection model is generated, the personalized feature information of the user may also be integrated, so that when the video data to be detected is identified, the evaluation of the video data and the user attribute may be combined. Improve the relevance and effectiveness of recommended video data.
在具体实现中,可以获取多个用户的属性信息,然后根据所述属性信息,将所述多个用户聚类为多个用户群体,所述用户群体具有相应的用户标签,从而在对训练样本集中的视频数据进行模型训练时,可以有效融合用户的属性信息。In a specific implementation, attribute information of multiple users may be acquired, and then, according to the attribute information, the multiple users are clustered into multiple user groups, and the user groups have corresponding user labels, so that the training samples are When the centralized video data is used for model training, the attribute information of the user can be effectively integrated.
步骤203,获取一个或多个待检测的视频数据;Step 203: Acquire one or more video data to be detected.
在本申请实施例中,所述待检测的视频数据可以是在视频库中根据某种规则提取多个视频帧实时合成的视频片段。例如,在电子商务网站使用视频内容进行导购及营销时,可以根据输入的文本内容,从海量的视频库中提取出与所述文本内容相匹配的多个视频帧,然后将所述多个视频帧按照一定规则组合成视频片段。当然,本领域技术人员还可以采用其他方式确定待检测的视频数据,例如,所述待检测的视频数据也可以是从各种 途径获取的现成的视频片段,本申请实施例对此不作限定。In the embodiment of the present application, the video data to be detected may be a video segment synthesized in real time by extracting a plurality of video frames according to a certain rule in a video library. For example, when the e-commerce website uses the video content for shopping guide and marketing, a plurality of video frames matching the text content may be extracted from the massive video library according to the input text content, and then the multiple videos are Frames are combined into video clips according to certain rules. Of course, the video data to be detected may be determined by other methods in the art. For example, the video data to be detected may also be an off-the-shelf video segment obtained from various paths, which is not limited in this embodiment of the present application.
步骤204,分别提取每个待检测的视频数据的质量特征信息;Step 204: Extract quality feature information of each video data to be detected, respectively.
与样本视频数据类似,待检测的视频数据的质量特征信息也可以包括图像像素特征信息,连续帧图像物体迁移特征信息,连续帧图像动作特征信息,图像帧不同的频域特征信息,图像帧小波变换特征信息,和/或,图像旋转算子特征信息。Similar to the sample video data, the quality feature information of the video data to be detected may also include image pixel feature information, continuous frame image object migration feature information, continuous frame image motion feature information, different frequency domain feature information of the image frame, and image frame wavelet. Transforming feature information, and/or image rotation operator feature information.
对于上述质量特征信息的提取方法可以参见步骤201,本步骤对此不再赘述。For the method for extracting the foregoing quality feature information, refer to step 201, which is not described in this step.
步骤205,采用预设的视频数据检测模型分别对所述一个或多个待检测的视频数据的质量特征信息进行识别,以获得所述一个或多个待检测的视频数据的质量分值;Step 205: Identify, by using a preset video data detection model, the quality feature information of the one or more video data to be detected to obtain a quality score of the one or more video data to be detected.
在具体实现中,在完成视频检测模型的构建,以及待检测视频数据的质量特征信息的提取后,便可以采用已经训练好的视频检测模型对所述质量特征信息进行识别,并依据识别结果对每一个待检测的视频数据进行评分,输出相应的质量分值。In a specific implementation, after the completion of the construction of the video detection model and the extraction of the quality feature information of the video data to be detected, the quality feature information may be identified by using the trained video detection model, and based on the recognition result. Each video data to be detected is scored, and a corresponding quality score is output.
步骤206,提取所述质量分值超过第二预设阈值的视频数据为目标视频数据;Step 206: Extract video data whose quality score exceeds a second preset threshold as target video data.
通常,质量分值越高,其对应的视频数据的质量越好,该视频数据的流畅度和连贯性也较好、各个视频帧之间的整体风格也会相对较一致。因此,可以将质量分值超过第二预设阈值的视频数据提取为目标视频数据。本领域技术人员可以根据实际需要确定第二预设阈值的大小,本申请实施例对此不作限定。当然,还可以直接选择质量分值最高的视频数据作为目标视频数据,本申请实施例对此亦不作限定。Generally, the higher the quality score, the better the quality of the corresponding video data, the smoothness and consistency of the video data, and the overall style of each video frame will be relatively consistent. Therefore, video data whose quality score exceeds the second preset threshold can be extracted as target video data. A person skilled in the art can determine the size of the second preset threshold according to actual needs, which is not limited by the embodiment of the present application. Of course, the video data with the highest quality score can be directly selected as the target video data, which is not limited in this embodiment of the present application.
步骤207,在所述多个用户群体中确定目标用户群体;Step 207: Determine a target user group among the plurality of user groups;
在本申请实施例中,由于在构建视频数据检测模型的过程中加入了用户的属性信息,因此,识别出的目标视频数据可以包括有相应的视频标签,以体现该视频数据的分类或其他信息。In the embodiment of the present application, since the attribute information of the user is added in the process of constructing the video data detection model, the identified target video data may include a corresponding video tag to reflect the classification or other information of the video data. .
在具体实现中,可以根据视频标签与用户群体的用户标签的比对,识别出该目标视频数据所针对的目标用户群体。例如,可以确定与所述目标视频数据的视频标签相同的用户标签所对应的用户群体为目标用户群体。当然,本领域技术人员还可以采用其他方式确定目标用户群体,本申请实施例对此不作限定。In a specific implementation, the target user group for which the target video data is targeted may be identified according to the comparison between the video tag and the user tag of the user group. For example, it may be determined that the user group corresponding to the same user tag of the video tag of the target video data is the target user group. Of course, a person skilled in the art may also determine the target user group in other manners, which is not limited by the embodiment of the present application.
步骤208,向所述目标用户群体推荐所述目标视频数据。 Step 208, recommend the target video data to the target user group.
在本申请实施例中,在分别确定目标视频数据和目标用户群体后,便可以将所述目标视频数据推荐给目标用户群体。In the embodiment of the present application, after the target video data and the target user group are separately determined, the target video data may be recommended to the target user group.
例如,对于电子商务网站的视频导购,可以在确定出优质的导购视频片段后,将该视频片段推荐给潜在的消费群体,提升用户服务体验,提高用户转化率。For example, for a video shopping guide of an e-commerce website, after determining a high-quality shopping guide video clip, the video clip can be recommended to a potential consumer group, improving the user service experience and improving the user conversion rate.
参照图4,示出了本申请的一种视频数据检测模型的生成方法实施例的步骤流程图,具体可以包括如下步骤:Referring to FIG. 4, a flow chart of the steps of a method for generating a video data detection model of the present application is shown, which may specifically include the following steps:
步骤401,分别提取多个样本视频数据的质量特征信息,所述多个样本视频数据包括多个正向样本视频数据和负向样本视频数据;Step 401: Extract quality feature information of a plurality of sample video data, where the plurality of sample video data includes a plurality of forward sample video data and negative direction sample video data;
步骤402,采用所述多个正向样本视频数据和负向样本视频数据的质量特征信息进行训练,生成视频数据检测模型。Step 402: Perform training by using quality feature information of the plurality of forward sample video data and negative sample video data to generate a video data detection model.
在本申请实施例中,所述质量特征信息可以包括图像像素特征信息,连续帧图像物体迁移特征信息,连续帧图像动作特征信息,图像帧不同的频域特征信息,图像帧小波变换特征信息,和/或,图像旋转算子特征信息。In the embodiment of the present application, the quality feature information may include image pixel feature information, continuous frame image object migration feature information, continuous frame image motion feature information, different frequency domain feature information of the image frame, and image frame wavelet transform feature information. And/or image rotation operator feature information.
由于本实施例步骤401-步骤402中所述的视频数据检测模型生成方法与上述视频数据的推荐方法实施例二中步骤201-步骤202类似,可以相互参阅,本实施例对此不再赘述。The method for generating the video data detection model in the step 401 to the step 402 of the present embodiment is similar to the step 201 to the step 202 in the second embodiment of the video data recommendation method, and can be referred to each other.
参照图5,示出了本申请的一种视频数据的识别方法实施例的步骤流程图,具体可以包括如下步骤:Referring to FIG. 5, a flow chart of steps of an embodiment of a method for identifying video data according to the present application is shown. Specifically, the method may include the following steps:
步骤501,获取一个或多个待检测的视频数据;Step 501: Acquire one or more video data to be detected.
在本申请实施例中,可以提供一用户界面,例如,在终端的显示屏上展现一交互界面,用户可以通过该交互界面,提交针对一个或多个视频数据的检测请求。所述视频数据可以是从各种途径获取的现成的视频片段,也可以是在视频库中根据某种规则提取多个视频帧实时合成的视频片段,本申请实施例对视频数据的具体来源和类型不作限定。In the embodiment of the present application, a user interface may be provided. For example, an interactive interface is displayed on the display screen of the terminal, and the user may submit a detection request for one or more video data through the interaction interface. The video data may be an off-the-shelf video segment obtained from various channels, or may be a video segment that is synthesized in real time by extracting a plurality of video frames according to a certain rule in the video library. The specific source of the video data in the embodiment of the present application is The type is not limited.
步骤502,将所述一个或多个待检测的视频数据发送至服务器,所述服务器用于分别对所述一个或多个待检测的视频数据进行识别,以获得识别结果,所述识别结果包括一个或多个候选视频数据;Step 502: Send the one or more video data to be detected to a server, where the server is configured to separately identify the one or more video data to be detected to obtain a recognition result, where the identification result includes One or more candidate video data;
当用户在提交针对视频数据的检测请求后,终端可以将一个或多个待检测的视频数据发送至服务器,由所述服务器完成对上述视频数据的识别,以获得相应的识别结果。After the user submits the detection request for the video data, the terminal may send one or more video data to be detected to the server, and the server completes the identification of the video data to obtain a corresponding recognition result.
在本申请实施例中,所述识别结果可以包括一个或多个候选视频数据,每个候选视频数据均包括有相应的质量分值。In this embodiment of the present application, the identification result may include one or more candidate video data, and each candidate video data includes a corresponding quality score.
在具体实现中,服务器对一个或多个待检测的视频数据进行识别的过程,与前述实施例中步骤201-步骤205类似,可以相互参照,本实施例对此不再赘述。In a specific implementation, the process of identifying the one or more video data to be detected by the server is similar to the step 201 to step 205 in the foregoing embodiment, and may be referred to each other.
步骤503,接收所述服务器返回的所述一个或多个候选视频数据;Step 503: Receive the one or more candidate video data returned by the server.
在本申请实施例中,服务器在完成对待检测视频数据的识别,获得识别结果后,可以将所述识别结果中包括的一个或多个候选视频数据返回给终端。In the embodiment of the present application, after the server completes the identification of the video data to be detected, and obtains the recognition result, the server may return one or more candidate video data included in the identification result to the terminal.
步骤504,在所述一个或多个候选视频数据中确定目标视频数据;Step 504: Determine target video data in the one or more candidate video data.
在本申请实施例中,由于候选视频数据具有相应的质量分值,因此,可以根据质量分值的高低,确定出目标视频数据。In the embodiment of the present application, since the candidate video data has a corresponding quality score, the target video data may be determined according to the level of the quality score.
在一种示例中,质量分值越高,可以认为对应的视频数据的质量越好,因此,可以以质量分值最高的视频数据作为目标视频数据;或者,可以从质量分值超过某一阈值的多个候选视频数据中确定出一筛选范围,然后进一步根据业务的实际需求,从该范围内的多个候选视频数据中确定出目标视频数据,本申请实施例对确定目标视频数据的具体方式不作限定。当然,目标视频数据可以不止一个,也可以有多个,本申请对此亦不作限定。In an example, the higher the quality score, the better the quality of the corresponding video data can be considered. Therefore, the video data with the highest quality score can be used as the target video data; or, the quality score can exceed a certain threshold. Determining a screening range in the plurality of candidate video data, and then determining the target video data from the plurality of candidate video data in the range according to actual requirements of the service, and the specific manner of determining the target video data in the embodiment of the present application Not limited. Of course, there may be more than one target video data, and there may be multiple, and this application does not limit this.
需要说明的是,目标视频数据可以是由终端根据用户输入的信息自行确定的,可以是用户在多个候选视频数据中具体选定的,本申请实施例对此不作限定。It should be noted that the target video data may be determined by the terminal according to the information input by the user, and may be specifically selected by the user in the multiple candidate video data, which is not limited in this embodiment of the present application.
步骤505,展现所述目标视频数据。 Step 505, presenting the target video data.
当确定出目标视频数据后,终端可以在交互界面上展现所述目标视频数据,例如,可以展现目标视频数据的具体信息,或者直接播放该目标视频数据,本申请实施例对此不作限定。After the target video data is determined, the terminal may display the target video data on the interaction interface, for example, the specific information of the target video data may be displayed, or the target video data may be directly played, which is not limited in this embodiment of the present application.
在本申请实施例中,通过在终端上提供一交互界面,从而用户可以通过该交互界面直接提交对视频数据的识别请求,并由服务器对该识别请求所针对的视频数据进行识别,使得用户可以根据实际需要完成对视频数据的检测,提高了用户对视频数据的质量的判断便捷性。In the embodiment of the present application, by providing an interaction interface on the terminal, the user can directly submit the identification request for the video data through the interaction interface, and the server identifies the video data targeted by the identification request, so that the user can The detection of the video data is completed according to actual needs, and the convenience of the user to judge the quality of the video data is improved.
需要说明的是,对于方法实施例,为了简单描述,故将其都表述为一系列的动作组合,但是本领域技术人员应该知悉,本申请实施例并不受所描述的动作顺序的限制,因为依据本申请实施例,某些步骤可以采用其他顺序或者同时进行。其次,本领域技术人员也应该知悉,说明书中所描述的实施例均属于优选实施例,所涉及的动作并不一定是本申请实施例所必须的。It should be noted that, for the method embodiments, for the sake of simple description, they are all expressed as a series of action combinations, but those skilled in the art should understand that the embodiments of the present application are not limited by the described action sequence, because In accordance with embodiments of the present application, certain steps may be performed in other sequences or concurrently. In the following, those skilled in the art should also understand that the embodiments described in the specification are all preferred embodiments, and the actions involved are not necessarily required in the embodiments of the present application.
参照图6,示出了本申请的一种视频数据的推荐装置实施例的结构框图,具体可以 包括如下模块:Referring to FIG. 6, a structural block diagram of a device for recommending video data of the present application is shown, which may specifically include the following modules:
获取模块601,用于获取一个或多个待检测的视频数据;The obtaining module 601 is configured to acquire one or more video data to be detected;
提取模块602,用于分别提取每个待检测的视频数据的质量特征信息;The extracting module 602 is configured to separately extract quality feature information of each video data to be detected;
识别模块603,用于采用预设的视频数据检测模型对所述质量特征信息进行识别,以获得目标视频数据;The identification module 603 is configured to identify the quality feature information by using a preset video data detection model to obtain target video data.
推荐模块604,用于向用户推荐所述目标视频数据。The recommendation module 604 is configured to recommend the target video data to the user.
在本申请实施例中,所述预设的视频数据检测模型可以通过调用如下模块生成:In this embodiment of the present application, the preset video data detection model may be generated by calling the following module:
质量特征信息提取模块,用于分别提取多个样本视频数据的质量特征信息,所述多个样本视频数据可以包括多个正向样本视频数据和负向样本视频数据;a quality feature information extraction module, configured to separately extract quality feature information of the plurality of sample video data, where the plurality of sample video data may include a plurality of forward sample video data and negative direction sample video data;
视频数据检测模型生成模块,用于采用所述多个正向样本视频数据和负向样本视频数据的质量特征信息进行训练,生成视频数据检测模型。The video data detection model generating module is configured to perform training by using the quality feature information of the plurality of forward sample video data and the negative sample video data to generate a video data detection model.
在本申请实施例中,所述质量特征信息可以包括图像像素特征信息,连续帧图像物体迁移特征信息,连续帧图像动作特征信息,图像帧不同的频域特征信息,图像帧小波变换特征信息,和/或,图像旋转算子特征信息。In the embodiment of the present application, the quality feature information may include image pixel feature information, continuous frame image object migration feature information, continuous frame image motion feature information, different frequency domain feature information of the image frame, and image frame wavelet transform feature information. And/or image rotation operator feature information.
在本申请实施例中,所述质量特征信息提取模块具体可以包括如下子模块:In the embodiment of the present application, the quality feature information extraction module may specifically include the following submodules:
像素信息提取子模块,用于提取每个样本视频数据的每一帧图像的像素信息;a pixel information extraction submodule, configured to extract pixel information of each frame image of each sample video data;
像素信息处理子模块,用于分别对所述像素信息进行卷积运算和池化处理,以获得图像像素特征信息。The pixel information processing sub-module is configured to perform convolution operation and pooling processing on the pixel information to obtain image pixel feature information.
在本申请实施例中,所述质量特征信息提取模块还可以包括如下子模块:In the embodiment of the present application, the quality feature information extraction module may further include the following sub-modules:
物体对象识别子模块,用于识别每个样本视频数据的每一帧图像中的物体对象;An object object recognition sub-module for identifying an object object in each frame image of each sample video data;
物体对象处理子模块,用于分别确定相邻两帧图像中的物体对象出现的次数和频率,以获得连续帧图像物体迁移特征信息。The object object processing sub-module is configured to respectively determine the number and frequency of occurrences of the object objects in the adjacent two frames of images to obtain continuous frame image object migration feature information.
在本申请实施例中,所述质量特征信息提取模块还可以包括如下子模块:In the embodiment of the present application, the quality feature information extraction module may further include the following sub-modules:
动作对象识别子模块,用于识别每个样本视频数据的每一帧图像中的动作对象的形状特征;a motion object recognition submodule, configured to identify a shape feature of the motion object in each frame image of each sample video data;
动作对象处理子模块,用于分别确定相邻两帧图像中的动作对象的形状特征的几何参数,以获得连续帧图像动作特征信息。The action object processing sub-module is configured to respectively determine geometric parameters of the shape features of the action objects in the adjacent two frames of images to obtain continuous frame image action feature information.
在本申请实施例中,所述质量特征信息提取模块还可以包括如下子模块:In the embodiment of the present application, the quality feature information extraction module may further include the following sub-modules:
幅值和相位确定子模块,用于确定每个样本视频数据的每一帧图像的幅值和相位;An amplitude and phase determination sub-module for determining a magnitude and a phase of each frame image of each sample video data;
幅值和相位处理子模块,用于分别确定相邻两帧图像的幅值差和相位差,以获得图 像帧不同的频域特征信息。The amplitude and phase processing sub-module is configured to respectively determine the amplitude difference and the phase difference of the adjacent two frames of images to obtain different frequency domain characteristic information of the image frame.
在本申请实施例中,所述质量特征信息提取模块还可以包括如下子模块:In the embodiment of the present application, the quality feature information extraction module may further include the following sub-modules:
小波系数确定子模块,用于确定每个样本视频数据的每一帧图像的小波系数;a wavelet coefficient determining submodule for determining a wavelet coefficient of each frame image of each sample video data;
小波系数处理子模块,用于分别确定相邻两帧图像的小波系数的变化值,以获得图像帧小波变换特征信息。The wavelet coefficient processing sub-module is configured to respectively determine the variation values of the wavelet coefficients of the adjacent two frames of images to obtain image frame wavelet transform feature information.
在本申请实施例中,所述质量特征信息提取模块还可以包括如下子模块:In the embodiment of the present application, the quality feature information extraction module may further include the following sub-modules:
旋转算子确定子模块,用于确定每个样本视频数据的每一帧图像的旋转算子;a rotation operator determining sub-module for determining a rotation operator of each frame image of each sample video data;
旋转算子处理子模块,用于分别确定相邻两帧图像的旋转算子的变化值,以获得图像旋转算子特征信息。The rotation operator processing sub-module is configured to respectively determine a variation value of a rotation operator of the adjacent two frames of images to obtain image rotation operator feature information.
在本申请实施例中,所述视频数据检测模型生成模块具体可以包括如下子模块:In the embodiment of the present application, the video data detection model generating module may specifically include the following submodules:
归一化处理子模块,用于对所述多个正向样本视频数据和负向样本视频数据的质量特征信息进行归一化处理,以获得归一化的质量特征信息;a normalization processing sub-module, configured to normalize quality characteristic information of the plurality of forward sample video data and negative-direction sample video data to obtain normalized quality feature information;
缺失值补全子模块,用于补全所述归一化的质量特征信息的缺失值;a missing value completion sub-module for complementing the missing value of the normalized quality feature information;
目标质量特征信息识别子模块,用于从所述归一化的质量特征信息中识别出目标质量特征信息;a target quality feature information identifying submodule, configured to identify target quality feature information from the normalized quality feature information;
视频数据检测模型生成子模块,用于采用所述目标质量特征信息进行神经网络模型训练,生成视频数据检测模型。The video data detection model generation submodule is configured to perform neural network model training by using the target quality feature information, and generate a video data detection model.
在本申请实施例中,所述目标质量特征信息识别子模块具体可以包括如下单元:In the embodiment of the present application, the target quality feature information identifying submodule may specifically include the following units:
信息熵确定单元,用于确定所述归一化的质量特征信息的信息熵;An information entropy determining unit, configured to determine an information entropy of the normalized quality feature information;
目标质量特征信息识别单元,用于识别所述信息熵超过第一预设阈值的质量特征信息为目标质量特征信息。The target quality feature information identifying unit is configured to identify the quality feature information that the information entropy exceeds the first preset threshold as the target quality feature information.
在本申请实施例中,生成所述预设的视频数据检测模型还可以调用如下模块:In the embodiment of the present application, generating the preset video data detection model may also invoke the following modules:
属性信息获取模块,用于获取多个用户的属性信息;An attribute information obtaining module, configured to acquire attribute information of multiple users;
用户群体聚类模块,用于根据所述属性信息,将所述多个用户聚类为多个用户群体,所述用户群体具有相应的用户标签。The user group clustering module is configured to cluster the plurality of users into a plurality of user groups according to the attribute information, where the user group has a corresponding user label.
在本申请实施例中,所述识别模块603具体可以包括如下子模块:In the embodiment of the present application, the identification module 603 may specifically include the following sub-modules:
质量特征信息识别子模块,用于采用预设的视频数据检测模型分别对所述一个或多个待检测的视频数据的质量特征信息进行识别,以获得所述一个或多个待检测的视频数据的质量分值;a quality feature information identifying sub-module, configured to identify, by using a preset video data detection model, quality characteristic information of the one or more video data to be detected, respectively, to obtain the one or more video data to be detected. Quality score
目标视频数据提取子模块,用于提取所述质量分值超过第二预设阈值的视频数据为 目标视频数据。The target video data extraction sub-module is configured to extract video data whose quality score exceeds a second preset threshold as target video data.
在本申请实施例中,所述推荐模块604具体可以包括如下子模块:In the embodiment of the present application, the recommendation module 604 may specifically include the following submodules:
目标用户群体确定子模块,用于在所述多个用户群体中确定目标用户群体;a target user group determining submodule, configured to determine a target user group among the plurality of user groups;
目标视频数据推荐子模块,用于向所述目标用户群体推荐所述目标视频数据。The target video data recommendation submodule is configured to recommend the target video data to the target user group.
在本申请实施例中,所述目标视频数据可以具有相应的视频标签,所述目标用户群体确定子模块具体可以包括如下单元:In the embodiment of the present application, the target video data may have a corresponding video label, and the target user group determining sub-module may specifically include the following units:
目标用户群体确定单元,用于确定与所述目标视频数据的视频标签相同的用户标签所对应的用户群体为目标用户群体。The target user group determining unit is configured to determine a user group corresponding to the same user tag of the video tag of the target video data as a target user group.
参照图7,示出了本申请的一种视频数据检测模型的生成装置实施例的结构框图,具体可以包括如下模块:Referring to FIG. 7, a structural block diagram of an embodiment of a device for generating a video data detection model of the present application is shown, which may specifically include the following modules:
质量特征信息提取模块701,用于分别提取多个样本视频数据的质量特征信息,所述多个样本视频数据可以包括多个正向样本视频数据和负向样本视频数据;The quality feature information extraction module 701 is configured to separately extract quality feature information of the plurality of sample video data, where the plurality of sample video data may include a plurality of forward sample video data and negative direction sample video data;
视频数据检测模型生成模块702,用于采用所述多个正向样本视频数据和负向样本视频数据的质量特征信息进行训练,生成视频数据检测模型。The video data detection model generating module 702 is configured to perform training by using the quality feature information of the plurality of forward sample video data and the negative sample video data to generate a video data detection model.
在本申请实施例中,所述质量特征信息可以包括图像像素特征信息,连续帧图像物体迁移特征信息,连续帧图像动作特征信息,图像帧不同的频域特征信息,图像帧小波变换特征信息,和/或,图像旋转算子特征信息。In the embodiment of the present application, the quality feature information may include image pixel feature information, continuous frame image object migration feature information, continuous frame image motion feature information, different frequency domain feature information of the image frame, and image frame wavelet transform feature information. And/or image rotation operator feature information.
参照图8,示出了本申请的一种视频数据的识别装置实施例的结构框图,具体可以包括如下模块:Referring to FIG. 8, a structural block diagram of an embodiment of an apparatus for identifying video data according to the present application is shown, which may specifically include the following modules:
获取模块801,用于获取一个或多个待检测的视频数据;The obtaining module 801 is configured to acquire one or more video data to be detected;
发送模块802,用于将所述一个或多个待检测的视频数据发送至服务器,所述服务器用于分别对所述一个或多个待检测的视频数据进行识别,以获得识别结果,所述识别结果可以包括一个或多个候选视频数据;a sending module 802, configured to send the one or more video data to be detected to a server, where the server is configured to separately identify the one or more video data to be detected to obtain a recognition result, where The recognition result may include one or more candidate video data;
接收模块803,用于接收所述服务器返回的所述一个或多个候选视频数据;The receiving module 803 is configured to receive the one or more candidate video data returned by the server;
确定模块804,用于在所述一个或多个候选视频数据中确定目标视频数据;a determining module 804, configured to determine target video data in the one or more candidate video data;
展现模块805,用于展现所述目标视频数据。A presentation module 805 is configured to present the target video data.
对于装置实施例而言,由于其与方法实施例基本相似,所以描述的比较简单,相关 之处参见方法实施例的部分说明即可。For the device embodiment, since it is basically similar to the method embodiment, the description is relatively simple, and the relevant parts can be referred to the description of the method embodiment.
本说明书中的各个实施例均采用递进的方式描述,每个实施例重点说明的都是与其他实施例的不同之处,各个实施例之间相同相似的部分互相参见即可。The various embodiments in the present specification are described in a progressive manner, and each embodiment focuses on differences from other embodiments, and the same similar parts between the various embodiments can be referred to each other.
本领域内的技术人员应明白,本申请实施例的实施例可提供为方法、装置、或计算机程序产品。因此,本申请实施例可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本申请实施例可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。Those skilled in the art will appreciate that embodiments of the embodiments of the present application can be provided as a method, apparatus, or computer program product. Therefore, the embodiments of the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware. Moreover, embodiments of the present application can take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) including computer usable program code.
在一个典型的配置中,所述计算机设备包括一个或多个处理器(CPU)、输入/输出接口、网络接口和内存。内存可能包括计算机可读介质中的非永久性存储器,随机存取存储器(RAM)和/或非易失性内存等形式,如只读存储器(ROM)或闪存(flash RAM)。内存是计算机可读介质的示例。计算机可读介质包括永久性和非永久性、可移动和非可移动媒体可以由任何方法或技术来实现信息存储。信息可以是计算机可读指令、数据结构、程序的模块或其他数据。计算机的存储介质的例子包括,但不限于相变内存(PRAM)、静态随机存取存储器(SRAM)、动态随机存取存储器(DRAM)、其他类型的随机存取存储器(RAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、快闪记忆体或其他内存技术、只读光盘只读存储器(CD-ROM)、数字多功能光盘(DVD)或其他光学存储、磁盒式磁带,磁带磁磁盘存储或其他磁性存储设备或任何其他非传输介质,可用于存储可以被计算设备访问的信息。按照本文中的界定,计算机可读介质不包括非持续性的电脑可读媒体(transitory media),如调制的数据信号和载波。In a typical configuration, the computer device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory. The memory may include non-persistent memory, random access memory (RAM), and/or non-volatile memory in a computer readable medium, such as read only memory (ROM) or flash memory. Memory is an example of a computer readable medium. Computer readable media includes both permanent and non-persistent, removable and non-removable media. Information storage can be implemented by any method or technology. The information can be computer readable instructions, data structures, modules of programs, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read only memory. (ROM), electrically erasable programmable read only memory (EEPROM), flash memory or other memory technology, compact disk read only memory (CD-ROM), digital versatile disk (DVD) or other optical storage, Magnetic tape cartridges, magnetic tape storage or other magnetic storage devices or any other non-transportable media can be used to store information that can be accessed by a computing device. As defined herein, computer readable media does not include non-persistent computer readable media, such as modulated data signals and carrier waves.
本申请实施例是参照根据本申请实施例的方法、终端设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理终端设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理终端设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。Embodiments of the present application are described with reference to flowcharts and/or block diagrams of methods, terminal devices (systems), and computer program products according to embodiments of the present application. It will be understood that each flow and/or block of the flowchart illustrations and/or FIG. These computer program instructions can be provided to a processor of a general purpose computer, special purpose computer, embedded processor or other programmable data processing terminal device to produce a machine such that instructions are executed by a processor of a computer or other programmable data processing terminal device Means are provided for implementing the functions specified in one or more of the flow or in one or more blocks of the flow chart.
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理终端设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括 指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。The computer program instructions can also be stored in a computer readable memory that can direct a computer or other programmable data processing terminal device to operate in a particular manner, such that the instructions stored in the computer readable memory produce an article of manufacture comprising the instruction device. The instruction device implements the functions specified in one or more blocks of the flowchart or in a flow or block of the flowchart.
这些计算机程序指令也可装载到计算机或其他可编程数据处理终端设备上,使得在计算机或其他可编程终端设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程终端设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。These computer program instructions can also be loaded onto a computer or other programmable data processing terminal device such that a series of operational steps are performed on the computer or other programmable terminal device to produce computer-implemented processing, such that the computer or other programmable terminal device The instructions executed above provide steps for implementing the functions specified in one or more blocks of the flowchart or in a block or blocks of the flowchart.
尽管已描述了本申请实施例的优选实施例,但本领域内的技术人员一旦得知了基本创造性概念,则可对这些实施例做出另外的变更和修改。所以,所附权利要求意欲解释为包括优选实施例以及落入本申请实施例范围的所有变更和修改。While a preferred embodiment of the embodiments of the present application has been described, those skilled in the art can make further changes and modifications to the embodiments once they are aware of the basic inventive concept. Therefore, the appended claims are intended to be interpreted as including all the modifications and the modifications
最后,还需要说明的是,在本文中,诸如第一和第二等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来,而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。而且,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者终端设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者终端设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、物品或者终端设备中还存在另外的相同要素。Finally, it should also be noted that in this context, relational terms such as first and second are used merely to distinguish one entity or operation from another entity or operation, and do not necessarily require or imply these entities. There is any such actual relationship or order between operations. Furthermore, the terms "comprises" or "comprising" or "comprising" or any other variations are intended to encompass a non-exclusive inclusion, such that a process, method, article, or terminal device that includes a plurality of elements includes not only those elements but also Other elements that are included, or include elements inherent to such a process, method, article, or terminal device. An element defined by the phrase "comprising a ..." does not exclude the presence of additional identical elements in the process, method, article, or terminal device that comprises the element, without further limitation.
以上对本申请所提供的一种视频数据的推荐方法、一种视频数据的推荐装置、一种视频数据检测模型的生成方法、一种视频数据检测模型的生成装置、一种视频数据的识别方法和一种视频数据的识别装置,进行了详细介绍,本文中应用了具体个例对本申请的原理及实施方式进行了阐述,以上实施例的说明只是用于帮助理解本申请的方法及其核心思想;同时,对于本领域的一般技术人员,依据本申请的思想,在具体实施方式及应用范围上均会有改变之处,综上所述,本说明书内容不应理解为对本申请的限制。A method for recommending video data provided by the present application, a recommendation device for video data, a method for generating a video data detection model, a device for generating a video data detection model, a method for identifying video data, and A device for identifying video data is described in detail. The principles and implementations of the present application are described in the following. The description of the above embodiments is only used to help understand the method and core idea of the present application; In the meantime, the present invention is not limited to the scope of the present application.
Claims (21)
- 一种视频数据的推荐方法,其特征在于,包括:A method for recommending video data, comprising:获取一个或多个待检测的视频数据;Obtaining one or more video data to be detected;分别提取每个待检测的视频数据的质量特征信息;Extracting quality characteristic information of each video data to be detected separately;采用预设的视频数据检测模型对所述质量特征信息进行识别,以获得目标视频数据;Identifying the quality feature information by using a preset video data detection model to obtain target video data;向用户推荐所述目标视频数据。The target video data is recommended to the user.
- 根据权利要求1所述的方法,其特征在于,所述预设的视频数据检测模型通过如下方式生成:The method according to claim 1, wherein the preset video data detection model is generated by:分别提取多个样本视频数据的质量特征信息,所述多个样本视频数据包括多个正向样本视频数据和负向样本视频数据;Extracting quality feature information of the plurality of sample video data, the plurality of sample video data including a plurality of forward sample video data and negative direction sample video data;采用所述多个正向样本视频数据和负向样本视频数据的质量特征信息进行训练,生成视频数据检测模型。Training is performed by using quality feature information of the plurality of forward sample video data and negative sample video data to generate a video data detection model.
- 根据权利要求2所述的方法,其特征在于,所述质量特征信息包括图像像素特征信息,连续帧图像物体迁移特征信息,连续帧图像动作特征信息,图像帧不同的频域特征信息,图像帧小波变换特征信息,和/或,图像旋转算子特征信息。The method according to claim 2, wherein the quality feature information comprises image pixel feature information, continuous frame image object migration feature information, continuous frame image motion feature information, different frequency domain feature information of the image frame, and image frame. Wavelet transform feature information, and/or image rotation operator feature information.
- 根据权利要求3所述的方法,其特征在于,所述分别提取多个样本视频数据的质量特征信息的步骤包括:The method according to claim 3, wherein the step of separately extracting quality feature information of the plurality of sample video data comprises:提取每个样本视频数据的每一帧图像的像素信息;Extracting pixel information of each frame image of each sample video data;分别对所述像素信息进行卷积运算和池化处理,以获得图像像素特征信息。The pixel information is separately subjected to convolution operation and pooling processing to obtain image pixel feature information.
- 根据权利要求3所述的方法,其特征在于,所述分别提取多个样本视频数据的质量特征信息的步骤包括:The method according to claim 3, wherein the step of separately extracting quality feature information of the plurality of sample video data comprises:识别每个样本视频数据的每一帧图像中的物体对象;Identifying an object object in each frame of image of each sample video data;分别确定相邻两帧图像中的物体对象出现的次数和频率,以获得连续帧图像物体迁移特征信息。The number and frequency of occurrences of the object objects in the adjacent two frames of images are respectively determined to obtain continuous frame image object migration feature information.
- 根据权利要求3所述的方法,其特征在于,所述分别提取多个样本视频数据的质量特征信息的步骤包括:The method according to claim 3, wherein the step of separately extracting quality feature information of the plurality of sample video data comprises:识别每个样本视频数据的每一帧图像中的动作对象的形状特征;Identifying a shape feature of the action object in each frame image of each sample video data;分别确定相邻两帧图像中的动作对象的形状特征的几何参数,以获得连续帧图像动作特征信息。The geometric parameters of the shape features of the motion objects in the adjacent two frames of images are respectively determined to obtain continuous frame image motion feature information.
- 根据权利要求3所述的方法,其特征在于,所述分别提取多个样本视频数据的质量特征信息的步骤包括:The method according to claim 3, wherein the step of separately extracting quality feature information of the plurality of sample video data comprises:确定每个样本视频数据的每一帧图像的幅值和相位;Determining the amplitude and phase of each frame of image of each sample video data;分别确定相邻两帧图像的幅值差和相位差,以获得图像帧不同的频域特征信息。The amplitude difference and the phase difference of the adjacent two frames of images are respectively determined to obtain different frequency domain feature information of the image frame.
- 根据权利要求3所述的方法,其特征在于,所述分别提取多个样本视频数据的质量特征信息的步骤包括:The method according to claim 3, wherein the step of separately extracting quality feature information of the plurality of sample video data comprises:确定每个样本视频数据的每一帧图像的小波系数;Determining a wavelet coefficient of each frame image of each sample video data;分别确定相邻两帧图像的小波系数的变化值,以获得图像帧小波变换特征信息。The change values of the wavelet coefficients of the adjacent two frames of images are respectively determined to obtain image frame wavelet transform feature information.
- 根据权利要求3所述的方法,其特征在于,所述分别提取多个样本视频数据的质量特征信息的步骤包括:The method according to claim 3, wherein the step of separately extracting quality feature information of the plurality of sample video data comprises:确定每个样本视频数据的每一帧图像的旋转算子;Determining a rotation operator for each frame of image of each sample video data;分别确定相邻两帧图像的旋转算子的变化值,以获得图像旋转算子特征信息。The change values of the rotation operators of the adjacent two frames of images are respectively determined to obtain image rotation operator feature information.
- 根据权利要求2-9任一所述的方法,其特征在于,所述采用所述多个正向样本视频数据和负向样本视频数据的质量特征信息进行训练,生成视频数据检测模型的步骤包括:The method according to any one of claims 2-9, wherein the step of training using the quality feature information of the plurality of forward sample video data and negative sample video data to generate a video data detection model comprises: :对所述多个正向样本视频数据和负向样本视频数据的质量特征信息进行归一化处理,以获得归一化的质量特征信息;Normalizing the quality feature information of the plurality of forward sample video data and negative sample video data to obtain normalized quality feature information;补全所述归一化的质量特征信息的缺失值;Completing the missing value of the normalized quality feature information;从所述归一化的质量特征信息中识别出目标质量特征信息;Identifying target quality feature information from the normalized quality feature information;采用所述目标质量特征信息进行神经网络模型训练,生成视频数据检测模型。The target quality feature information is used to train the neural network model to generate a video data detection model.
- 根据权利要求10所述的方法,其特征在于,所述从所述归一化的质量特征信息中识别出目标质量特征信息的步骤包括:The method according to claim 10, wherein the step of identifying the target quality feature information from the normalized quality feature information comprises:确定所述归一化的质量特征信息的信息熵;Determining an information entropy of the normalized quality feature information;识别所述信息熵超过第一预设阈值的质量特征信息为目标质量特征信息。The quality feature information identifying that the information entropy exceeds the first preset threshold is the target quality feature information.
- 根据权利要求2所述的方法,其特征在于,还包括:The method of claim 2, further comprising:获取多个用户的属性信息;Obtain attribute information of multiple users;根据所述属性信息,将所述多个用户聚类为多个用户群体,所述用户群体具有相应的用户标签。And the plurality of users are clustered into a plurality of user groups according to the attribute information, and the user groups have corresponding user labels.
- 根据权利要求12所述的方法,其特征在于,所述采用预设的视频数据检测模型对所述质量特征信息进行识别,以获得目标视频数据的步骤包括:The method according to claim 12, wherein the step of identifying the quality feature information by using a preset video data detection model to obtain target video data comprises:采用预设的视频数据检测模型分别对所述一个或多个待检测的视频数据的质量特征信息进行识别,以获得所述一个或多个待检测的视频数据的质量分值;Determining quality characteristic information of the one or more video data to be detected by using a preset video data detection model to obtain a quality score of the one or more video data to be detected;提取所述质量分值超过第二预设阈值的视频数据为目标视频数据。The video data whose quality score exceeds the second preset threshold is extracted as target video data.
- 根据权利要求13所述的方法,其特征在于,所述向用户推荐所述目标视频数据的步骤包括:The method according to claim 13, wherein said step of recommending said target video data to a user comprises:在所述多个用户群体中确定目标用户群体;Determining a target user group among the plurality of user groups;向所述目标用户群体推荐所述目标视频数据。The target video data is recommended to the target user group.
- 根据权利要求14所述的方法,其特征在于,所述目标视频数据具有相应的视频标签,所述在所述多个用户群体中确定目标用户群体的步骤包括:The method according to claim 14, wherein the target video data has a corresponding video tag, and the step of determining a target user group among the plurality of user groups comprises:确定与所述目标视频数据的视频标签相同的用户标签所对应的用户群体为目标用户群体。Determining a user group corresponding to the same user tag of the video tag of the target video data as a target user group.
- 一种视频数据检测模型的生成方法,其特征在于,包括:A method for generating a video data detection model, comprising:分别提取多个样本视频数据的质量特征信息,所述多个样本视频数据包括多个正向样本视频数据和负向样本视频数据;Extracting quality feature information of the plurality of sample video data, the plurality of sample video data including a plurality of forward sample video data and negative direction sample video data;采用所述多个正向样本视频数据和负向样本视频数据的质量特征信息进行训练,生成视频数据检测模型。Training is performed by using quality feature information of the plurality of forward sample video data and negative sample video data to generate a video data detection model.
- 根据权利要求16所述的方法,其特征在于,所述质量特征信息包括图像像素特征信息,连续帧图像物体迁移特征信息,连续帧图像动作特征信息,图像帧不同的频域特征信息,图像帧小波变换特征信息,和/或,图像旋转算子特征信息。The method according to claim 16, wherein the quality feature information comprises image pixel feature information, continuous frame image object migration feature information, continuous frame image motion feature information, different frequency domain feature information of the image frame, and image frame. Wavelet transform feature information, and/or image rotation operator feature information.
- 一种视频数据的识别方法,其特征在于,包括:A method for identifying video data, comprising:获取一个或多个待检测的视频数据;Obtaining one or more video data to be detected;将所述一个或多个待检测的视频数据发送至服务器,所述服务器用于分别对所述一个或多个待检测的视频数据进行识别,以获得识别结果,所述识别结果包括一个或多个候选视频数据;Sending the one or more video data to be detected to a server, where the server is configured to separately identify the one or more video data to be detected to obtain a recognition result, where the recognition result includes one or more Candidate video data;接收所述服务器返回的所述一个或多个候选视频数据;Receiving the one or more candidate video data returned by the server;在所述一个或多个候选视频数据中确定目标视频数据;Determining target video data in the one or more candidate video data;展现所述目标视频数据。Presenting the target video data.
- 一种视频数据的推荐装置,其特征在于,包括:A device for recommending video data, comprising:获取模块,用于获取一个或多个待检测的视频数据;An obtaining module, configured to acquire one or more video data to be detected;提取模块,用于分别提取每个待检测的视频数据的质量特征信息;An extraction module, configured to separately extract quality feature information of each video data to be detected;识别模块,用于采用预设的视频数据检测模型对所述质量特征信息进行识别,以获得目标视频数据;An identification module, configured to identify the quality feature information by using a preset video data detection model to obtain target video data;推荐模块,用于向用户推荐所述目标视频数据。a recommendation module for recommending the target video data to a user.
- 一种视频数据检测模型的生成装置,其特征在于,包括:A device for generating a video data detection model, comprising:质量特征信息提取模块,用于分别提取多个样本视频数据的质量特征信息,所述多个样本视频数据包括多个正向样本视频数据和负向样本视频数据;a quality feature information extraction module, configured to separately extract quality feature information of the plurality of sample video data, where the plurality of sample video data includes a plurality of forward sample video data and negative sample video data;视频数据检测模型生成模块,用于采用所述多个正向样本视频数据和负向样本视频数据的质量特征信息进行训练,生成视频数据检测模型。The video data detection model generating module is configured to perform training by using the quality feature information of the plurality of forward sample video data and the negative sample video data to generate a video data detection model.
- 一种视频数据的识别装置,其特征在于,包括:A device for identifying video data, comprising:获取模块,用于获取一个或多个待检测的视频数据;An obtaining module, configured to acquire one or more video data to be detected;发送模块,用于将所述一个或多个待检测的视频数据发送至服务器,所述服务器用于分别对所述一个或多个待检测的视频数据进行识别,以获得识别结果,所述识别结果包括一个或多个候选视频数据;a sending module, configured to send the one or more video data to be detected to a server, where the server is configured to separately identify the one or more video data to be detected to obtain a recognition result, and the identifying The result includes one or more candidate video data;接收模块,用于接收所述服务器返回的所述一个或多个候选视频数据;a receiving module, configured to receive the one or more candidate video data returned by the server;确定模块,用于在所述一个或多个候选视频数据中确定目标视频数据;a determining module, configured to determine target video data in the one or more candidate video data;展现模块,用于展现所述目标视频数据。a presentation module for presenting the target video data.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710113741.4A CN108509457A (en) | 2017-02-28 | 2017-02-28 | A kind of recommendation method and apparatus of video data |
CN201710113741.4 | 2017-02-28 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2018157746A1 true WO2018157746A1 (en) | 2018-09-07 |
Family
ID=63369778
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2018/076784 WO2018157746A1 (en) | 2017-02-28 | 2018-02-14 | Recommendation method and apparatus for video data |
Country Status (3)
Country | Link |
---|---|
CN (1) | CN108509457A (en) |
TW (1) | TWI753044B (en) |
WO (1) | WO2018157746A1 (en) |
Cited By (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110879851A (en) * | 2019-10-15 | 2020-03-13 | 北京三快在线科技有限公司 | Video dynamic cover generation method and device, electronic equipment and readable storage medium |
CN111126262A (en) * | 2019-12-24 | 2020-05-08 | 中国科学院自动化研究所 | Video highlight detection method and device based on graph neural network |
WO2020093914A1 (en) * | 2018-11-08 | 2020-05-14 | Alibaba Group Holding Limited | Content-weighted deep residual learning for video in-loop filtering |
CN111191054A (en) * | 2019-12-18 | 2020-05-22 | 腾讯科技(深圳)有限公司 | Recommendation method and device for media data |
CN111753136A (en) * | 2019-11-14 | 2020-10-09 | 北京沃东天骏信息技术有限公司 | Article information processing method, article information processing device, medium, and electronic device |
CN111950360A (en) * | 2020-07-06 | 2020-11-17 | 北京奇艺世纪科技有限公司 | Method and device for identifying infringing user |
CN112100441A (en) * | 2020-09-17 | 2020-12-18 | 咪咕文化科技有限公司 | Video recommendation method, electronic device and computer-readable storage medium |
CN112464083A (en) * | 2020-11-16 | 2021-03-09 | 北京达佳互联信息技术有限公司 | Model training method, work pushing method, device, electronic equipment and storage medium |
CN112749297A (en) * | 2020-03-03 | 2021-05-04 | 腾讯科技(深圳)有限公司 | Video recommendation method and device, computer equipment and computer-readable storage medium |
CN113761347A (en) * | 2021-02-25 | 2021-12-07 | 北京沃东天骏信息技术有限公司 | Commodity recommendation method, commodity recommendation device, storage medium and commodity recommendation system |
CN114037941A (en) * | 2021-11-22 | 2022-02-11 | 南京启数智能系统有限公司 | Method and device for algorithmic multi-data cross-validation completion for video target attributes |
CN114519840A (en) * | 2022-02-25 | 2022-05-20 | 携程旅游信息技术(上海)有限公司 | Photo album video identification method and training method and device of photo album video identification model |
CN114780795A (en) * | 2022-05-07 | 2022-07-22 | 济南博观智能科技有限公司 | Video material screening method, device, equipment and medium |
WO2024057124A1 (en) * | 2022-09-14 | 2024-03-21 | Digit7 India Private Limited | System and method for automatically labelling media |
Families Citing this family (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109242030A (en) * | 2018-09-21 | 2019-01-18 | 京东方科技集团股份有限公司 | Draw single generation method and device, electronic equipment, computer readable storage medium |
CN109068180B (en) * | 2018-09-28 | 2021-02-02 | 武汉斗鱼网络科技有限公司 | Method for determining video fine selection set and related equipment |
CN109614537A (en) * | 2018-12-06 | 2019-04-12 | 北京百度网讯科技有限公司 | For generating the method, apparatus, equipment and storage medium of video |
CN109729395B (en) * | 2018-12-14 | 2022-02-08 | 广州市百果园信息技术有限公司 | Video quality evaluation method and device, storage medium and computer equipment |
CN111353597B (en) * | 2018-12-24 | 2023-12-05 | 杭州海康威视数字技术股份有限公司 | Target detection neural network training method and device |
CN111401100B (en) * | 2018-12-28 | 2021-02-09 | 广州市百果园信息技术有限公司 | Video quality evaluation method, device, equipment and storage medium |
CN109685631B (en) * | 2019-01-10 | 2021-06-01 | 博拉网络股份有限公司 | Personalized recommendation method based on big data user behavior analysis |
CN112464027A (en) * | 2019-09-06 | 2021-03-09 | 腾讯科技(深圳)有限公司 | Video detection method, device and storage medium |
CN111209897B (en) * | 2020-03-09 | 2023-06-20 | 深圳市雅阅科技有限公司 | Video processing method, device and storage medium |
CN111491187B (en) * | 2020-04-15 | 2023-10-31 | 腾讯科技(深圳)有限公司 | Video recommendation method, device, equipment and storage medium |
CN111683273A (en) * | 2020-06-02 | 2020-09-18 | 中国联合网络通信集团有限公司 | Method and device for determining video freeze information |
CN113837820B (en) * | 2020-06-23 | 2024-11-05 | 阿里巴巴集团控股有限公司 | Data processing method, device and equipment |
CN112069951B (en) * | 2020-08-25 | 2025-04-18 | 北京小米松果电子有限公司 | Video segment extraction method, video segment extraction device and storage medium |
CN112199582B (en) * | 2020-09-21 | 2023-07-18 | 聚好看科技股份有限公司 | A content recommendation method, device, equipment and medium |
CN114613000A (en) * | 2020-12-08 | 2022-06-10 | 阿里巴巴集团控股有限公司 | Behavior identification method based on video, computing equipment and user equipment |
CN116708725B (en) * | 2023-08-07 | 2023-10-31 | 清华大学 | Low-bandwidth crowd scene security monitoring method and system based on semantic encoding and decoding |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101282481A (en) * | 2008-05-09 | 2008-10-08 | 中国传媒大学 | A Method of Video Quality Evaluation Based on Artificial Neural Network |
US20110131595A1 (en) * | 2009-12-02 | 2011-06-02 | General Electric Company | Methods and systems for online recommendation |
CN104219575A (en) * | 2013-05-29 | 2014-12-17 | 酷盛(天津)科技有限公司 | Related video recommending method and system |
CN104915861A (en) * | 2015-06-15 | 2015-09-16 | 浙江经贸职业技术学院 | An electronic commerce recommendation method for a user group model constructed based on scores and labels |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
TWI510064B (en) * | 2012-03-30 | 2015-11-21 | Inst Information Industry | Video recommendation system and method thereof |
CN104216960A (en) * | 2014-08-21 | 2014-12-17 | 北京奇艺世纪科技有限公司 | Method and device for recommending video |
-
2017
- 2017-02-28 CN CN201710113741.4A patent/CN108509457A/en active Pending
- 2017-11-07 TW TW106138405A patent/TWI753044B/en active
-
2018
- 2018-02-14 WO PCT/CN2018/076784 patent/WO2018157746A1/en active Application Filing
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101282481A (en) * | 2008-05-09 | 2008-10-08 | 中国传媒大学 | A Method of Video Quality Evaluation Based on Artificial Neural Network |
US20110131595A1 (en) * | 2009-12-02 | 2011-06-02 | General Electric Company | Methods and systems for online recommendation |
CN104219575A (en) * | 2013-05-29 | 2014-12-17 | 酷盛(天津)科技有限公司 | Related video recommending method and system |
CN104915861A (en) * | 2015-06-15 | 2015-09-16 | 浙江经贸职业技术学院 | An electronic commerce recommendation method for a user group model constructed based on scores and labels |
Cited By (19)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2020093914A1 (en) * | 2018-11-08 | 2020-05-14 | Alibaba Group Holding Limited | Content-weighted deep residual learning for video in-loop filtering |
CN110879851A (en) * | 2019-10-15 | 2020-03-13 | 北京三快在线科技有限公司 | Video dynamic cover generation method and device, electronic equipment and readable storage medium |
CN111753136A (en) * | 2019-11-14 | 2020-10-09 | 北京沃东天骏信息技术有限公司 | Article information processing method, article information processing device, medium, and electronic device |
CN111191054A (en) * | 2019-12-18 | 2020-05-22 | 腾讯科技(深圳)有限公司 | Recommendation method and device for media data |
CN111191054B (en) * | 2019-12-18 | 2024-02-13 | 腾讯科技(深圳)有限公司 | Media data recommendation method and device |
CN111126262A (en) * | 2019-12-24 | 2020-05-08 | 中国科学院自动化研究所 | Video highlight detection method and device based on graph neural network |
CN111126262B (en) * | 2019-12-24 | 2023-04-28 | 中国科学院自动化研究所 | Video highlights detection method and device based on graph neural network |
CN112749297A (en) * | 2020-03-03 | 2021-05-04 | 腾讯科技(深圳)有限公司 | Video recommendation method and device, computer equipment and computer-readable storage medium |
CN112749297B (en) * | 2020-03-03 | 2023-07-21 | 腾讯科技(深圳)有限公司 | Video recommendation method, device, computer equipment and computer readable storage medium |
CN111950360B (en) * | 2020-07-06 | 2023-08-18 | 北京奇艺世纪科技有限公司 | Method and device for identifying infringement user |
CN111950360A (en) * | 2020-07-06 | 2020-11-17 | 北京奇艺世纪科技有限公司 | Method and device for identifying infringing user |
CN112100441A (en) * | 2020-09-17 | 2020-12-18 | 咪咕文化科技有限公司 | Video recommendation method, electronic device and computer-readable storage medium |
CN112100441B (en) * | 2020-09-17 | 2024-04-09 | 咪咕文化科技有限公司 | Video recommendation method, electronic device and computer-readable storage medium |
CN112464083A (en) * | 2020-11-16 | 2021-03-09 | 北京达佳互联信息技术有限公司 | Model training method, work pushing method, device, electronic equipment and storage medium |
CN113761347A (en) * | 2021-02-25 | 2021-12-07 | 北京沃东天骏信息技术有限公司 | Commodity recommendation method, commodity recommendation device, storage medium and commodity recommendation system |
CN114037941A (en) * | 2021-11-22 | 2022-02-11 | 南京启数智能系统有限公司 | Method and device for algorithmic multi-data cross-validation completion for video target attributes |
CN114519840A (en) * | 2022-02-25 | 2022-05-20 | 携程旅游信息技术(上海)有限公司 | Photo album video identification method and training method and device of photo album video identification model |
CN114780795A (en) * | 2022-05-07 | 2022-07-22 | 济南博观智能科技有限公司 | Video material screening method, device, equipment and medium |
WO2024057124A1 (en) * | 2022-09-14 | 2024-03-21 | Digit7 India Private Limited | System and method for automatically labelling media |
Also Published As
Publication number | Publication date |
---|---|
CN108509457A (en) | 2018-09-07 |
TWI753044B (en) | 2022-01-21 |
TW201834463A (en) | 2018-09-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2018157746A1 (en) | Recommendation method and apparatus for video data | |
CN108509465B (en) | Video data recommendation method and device and server | |
WO2022033199A1 (en) | Method for obtaining user portrait and related device | |
US11019017B2 (en) | Social media influence of geographic locations | |
US20140172643A1 (en) | System and method for categorizing an image | |
CN110019943B (en) | Video recommendation method and device, electronic equipment and storage medium | |
CN118628214B (en) | Personalized clothing recommendation method and system for electronic commerce platform based on artificial intelligence | |
KR20230087622A (en) | Methods and apparatus for detecting, filtering, and identifying objects in streaming video | |
CN111859149A (en) | Information recommendation method, device, electronic device and storage medium | |
CN104715023A (en) | Commodity recommendation method and system based on video content | |
US20190303499A1 (en) | Systems and methods for determining video content relevance | |
Jing et al. | A new method of printed fabric image retrieval based on color moments and gist feature description | |
JP5261493B2 (en) | Extended image identification | |
CN110363206B (en) | Clustering of data objects, data processing and data identification method | |
Sebyakin et al. | Spatio-temporal deepfake detection with deep neural networks | |
CN103793717A (en) | Methods for determining image-subject significance and training image-subject significance determining classifier and systems for same | |
CN112084954A (en) | Video target detection method and device, electronic equipment and storage medium | |
Angadi et al. | Multimodal sentiment analysis using reliefF feature selection and random forest classifier | |
Chang et al. | Human vision attention mechanism-inspired temporal-spatial feature pyramid for video saliency detection | |
Hu et al. | Improved YOLOv5-based image detection of cotton impurities | |
Ou et al. | An Intelligent Recommendation System for Real Estate Commodity. | |
Priadana et al. | An efficient face gender detector on a cpu with multi-perspective convolution | |
Tao et al. | A large-scale television advertising dataset for detailed impression analysis | |
CN105913427B (en) | A Noise Image Saliency Detection Method Based on Machine Learning | |
CN112580674B (en) | Image recognition method, computer device, and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 18760885 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 18760885 Country of ref document: EP Kind code of ref document: A1 |