CN117812381B - Video content making method based on artificial intelligence - Google Patents
Video content making method based on artificial intelligence Download PDFInfo
- Publication number
- CN117812381B CN117812381B CN202311654595.8A CN202311654595A CN117812381B CN 117812381 B CN117812381 B CN 117812381B CN 202311654595 A CN202311654595 A CN 202311654595A CN 117812381 B CN117812381 B CN 117812381B
- Authority
- CN
- China
- Prior art keywords
- semantic
- feature vector
- key element
- search key
- text
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000013473 artificial intelligence Methods 0.000 title claims abstract description 41
- 238000000034 method Methods 0.000 title claims description 19
- 239000000463 material Substances 0.000 claims abstract description 280
- 239000013598 vector Substances 0.000 claims abstract description 252
- 238000004519 manufacturing process Methods 0.000 claims abstract description 47
- 230000004927 fusion Effects 0.000 claims abstract description 26
- 238000004458 analytical method Methods 0.000 claims abstract description 13
- 230000003993 interaction Effects 0.000 claims abstract description 10
- 238000012937 correction Methods 0.000 claims description 11
- 238000005457 optimization Methods 0.000 claims description 11
- 238000012545 processing Methods 0.000 claims description 10
- 238000009826 distribution Methods 0.000 claims description 9
- 230000011218 segmentation Effects 0.000 claims description 7
- 238000013507 mapping Methods 0.000 claims description 5
- 238000012216 screening Methods 0.000 claims description 5
- 239000011159 matrix material Substances 0.000 claims description 4
- 238000005520 cutting process Methods 0.000 claims description 3
- 238000010586 diagram Methods 0.000 description 10
- 230000006870 function Effects 0.000 description 8
- 230000008569 process Effects 0.000 description 8
- 230000000694 effects Effects 0.000 description 5
- 238000005516 engineering process Methods 0.000 description 4
- 238000004422 calculation algorithm Methods 0.000 description 3
- 230000003252 repetitive effect Effects 0.000 description 3
- 238000007477 logistic regression Methods 0.000 description 2
- 238000003058 natural language processing Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000004590 computer program Methods 0.000 description 1
- 230000004069 differentiation Effects 0.000 description 1
- 230000008451 emotion Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012797 qualification Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 230000002441 reversible effect Effects 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 238000012549 training Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 238000000844 transformation Methods 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/44—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
- H04N21/44008—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics in the video stream
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/70—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F16/73—Querying
- G06F16/735—Filtering based on additional data, e.g. user or group profiles
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/70—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F16/78—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/7867—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using information manually generated, e.g. tags, keywords, comments, title and artist information, manually generated time, location and usage information, user ratings
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/44—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
- H04N21/44016—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving splicing one content stream with another content stream, e.g. for substituting a video clip
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Signal Processing (AREA)
- Library & Information Science (AREA)
- Computational Linguistics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
An artificial intelligence based video content production method is disclosed. Firstly, acquiring text description of a first alternative material, then, carrying out semantic understanding on the text description of the first alternative material to obtain a sequence of semantic feature vectors of the material text, then, carrying out semantic coding and analysis on keywords and the limiting conditions to obtain a global semantic feature vector of a search key element, then, carrying out semantic feature interaction and fusion on the global semantic feature vector of the search key element and the sequence of semantic feature vectors of the material text to obtain an optimized search key element-semantic related feature vector of the material text, and finally, determining whether to return the first alternative material based on the optimized search key element-semantic related feature vector of the material text. In this way, the production process of the video content can be optimized.
Description
Technical Field
The present application relates to the field of video content production, and more particularly, to an artificial intelligence-based video content production method.
Background
In the current digital age, video content is increasingly demanded. Conventional video content production typically consumes a significant amount of time.
For example, in selecting materials, conventional video content production methods manually screen and match materials. In this process, people often spend a great deal of time searching, browsing and evaluating a great deal of material to find material suitable for their video content to author. Such tedious and repetitive manual work can be time and resource consuming.
Therefore, an optimized video content production method is desired.
Disclosure of Invention
In view of the above, the present application proposes an artificial intelligence-based video content production method, which can optimize the production process of video content.
According to an aspect of the present application, there is provided an artificial intelligence-based video content production method, including: matching selected materials from a given material library according to given keywords and limiting conditions; analyzing, screening, cutting and splicing the alternative materials to generate video content; and generating a broadcast explanation voice adapted to the video content based on digital man-in-the-art, wherein matching the selected material from the given material library according to the given keyword and the limiting condition comprises:
Acquiring a text description of a first alternative material;
carrying out semantic understanding on the text description of the first alternative material to obtain a sequence of semantic feature vectors of the material text;
carrying out semantic coding and analysis on the keywords and the limiting conditions to obtain a global semantic feature vector of the search key element;
Carrying out semantic feature interaction and fusion on the sequence of the global semantic feature vector of the search key element and the semantic feature vector of the material text to obtain an optimized search key element-material text semantic related feature vector; and
And determining whether to return the first alternative material based on the optimized search key element-material text semantic related feature vector.
According to the embodiment of the application, firstly, the text description of a first alternative material is obtained, then, the text description of the first alternative material is subjected to semantic understanding to obtain a sequence of semantic feature vectors of the material text, then, keywords and the limiting conditions are subjected to semantic coding and analysis to obtain a global semantic feature vector of a search key element, then, the global semantic feature vector of the search key element and the sequence of semantic feature vectors of the material text are subjected to semantic feature interaction and fusion to obtain an optimized search key element-material text semantic related feature vector, and finally, whether the first alternative material is returned or not is determined based on the optimized search key element-material text semantic related feature vector. In this way, the production process of the video content can be optimized.
Other features and aspects of the present application will become apparent from the following detailed description of exemplary embodiments, which proceeds with reference to the accompanying drawings.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate exemplary embodiments, features and aspects of the application and together with the description, serve to explain the principles of the application.
FIG. 1 illustrates a flow chart of an artificial intelligence based video content production method according to an embodiment of the application.
Fig. 2 shows a flow chart of sub-step S110 of the artificial intelligence based video content production method according to an embodiment of the application.
Fig. 3 shows an architectural diagram of substep S110 of an artificial intelligence based video content production method according to an embodiment of the present application.
Fig. 4 shows a flowchart of sub-step S113 of the artificial intelligence based video content production method according to an embodiment of the application.
Fig. 5 shows a flowchart of sub-step S114 of the artificial intelligence based video content production method according to an embodiment of the application.
Fig. 6 shows a flowchart of sub-step S115 of the artificial intelligence based video content production method according to an embodiment of the application.
FIG. 7 illustrates a block diagram of an artificial intelligence based video content production system in accordance with an embodiment of the application.
Fig. 8 illustrates an application scenario diagram of an artificial intelligence based video content production method according to an embodiment of the present application.
Detailed Description
The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some, but not all embodiments of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are also within the scope of the application.
As used in the specification and in the claims, the terms "a," "an," "the," and/or "the" are not specific to a singular, but may include a plurality, unless the context clearly dictates otherwise. In general, the terms "comprises" and "comprising" merely indicate that the steps and elements are explicitly identified, and they do not constitute an exclusive list, as other steps or elements may be included in a method or apparatus.
Various exemplary embodiments, features and aspects of the application will be described in detail below with reference to the drawings. In the drawings, like reference numbers indicate identical or functionally similar elements. Although various aspects of the embodiments are illustrated in the accompanying drawings, the drawings are not necessarily drawn to scale unless specifically indicated.
In addition, numerous specific details are set forth in the following description in order to provide a better illustration of the application. It will be understood by those skilled in the art that the present application may be practiced without some of these specific details. In some instances, well known methods, procedures, components, and circuits have not been described in detail so as not to obscure the present application.
The application provides a video content making method based on artificial intelligence, and fig. 1 shows a flow chart of the video content making method based on artificial intelligence according to an embodiment of the application. As shown in fig. 1, the method for producing video content based on artificial intelligence according to the embodiment of the application comprises the following steps: s110, matching selected materials from a given material library according to given keywords and limiting conditions; s120, analyzing, screening, cutting and splicing the alternative materials to generate video content; and S130, generating the broadcasting explanation voice adapted to the video content based on the digital man-in-the-art technology.
In particular, it is contemplated that conventional video content production methods manually screen and match materials by hand during selection of the materials. In this process, people often spend a great deal of time searching, browsing and evaluating a great deal of material to find material suitable for their video content to author. Such tedious and repetitive manual work can be time and resource consuming. In order to solve the technical problems, the technical concept of the application is to combine natural language processing technology, perform semantic understanding and analysis on text description of the candidate materials and given keywords and limiting conditions, and blend and match semantic feature information of the candidate materials to return the candidate materials with matching degree meeting preset requirements, namely, select the candidate materials with matching degree meeting preset requirements from a given material library for video content production.
Based on this, fig. 2 shows a flow chart of sub-step S110 of the artificial intelligence based video content production method according to an embodiment of the application. Fig. 3 shows an architectural diagram of substep S110 of an artificial intelligence based video content production method according to an embodiment of the present application. As shown in fig. 2 and 3, according to an embodiment of the present application, an artificial intelligence-based video content production method matches selected materials from a given material library according to given keywords and constraints, including: s111, acquiring text description of the first alternative material; s112, carrying out semantic understanding on the text description of the first alternative material to obtain a sequence of semantic feature vectors of the material text; s113, carrying out semantic coding and analysis on the keywords and the limiting conditions to obtain global semantic feature vectors of the search key elements; s114, carrying out semantic feature interaction and fusion on the search key element global semantic feature vector and the sequence of the material text semantic feature vector to obtain an optimized search key element-material text semantic related feature vector; and S115, determining whether to return the first alternative material based on the optimized search key element-material text semantic related feature vector.
Specifically, in the technical scheme of the application, the specific process of matching the selected materials from the given material library according to the given keywords and the limiting conditions comprises the following steps: firstly, acquiring text description of a first alternative material; and the text description of the first alternative material is subjected to word segmentation processing and then passes through a material context semantic encoder comprising a word embedding layer to obtain a sequence of material text semantic feature vectors. That is, the textual description of the first alternative material is converted into structured data having semantic feature information. In particular, the sequence of story text semantic feature vectors characterizes the subject matter and text context semantic information of the first alternative story to enable a model to understand the text meaning of the first alternative story.
Accordingly, in step S112, performing semantic understanding on the text description of the first candidate material to obtain a sequence of semantic feature vectors of the material text, including: and carrying out word segmentation processing on the text description of the first alternative material, and then obtaining a sequence of semantic feature vectors of the text of the material through a material context semantic encoder comprising a word embedding layer.
Specifically, in one example, the step of obtaining the sequence of semantic feature vectors of the text of the material by using a text description of the first candidate material through a material context semantic encoder including a word embedding layer after word segmentation includes: word segmentation processing is carried out on the text description of the first alternative material so as to convert the text description of the first alternative material into a word sequence composed of a plurality of words; mapping each word in the word sequence to a word vector by using a word embedding layer of the material context semantic encoder comprising the word embedding layer to obtain a sequence of word vectors; and performing global-based context semantic coding on the sequence of word vectors by using the material context semantic coder comprising the word embedding layer to obtain a sequence of semantic feature vectors of the material text.
It is worth mentioning that word embedding layer (Word Embedding Layer) is a technique in natural language processing for mapping words in text to real vectors so that a computer can better understand and process text data, which represents discrete words as continuous real vectors, mapping semantically close words to close locations in vector space. The word embedding layer functions to convert words into vector representations for processing and analysis in a computer. By mapping words to vector space, the word embedding layer can capture semantic relationships between words, enabling a computer to understand the similarity and variability between words. Each word in the text description may be converted into a corresponding word vector by using a material context semantic encoder that includes a word embedding layer. These word vectors may represent semantic information of words, including contextual information and semantic similarity of words. Then, by performing global context semantic coding on the word vector sequence, a semantic feature vector sequence of the material text can be further obtained, wherein each feature vector represents a part of semantic information of the text description. Through the word embedding layer, words in the text description can be converted into continuous real number vector representations, so that better input representations are provided for subsequent text processing tasks, and the performance of the text processing tasks is improved. Meanwhile, the word embedding layer can also reduce feature dimension and improve calculation efficiency of the model.
Meanwhile, carrying out word embedding coding on the keywords and the limiting conditions to obtain a sequence of search key element semantic embedded vectors; and passing the sequence of the search key element semantic embedded vector through a semantic encoder to obtain a search key element global semantic feature vector. Keywords are often important words related to video content or material, and they may describe aspects of the video, such as topic, content, emotion, and so on. The limitation condition is a limitation of the search range or the screening condition, such as time, place, person, etc. The semantic information of keywords and qualifications can help determine the material in the material library that matches a particular requirement. Word embedding encoding and semantic encoding processes on the keywords and the constraints may capture such important semantic information therefrom.
Accordingly, in step S113, as shown in fig. 4, semantic encoding and analysis are performed on the keywords and the limiting conditions to obtain a global semantic feature vector of the search key element, including: s1131, carrying out word embedding coding on the keywords and the limiting conditions to obtain a sequence of search key element semantic embedded vectors; and S1132, passing the sequence of the search key element semantic embedded vectors through a semantic encoder to obtain the search key element global semantic feature vector.
It will be appreciated that in step S1131, the keywords and constraints will be translated into a sequence of word vectors, i.e. each word is mapped to a corresponding real vector, which is done in order to translate the discrete text data into a continuous vector representation so that the computer can better understand and process the text data. Word embedding encodings can capture semantic relationships between words, enabling a computer to understand the similarity and variability between words. In step S1132, the sequence of search key element semantic embedded vectors is processed by a semantic encoder to obtain global semantic feature vectors. The semantic encoder may perform global context semantic encoding on the input vector sequence, thereby capturing semantic associations and overall semantic information between terms in the sequence. This results in global semantic feature vectors for the search key element, where each feature vector represents a portion of the semantic information for the search key element. Through the two steps, the search key elements are subjected to semantic coding and analysis, and keywords and limiting conditions can be converted into global semantic feature vectors. The feature vectors can better represent semantic information of the key elements of the search, so that the accuracy and the effect of the search are improved. Meanwhile, through the application of word embedded codes and semantic encoders, semantic associations among search key elements can be captured, and a computer is helped to better understand search requirements and context information.
And then, calculating the correlation degree between the global semantic feature vector of the search key element and each semantic feature vector of the material text in the sequence of the semantic feature vectors of the material text to obtain the semantic correlation feature vector of the search key element-material text consisting of a plurality of correlation degrees. That is, the semantic similarity and the semantic matching degree between the keywords and the limiting conditions and the first alternative materials are measured by calculating the correlation degree between the global semantic feature vector of the search key element and the semantic feature vector of each material text.
And then, carrying out residual information fusion optimization on the search key element-material text semantic related feature vector based on the sequence of the material text semantic feature vector so as to obtain an optimized search key element-material text semantic related feature vector. Here, considering that in the process of semantic matching and similarity analysis, certain fuzzy matching may exist between the search key element information, namely the semantic feature information expressed by the key words and the limiting conditions, and the semantic information of the first candidate material text, the semantic matching degree information expressed by the semantic related feature vector of the optimized search key element-material text can be enhanced through residual information fusion. In addition, by carrying out residual fusion on semantic information of the first alternative material text expressed by the sequence of the material text semantic feature vectors, missing or incomplete semantic information possibly existing in the search key element-material text semantic related feature vectors can be supplemented.
Accordingly, in step S114, as shown in fig. 5, the semantic feature interaction and fusion are performed on the sequence of the global semantic feature vector of the search key element and the semantic feature vector of the material text to obtain an optimized search key element-semantic related feature vector of the material text, which includes: s1141, calculating the correlation degree between the global semantic feature vector of the search key element and each semantic feature vector of the material text in the sequence of the semantic feature vectors of the material text to obtain a search key element-material text semantic correlation feature vector consisting of a plurality of correlation degrees; and S1142, carrying out residual information fusion optimization on the search key element-material text semantic related feature vector based on the sequence of the material text semantic feature vector to obtain the optimized search key element-material text semantic related feature vector.
It should be understood that in step S1141, the semantic feature interaction is performed on the global semantic feature vector of the search key element and the semantic feature vector sequence of the material text, and by calculating the correlation between them, the semantic correlation between the search key element and the material text can be measured, and these correlations can be used as the component parts of the semantic correlation feature vector of the search key element-material text to represent the semantic correlation degree between the search key element and the material text. In step S1142, the search key element-material text semantic related feature vector is optimized by using the sequence of the material text semantic feature vector, and redundant information in the search key element-material text semantic related feature vector can be reduced by means of residual information fusion, so that more accurate and effective semantic related features can be extracted, and the optimized search key element-material text semantic related feature vector can be obtained, wherein the optimized search key element-material text semantic related feature vector contains semantic information with more differentiation and importance. Through the two steps, the global semantic feature vector of the search key element and the semantic feature vector of the material text can be interacted and fused, and the optimized search key element-material text semantic related feature vector is obtained. The feature vectors comprehensively consider semantic association between the search key elements and the material text, provide more accurate and rich semantic related features, and are beneficial to optimizing the search process and improving the quality of search results.
In step S1141, calculating a correlation between the global semantic feature vector of the search key element and each semantic feature vector of the material text in the sequence of semantic feature vectors of the material text to obtain a search key element-material text semantic correlation feature vector composed of a plurality of correlations, including: calculating the correlation between the global semantic feature vector of the search key element and each semantic feature vector of the material text in the sequence of the semantic feature vectors of the material text according to the following correlation formula to obtain a plurality of correlations; wherein, the correlation formula is: Wherein/> Global semantic feature vector for the search key element,/>For each material text semantic feature vector in the sequence of material text semantic feature vectors,/>And/>For two different linear transformations,/>Representing a transpose operation,/>The correlation degree between the global semantic feature vector of the key element and each material text semantic feature vector in the sequence of the material text semantic feature vector is searched; and arranging a plurality of the relevancy to obtain the search key element-material text semantic related feature vector.
In step S1142, performing residual information fusion optimization on the search key element-material text semantic related feature vector based on the sequence of the material text semantic feature vector to obtain the optimized search key element-material text semantic related feature vector, including: calculating the weight value of the sequence of the semantic feature vectors of the material text for the semantic related feature vectors of the search key element-material text according to the following weight formula; wherein, the weight formula is: Wherein/> Is/>Matrix of/>Is the dimension of the search key element-material text semantically related feature vector,/>Is the search key element-material text semantically related feature vector,/>Is/>Matrix of/>Is the dimension of semantic feature vector of each material text,/>Is a Sigmoid function,/>Is the/>, in the sequence of semantic feature vectors of the material textThe semantic feature vector of the individual material text,A material text global average value vector comprising global average values of all material text semantic feature vectors in the sequence of representing the material text semantic feature vectors,/>Is the weight value; carrying out residual information fusion optimization on the search key element-material text semantic related feature vector based on the weight value by using the following residual information fusion optimization formula to obtain the optimized search key element-material text semantic related feature vector; the residual information fusion optimization formula is as follows: /(I)Wherein/>Is the optimized search key element-material text semantically related feature vector,/>And/>Representing the convolution operation of a1 x 1 convolution kernel,Is the search key element-material text semantically related feature vector,/>Is the weight value,/>Is the/>, in the sequence of semantic feature vectors of the material textPersonal material text semantic feature vectors.
Further, the optimized search key element-material text semantic related feature vector is passed through a classifier to obtain a classification result, wherein the classification result is used for indicating whether the matching degree of the first alternative material meets a preset requirement; and responding to the classification result that the matching degree of the first alternative materials and the keywords meets the preset requirement, and returning the first alternative materials.
Accordingly, in step S115, as shown in fig. 6, based on the optimized search key element-material text semantic related feature vector, determining whether to return the first candidate material includes: s1151, carrying out feature distribution correction on the optimized search key element-material text semantic related feature vector to obtain a corrected search key element-material text semantic related feature vector; s1152, passing the corrected search key element-material text semantic related feature vector through a classifier to obtain a classification result, wherein the classification result is used for indicating whether the matching degree of the first alternative material meets a preset requirement; and S1153, returning the first alternative material in response to the classification result that the matching degree of the first alternative material and the keyword meets a preset requirement.
It should be understood that in step S1151, the feature distribution correction is performed on the optimized search key element-material text semantic related feature vector, which means that the feature vector is further processed to make it conform to the expected feature distribution, and by correcting the feature distribution, the quality and expression capability of the search key element-material text semantic related feature vector can be further improved, so that it is more suitable for the subsequent classification and matching task. In step S1152, the corrected search key element-material text semantic related feature vector is classified by a classifier. The classifier can be used for classifying and judging the feature vectors according to the feature representation of the feature vectors and the preset label information, and the search key element-material text semantic related feature vectors can be mapped to the preset category or score through the classifier to be used for representing whether the matching degree of the first alternative materials meets the preset requirement. In step S1153, it is determined, according to the classification result, whether the matching degree between the first candidate material and the keyword meets the predetermined requirement, if the classification result indicates that the matching degree between the first candidate material and the keyword meets the predetermined requirement, that is, the matching degree is higher, the first candidate material is returned as the search result, so that higher semantic relevance between the returned material and the search key element can be ensured, and the quality and accuracy of the search result are improved. Through the three steps, correction, classification and matching judgment can be carried out on the optimized search key element-material text semantic related feature vector, and whether the first alternative material is returned as a search result is finally determined. The steps are helpful for screening and selecting the materials which are most matched with the key elements of the search, and improving the search effect and the user satisfaction.
In the above technical solution, the search key element global semantic feature vector expresses the keyword and the defined condition of the encoded text semantic feature, each material text semantic feature vector in the sequence of material text semantic feature vectors expresses the text semantic feature related based on the word source semantic context of the text description of the first candidate material, thereby, the search key element-material text semantic related feature vector expresses the semantic cross-domain relevance feature between different text semantic feature domains, and when the search key element-material text semantic related feature vector is optimized based on the sequence of material text semantic feature vectors by means of residual information fusion, because the sequence of the semantic feature vectors of the material text and the semantic residual information caused by the inconsistent semantic expression between the semantic feature vectors of the search key element and the semantic related feature vectors of the material text may result in fusion sparsity based on residual information fusion optimization, thereby affecting the expression effect of the semantic related feature vectors of the optimized search key element and the material text, feature fusion correction is expected to be performed based on the sequence of the semantic feature vectors of the material text and the respective feature expression significance and criticality of the semantic related feature vectors of the search key element and the material text, thereby improving the expression effect of the semantic related feature vectors of the optimized search key element and the material text. Based on this, the applicant of the present application corrects the sequence of the story text semantic feature vectors and the optimized search key element-story text semantic related feature vector.
Correspondingly, performing feature distribution correction on the optimized search key element-material text semantic related feature vector to obtain a corrected search key element-material text semantic related feature vector, including: carrying out feature distribution correction on the optimized search key element-material text semantic related feature vector by using the following correction formula to obtain a corrected search key element-material text semantic related feature vector; wherein, the correction formula is: Wherein/> Is a cascading feature vector obtained by cascading the sequence of the semantic feature vectors of the material text, and/>Is the optimized search key element-material text semantically related feature vector,/>Representing the position-wise evolution of feature vectors,/>And/>Respectively the feature vectorsAnd/>Reciprocal of maximum eigenvalue,/>And/>Is a weight superparameter,/>Representing multiplication by location,/>Representing vector subtraction,/>Is the corrected search key element-material text semantically related feature vector.
Here, the pre-segmented local group of feature value sets is obtained by the sequence of the semantic feature vectors of the material text and the on-off value of each feature value of the semantic related feature vectors of the optimized search key element-material text, and the sequence of the semantic feature vectors of the material text and the key maximum value feature of the semantic related feature vectors of the optimized search key element-material text are regressed from the pre-segmented local group, so that the per-position significance distribution of the feature values can be improved based on the concept of the furthest point sampling, thereby performing sparse fusion control among the feature vectors by the key features with significant distribution, and realizing correction of the feature vectorsAnd restoring the original feature manifold geometric representation of the material text semantic feature vector sequence and the optimized search key element-material text semantic related feature vector. In this way, the correction feature vector/>And fusing the optimized search key element-material text semantic related feature vector, so that the expression effect of the optimized search key element-material text semantic related feature vector can be improved, and the accuracy of a classification result obtained by a classifier is improved.
Further, in step S1152, passing the corrected search key element-material text semantic related feature vector through a classifier to obtain a classification result, where the classification result is used to indicate whether the matching degree of the first candidate material meets a predetermined requirement, and includes: performing full-connection coding on the corrected search key element-material text semantic related feature vector by using a full-connection layer of the classifier to obtain a coding classification feature vector; and inputting the coding classification feature vector into a Softmax classification function of the classifier to obtain the classification result.
It should be appreciated that the role of the classifier is to learn the classification rules and classifier using a given class, known training data, and then classify (or predict) the unknown data. Logistic regression (logistics), SVM, etc. are commonly used to solve the classification problem, and for multi-classification problems (multi-class classification), logistic regression or SVM can be used as well, but multiple bi-classifications are required to compose multiple classifications, but this is error-prone and inefficient, and the commonly used multi-classification method is the Softmax classification function.
In summary, according to the artificial intelligence-based video content production method provided by the embodiment of the application, the production process of the video content can be optimized.
Fig. 7 shows a block diagram of an artificial intelligence based video content production system 100 according to an embodiment of the application. As shown in fig. 7, the artificial intelligence based video content production system 100 according to an embodiment of the present application includes: an alternative material matching module 110, configured to match the selected materials from the given material library according to the given keywords and the limiting conditions; the video content generating module 120 is configured to analyze, screen, clip, and splice the candidate materials to generate video content; and an explanation voice generation module 130, configured to generate a broadcast explanation voice adapted to the video content based on digital man-in-the-art technology.
In one possible implementation, the candidate material matching module 110 includes: the candidate material text description acquisition unit is used for acquiring the text description of the first candidate material; the semantic understanding unit of the alternative materials is used for carrying out semantic understanding on the text description of the first alternative materials so as to obtain a sequence of semantic feature vectors of the materials text; the limiting condition semantic coding analysis unit is used for carrying out semantic coding and analysis on the keywords and the limiting conditions so as to obtain global semantic feature vectors of the search key elements; the semantic feature interaction unit is used for carrying out semantic feature interaction and fusion on the sequence of the search key element global semantic feature vector and the material text semantic feature vector so as to obtain an optimized search key element-material text semantic related feature vector; and an analysis unit, configured to determine whether to return the first candidate material based on the optimized search key element-material text semantic related feature vector.
Here, it will be understood by those skilled in the art that the specific functions and operations of the respective units and modules in the above-described artificial intelligence-based video content production system 100 have been described in detail in the above description of the artificial intelligence-based video content production method with reference to fig. 1 to 6, and thus, repetitive descriptions thereof will be omitted.
As described above, the artificial intelligence-based video content production system 100 according to the embodiment of the present application may be implemented in various wireless terminals, for example, a server or the like having an artificial intelligence-based video content production algorithm. In one possible implementation, the artificial intelligence based video content production system 100 according to embodiments of the present application may be integrated into a wireless terminal as a software module and/or hardware module. For example, the artificial intelligence based video content production system 100 may be a software module in the operating system of the wireless terminal or may be an application developed for the wireless terminal; of course, the artificial intelligence based video content production system 100 could equally be one of many hardware modules of the wireless terminal.
Alternatively, in another example, the artificial intelligence based video content production system 100 and the wireless terminal may be separate devices, and the artificial intelligence based video content production system 100 may be connected to the wireless terminal through a wired and/or wireless network and transmit the interactive information in a agreed data format.
Fig. 8 illustrates an application scenario diagram of an artificial intelligence based video content production method according to an embodiment of the present application. As shown in fig. 8, in this application scenario, first, a text description of a first candidate material (e.g., D1 illustrated in fig. 8) and a keyword and a definition condition (e.g., D2 illustrated in fig. 8) are acquired, and then the text description of the first candidate material and the keyword and the definition condition are input into a server (e.g., S illustrated in fig. 8) where an artificial intelligence-based video content production algorithm is deployed, wherein the server is capable of processing the text description of the first candidate material and the keyword and the definition condition using the artificial intelligence-based video content production algorithm to obtain a classification result for indicating whether or not the matching degree of the first candidate material meets a predetermined requirement.
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The foregoing description of embodiments of the application has been presented for purposes of illustration and description, and is not intended to be exhaustive or limited to the embodiments disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the various embodiments described. The terminology used herein was chosen in order to best explain the principles of the embodiments, the practical application, or the improvement of technology in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein.
Claims (6)
1. A method of video content production based on artificial intelligence, comprising: matching selected materials from a given material library according to given keywords and limiting conditions; analyzing, screening, cutting and splicing the alternative materials to generate video content; and generating a broadcast explanation voice adapted to the video content based on digital man-in-the-art, wherein matching the selected material from the given material library according to the given keyword and the limiting condition comprises:
Acquiring a text description of a first alternative material;
carrying out semantic understanding on the text description of the first alternative material to obtain a sequence of semantic feature vectors of the material text;
carrying out semantic coding and analysis on the keywords and the limiting conditions to obtain a global semantic feature vector of the search key element;
Carrying out semantic feature interaction and fusion on the sequence of the global semantic feature vector of the search key element and the semantic feature vector of the material text to obtain an optimized search key element-material text semantic related feature vector; and
Determining whether to return the first alternative material based on the optimized search key element-material text semantic related feature vector;
The semantic feature interaction and fusion are carried out on the sequence of the global semantic feature vector of the search key element and the semantic feature vector of the material text to obtain an optimized search key element-material text semantic related feature vector, which comprises the following steps:
Calculating the correlation degree between the global semantic feature vector of the search key element and each semantic feature vector of the material text in the sequence of the semantic feature vectors of the material text to obtain a search key element-material text semantic correlation feature vector consisting of a plurality of correlation degrees; and
Performing residual information fusion optimization on the search key element-material text semantic related feature vector based on the sequence of the material text semantic feature vector to obtain the optimized search key element-material text semantic related feature vector;
The optimizing the search key element-material text semantic related feature vector based on the sequence of the material text semantic feature vector to obtain the optimized search key element-material text semantic related feature vector comprises the following steps:
Calculating the weight value of the sequence of the semantic feature vectors of the material text for the semantic related feature vectors of the search key element-material text according to the following weight formula;
wherein, the weight formula is:
Wherein A is a matrix of 1 XN w, N w is the dimension of the search key element-material text semantic related feature vector, V is the search key element-material text semantic related feature vector, B is a matrix of 1 XN h, N h is the dimension of each material text semantic feature vector, σ is a Sigmoid function, h i is the ith material text semantic feature vector in the sequence of material text semantic feature vectors, Representing a material text global average value vector formed by global average values of all material text semantic feature vectors in the sequence of the material text semantic feature vectors, wherein S is the weight value; and
Carrying out residual information fusion optimization on the search key element-material text semantic related feature vector by using the following residual information fusion optimization formula based on the weight value to obtain the optimized search key element-material text semantic related feature vector;
the residual information fusion optimization formula is as follows:
Wherein V' is the optimized search key element-material text semantic related feature vector, M w and M h represent convolution operations of a1×1 convolution kernel, V is the search key element-material text semantic related feature vector, S is the weight value, and h i is the ith material text semantic feature vector in the sequence of material text semantic feature vectors.
2. The artificial intelligence based video content production method of claim 1, wherein semantically understanding the text description of the first alternative material to obtain a sequence of material text semantic feature vectors, comprises:
And carrying out word segmentation processing on the text description of the first alternative material, and then obtaining a sequence of semantic feature vectors of the text of the material through a material context semantic encoder comprising a word embedding layer.
3. The method for producing video content based on artificial intelligence according to claim 2, wherein the step of obtaining the sequence of semantic feature vectors of the text of the material by a material context semantic encoder including a word embedding layer after the text description of the first candidate material is subjected to word segmentation processing comprises the steps of:
Word segmentation processing is carried out on the text description of the first alternative material so as to convert the text description of the first alternative material into a word sequence composed of a plurality of words;
Mapping each word in the word sequence to a word vector by using a word embedding layer of the material context semantic encoder comprising the word embedding layer to obtain a sequence of word vectors; and
And performing global-based context semantic coding on the sequence of word vectors by using the material context semantic coder containing the word embedding layer to obtain a sequence of semantic feature vectors of the material text.
4. The artificial intelligence based video content production method of claim 3, wherein semantically encoding and analyzing the keywords and the defined conditions to obtain a search key element global semantic feature vector comprises:
performing word embedding coding on the keywords and the limiting conditions to obtain a sequence of search key element semantic embedded vectors; and
And passing the sequence of the search key element semantic embedded vector through a semantic encoder to obtain the search key element global semantic feature vector.
5. The artificial intelligence based video content production method according to claim 4, wherein calculating the correlation between the search key element global semantic feature vector and each of the material text semantic feature vectors in the sequence of material text semantic feature vectors to obtain a search key element-material text semantic correlation feature vector composed of a plurality of the correlations comprises:
Calculating the correlation between the global semantic feature vector of the search key element and each semantic feature vector of the material text in the sequence of the semantic feature vectors of the material text according to the following correlation formula to obtain a plurality of correlations;
Wherein, the correlation formula is:
Wherein V 1 is the global semantic feature vector of the search key element, V 2 is the semantic feature vector of each material text in the sequence of the semantic feature vectors of the material text, And/>For two different linear transforms, (·) T represents a transpose operation,The correlation degree between the global semantic feature vector of the key element and each material text semantic feature vector in the sequence of the material text semantic feature vector is searched; and
And arranging a plurality of relevancy to obtain the search key element-material text semantic correlation feature vector.
6. The artificial intelligence based video content production method of claim 5, wherein determining whether to return the first alternative material based on the optimized search key element-material text semantic related feature vector comprises:
Carrying out feature distribution correction on the optimized search key element-material text semantic related feature vector to obtain a corrected search key element-material text semantic related feature vector;
The corrected search key element-material text semantic related feature vector is passed through a classifier to obtain a classification result, wherein the classification result is used for indicating whether the matching degree of the first alternative material meets the preset requirement; and
And responding to the classification result that the matching degree of the first alternative materials and the keywords meets the preset requirement, and returning the first alternative materials.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311654595.8A CN117812381B (en) | 2023-12-05 | 2023-12-05 | Video content making method based on artificial intelligence |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311654595.8A CN117812381B (en) | 2023-12-05 | 2023-12-05 | Video content making method based on artificial intelligence |
Publications (2)
Publication Number | Publication Date |
---|---|
CN117812381A CN117812381A (en) | 2024-04-02 |
CN117812381B true CN117812381B (en) | 2024-06-04 |
Family
ID=90428849
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311654595.8A Active CN117812381B (en) | 2023-12-05 | 2023-12-05 | Video content making method based on artificial intelligence |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117812381B (en) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118332140B (en) * | 2024-06-14 | 2024-08-23 | 北京睿智荟聚科技发展有限公司 | Audio and video content retrieval system and method based on artificial intelligence |
CN118521111B (en) * | 2024-06-19 | 2025-04-18 | 上海源庐加佳信息科技有限公司 | Intelligent collaboration system for resource parties based on data analysis |
CN118646940A (en) * | 2024-08-13 | 2024-09-13 | 深圳市客一客信息科技有限公司 | Video generation method, device and system based on multimodal input |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2002230021A (en) * | 2001-01-30 | 2002-08-16 | Canon Inc | Information retrieval device and method, and storage medium |
JP2008217701A (en) * | 2007-03-07 | 2008-09-18 | Sharp Corp | Metadata providing device, metadata providing method, metadata providing program, and recording medium recording metadata providing program |
KR20120050660A (en) * | 2010-11-11 | 2012-05-21 | 고려대학교 산학협력단 | Face searching system and method based on face recognition |
CN110750627A (en) * | 2018-07-19 | 2020-02-04 | 上海谦问万答吧云计算科技有限公司 | Material retrieval method and device, electronic equipment and storage medium |
CN112015949A (en) * | 2020-08-26 | 2020-12-01 | 腾讯科技(上海)有限公司 | Video generation method and device, storage medium and electronic equipment |
CN113094552A (en) * | 2021-03-19 | 2021-07-09 | 北京达佳互联信息技术有限公司 | Video template searching method and device, server and readable storage medium |
CN115455152A (en) * | 2022-09-29 | 2022-12-09 | 北京世纪好未来教育科技有限公司 | Recommended methods, devices, electronic equipment, and storage media for writing materials |
CN116010713A (en) * | 2023-03-27 | 2023-04-25 | 日照职业技术学院 | Innovative entrepreneur platform service data processing method and system based on cloud computing |
CN116932723A (en) * | 2023-07-28 | 2023-10-24 | 世优(北京)科技有限公司 | Man-machine interaction system and method based on natural language processing |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106202413B (en) * | 2016-07-11 | 2018-11-20 | 北京大学深圳研究生院 | A kind of cross-media retrieval method |
-
2023
- 2023-12-05 CN CN202311654595.8A patent/CN117812381B/en active Active
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2002230021A (en) * | 2001-01-30 | 2002-08-16 | Canon Inc | Information retrieval device and method, and storage medium |
JP2008217701A (en) * | 2007-03-07 | 2008-09-18 | Sharp Corp | Metadata providing device, metadata providing method, metadata providing program, and recording medium recording metadata providing program |
KR20120050660A (en) * | 2010-11-11 | 2012-05-21 | 고려대학교 산학협력단 | Face searching system and method based on face recognition |
CN110750627A (en) * | 2018-07-19 | 2020-02-04 | 上海谦问万答吧云计算科技有限公司 | Material retrieval method and device, electronic equipment and storage medium |
CN112015949A (en) * | 2020-08-26 | 2020-12-01 | 腾讯科技(上海)有限公司 | Video generation method and device, storage medium and electronic equipment |
CN113094552A (en) * | 2021-03-19 | 2021-07-09 | 北京达佳互联信息技术有限公司 | Video template searching method and device, server and readable storage medium |
CN115455152A (en) * | 2022-09-29 | 2022-12-09 | 北京世纪好未来教育科技有限公司 | Recommended methods, devices, electronic equipment, and storage media for writing materials |
CN116010713A (en) * | 2023-03-27 | 2023-04-25 | 日照职业技术学院 | Innovative entrepreneur platform service data processing method and system based on cloud computing |
CN116932723A (en) * | 2023-07-28 | 2023-10-24 | 世优(北京)科技有限公司 | Man-machine interaction system and method based on natural language processing |
Non-Patent Citations (1)
Title |
---|
《基于知识图谱的问答系统研究与实现》;李飞;《中国优秀硕士学位论文全文数据库 (信息科技辑)》;20230228;第I138-4309页 * |
Also Published As
Publication number | Publication date |
---|---|
CN117812381A (en) | 2024-04-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN117812381B (en) | Video content making method based on artificial intelligence | |
CN110580292B (en) | Text label generation method, device and computer readable storage medium | |
CN113987187B (en) | Public opinion text classification method, system, terminal and medium based on multi-label embedding | |
CN111753060A (en) | Information retrieval method, device, equipment and computer readable storage medium | |
CN112732916A (en) | BERT-based multi-feature fusion fuzzy text classification model | |
CN115145551B (en) | An intelligent assistance system for low-code development of machine learning applications | |
CN111159485B (en) | Tail entity linking method, device, server and storage medium | |
CN114896388B (en) | A hierarchical multi-label text classification method based on hybrid attention | |
CN110532386A (en) | Text sentiment classification method, device, electronic equipment and storage medium | |
CN113886544B (en) | Text matching method, device, storage medium and computer equipment | |
CN112925904A (en) | Lightweight text classification method based on Tucker decomposition | |
CN114090776A (en) | Document analysis method, system and device | |
CN112541083A (en) | Text classification method based on active learning hybrid neural network | |
CN116562284B (en) | Government affair text automatic allocation model training method and device | |
CN115186085A (en) | Reply content processing method and interaction method of media content interaction content | |
CN117171413B (en) | Data processing system and method for digital collection management | |
CN109657691B (en) | Image semantic annotation method based on energy model | |
CN111813941A (en) | Text classification method, device, device and medium combining RPA and AI | |
Ying et al. | Research on Sentiment Analysis of Travel Reviews Based on Word2Vec and RNN | |
CN113051366B (en) | Batch entity extraction method and system for professional field papers | |
CN117633358A (en) | Content recommendation method, content recommendation device, and storage medium | |
US8214310B2 (en) | Cross descriptor learning system, method and program product therefor | |
CN113157892B (en) | User intention processing method, device, computer equipment and storage medium | |
CN116796288A (en) | Industrial document-oriented multi-mode information extraction method and system | |
CN115827871A (en) | Internet enterprise classification method, device and system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CP03 | Change of name, title or address |
Address after: Building 60, 1st Floor, No.7 Jiuxianqiao North Road, Chaoyang District, Beijing 021 Patentee after: Shiyou (Beijing) Technology Co.,Ltd. Country or region after: China Address before: 4017, 4th Floor, Building 2, No.17 Ritan North Road, Chaoyang District, Beijing Patentee before: 4U (BEIJING) TECHNOLOGY CO.,LTD. Country or region before: China |
|
CP03 | Change of name, title or address |