CN117812381B

CN117812381B - Video content making method based on artificial intelligence

Info

Publication number: CN117812381B
Application number: CN202311654595.8A
Authority: CN
Inventors: 纪智辉; 李睿
Original assignee: 4u Beijing Technology Co ltd
Current assignee: Shiyou Beijing Technology Co ltd
Priority date: 2023-12-05
Filing date: 2023-12-05
Publication date: 2024-06-04
Anticipated expiration: 2043-12-05
Also published as: CN117812381A

Abstract

An artificial intelligence based video content production method is disclosed. Firstly, acquiring text description of a first alternative material, then, carrying out semantic understanding on the text description of the first alternative material to obtain a sequence of semantic feature vectors of the material text, then, carrying out semantic coding and analysis on keywords and the limiting conditions to obtain a global semantic feature vector of a search key element, then, carrying out semantic feature interaction and fusion on the global semantic feature vector of the search key element and the sequence of semantic feature vectors of the material text to obtain an optimized search key element-semantic related feature vector of the material text, and finally, determining whether to return the first alternative material based on the optimized search key element-semantic related feature vector of the material text. In this way, the production process of the video content can be optimized.

Description

Video content making method based on artificial intelligence

Technical Field

The present application relates to the field of video content production, and more particularly, to an artificial intelligence-based video content production method.

Background

In the current digital age, video content is increasingly demanded. Conventional video content production typically consumes a significant amount of time.

For example, in selecting materials, conventional video content production methods manually screen and match materials. In this process, people often spend a great deal of time searching, browsing and evaluating a great deal of material to find material suitable for their video content to author. Such tedious and repetitive manual work can be time and resource consuming.

Therefore, an optimized video content production method is desired.

Disclosure of Invention

In view of the above, the present application proposes an artificial intelligence-based video content production method, which can optimize the production process of video content.

According to an aspect of the present application, there is provided an artificial intelligence-based video content production method, including: matching selected materials from a given material library according to given keywords and limiting conditions; analyzing, screening, cutting and splicing the alternative materials to generate video content; and generating a broadcast explanation voice adapted to the video content based on digital man-in-the-art, wherein matching the selected material from the given material library according to the given keyword and the limiting condition comprises:

Acquiring a text description of a first alternative material;

carrying out semantic understanding on the text description of the first alternative material to obtain a sequence of semantic feature vectors of the material text;

carrying out semantic coding and analysis on the keywords and the limiting conditions to obtain a global semantic feature vector of the search key element;

Carrying out semantic feature interaction and fusion on the sequence of the global semantic feature vector of the search key element and the semantic feature vector of the material text to obtain an optimized search key element-material text semantic related feature vector; and

And determining whether to return the first alternative material based on the optimized search key element-material text semantic related feature vector.

According to the embodiment of the application, firstly, the text description of a first alternative material is obtained, then, the text description of the first alternative material is subjected to semantic understanding to obtain a sequence of semantic feature vectors of the material text, then, keywords and the limiting conditions are subjected to semantic coding and analysis to obtain a global semantic feature vector of a search key element, then, the global semantic feature vector of the search key element and the sequence of semantic feature vectors of the material text are subjected to semantic feature interaction and fusion to obtain an optimized search key element-material text semantic related feature vector, and finally, whether the first alternative material is returned or not is determined based on the optimized search key element-material text semantic related feature vector. In this way, the production process of the video content can be optimized.

Other features and aspects of the present application will become apparent from the following detailed description of exemplary embodiments, which proceeds with reference to the accompanying drawings.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate exemplary embodiments, features and aspects of the application and together with the description, serve to explain the principles of the application.

FIG. 1 illustrates a flow chart of an artificial intelligence based video content production method according to an embodiment of the application.

Fig. 2 shows a flow chart of sub-step S110 of the artificial intelligence based video content production method according to an embodiment of the application.

Fig. 3 shows an architectural diagram of substep S110 of an artificial intelligence based video content production method according to an embodiment of the present application.

Fig. 4 shows a flowchart of sub-step S113 of the artificial intelligence based video content production method according to an embodiment of the application.

Fig. 5 shows a flowchart of sub-step S114 of the artificial intelligence based video content production method according to an embodiment of the application.

Fig. 6 shows a flowchart of sub-step S115 of the artificial intelligence based video content production method according to an embodiment of the application.

FIG. 7 illustrates a block diagram of an artificial intelligence based video content production system in accordance with an embodiment of the application.

Fig. 8 illustrates an application scenario diagram of an artificial intelligence based video content production method according to an embodiment of the present application.

Detailed Description

The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some, but not all embodiments of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are also within the scope of the application.

As used in the specification and in the claims, the terms "a," "an," "the," and/or "the" are not specific to a singular, but may include a plurality, unless the context clearly dictates otherwise. In general, the terms "comprises" and "comprising" merely indicate that the steps and elements are explicitly identified, and they do not constitute an exclusive list, as other steps or elements may be included in a method or apparatus.

Various exemplary embodiments, features and aspects of the application will be described in detail below with reference to the drawings. In the drawings, like reference numbers indicate identical or functionally similar elements. Although various aspects of the embodiments are illustrated in the accompanying drawings, the drawings are not necessarily drawn to scale unless specifically indicated.

In addition, numerous specific details are set forth in the following description in order to provide a better illustration of the application. It will be understood by those skilled in the art that the present application may be practiced without some of these specific details. In some instances, well known methods, procedures, components, and circuits have not been described in detail so as not to obscure the present application.

The application provides a video content making method based on artificial intelligence, and fig. 1 shows a flow chart of the video content making method based on artificial intelligence according to an embodiment of the application. As shown in fig. 1, the method for producing video content based on artificial intelligence according to the embodiment of the application comprises the following steps: s110, matching selected materials from a given material library according to given keywords and limiting conditions; s120, analyzing, screening, cutting and splicing the alternative materials to generate video content; and S130, generating the broadcasting explanation voice adapted to the video content based on the digital man-in-the-art technology.

In particular, it is contemplated that conventional video content production methods manually screen and match materials by hand during selection of the materials. In this process, people often spend a great deal of time searching, browsing and evaluating a great deal of material to find material suitable for their video content to author. Such tedious and repetitive manual work can be time and resource consuming. In order to solve the technical problems, the technical concept of the application is to combine natural language processing technology, perform semantic understanding and analysis on text description of the candidate materials and given keywords and limiting conditions, and blend and match semantic feature information of the candidate materials to return the candidate materials with matching degree meeting preset requirements, namely, select the candidate materials with matching degree meeting preset requirements from a given material library for video content production.

Based on this, fig. 2 shows a flow chart of sub-step S110 of the artificial intelligence based video content production method according to an embodiment of the application. Fig. 3 shows an architectural diagram of substep S110 of an artificial intelligence based video content production method according to an embodiment of the present application. As shown in fig. 2 and 3, according to an embodiment of the present application, an artificial intelligence-based video content production method matches selected materials from a given material library according to given keywords and constraints, including: s111, acquiring text description of the first alternative material; s112, carrying out semantic understanding on the text description of the first alternative material to obtain a sequence of semantic feature vectors of the material text; s113, carrying out semantic coding and analysis on the keywords and the limiting conditions to obtain global semantic feature vectors of the search key elements; s114, carrying out semantic feature interaction and fusion on the search key element global semantic feature vector and the sequence of the material text semantic feature vector to obtain an optimized search key element-material text semantic related feature vector; and S115, determining whether to return the first alternative material based on the optimized search key element-material text semantic related feature vector.

Specifically, in the technical scheme of the application, the specific process of matching the selected materials from the given material library according to the given keywords and the limiting conditions comprises the following steps: firstly, acquiring text description of a first alternative material; and the text description of the first alternative material is subjected to word segmentation processing and then passes through a material context semantic encoder comprising a word embedding layer to obtain a sequence of material text semantic feature vectors. That is, the textual description of the first alternative material is converted into structured data having semantic feature information. In particular, the sequence of story text semantic feature vectors characterizes the subject matter and text context semantic information of the first alternative story to enable a model to understand the text meaning of the first alternative story.

Accordingly, in step S112, performing semantic understanding on the text description of the first candidate material to obtain a sequence of semantic feature vectors of the material text, including: and carrying out word segmentation processing on the text description of the first alternative material, and then obtaining a sequence of semantic feature vectors of the text of the material through a material context semantic encoder comprising a word embedding layer.

Specifically, in one example, the step of obtaining the sequence of semantic feature vectors of the text of the material by using a text description of the first candidate material through a material context semantic encoder including a word embedding layer after word segmentation includes: word segmentation processing is carried out on the text description of the first alternative material so as to convert the text description of the first alternative material into a word sequence composed of a plurality of words; mapping each word in the word sequence to a word vector by using a word embedding layer of the material context semantic encoder comprising the word embedding layer to obtain a sequence of word vectors; and performing global-based context semantic coding on the sequence of word vectors by using the material context semantic coder comprising the word embedding layer to obtain a sequence of semantic feature vectors of the material text.

It is worth mentioning that word embedding layer (Word Embedding Layer) is a technique in natural language processing for mapping words in text to real vectors so that a computer can better understand and process text data, which represents discrete words as continuous real vectors, mapping semantically close words to close locations in vector space. The word embedding layer functions to convert words into vector representations for processing and analysis in a computer. By mapping words to vector space, the word embedding layer can capture semantic relationships between words, enabling a computer to understand the similarity and variability between words. Each word in the text description may be converted into a corresponding word vector by using a material context semantic encoder that includes a word embedding layer. These word vectors may represent semantic information of words, including contextual information and semantic similarity of words. Then, by performing global context semantic coding on the word vector sequence, a semantic feature vector sequence of the material text can be further obtained, wherein each feature vector represents a part of semantic information of the text description. Through the word embedding layer, words in the text description can be converted into continuous real number vector representations, so that better input representations are provided for subsequent text processing tasks, and the performance of the text processing tasks is improved. Meanwhile, the word embedding layer can also reduce feature dimension and improve calculation efficiency of the model.

Meanwhile, carrying out word embedding coding on the keywords and the limiting conditions to obtain a sequence of search key element semantic embedded vectors; and passing the sequence of the search key element semantic embedded vector through a semantic encoder to obtain a search key element global semantic feature vector. Keywords are often important words related to video content or material, and they may describe aspects of the video, such as topic, content, emotion, and so on. The limitation condition is a limitation of the search range or the screening condition, such as time, place, person, etc. The semantic information of keywords and qualifications can help determine the material in the material library that matches a particular requirement. Word embedding encoding and semantic encoding processes on the keywords and the constraints may capture such important semantic information therefrom.

Accordingly, in step S113, as shown in fig. 4, semantic encoding and analysis are performed on the keywords and the limiting conditions to obtain a global semantic feature vector of the search key element, including: s1131, carrying out word embedding coding on the keywords and the limiting conditions to obtain a sequence of search key element semantic embedded vectors; and S1132, passing the sequence of the search key element semantic embedded vectors through a semantic encoder to obtain the search key element global semantic feature vector.

It will be appreciated that in step S1131, the keywords and constraints will be translated into a sequence of word vectors, i.e. each word is mapped to a corresponding real vector, which is done in order to translate the discrete text data into a continuous vector representation so that the computer can better understand and process the text data. Word embedding encodings can capture semantic relationships between words, enabling a computer to understand the similarity and variability between words. In step S1132, the sequence of search key element semantic embedded vectors is processed by a semantic encoder to obtain global semantic feature vectors. The semantic encoder may perform global context semantic encoding on the input vector sequence, thereby capturing semantic associations and overall semantic information between terms in the sequence. This results in global semantic feature vectors for the search key element, where each feature vector represents a portion of the semantic information for the search key element. Through the two steps, the search key elements are subjected to semantic coding and analysis, and keywords and limiting conditions can be converted into global semantic feature vectors. The feature vectors can better represent semantic information of the key elements of the search, so that the accuracy and the effect of the search are improved. Meanwhile, through the application of word embedded codes and semantic encoders, semantic associations among search key elements can be captured, and a computer is helped to better understand search requirements and context information.

And then, calculating the correlation degree between the global semantic feature vector of the search key element and each semantic feature vector of the material text in the sequence of the semantic feature vectors of the material text to obtain the semantic correlation feature vector of the search key element-material text consisting of a plurality of correlation degrees. That is, the semantic similarity and the semantic matching degree between the keywords and the limiting conditions and the first alternative materials are measured by calculating the correlation degree between the global semantic feature vector of the search key element and the semantic feature vector of each material text.

And then, carrying out residual information fusion optimization on the search key element-material text semantic related feature vector based on the sequence of the material text semantic feature vector so as to obtain an optimized search key element-material text semantic related feature vector. Here, considering that in the process of semantic matching and similarity analysis, certain fuzzy matching may exist between the search key element information, namely the semantic feature information expressed by the key words and the limiting conditions, and the semantic information of the first candidate material text, the semantic matching degree information expressed by the semantic related feature vector of the optimized search key element-material text can be enhanced through residual information fusion. In addition, by carrying out residual fusion on semantic information of the first alternative material text expressed by the sequence of the material text semantic feature vectors, missing or incomplete semantic information possibly existing in the search key element-material text semantic related feature vectors can be supplemented.

Accordingly, in step S114, as shown in fig. 5, the semantic feature interaction and fusion are performed on the sequence of the global semantic feature vector of the search key element and the semantic feature vector of the material text to obtain an optimized search key element-semantic related feature vector of the material text, which includes: s1141, calculating the correlation degree between the global semantic feature vector of the search key element and each semantic feature vector of the material text in the sequence of the semantic feature vectors of the material text to obtain a search key element-material text semantic correlation feature vector consisting of a plurality of correlation degrees; and S1142, carrying out residual information fusion optimization on the search key element-material text semantic related feature vector based on the sequence of the material text semantic feature vector to obtain the optimized search key element-material text semantic related feature vector.

It should be understood that in step S1141, the semantic feature interaction is performed on the global semantic feature vector of the search key element and the semantic feature vector sequence of the material text, and by calculating the correlation between them, the semantic correlation between the search key element and the material text can be measured, and these correlations can be used as the component parts of the semantic correlation feature vector of the search key element-material text to represent the semantic correlation degree between the search key element and the material text. In step S1142, the search key element-material text semantic related feature vector is optimized by using the sequence of the material text semantic feature vector, and redundant information in the search key element-material text semantic related feature vector can be reduced by means of residual information fusion, so that more accurate and effective semantic related features can be extracted, and the optimized search key element-material text semantic related feature vector can be obtained, wherein the optimized search key element-material text semantic related feature vector contains semantic information with more differentiation and importance. Through the two steps, the global semantic feature vector of the search key element and the semantic feature vector of the material text can be interacted and fused, and the optimized search key element-material text semantic related feature vector is obtained. The feature vectors comprehensively consider semantic association between the search key elements and the material text, provide more accurate and rich semantic related features, and are beneficial to optimizing the search process and improving the quality of search results.

In step S1141, calculating a correlation between the global semantic feature vector of the search key element and each semantic feature vector of the material text in the sequence of semantic feature vectors of the material text to obtain a search key element-material text semantic correlation feature vector composed of a plurality of correlations, including: calculating the correlation between the global semantic feature vector of the search key element and each semantic feature vector of the material text in the sequence of the semantic feature vectors of the material text according to the following correlation formula to obtain a plurality of correlations; wherein, the correlation formula is: Wherein/> Global semantic feature vector for the search key element,/>For each material text semantic feature vector in the sequence of material text semantic feature vectors,/>And/>For two different linear transformations,/>Representing a transpose operation,/>The correlation degree between the global semantic feature vector of the key element and each material text semantic feature vector in the sequence of the material text semantic feature vector is searched; and arranging a plurality of the relevancy to obtain the search key element-material text semantic related feature vector.

In step S1142, performing residual information fusion optimization on the search key element-material text semantic related feature vector based on the sequence of the material text semantic feature vector to obtain the optimized search key element-material text semantic related feature vector, including: calculating the weight value of the sequence of the semantic feature vectors of the material text for the semantic related feature vectors of the search key element-material text according to the following weight formula; wherein, the weight formula is: Wherein/> Is/>Matrix of/>Is the dimension of the search key element-material text semantically related feature vector,/>Is the search key element-material text semantically related feature vector,/>Is/>Matrix of/>Is the dimension of semantic feature vector of each material text,/>Is a Sigmoid function,/>Is the/>, in the sequence of semantic feature vectors of the material textThe semantic feature vector of the individual material text,A material text global average value vector comprising global average values of all material text semantic feature vectors in the sequence of representing the material text semantic feature vectors,/>Is the weight value; carrying out residual information fusion optimization on the search key element-material text semantic related feature vector based on the weight value by using the following residual information fusion optimization formula to obtain the optimized search key element-material text semantic related feature vector; the residual information fusion optimization formula is as follows: /(I)Wherein/>Is the optimized search key element-material text semantically related feature vector,/>And/>Representing the convolution operation of a1 x 1 convolution kernel,Is the search key element-material text semantically related feature vector,/>Is the weight value,/>Is the/>, in the sequence of semantic feature vectors of the material textPersonal material text semantic feature vectors.

Further, the optimized search key element-material text semantic related feature vector is passed through a classifier to obtain a classification result, wherein the classification result is used for indicating whether the matching degree of the first alternative material meets a preset requirement; and responding to the classification result that the matching degree of the first alternative materials and the keywords meets the preset requirement, and returning the first alternative materials.

Accordingly, in step S115, as shown in fig. 6, based on the optimized search key element-material text semantic related feature vector, determining whether to return the first candidate material includes: s1151, carrying out feature distribution correction on the optimized search key element-material text semantic related feature vector to obtain a corrected search key element-material text semantic related feature vector; s1152, passing the corrected search key element-material text semantic related feature vector through a classifier to obtain a classification result, wherein the classification result is used for indicating whether the matching degree of the first alternative material meets a preset requirement; and S1153, returning the first alternative material in response to the classification result that the matching degree of the first alternative material and the keyword meets a preset requirement.

It should be understood that in step S1151, the feature distribution correction is performed on the optimized search key element-material text semantic related feature vector, which means that the feature vector is further processed to make it conform to the expected feature distribution, and by correcting the feature distribution, the quality and expression capability of the search key element-material text semantic related feature vector can be further improved, so that it is more suitable for the subsequent classification and matching task. In step S1152, the corrected search key element-material text semantic related feature vector is classified by a classifier. The classifier can be used for classifying and judging the feature vectors according to the feature representation of the feature vectors and the preset label information, and the search key element-material text semantic related feature vectors can be mapped to the preset category or score through the classifier to be used for representing whether the matching degree of the first alternative materials meets the preset requirement. In step S1153, it is determined, according to the classification result, whether the matching degree between the first candidate material and the keyword meets the predetermined requirement, if the classification result indicates that the matching degree between the first candidate material and the keyword meets the predetermined requirement, that is, the matching degree is higher, the first candidate material is returned as the search result, so that higher semantic relevance between the returned material and the search key element can be ensured, and the quality and accuracy of the search result are improved. Through the three steps, correction, classification and matching judgment can be carried out on the optimized search key element-material text semantic related feature vector, and whether the first alternative material is returned as a search result is finally determined. The steps are helpful for screening and selecting the materials which are most matched with the key elements of the search, and improving the search effect and the user satisfaction.

In the above technical solution, the search key element global semantic feature vector expresses the keyword and the defined condition of the encoded text semantic feature, each material text semantic feature vector in the sequence of material text semantic feature vectors expresses the text semantic feature related based on the word source semantic context of the text description of the first candidate material, thereby, the search key element-material text semantic related feature vector expresses the semantic cross-domain relevance feature between different text semantic feature domains, and when the search key element-material text semantic related feature vector is optimized based on the sequence of material text semantic feature vectors by means of residual information fusion, because the sequence of the semantic feature vectors of the material text and the semantic residual information caused by the inconsistent semantic expression between the semantic feature vectors of the search key element and the semantic related feature vectors of the material text may result in fusion sparsity based on residual information fusion optimization, thereby affecting the expression effect of the semantic related feature vectors of the optimized search key element and the material text, feature fusion correction is expected to be performed based on the sequence of the semantic feature vectors of the material text and the respective feature expression significance and criticality of the semantic related feature vectors of the search key element and the material text, thereby improving the expression effect of the semantic related feature vectors of the optimized search key element and the material text. Based on this, the applicant of the present application corrects the sequence of the story text semantic feature vectors and the optimized search key element-story text semantic related feature vector.

Correspondingly, performing feature distribution correction on the optimized search key element-material text semantic related feature vector to obtain a corrected search key element-material text semantic related feature vector, including: carrying out feature distribution correction on the optimized search key element-material text semantic related feature vector by using the following correction formula to obtain a corrected search key element-material text semantic related feature vector; wherein, the correction formula is: Wherein/> Is a cascading feature vector obtained by cascading the sequence of the semantic feature vectors of the material text, and/>Is the optimized search key element-material text semantically related feature vector,/>Representing the position-wise evolution of feature vectors,/>And/>Respectively the feature vectorsAnd/>Reciprocal of maximum eigenvalue,/>And/>Is a weight superparameter,/>Representing multiplication by location,/>Representing vector subtraction,/>Is the corrected search key element-material text semantically related feature vector.

Here, the pre-segmented local group of feature value sets is obtained by the sequence of the semantic feature vectors of the material text and the on-off value of each feature value of the semantic related feature vectors of the optimized search key element-material text, and the sequence of the semantic feature vectors of the material text and the key maximum value feature of the semantic related feature vectors of the optimized search key element-material text are regressed from the pre-segmented local group, so that the per-position significance distribution of the feature values can be improved based on the concept of the furthest point sampling, thereby performing sparse fusion control among the feature vectors by the key features with significant distribution, and realizing correction of the feature vectorsAnd restoring the original feature manifold geometric representation of the material text semantic feature vector sequence and the optimized search key element-material text semantic related feature vector. In this way, the correction feature vector/>And fusing the optimized search key element-material text semantic related feature vector, so that the expression effect of the optimized search key element-material text semantic related feature vector can be improved, and the accuracy of a classification result obtained by a classifier is improved.

Further, in step S1152, passing the corrected search key element-material text semantic related feature vector through a classifier to obtain a classification result, where the classification result is used to indicate whether the matching degree of the first candidate material meets a predetermined requirement, and includes: performing full-connection coding on the corrected search key element-material text semantic related feature vector by using a full-connection layer of the classifier to obtain a coding classification feature vector; and inputting the coding classification feature vector into a Softmax classification function of the classifier to obtain the classification result.

It should be appreciated that the role of the classifier is to learn the classification rules and classifier using a given class, known training data, and then classify (or predict) the unknown data. Logistic regression (logistics), SVM, etc. are commonly used to solve the classification problem, and for multi-classification problems (multi-class classification), logistic regression or SVM can be used as well, but multiple bi-classifications are required to compose multiple classifications, but this is error-prone and inefficient, and the commonly used multi-classification method is the Softmax classification function.

In summary, according to the artificial intelligence-based video content production method provided by the embodiment of the application, the production process of the video content can be optimized.

Fig. 7 shows a block diagram of an artificial intelligence based video content production system 100 according to an embodiment of the application. As shown in fig. 7, the artificial intelligence based video content production system 100 according to an embodiment of the present application includes: an alternative material matching module 110, configured to match the selected materials from the given material library according to the given keywords and the limiting conditions; the video content generating module 120 is configured to analyze, screen, clip, and splice the candidate materials to generate video content; and an explanation voice generation module 130, configured to generate a broadcast explanation voice adapted to the video content based on digital man-in-the-art technology.

In one possible implementation, the candidate material matching module 110 includes: the candidate material text description acquisition unit is used for acquiring the text description of the first candidate material; the semantic understanding unit of the alternative materials is used for carrying out semantic understanding on the text description of the first alternative materials so as to obtain a sequence of semantic feature vectors of the materials text; the limiting condition semantic coding analysis unit is used for carrying out semantic coding and analysis on the keywords and the limiting conditions so as to obtain global semantic feature vectors of the search key elements; the semantic feature interaction unit is used for carrying out semantic feature interaction and fusion on the sequence of the search key element global semantic feature vector and the material text semantic feature vector so as to obtain an optimized search key element-material text semantic related feature vector; and an analysis unit, configured to determine whether to return the first candidate material based on the optimized search key element-material text semantic related feature vector.

Here, it will be understood by those skilled in the art that the specific functions and operations of the respective units and modules in the above-described artificial intelligence-based video content production system 100 have been described in detail in the above description of the artificial intelligence-based video content production method with reference to fig. 1 to 6, and thus, repetitive descriptions thereof will be omitted.

As described above, the artificial intelligence-based video content production system 100 according to the embodiment of the present application may be implemented in various wireless terminals, for example, a server or the like having an artificial intelligence-based video content production algorithm. In one possible implementation, the artificial intelligence based video content production system 100 according to embodiments of the present application may be integrated into a wireless terminal as a software module and/or hardware module. For example, the artificial intelligence based video content production system 100 may be a software module in the operating system of the wireless terminal or may be an application developed for the wireless terminal; of course, the artificial intelligence based video content production system 100 could equally be one of many hardware modules of the wireless terminal.

Alternatively, in another example, the artificial intelligence based video content production system 100 and the wireless terminal may be separate devices, and the artificial intelligence based video content production system 100 may be connected to the wireless terminal through a wired and/or wireless network and transmit the interactive information in a agreed data format.

Fig. 8 illustrates an application scenario diagram of an artificial intelligence based video content production method according to an embodiment of the present application. As shown in fig. 8, in this application scenario, first, a text description of a first candidate material (e.g., D1 illustrated in fig. 8) and a keyword and a definition condition (e.g., D2 illustrated in fig. 8) are acquired, and then the text description of the first candidate material and the keyword and the definition condition are input into a server (e.g., S illustrated in fig. 8) where an artificial intelligence-based video content production algorithm is deployed, wherein the server is capable of processing the text description of the first candidate material and the keyword and the definition condition using the artificial intelligence-based video content production algorithm to obtain a classification result for indicating whether or not the matching degree of the first candidate material meets a predetermined requirement.

The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The foregoing description of embodiments of the application has been presented for purposes of illustration and description, and is not intended to be exhaustive or limited to the embodiments disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the various embodiments described. The terminology used herein was chosen in order to best explain the principles of the embodiments, the practical application, or the improvement of technology in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein.

Claims

1. A method of video content production based on artificial intelligence, comprising: matching selected materials from a given material library according to given keywords and limiting conditions; analyzing, screening, cutting and splicing the alternative materials to generate video content; and generating a broadcast explanation voice adapted to the video content based on digital man-in-the-art, wherein matching the selected material from the given material library according to the given keyword and the limiting condition comprises:

Acquiring a text description of a first alternative material;

Determining whether to return the first alternative material based on the optimized search key element-material text semantic related feature vector;

The semantic feature interaction and fusion are carried out on the sequence of the global semantic feature vector of the search key element and the semantic feature vector of the material text to obtain an optimized search key element-material text semantic related feature vector, which comprises the following steps:

Calculating the correlation degree between the global semantic feature vector of the search key element and each semantic feature vector of the material text in the sequence of the semantic feature vectors of the material text to obtain a search key element-material text semantic correlation feature vector consisting of a plurality of correlation degrees; and

Performing residual information fusion optimization on the search key element-material text semantic related feature vector based on the sequence of the material text semantic feature vector to obtain the optimized search key element-material text semantic related feature vector;

The optimizing the search key element-material text semantic related feature vector based on the sequence of the material text semantic feature vector to obtain the optimized search key element-material text semantic related feature vector comprises the following steps:

Calculating the weight value of the sequence of the semantic feature vectors of the material text for the semantic related feature vectors of the search key element-material text according to the following weight formula;

wherein, the weight formula is:

Wherein A is a matrix of 1 XN _w, N _w is the dimension of the search key element-material text semantic related feature vector, V is the search key element-material text semantic related feature vector, B is a matrix of 1 XN _h, N _h is the dimension of each material text semantic feature vector, σ is a Sigmoid function, h _i is the ith material text semantic feature vector in the sequence of material text semantic feature vectors, Representing a material text global average value vector formed by global average values of all material text semantic feature vectors in the sequence of the material text semantic feature vectors, wherein S is the weight value; and

Carrying out residual information fusion optimization on the search key element-material text semantic related feature vector by using the following residual information fusion optimization formula based on the weight value to obtain the optimized search key element-material text semantic related feature vector;

the residual information fusion optimization formula is as follows:

Wherein V' is the optimized search key element-material text semantic related feature vector, M _w and M _h represent convolution operations of a1×1 convolution kernel, V is the search key element-material text semantic related feature vector, S is the weight value, and h _i is the ith material text semantic feature vector in the sequence of material text semantic feature vectors.

2. The artificial intelligence based video content production method of claim 1, wherein semantically understanding the text description of the first alternative material to obtain a sequence of material text semantic feature vectors, comprises:

And carrying out word segmentation processing on the text description of the first alternative material, and then obtaining a sequence of semantic feature vectors of the text of the material through a material context semantic encoder comprising a word embedding layer.

3. The method for producing video content based on artificial intelligence according to claim 2, wherein the step of obtaining the sequence of semantic feature vectors of the text of the material by a material context semantic encoder including a word embedding layer after the text description of the first candidate material is subjected to word segmentation processing comprises the steps of:

Word segmentation processing is carried out on the text description of the first alternative material so as to convert the text description of the first alternative material into a word sequence composed of a plurality of words;

Mapping each word in the word sequence to a word vector by using a word embedding layer of the material context semantic encoder comprising the word embedding layer to obtain a sequence of word vectors; and

And performing global-based context semantic coding on the sequence of word vectors by using the material context semantic coder containing the word embedding layer to obtain a sequence of semantic feature vectors of the material text.

4. The artificial intelligence based video content production method of claim 3, wherein semantically encoding and analyzing the keywords and the defined conditions to obtain a search key element global semantic feature vector comprises:

performing word embedding coding on the keywords and the limiting conditions to obtain a sequence of search key element semantic embedded vectors; and

And passing the sequence of the search key element semantic embedded vector through a semantic encoder to obtain the search key element global semantic feature vector.

5. The artificial intelligence based video content production method according to claim 4, wherein calculating the correlation between the search key element global semantic feature vector and each of the material text semantic feature vectors in the sequence of material text semantic feature vectors to obtain a search key element-material text semantic correlation feature vector composed of a plurality of the correlations comprises:

Calculating the correlation between the global semantic feature vector of the search key element and each semantic feature vector of the material text in the sequence of the semantic feature vectors of the material text according to the following correlation formula to obtain a plurality of correlations;

Wherein, the correlation formula is:

Wherein V ₁ is the global semantic feature vector of the search key element, V ₂ is the semantic feature vector of each material text in the sequence of the semantic feature vectors of the material text, And/>For two different linear transforms, (·) ^T represents a transpose operation,The correlation degree between the global semantic feature vector of the key element and each material text semantic feature vector in the sequence of the material text semantic feature vector is searched; and

And arranging a plurality of relevancy to obtain the search key element-material text semantic correlation feature vector.

6. The artificial intelligence based video content production method of claim 5, wherein determining whether to return the first alternative material based on the optimized search key element-material text semantic related feature vector comprises:

Carrying out feature distribution correction on the optimized search key element-material text semantic related feature vector to obtain a corrected search key element-material text semantic related feature vector;

The corrected search key element-material text semantic related feature vector is passed through a classifier to obtain a classification result, wherein the classification result is used for indicating whether the matching degree of the first alternative material meets the preset requirement; and

And responding to the classification result that the matching degree of the first alternative materials and the keywords meets the preset requirement, and returning the first alternative materials.