CN116634192A

CN116634192A - A method for automatic video editing

Info

Publication number: CN116634192A
Application number: CN202210130528.5A
Authority: CN
Inventors: 毛意辉; 陈天雪; 许洛彬; 袁嘉晨
Original assignee: Xiaohongshu Technology Co ltd
Current assignee: Xiaohongshu Technology Co ltd
Priority date: 2022-02-11
Filing date: 2022-02-11
Publication date: 2023-08-22

Abstract

The invention provides an automatic video editing method which is characterized by comprising the following steps: a step of receiving an original video; a step of intensity recognition of the original video; a step of performing highlight content recognition on the original video; selecting music according to the intensity recognition result; and fusing the highlight content and the music. The video automatic editing method can automatically complete daily editing of the video, and greatly improves the working efficiency of video publishers.

Description

A method for automatic video editing

技术领域technical field

本发明的方法涉及视频编辑领域，具体涉及一种对视频自动进行编辑的方法。The method of the invention relates to the field of video editing, in particular to a method for automatically editing videos.

背景技术Background technique

随着网格技术的发展，基于视频内容的应用越来越广泛，大多数拍摄完成的原生视频并不能直接使用，尤其在短视频应用的场景下，每个视频的时长有严格的限制，而待处理的视频中，发布者需要重点展示的，或者说观赏者其实关心的也仅仅是一段视频中一小段(称为高光时刻)，因此视频发布者发布前需要进行剪辑、配乐、增加滤镜等等编辑工作。目前上述的编辑工作，还只能依赖人工完成，这导致效率低下。With the development of grid technology, applications based on video content are becoming more and more widely used. Most of the original videos that have been shot cannot be used directly. Especially in the case of short video applications, the duration of each video is strictly limited, and In the video to be processed, what the publisher needs to focus on, or what the viewer actually cares about is only a small part of the video (called the highlight moment), so the video publisher needs to edit, make music, and add filters before publishing Wait for the editing work. At present, the above-mentioned editing work can only be done manually, which leads to low efficiency.

发明内容Contents of the invention

鉴于以上所述现有技术的缺点，本发明提供一种视频自动编辑方法，其特征在于，包括：接收原始视频的步骤；对所述原始视频进行强度识别的步骤；对所述原始视频进行高光内容识别的步骤；根据所述强度识别结果选择音乐的步骤；将所述高光内容和所述音乐进行融合的步骤。In view of the shortcomings of the prior art described above, the present invention provides an automatic video editing method, which is characterized in that it includes: a step of receiving an original video; a step of identifying the intensity of the original video; performing a highlight on the original video The steps of content recognition; the step of selecting music according to the intensity recognition result; the step of fusing the highlight content with the music.

优选地，上述视频自动编辑方法中，所述接收原始视频的步骤中，还包括检测所述原始视频是否符合约束条件的步骤。Preferably, in the above automatic video editing method, the step of receiving the original video further includes a step of detecting whether the original video meets the constraints.

优选地，上述视频自动编辑方法中，所述对所述原始视频进行高光内容识别的步骤中，还包括：根据所述原始视频所属垂类，识别出所有具备高光特点的片段列表；对所述片断列表进行筛选得到所述高光内容列表。Preferably, in the above automatic video editing method, the step of identifying the highlight content of the original video further includes: identifying a list of all segments with highlight characteristics according to the vertical category to which the original video belongs; The segment list is filtered to obtain the highlight content list.

优选地，上述视频自动编辑方法中，所述对所述片断列表进行筛选得到所片断列表的步骤中，只保留所述片段列表中时长大于等于预设门限T₁的片段，且所述高光内容列表的总时长不大于预设第二门限T₂。Preferably, in the above automatic video editing method, in the step of filtering the list of clips to obtain the list of clips, only the clips in the list of clips whose duration is greater than or equal to the preset threshold T ₁ are kept, and the highlight content The total duration of the list is not greater than the preset second threshold T ₂ .

优选地，上述视频自动编辑方法中，如果所述片断列表的长度大于1，而所述高光内容列表的长度小于1，则将所述第一片断列表中的片断拼接作为所述高光内容列表。Preferably, in the above automatic video editing method, if the length of the segment list is greater than 1 and the length of the highlight content list is less than 1, the segments in the first segment list are concatenated as the highlight content list.

优选地，上述视频自动编辑方法中，所述根据所述强度识别结果选择音乐的步骤中，包括：对备选音乐进行标记；根据所述视频强度和所述标记选择音乐。Preferably, in the above automatic video editing method, the step of selecting music according to the strength recognition result includes: marking candidate music; selecting music according to the video strength and the mark.

优选地，上述视频自动编辑方法中，将所述高光内容和所述音乐进行融合的步骤中，包括：为所述音乐定义卡槽；将所述高光内容填充进所述卡槽。Preferably, in the above automatic video editing method, the step of fusing the highlight content and the music includes: defining a slot for the music; filling the highlight content into the slot.

优选地，上述视频自动编辑方法中，所述对备选音乐进行标记的步骤中，包括对音乐的速度和情绪两个维度进行标记。Preferably, in the above automatic video editing method, the step of marking the candidate music includes marking two dimensions of music speed and emotion.

优选地，上述视频自动编辑方法中，根据所述视频强度和所述标记选择音乐的步骤中，包括根据所述视频强度向用户返回推荐音乐列表，以及根据用户选择确定所述音乐的步骤。Preferably, in the above-mentioned automatic video editing method, the step of selecting music according to the video strength and the mark includes the steps of returning a recommended music list to the user according to the video strength, and determining the music according to the user selection.

优选地，上述视频自动编辑方法中，为所述音乐定义卡槽的步骤中，如果所述音乐系无歌词音乐，则将每3个强拍之间的间隔定义为一个卡槽；如果所述音乐系有歌词音乐，则每句歌词对应的音乐间隔定义为一个卡槽。Preferably, in the above-mentioned automatic video editing method, in the step of defining a slot for the music, if the music is music without lyrics, the interval between every 3 downbeats is defined as a slot; if the If the music department has lyric music, the music interval corresponding to each lyric is defined as a slot.

优选地，上述视频自动编辑方法中，如果所述音乐系有歌词音乐，则对所述音乐中的前奏、间奏和尾奏按如下方式处理：当段落时长小于等于预设门限t时，直接将所述段落合并进临近卡槽；当所述段落大于门限t而小于步长p时，则所述将段落单独设置成一个卡槽，如果所述段落时长大于等于t而小于2p，则增设一个卡槽，取所述段落均分点附近的强节拍点作为分界点，以此类推；其中，所述段落指所述音乐中的前奏、间奏和尾奏；所述门限t取值为3秒；所述步长p取值为9秒。Preferably, in the above-mentioned automatic video editing method, if the music has lyric music, the prelude, interlude and ending in the music are processed in the following manner: when the duration of the paragraph is less than or equal to the preset threshold t, directly Merge the paragraphs into adjacent slots; when the paragraphs are greater than the threshold t and smaller than the step size p, then set the paragraphs as a single slot, and if the duration of the paragraphs is greater than or equal to t and less than 2p, then add A draw-in slot, take the strong beat point near the equally divided point of the paragraph as the demarcation point, and so on; wherein, the paragraph refers to the prelude, interlude and coda in the music; the threshold t is 3 seconds; the value of the step p is 9 seconds.

优选地，上述视频自动编辑方法中，当所述原始视频为多个时，将所述高光内容填充进所述卡槽的步骤包括：当“当前高光内容片段时长–1s–当前卡槽时长”<1s时，对所述当前高光内容片段进行裁剪处理，先剪去所述当前高光内容片段结尾1s，然后将所述当前高光内容片段裁剪至临近节拍点，并将下一卡槽的起始点移至该处；当“1s≤当前高光内容片段时长–1s–当前卡槽时长<7s”时，不对所述当前高光内容片段进行处理，读取和填充下一段高光内容片段；当“7s≤当前高光内容片段时长–1s–当前卡槽时长”时，使用所述当前高光内容片段继续填充下一个卡槽，以此类推，且下一高光内容片段填充起始点＝当前高光内容片段在结束点+3s；如果当前高光内容片段没能填满下一个卡槽，对当前高光内容片段进行裁剪处理，先剪去当前高光内容片段结尾1s，然后将当前高光内容片段裁剪至临近节拍点，并将下一卡槽的起始点移至该处；其中，填满下一个卡槽的条件是：“当前素材时长–1s–当前卡槽时长–3s–下一个卡槽时长≥1s”。Preferably, in the above-mentioned automatic video editing method, when there are multiple original videos, the step of filling the highlight content into the card slot includes: when "current highlight content segment duration-1s-current card slot duration" When <1s, cut the current highlight content segment, first cut off the end 1s of the current highlight content segment, then cut the current highlight content segment to the point close to the beat, and set the starting point of the next slot Move to this place; when "1s ≤ current highlight content segment length – 1s - current card slot duration < 7s", the current highlight content segment is not processed, and the next segment of highlight content segment is read and filled; when "7s ≤ When the duration of the current highlight content segment – 1s – the duration of the current slot”, use the current highlight content segment to continue filling the next slot, and so on, and the next highlight content segment fills the starting point = the current highlight content segment is at the end point +3s; if the current highlight content segment fails to fill the next card slot, cut the current highlight content segment, first cut off the end 1s of the current highlight content segment, then cut the current highlight content segment to the nearest beat point, and The starting point of the next slot is moved there; the condition for filling the next slot is: "current material duration - 1s - current slot duration - 3s - next slot duration ≥ 1s".

优选地，上述视频自动编辑方法中，还包括在卡槽拼接处为视频添加转场效果的步骤。Preferably, the above-mentioned automatic video editing method further includes the step of adding a transition effect to the video at the slot joint.

优选地，上述视频自动编辑方法中，还包括为所述高光内容添加滤镜的步骤Preferably, the above-mentioned automatic video editing method also includes the step of adding a filter to the highlight content

附图说明Description of drawings

图1所示为本发明视频自动编辑方法流程图。FIG. 1 is a flowchart of the automatic video editing method of the present invention.

具体实施方式Detailed ways

以下通过特定的具体实例说明本发明的实施方式，本领域技术人员可由本说明书所揭露的内容轻易地了解本发明的其他优点与功效。本发明还可以通过另外不同的具体实施方式加以实施或应用，本说明书中的各项细节也可以基于不同观点与应用，在没有背离本发明的精神下进行各种修饰或改变。Embodiments of the present invention are described below through specific examples, and those skilled in the art can easily understand other advantages and effects of the present invention from the content disclosed in this specification. The present invention can also be implemented or applied through other different specific implementation modes, and various modifications or changes can be made to the details in this specification based on different viewpoints and applications without departing from the spirit of the present invention.

本发明的视频自动编辑方法流程如图1所示。步骤S1表示接收原始视频，本步骤通常还包括检测视频是否符合约束条件的步骤，约束条件包括视频的时长、帧率、码率、分辨率。如果不符合约束条件，可以对用户作出提示，或者进行预处理。例如，如果时间过长，可以提示用户；分辨率过高，可以进行压缩。The flow chart of the automatic video editing method of the present invention is shown in FIG. 1 . Step S1 represents receiving the original video, and this step usually includes the step of checking whether the video meets the constraints, which include the duration, frame rate, bit rate, and resolution of the video. If the constraints are not met, a prompt can be given to the user, or preprocessing can be performed. For example, if the time is too long, the user can be prompted; if the resolution is too high, compression can be performed.

步骤S2表示对视频进行强度识别。视频的强度是表示视频信号变化剧烈程度的一项指标，视频中人物动作幅度的越大、主体物位移的速度越快、场景或镜头变化的幅度越剧烈，表示视频信号中的突变成分越多，强度越大，反之则视频的强度越小。其中，对视频内人物、主体及其动作幅度和位移速度等的识别，可以直接使用神经网络等机器学习技术，并非本发明的发明点，因此不予赘述。本例中，示意性地将视频强度分为高、中、低三个级别。视频的强度很大程度上体现了视频的情绪、风格，因此对视频强度的分级，也可以看作是对视频风格的一种分类。Step S2 represents performing intensity recognition on the video. The intensity of the video is an index that indicates the degree of change in the video signal. The greater the movement range of the characters in the video, the faster the displacement of the subject object, and the more severe the change of the scene or lens, it means that there are more mutation components in the video signal. , the greater the intensity, otherwise the lower the intensity of the video. Among them, machine learning technologies such as neural networks can be directly used to identify the characters, subjects, and their range of motion and displacement speed in the video, which is not the invention point of the present invention, so it will not be described in detail. In this example, the video intensity is schematically divided into three levels: high, medium, and low. The intensity of a video reflects the emotion and style of the video to a large extent, so the grading of the intensity of the video can also be regarded as a classification of the style of the video.

步骤S3表示对视频中高光内容的识别。视频中的高光内容通常也是发布者希望重点展示给观看者的内容，也是观看者最关心的内容。本发明的高光识别策略包括：Step S3 represents the identification of highlight content in the video. The highlight content in the video is usually the content that the publisher wants to highlight to the viewers, and it is also the content that the viewers care most about. The highlight recognition strategy of the present invention includes:

S31：根据视频所属垂类，识别出所有具备高光特点的片段列表list A。其中，高光特点可以根据先验知识进行定义，例如，对于垂类为篮球视频，高光特点包括投篮的一个完整动作，以及球的完整运动轨迹；对于垂类为美妆的视频，高光特点包括妆前妆后妆容的变化，以及化妆品的近景特写、互动。S31: According to the vertical category to which the video belongs, a list A of all clips with highlight characteristics is identified. Among them, the highlight features can be defined based on prior knowledge. For example, for a vertical category of basketball videos, the highlight features include a complete action of shooting and the complete trajectory of the ball; for a vertical category of beauty makeup videos, the highlight features include makeup The change of makeup before and after makeup, as well as the close-up and interaction of cosmetics.

S32：对S31所得的片断进行初筛得到高光特点的片段列表list B(即高光内容片段)。筛选的标准之一可以是时长，本例中，定义最小时长门限T₁＝5秒，即只保留时长大于等于T₁的片段。优选地，列表list B的最大时长不超过门限T₂＝30秒。更优选地，如果list A的长度大于等于1，而list B的长度小于1，则高光内容片段采用list A中的片段拼接、补齐。S32: Preliminarily screen the fragments obtained in S31 to obtain a list B of fragments with highlight features (ie, highlight content fragments). One of the screening criteria may be duration. In this example, a minimum duration threshold T ₁ =5 seconds is defined, that is, only fragments with a duration greater than or equal to T ₁ are retained. Preferably, the maximum duration of the list list B does not exceed the threshold T ₂ =30 seconds. More preferably, if the length of list A is greater than or equal to 1, and the length of list B is less than 1, the highlight content fragments are spliced and filled with the fragments in list A.

步骤S4表示选择配乐的步骤，即为视频素材选择所匹配的音乐。音乐的选取依据可以是音乐的速度、风格和视频的强度，本例方法为：Step S4 represents the step of selecting the soundtrack, that is, selecting the matching music for the video material. The music can be selected based on the speed, style and video intensity of the music. In this case, the method is as follows:

·S41：对备选音乐进行标记。其中标记可以分为多个维度，比如音乐的速度，可以标记快、中、慢；音乐的情绪，可以标记正、中、负等等。· S41: mark the candidate music. The marks can be divided into multiple dimensions. For example, the speed of music can be marked as fast, medium, and slow; the mood of music can be marked as positive, medium, and negative.

·S42：根据视频强度和音乐标记选择音乐。例如视频强度为高的视频，匹配速度为快、情绪为正的音乐。如果匹配的结果是多首音乐，则随机确定一首。但为了提高最终成片的用户满意度，本例中为用户提供交互接口，即根据视频强度和音乐标记通过多种交叉方式匹配，获得一个音乐的推荐列表返回给用户，然后由用户最终选定匹配的音乐。不难理解的是：对音乐的标记维度越多，匹配的运算量(耗时)会越大，匹配的精度会越高，因此实践中需要寻找平衡。· S42: Select music according to video intensity and music tags. For example, a video with high video strength matches music with fast speed and positive emotion. If the result of the match is multiple pieces of music, one is randomly determined. However, in order to improve the user satisfaction of the final film, in this example, an interactive interface is provided for the user, that is, a music recommendation list is obtained and returned to the user through multiple cross-matching methods based on the video intensity and music tags, and then the user finally selects Matching music. It is not difficult to understand that: the more dimensions of music are marked, the greater the amount of calculation (time-consuming) matching will be, and the higher the accuracy of matching will be. Therefore, it is necessary to find a balance in practice.

步骤S5表示将所选择音乐与视频素材融合的步骤。本例中，融合的具体步骤包括：Step S5 represents the step of fusing the selected music with the video material. In this example, the specific steps of fusion include:

·S51：为备选音乐定义卡槽。音乐可以分为有歌词音乐和无歌词音乐两类。对于无歌词音乐，将每3个强拍之间的间隔定义为一个卡槽。对于有歌词的音乐，每句歌词对应的音乐间隔作为一个卡槽。对于前奏(即音乐起始点到歌词起始点之间的间隔)、间奏(歌词与歌词之间的间隔)和尾奏(歌词结束点与音乐结束点之间的间隔)，由于有着共同的特点——属于无歌词纯音乐，所以既可以直接按无歌词音乐进行处理，也可以按如下方式处理，为了方便描述，以下将前奏、间奏和尾奏统称为段落：当段落时长小于等于预设门限t(本例取值3秒)时，直接将段落合并进临近卡槽；当段落大于门限t而小于步长p(本例中设置为9秒)时，则将段落单独设置成一个卡槽，如果段落时长大于等于t+p而小于t+2p，则增设一个卡槽，取段落均分点附近的强节拍点作为分界点，以此类推。· S51: Define card slots for alternative music. Music can be divided into music with lyrics and music without lyrics. For music without lyrics, the interval between every 3 downbeats is defined as a slot. For music with lyrics, the music interval corresponding to each lyrics is used as a card slot. For the prelude (that is, the interval between the beginning of the music and the beginning of the lyrics), the interlude (the interval between the lyrics and the lyrics) and the ending (the interval between the end of the lyrics and the end of the music), due to the common characteristics ——It belongs to pure music without lyrics, so it can be processed directly as music without lyrics, or it can be processed as follows. For the convenience of description, the prelude, interlude and ending are collectively referred to as paragraphs below: when the length of a paragraph is less than or equal to the preset When the threshold t (in this example, the value is 3 seconds), the paragraphs are directly merged into the adjacent card slot; when the paragraph is greater than the threshold t but smaller than the step size p (in this example, it is set to 9 seconds), the paragraphs are set separately as a card Slot, if the length of the paragraph is greater than or equal to t+p but less than t+2p, then add a slot, take the strong beat point near the even point of the paragraph as the dividing point, and so on.

·S52：将高光内容片段填充进音乐卡槽。当高光内容片段列表中的元素数量大于1时，可以采取如下的方式填充：· S52: filling the highlight content clips into the music card slot. When the number of elements in the highlight content fragment list is greater than 1, it can be filled in the following way:

当“当前高光内容片段时长–1s–当前卡槽时长”<1s时，对当前高光内容片段进行裁剪处理，先剪去高光内容片段结尾1s，然后将高光内容片段裁剪至临近节拍点，并将下一卡槽的起始点移至该处； When the "current highlight content segment duration - 1s - current slot duration"< 1s, the current highlight content segment is trimmed, first cutting off the end 1s of the highlight content segment, and then cutting the highlight content segment to the adjacent beat point, and The starting point of the next card slot is moved there;

当“1s≤当前高光内容片段时长–1s–当前卡槽时长<7s”时，不对当前高光内容片段进行处理，读取和填充下一段高光内容片段； When "1s ≤ the duration of the current highlight content segment – 1s - the duration of the current slot <7s", the current highlight content segment is not processed, and the next segment of highlight content segment is read and filled;

当“7s≤当前高光内容片段时长–1s–当前卡槽时长”时，使用当前高光内容片段继续填充下一个卡槽，以此类推，且下一高光内容片段填充起始点＝当前高光内容片段在结束点+3s；如果当前高光内容片段没能填满下一个卡槽，对当前高光内容片段进行裁剪处理，先剪去当前高光内容片段结尾1s，然后将当前高光内容片段裁剪至临近节拍点，并将下一卡槽的起始点移至该处。 When "7s ≤ duration of the current highlight content segment – 1s - duration of the current slot", use the current highlight content segment to continue filling the next slot, and so on, and the filling starting point of the next highlight content segment = the current highlight content segment at End point + 3s; if the current highlight content segment fails to fill the next slot, trim the current highlight content segment, first cut off the end 1s of the current highlight content segment, and then cut the current highlight content segment to the nearest beat point, and move the starting point of the next card slot there.

其中，填满下一个卡槽的条件是：“当前素材时长–1s–当前卡槽时长–3s–下一个卡槽时长≥1s”。Among them, the condition for filling the next slot is: "current material duration - 1s - current slot duration - 3s - next slot duration ≥ 1s".

优选地，为提升用户体验，本发明还可以包括步骤S6：为视频添加转场效果。转场效果的添加点选在卡槽拼接处。其中，转场指的是视频镜头切换时的动画效果，例如叠化、闪黑、推进等等。Preferably, in order to improve user experience, the present invention may further include step S6: adding a transition effect to the video. The point of adding the transition effect is selected at the joint of the card slot. Among them, the transition refers to the animation effect when the video camera is switched, such as dissolving, flashing black, advancing and so on.

更优选地，本发明还包括步骤S7：为视频添加滤镜效果。可以根据视频强度，为所有高光内容选择统一滤镜，也可以根据高光内容片段所属垂类，为不同的高光内容片段添加不同的滤镜。More preferably, the present invention further includes step S7: adding a filter effect to the video. You can select a unified filter for all highlight content according to the video intensity, or add different filters for different highlight content segments according to the vertical category of the highlight content segment.

前述实施例中以处理单视频进行了举例，但本发明的方法同样适用于多视频情况，即同时接收多条原始视频，并输出为一段完整视频。对于处理多视频的情况，将每个视频分别进行强度识别、高光识别即可，方法与处理单视频类似，不予赘述。In the foregoing embodiments, a single video is processed as an example, but the method of the present invention is also applicable to multi-video situations, that is, multiple original videos are received at the same time and output as a complete video. For the case of processing multiple videos, it is enough to perform intensity recognition and highlight recognition on each video separately, and the method is similar to that of processing a single video, and will not be described in detail.

由于动态视频是由一系列微小差异的静态视频连续播放所得，因此本发明方法也同样适用于将多张静态图片拼接成视频素材。在此场景下，对于匹配音乐的步骤可以进行如下的优化：无需要区分音乐有无歌词，仅根据音乐节拍进行划分，每两个强拍之间的间隔划分为一个卡槽；无需要区别音乐的前奏、间奏、尾奏。Since the dynamic video is obtained by continuously playing a series of static videos with slight differences, the method of the present invention is also applicable to splicing multiple static pictures into video material. In this scenario, the steps of matching music can be optimized as follows: no need to distinguish whether there are lyrics in the music, only divide according to the music beat, and divide the interval between every two strong beats into a slot; no need to distinguish music Prelude, Interlude, Ending.

综上所述，本发明的一种视频自动编辑方法，能够自动完成视频的日常编辑，大大提高了视频发布者的工作效率。To sum up, the automatic video editing method of the present invention can automatically complete daily video editing and greatly improve the work efficiency of video publishers.

上述实施例仅例示性说明本发明的原理及其功效，而非用于限制本发明。任何熟悉此技术的人士皆可对上述实施例进行修饰或改变，但这些改变只要仍在权利要求所限范围，均属于使用了本发明的思想。The above-mentioned embodiments only illustrate the principles and effects of the present invention, but are not intended to limit the present invention. Any person familiar with this technology can modify or change the above embodiments, but as long as these changes are still within the scope of the claims, they all belong to the idea of using the present invention.

Claims

1. A method for automatically editing video, comprising:

a step of receiving an original video;

a step of intensity recognition of the original video;

a step of performing highlight content recognition on the original video;

selecting music according to the intensity recognition result;

and fusing the highlight content and the music.

2. The method according to claim 1, wherein the step of receiving the original video further comprises the step of detecting whether the original video meets a constraint condition.

3. The method for automatically editing video according to claim 1, wherein the step of recognizing the highlight content of the original video further comprises:

identifying all fragment lists with highlight characteristics according to the vertical class to which the original video belongs;

and screening the fragment list to obtain the highlight content list.

4. The method for automatically editing video according to claim 3, wherein in the step of filtering the clip list to obtain the clip list, only a duration of at least a preset threshold T in the clip list is maintained ₁ And the total duration of the highlight content list is not more than a preset second threshold T ₂ 。

5. The method according to claim 4, wherein if the length of the clip list is greater than 1 and the length of the highlight content list is less than 1, clips in the first clip list are spliced as the highlight content list.

6. The automatic video editing method according to claim 1, wherein the selecting music according to the intensity recognition result comprises:

marking the alternative music;

music is selected based on the video intensity and the indicia.

7. The automatic video editing method according to claim 1, wherein the step of fusing the highlight content and the music comprises:

defining a card slot for the music;

and filling the high-light content into the clamping groove.

8. The method of automatic video editing according to claim 6, wherein the step of marking the candidate music includes marking both a tempo and an emotion dimension of the music.

9. The method of automatic video editing according to claim 6, wherein the step of selecting music according to the video intensity and the mark comprises the step of returning a list of recommended music to the user according to the video intensity and determining the music according to the user selection.

10. The method of automatic video editing according to claim 7, wherein in the step of defining a slot for the music, if the music is lyrics-free music, an interval between every 3 beats is defined as a slot; if the music is lyric music, the music interval corresponding to each lyric is defined as a clamping groove.

11. The video automatic editing method according to claim 10, wherein if the musical system has lyric music, the pre-music, the inter-music and the end-music in the music are processed as follows: when the paragraph duration is less than or equal to a preset threshold t, directly merging the paragraphs into adjacent clamping grooves; when the paragraph is larger than a threshold t and smaller than a step length p, the paragraph is independently set into a clamping groove, if the time length of the paragraph is larger than or equal to t and smaller than 2p, a clamping groove is additionally arranged, a strong beat point near the paragraph equal point is taken as a demarcation point, and the like;

wherein the paragraphs refer to the pre-forms, the interludes and the tail forms in the music; the threshold t is 3 seconds; the step p takes a value of 9 seconds.

12. The method of automatic video editing according to claim 7, wherein when the original video is plural, the step of filling the highlight content into the card slot comprises:

when the duration of the current highlight content segment is-1 s-the duration of the current clamping groove is <1s, cutting the current highlight content segment, firstly cutting the end 1s of the current highlight content segment, then cutting the current highlight content segment to a point close to the beat point, and moving the starting point of the next clamping groove to the point;

when the duration of the current highlight content segment is less than or equal to 1s and less than or equal to the duration of the current clamping groove is less than 7s, the current highlight content segment is not processed, and the next highlight content segment is read and filled;

when the time length of the current highlight content segment is less than or equal to 7s and less than or equal to the time length of the current clamping groove, namely-1 s, the current highlight content segment is used for continuously filling the next clamping groove, and the like, and the next highlight content segment is filled with a starting point=the current highlight content segment is at an ending point +3s; if the current highlight content segment cannot fill the next clamping groove, cutting the current highlight content segment, firstly cutting the tail 1s of the current highlight content segment, cutting the current highlight content segment to the adjacent beat point, and moving the starting point of the next clamping groove to the position;

the condition for filling the next clamping groove is as follows: the time length of the current material is-1 s, the time length of the current clamping groove is-3 s, and the time length of the next clamping groove is more than or equal to 1 s.

13. The method of automatic video editing according to claim 7, further comprising the step of adding a transition effect to the video at the card slot splice.

14. The method of automatic video editing according to claim 1, further comprising the step of adding a filter to said highlight content.