CN114630057B

CN114630057B - Method and device for determining special effect video, electronic equipment and storage medium

Info

Publication number: CN114630057B
Application number: CN202210238163.8A
Authority: CN
Inventors: 陈嘉俊; 全浩; 阮翔鸿; 周栩彬
Original assignee: Beijing Zitiao Network Technology Co Ltd
Current assignee: Beijing Zitiao Network Technology Co Ltd
Priority date: 2022-03-11
Filing date: 2022-03-11
Publication date: 2024-01-30
Anticipated expiration: 2042-03-11
Also published as: CN114630057A

Abstract

Embodiments of the present disclosure provide a method, device, electronic device, and storage medium for determining a special effect video. The method includes: in response to a special effect triggering operation, determining a blurred video frame corresponding to the current video frame to be processed; and determining the blurred video frame corresponding to the current to-be-processed video frame. Audio special effects that are consistent with the audio information of the current video frame to be processed; use the blurred video frame as the background image of the current special effect video frame, and use the audio special effects as the foreground image of the current special effect video frame; by Special effects video frame splicing processing of video frames to obtain the target special effects video. The technical solution of the embodiment of the present disclosure not only presents the user's voice in a visual form, enhancing the interest of the special effects video, but also blurs the to-be-processed video frames shot by the user to meet the user's personalized needs.

Description

Methods, devices, electronic equipment and storage media for determining special effects videos

技术领域Technical field

本公开实施例涉及视频处理技术领域，尤其涉及一种确定特效视频的方法、装置、电子设备及存储介质。Embodiments of the present disclosure relate to the field of video processing technology, and in particular, to a method, device, electronic device, and storage medium for determining special effect video.

背景技术Background technique

随着网络技术的发展，越来越多的应用程序进入了用户的生活，尤其是一系列可以拍摄短视频的软件，深受用户的喜爱。例如，用户可以通过应用软件拍摄视频，并将视频发布至特定的平台或分享给其他用户。With the development of network technology, more and more applications have entered users' lives, especially a series of software that can shoot short videos, which are deeply loved by users. For example, users can shoot videos through the application software and publish the videos to specific platforms or share them with other users.

然而，在现有技术中，应用为用户提供的视频特效不够丰富，用户所拍摄的视频内容缺乏趣味性，同时，在视频拍摄的过程中并未考虑用户的个性化需求，从而降低了用户的使用体验。However, in the existing technology, the video special effects provided by the application for users are not rich enough, and the video content shot by the user lacks interest. At the same time, the user's personalized needs are not considered during the video shooting process, thus reducing the user's satisfaction. Use experience.

发明内容Contents of the invention

本公开提供一种确定特效视频的方法、装置、电子设备及存储介质，不仅将用户的语音以可视化的形式呈现出来，增强了特效视频的趣味性，同时，将用户拍摄的待处理视频帧模糊处理，满足了用户的个性化需求。The present disclosure provides a method, device, electronic device and storage medium for determining special effects video, which not only presents the user's voice in a visual form and enhances the interest of the special effects video, but also blurs the to-be-processed video frames shot by the user. processing to meet the individual needs of users.

第一方面，本公开实施例提供了一种确定特效视频的方法，包括：In a first aspect, embodiments of the present disclosure provide a method for determining special effects video, including:

响应于特效触发操作，确定与当前待处理视频帧相对应的模糊视频帧；以及，确定与所述当前待处理视频帧的音频信息相一致的音频特效；In response to the special effect triggering operation, determining a blurred video frame corresponding to the current video frame to be processed; and determining an audio special effect consistent with the audio information of the current video frame to be processed;

将所述模糊视频帧作为当前特效视频帧的背景图像，将所述音频特效作为当前特效视频帧的前景图像；Use the blurred video frame as the background image of the current special effect video frame, and use the audio special effect as the foreground image of the current special effect video frame;

通过对各待处理视频帧的特效视频帧拼接处理，得到目标特效视频。By splicing the special effects video frames of each video frame to be processed, the target special effects video is obtained.

第二方面，本公开实施例还提供了一种确定特效视频的装置，包括：In a second aspect, embodiments of the present disclosure also provide a device for determining special effects video, including:

模糊视频帧确定模块，用于响应于特效触发操作，确定与当前待处理视频帧相对应的模糊视频帧；以及，确定与所述当前待处理视频帧的音频信息相一致的音频特效；A blurred video frame determination module, configured to determine a blurred video frame corresponding to the current video frame to be processed in response to a special effect triggering operation; and to determine an audio special effect that is consistent with the audio information of the current video frame to be processed;

特效视频帧生成模块，用于将所述模糊视频帧作为当前特效视频帧的背景图像，将所述音频特效作为当前特效视频帧的前景图像；A special effects video frame generation module, configured to use the blurred video frame as the background image of the current special effects video frame, and the audio special effects as the foreground image of the current special effects video frame;

目标特效视频生成模块，用于通过对各待处理视频帧的特效视频帧拼接处理，得到目标特效视频。The target special effects video generation module is used to obtain the target special effects video by splicing the special effects video frames of each to-be-processed video frame.

第三方面，本公开实施例还提供了一种电子设备，所述电子设备包括：In a third aspect, embodiments of the present disclosure also provide an electronic device, where the electronic device includes:

一个或多个处理器；one or more processors;

存储装置，用于存储一个或多个程序，a storage device for storing one or more programs,

当所述一个或多个程序被所述一个或多个处理器执行，使得所述一个或多个处理器实现如本公开实施例任一所述的确定特效视频的方法。When the one or more programs are executed by the one or more processors, the one or more processors are caused to implement the method for determining special effect video as described in any one of the embodiments of the present disclosure.

第四方面，本公开实施例还提供了一种包含计算机可执行指令的存储介质，所述计算机可执行指令在由计算机处理器执行时用于执行如本公开实施例任一所述的确定特效视频的方法。In a fourth aspect, embodiments of the present disclosure also provide a storage medium containing computer-executable instructions, which when executed by a computer processor are used to perform the determination of special effects as described in any of the embodiments of the present disclosure. Video method.

本公开实施例的技术方案，响应于特效触发操作，确定出与当前待处理视频帧对应的模糊视频帧，以及与当前待处理视频帧的音频信息相一致的音频特效；将模糊视频帧作为背景图像，将音频特效作为前景图像，从而构建出当前特效视频帧，进一步的，将各特效视频帧进行拼接处理，得到目标特效视频，不仅将用户的语音以可视化的形式呈现出来，增强了特效视频的趣味性，同时，通过将用户拍摄的待处理视频帧模糊处理，满足了用户的个性化需求，提升了用户在制作特效视频过程中的使用体验。The technical solution of the embodiment of the present disclosure determines, in response to the special effect triggering operation, the blurred video frame corresponding to the current video frame to be processed, and the audio special effects consistent with the audio information of the current video frame to be processed; using the blurred video frame as the background image, using the audio special effects as the foreground image, thereby constructing the current special effects video frame. Further, each special effects video frame is spliced to obtain the target special effects video, which not only presents the user's voice in a visual form, but also enhances the special effects video. At the same time, by blurring the video frames shot by users to be processed, it meets the personalized needs of users and improves the user experience in the process of making special effects videos.

附图说明Description of the drawings

结合附图并参考以下具体实施方式，本公开各实施例的上述和其他特征、优点及方面将变得更加明显。贯穿附图中，相同或相似的附图标记表示相同或相似的元素。应当理解附图是示意性的，原件和元素不一定按照比例绘制。The above and other features, advantages, and aspects of various embodiments of the present disclosure will become more apparent with reference to the following detailed description taken in conjunction with the accompanying drawings. Throughout the drawings, the same or similar reference numbers refer to the same or similar elements. It is to be understood that the drawings are schematic and that elements and elements are not necessarily drawn to scale.

图1为本公开实施例一所提供的一种确定特效视频的方法流程示意图；Figure 1 is a schematic flowchart of a method for determining special effects video provided in Embodiment 1 of the present disclosure;

图2为本公开实施例二所提供的一种确定特效视频的装置结构示意图；Figure 2 is a schematic structural diagram of a device for determining special effects video provided in Embodiment 2 of the present disclosure;

图3为本公开实施例三所提供的一种电子设备的结构示意图。FIG. 3 is a schematic structural diagram of an electronic device provided by Embodiment 3 of the present disclosure.

具体实施方式Detailed ways

下面将参照附图更详细地描述本公开的实施例。虽然附图中显示了本公开的某些实施例，然而应当理解的是，本公开可以通过各种形式来实现，而且不应该被解释为限于这里阐述的实施例，相反提供这些实施例是为了更加透彻和完整地理解本公开。应当理解的是，本公开的附图及实施例仅用于示例性作用，并非用于限制本公开的保护范围。Embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. Although certain embodiments of the disclosure are shown in the drawings, it should be understood that the disclosure may be embodied in various forms and should not be construed as limited to the embodiments set forth herein, which rather are provided for A more thorough and complete understanding of this disclosure. It should be understood that the drawings and embodiments of the present disclosure are for illustrative purposes only and are not intended to limit the scope of the present disclosure.

应当理解，本公开的方法实施方式中记载的各个步骤可以按照不同的顺序执行，和/或并行执行。此外，方法实施方式可以包括附加的步骤和/或省略执行示出的步骤。本公开的范围在此方面不受限制。It should be understood that various steps described in the method implementations of the present disclosure may be executed in different orders and/or in parallel. Furthermore, method embodiments may include additional steps and/or omit performance of illustrated steps. The scope of the present disclosure is not limited in this regard.

本文使用的术语“包括”及其变形是开放性包括，即“包括但不限于”。术语“基于”是“至少部分地基于”。术语“一个实施例”表示“至少一个实施例”；术语“另一实施例”表示“至少一个另外的实施例”；术语“一些实施例”表示“至少一些实施例”。其他术语的相关定义将在下文描述中给出。As used herein, the term "include" and its variations are open-ended, ie, "including but not limited to." The term "based on" means "based at least in part on." The term "one embodiment" means "at least one embodiment"; the term "another embodiment" means "at least one additional embodiment"; and the term "some embodiments" means "at least some embodiments". Relevant definitions of other terms will be given in the description below.

需要注意，本公开中提及的“第一”、“第二”等概念仅用于对不同的装置、模块或单元进行区分，并非用于限定这些装置、模块或单元所执行的功能的顺序或者相互依存关系。需要注意，本公开中提及的“一个”、“多个”的修饰是示意性而非限制性的，本领域技术人员应当理解，除非在上下文另有明确指出，否则应该理解为“一个或多个”。It should be noted that concepts such as “first” and “second” mentioned in this disclosure are only used to distinguish different devices, modules or units, and are not used to limit the order of functions performed by these devices, modules or units. Or interdependence. It should be noted that the modifications of "one" and "plurality" mentioned in this disclosure are illustrative and not restrictive. Those skilled in the art will understand that unless the context clearly indicates otherwise, it should be understood as "one or Multiple”.

本公开实施方式中的多个装置之间所交互的消息或者信息的名称仅用于说明性的目的，而并不是用于对这些消息或信息的范围进行限制。The names of messages or information exchanged between multiple devices in the embodiments of the present disclosure are for illustrative purposes only and are not used to limit the scope of these messages or information.

在介绍本技术方案之前，可以先对本公开实施例的应用场景进行示例性说明。例如，当用户通过应用软件拍摄视频，或者与其他用户进行视频通话时，可能希望使拍摄得到的视频内容更具趣味性，同时，某些用户对视频拍摄的画面可能有个性化需求(例如，某些患有社交恐惧症的用户不希望将自己的容貌等展示在视频中)，可以理解为，这些用户希望将所拍摄画面中的全部内容或部分内容进行隐藏(例如，用户自身的面部图像)，此时，可以根据本实施例的技术方案，在将视频画面模糊处理的同时，在各视频帧上叠加用户的音频特效，从而得到画面内容更丰富、同时有效满足用户个性化需求的特效视频。Before introducing the technical solution, the application scenarios of the embodiments of the present disclosure may be exemplified. For example, when users shoot videos through application software or make video calls with other users, they may want to make the captured video content more interesting. At the same time, some users may have personalized needs for the video footage (for example, Some users with social phobia do not want their appearance to be shown in the video), it can be understood that these users want to hide all or part of the content in the shot (for example, the user's own facial image) ), at this time, according to the technical solution of this embodiment, while blurring the video picture, the user's audio special effects can be superimposed on each video frame, so as to obtain special effects that have richer picture content and effectively meet the user's personalized needs. video.

实施例一Embodiment 1

图1为本公开实施例一所提供的一种确定特效视频的方法流程示意图，本公开实施例适用于在满足用户个性化需求的同时，生成更具趣味性的特效视频的情形，该方法可以由确定特效视频的装置来执行，该装置可以通过软件和/或硬件的形式实现，可选的，通过电子设备来实现，该电子设备可以是移动终端、PC端或服务器等。Figure 1 is a schematic flowchart of a method for determining special effects videos provided in Embodiment 1 of the present disclosure. This embodiment of the present disclosure is suitable for generating more interesting special effects videos while meeting the personalized needs of users. This method can It is executed by a device that determines the special effect video. The device can be implemented in the form of software and/or hardware. Optionally, it can be implemented through an electronic device. The electronic device can be a mobile terminal, a PC or a server.

如图1所示，所述方法包括：As shown in Figure 1, the method includes:

S110、响应于特效触发操作，确定与当前待处理视频帧相对应的模糊视频帧；以及，确定与当前待处理视频帧的音频信息相一致的音频特效。S110. In response to the special effect triggering operation, determine the blurred video frame corresponding to the current video frame to be processed; and determine the audio special effect consistent with the audio information of the current video frame to be processed.

其中，执行本公开实施例提供的特效视频处理方法的装置，可以集成在支持特效视频处理功能的应用软件中，且该软件可以安装至电子设备中，可选的，电子设备可以是移动终端或者PC端等。应用软件可以是对图像/视频处理的一类软件，其具体的应用软件在此不再一一赘述，只要可以实现图像/视频处理即可。还可以是专门研发的应用程序，来实现添加特效并将特效进行展示的软件中，亦或是集成在相应的页面中，用户可以通过PC端中集成的页面来实现对特效视频的处理。Among them, the device for executing the special effects video processing method provided by the embodiment of the present disclosure can be integrated in application software that supports special effects video processing functions, and the software can be installed in an electronic device. Optionally, the electronic device can be a mobile terminal or PC version etc. The application software may be a type of software for image/video processing. The specific application software will not be described in detail here, as long as it can realize image/video processing. It can also be a specially developed application to add special effects and display the special effects in the software, or it can be integrated in the corresponding page. Users can process the special effects video through the page integrated in the PC.

在本实施例中，在支持特效视频处理功能的应用软件或应用程序中，可以预先开发用于触发特效的控件，当检测到用户触发该控件时，即可对特效触发操作进行响应。In this embodiment, in the application software or application program that supports the special effects video processing function, a control for triggering the special effects can be developed in advance. When it is detected that the user triggers the control, the special effects triggering operation can be responded to.

在本实施例中，当前待处理视频帧可以是安装有应用软件的电子设备在对特效触发操作进行响应时所拍摄的一帧图像，也可以是当前所播放的视频中的一帧图像，对应的，模糊视频帧即是对当前待处理视频帧进行模糊处理后所得到的一帧图像。In this embodiment, the current video frame to be processed may be a frame of image captured by an electronic device installed with application software when responding to a special effect triggering operation, or may be a frame of image in the currently played video, corresponding to , the blurred video frame is an image obtained by blurring the current video frame to be processed.

可以理解，模糊视频帧不会如待处理视频帧一样清晰地展示画面中的内容，而是会降低画面的清晰度，即，使画面产生模糊的视觉效果，从而隐藏各帧画面中的全部信息或部分信息。例如，将多帧包含有用户面部信息的待处理视频帧进行处理，得到相应的模糊视频帧后，仅通过模糊视频帧便无法准确识别出用户的容貌。It can be understood that the blurred video frame will not display the content in the picture as clearly as the video frame to be processed, but will reduce the clarity of the picture, that is, make the picture have a blurred visual effect, thus hiding all the information in each frame. or partial information. For example, after processing multiple video frames to be processed that contain user facial information and obtaining corresponding blurred video frames, the user's appearance cannot be accurately identified only by blurring the video frames.

在实际应用过程中，确定模糊视频帧的方式有多种，可选的，为当前待处理视频帧中添加模糊滤镜，得到模糊视频帧；或，对当前待处理视频帧高斯模糊处理，得到模糊视频帧；或，将当前待处理视频帧输入至模糊处理模型中，得到模糊视频帧；或，若当前待处理视频帧中包括目标对象时，对目标对象模糊处理，得到模糊视频帧。下面对上述方式分别进行说明。In the actual application process, there are many ways to determine the blurred video frame. Alternatively, add a blur filter to the current video frame to be processed to obtain a blurred video frame; or perform Gaussian blur processing on the current video frame to be processed to obtain Blur the video frame; or, input the current video frame to be processed into the blur processing model to obtain the blurred video frame; or, if the current video frame to be processed includes a target object, blur the target object to obtain the blurred video frame. The above methods are explained below respectively.

在第一种确定模糊视频帧的方式中，模糊滤镜可以是应用软件中预先开发的滤镜。具体来说，为待处理视频帧添加模糊滤镜后，可以使每帧图像的全部区域产生模糊效果，也可以使每帧图像中清晰度较高或对比度较为强烈的区域产生模糊效果，从而得到对应的模糊视频帧。本领域技术人员应当理解，当仅将图像中部分区域进行模糊处理时，可以通过平衡图像中不同区域像素点相关参数的方式来实现，本公开实施例在此不再赘述。In the first way of determining blurred video frames, the blur filter can be a pre-developed filter in the application software. Specifically, after adding a blur filter to the video frame to be processed, the entire area of each frame of the image can be blurred, or the areas with higher definition or stronger contrast in each frame of the image can be blurred, thus obtaining The corresponding blurred video frame. Persons skilled in the art should understand that when only a part of an area in an image is blurred, it can be achieved by balancing the parameters related to pixels in different areas of the image, and the embodiments of the present disclosure will not be repeated here.

在第二种确定模糊视频的方式中，高斯模糊也称作高斯平滑，通过高斯模糊处理可以减少图像噪声并降低细节层次，这种模糊技术生成的图像，其视觉效果就像是经过一个毛玻璃在观察图像。基于此可以理解，将当前待处理视频帧经过高斯模糊处理后，同样可以得到相应的模糊视频帧。In the second way to determine blurred videos, Gaussian blur is also called Gaussian smoothing. Gaussian blur processing can reduce image noise and reduce the level of detail. The visual effect of the image generated by this blur technology is like passing through a frosted glass. Observe the image. Based on this, it can be understood that after the current video frame to be processed is subjected to Gaussian blur processing, the corresponding blurred video frame can also be obtained.

在第三种确定模糊视频的方式中，模糊处理模型可以是预先训练好的神经网络模型，且可以集成在相关应用软件中，至少用于生成模糊视频帧。可以理解，模型的输入即是所获取的当前待处理视频帧，模型的输出即是相应的模糊视频帧。本领域技术人员应当理解，对于模糊处理模型来说，可以基于相对应的训练集和验证集进行训练，当模糊处理模型的损失函数收敛时，表明模型训练完毕可以在应用中进行集成，其具体的训练过程在本实施例中不再赘述。In the third way of determining blurred videos, the blur processing model can be a pre-trained neural network model and can be integrated in relevant application software, at least for generating blurred video frames. It can be understood that the input of the model is the current video frame to be processed, and the output of the model is the corresponding blurred video frame. Those skilled in the art should understand that the fuzzy processing model can be trained based on the corresponding training set and verification set. When the loss function of the fuzzy processing model converges, it indicates that the model can be integrated in the application after training. Specifically, The training process will not be described again in this embodiment.

在第四种确定模糊视频的方式中，可以预先在应用中设置目标对象，例如，将包含有特定用户面部信息的图像数据输入至应用中作为目标对象，进一步的，当应用响应于特效触发操作，并在显示界面中识别到特定用户的面部信息后，即可自动将当前待处理视频帧进行模糊化处理，得到相应的模糊视频帧。可以理解，在这种模式下，当显示界面中没有识别到目标对象时，则不会对各帧图像进行模糊化处理。In the fourth way to determine the blurred video, the target object can be set in the application in advance. For example, image data containing specific user facial information is input into the application as the target object. Furthermore, when the application responds to a special effect trigger operation , and after identifying the facial information of a specific user in the display interface, the current video frame to be processed can be automatically blurred to obtain the corresponding blurred video frame. It can be understood that in this mode, when the target object is not recognized in the display interface, each frame image will not be blurred.

在本实施例中，在获取当前待处理视频帧的同时，为了将用户的声音以可视化的形式呈现在特效视频中，首先需要获取用户的音频信息。例如，在拍摄视频的同时通过电子设备上的麦克风采集用户发出的语音信息。可以理解，音频信息中至少包含用于表征音频内容或音频特点的音频特征。在实际应用过程中，音频特征还可以包括声纹频谱特征，对应的，音频特效还可以是如电声学奇异所显示的动态的频谱(如动态的波纹)，当其精度达到一定数值时，通过声波频谱至少可以表征特定的用户以及该用户发出的语音信息；同时，当音频中的音量发生变化时，声纹语音中的动态波纹也会发生波动，当音量较高时，动态波纹的波动也越大，相应的，当音量较低时，动态波纹的波动也较小。需要说明的是，为了使最终生成的特效视频中的视觉效果和听觉效果相统一，声纹语音也需要与当前待处理视频帧的音频信息相一致。In this embodiment, while acquiring the current video frame to be processed, in order to present the user's voice in a visual form in the special effects video, it is first necessary to acquire the user's audio information. For example, the user's voice information is collected through the microphone on the electronic device while shooting a video. It can be understood that the audio information at least includes audio features used to characterize audio content or audio characteristics. In practical applications, audio features can also include voiceprint spectrum features. Correspondingly, audio special effects can also be dynamic spectrums (such as dynamic ripples) displayed by electroacoustic singularities. When its accuracy reaches a certain value, The sound wave spectrum can at least characterize a specific user and the voice information issued by that user; at the same time, when the volume in the audio changes, the dynamic ripples in the voiceprint speech will also fluctuate. When the volume is higher, the dynamic ripples will fluctuate. It is also larger, correspondingly, when the volume is lower, the fluctuation of the dynamic ripple is also smaller. It should be noted that in order to unify the visual and auditory effects in the final generated special effects video, the voiceprint voice also needs to be consistent with the audio information of the current video frame to be processed.

进一步的，当应用对音频信息进行解析并得到音频特征后，即可构建出与音频信息的音频特征相对应的音频特效，其中，音频特效可以是展示于显示界面中的、用于表征音频特征的卡通贴图，如，预先构建的小动物卡通贴图，还可以是与声纹频谱对应的波纹，本领域技术人员应当理解，音频特效可以根据实际需求预先创建，如与国家相关的特征图案或更加形象化的音符图案等，本公开实施例在此不做具体的限定。Further, after the application parses the audio information and obtains the audio features, it can construct audio special effects corresponding to the audio features of the audio information. The audio special effects can be displayed in the display interface and used to represent the audio features. Cartoon textures, such as pre-constructed small animal cartoon textures, can also be ripples corresponding to the voiceprint spectrum. Those skilled in the art should understand that audio special effects can be pre-created according to actual needs, such as country-related characteristic patterns or more. Visualized note patterns, etc. are not specifically limited in the embodiments of the present disclosure.

在本实施例中，音频特效的显示形式包括动态显示和/或静态显示。具体的，动态显示为基于动画显示音频特征，静态显示为音频特征在显示界面上静态展示。示例性的，当应用将预先构建的小动物卡通贴图进行动态显示时，该贴图会随着音频特征的变化在显示界面中不断地上下跳动，当应用将声纹频谱对应的波纹进行静态显示时，应用则会将多段波纹拼接为一整段并基于时间戳将其展示于显示界面上，可以理解，显示界面的尺寸通常是有限的，因此，任意时刻展示出来的也只是声纹频谱对应波纹中的一段，只有在应用检测到用户选择特定的时刻，或者，对波纹的展示区域执行拖动操作时，才会将与特定时刻或拖动操作相对应的一段波纹展示出来。In this embodiment, the display form of the audio special effects includes dynamic display and/or static display. Specifically, the dynamic display is to display the audio features based on animation, and the static display is to display the audio features statically on the display interface. For example, when the application dynamically displays the pre-built small animal cartoon texture, the texture will continuously jump up and down in the display interface as the audio characteristics change. When the application statically displays the ripples corresponding to the voiceprint spectrum. , the application will splice multiple segments of ripples into a whole segment and display it on the display interface based on the timestamp. It can be understood that the size of the display interface is usually limited, so what is displayed at any time is only the ripple corresponding to the voiceprint spectrum. Only when the application detects that the user selects a specific moment or performs a drag operation on the ripple display area, will the ripple corresponding to the specific moment or drag operation be displayed.

可选的，基于声纹特征提取模型对当前待处理视频帧所对应的音频信息进行处理，得到声纹语音。其中，声纹特征提取模型可以是预先训练好的模型，同样可以在应用中进行集成。可以理解，采集到用户的音频信息后，即可将其输入至声纹特征提取模型中，从而得到对应的声纹语音。当然，在实际应用过程中，声纹特征提取模型输入的声纹语音可以是拥有多种动画效果的声波频谱，例如，输出的声波频谱可以以多种形状或颜色进行呈现，本领域技术人员应当理解，声纹语音所呈现的具体的视觉效果可以根据实际情况进行选择，本公开实施例在此不做具体的限定。Optionally, the audio information corresponding to the current video frame to be processed is processed based on the voiceprint feature extraction model to obtain the voiceprint speech. Among them, the voiceprint feature extraction model can be a pre-trained model, and can also be integrated in the application. It can be understood that after the user's audio information is collected, it can be input into the voiceprint feature extraction model to obtain the corresponding voiceprint speech. Of course, in the actual application process, the voiceprint voice input by the voiceprint feature extraction model can be a sound wave spectrum with various animation effects. For example, the output sound wave spectrum can be presented in a variety of shapes or colors. Those skilled in the art should It is understood that the specific visual effect presented by the voiceprint voice can be selected according to the actual situation, and the embodiments of the present disclosure are not specifically limited here.

S120、将模糊视频帧作为当前特效视频帧的背景图像，将音频特效作为当前特效视频帧的前景图像。S120. Use the blurred video frame as the background image of the current special effect video frame, and use the audio special effect as the foreground image of the current special effect video frame.

在本实施例中，确定出模糊视频帧以及音频特效后，即可基于上述信息构建出特效视频帧。具体来说，在特效视频帧中包括背景图像以及前景图像，前景图像叠加显示于背景图像上，可以对背景图像全部区域或部分区域进行遮挡，从而使构建的特效视频帧更具层次感，下面对确定背景图像的过程进行说明。In this embodiment, after the blurred video frame and the audio special effect are determined, the special effect video frame can be constructed based on the above information. Specifically, the special effects video frame includes a background image and a foreground image. The foreground image is superimposed on the background image and can block all or part of the background image, thereby making the constructed special effects video frame more layered. Next The process of determining the background image is explained.

可选的，在确定背景图像的过程中，可以确定与当前待处理视频帧相对应的叠加背景图像，以及叠加背景图像所对应的目标透明度；将叠加背景图像依据目标透明度，叠加在模糊视频帧上，作为背景图像。Optionally, in the process of determining the background image, the overlay background image corresponding to the current video frame to be processed can be determined, and the target transparency corresponding to the overlay background image can be determined; the overlay background image can be superimposed on the blurred video frame based on the target transparency. on, as a background image.

其中，叠加背景图像可以是用户通过应用预先设置的图像，也可以是应用根据待处理视频帧的亮度、颜色等信息自动选择的图像。具体来说，可以是纯色的图像，如纯黑色或纯灰色的图像，当选择上述纯色图像作为叠加背景图像时，可以使最终得到的特效视频帧呈现的视觉效果较为柔和。可以理解，自动选择的图像与待处理视频帧所呈现的画面更加适配。The overlay background image may be an image preset by the user through the application, or an image automatically selected by the application based on the brightness, color and other information of the video frame to be processed. Specifically, it can be a solid-color image, such as a pure black or pure gray image. When the above-mentioned solid-color image is selected as the overlay background image, the visual effect of the final special effects video frame can be made softer. It can be understood that the automatically selected image is more suitable for the picture presented by the video frame to be processed.

在实际应用的过程中，可以根据当前待处理视频帧中各像素点的像素值，确定像素均值；基于像素均值，确定与当前待处理视频帧相对应的叠加背景图像。例如，对当前待处理视频帧进行解析，从而确定出画面中各像素点的像素值，进一步的，将所有通道(R、G、B)的像素求均值，得到反映当前待处理视频帧平均亮度的像素均值，基于像素均值，即可确定出对应的颜色的图像作为叠加背景图像。可以理解为，基于像素均值确定出一幅纯色叠加背景图像。In the process of practical application, the pixel mean can be determined based on the pixel value of each pixel in the current video frame to be processed; based on the pixel mean, the superimposed background image corresponding to the current video frame to be processed is determined. For example, the current video frame to be processed is analyzed to determine the pixel value of each pixel in the picture. Further, the pixels of all channels (R, G, B) are averaged to obtain the average brightness of the current video frame to be processed. Based on the pixel mean, the corresponding color image can be determined as the overlay background image. It can be understood that a solid color overlay background image is determined based on the pixel average.

为了使最终生成的特效视频呈现出更好视觉效果，在确定叠加背景图像的同时，还需要确定出该图像的目标透明度，同样的，用户可以通过应用预先设置目标透明度，应用也可以根据待处理视频帧的亮度、颜色等信息动态选择相应的目标透明度，本公开实施例在此不再赘述。最后，将叠加背景图像按照目标透明度叠加在模糊视频帧上之后，便得到特效视频帧的背景图像。还需要说明的是，叠加的背景图像设置有一定的透明度。In order to make the final generated special effects video present a better visual effect, while determining the overlay background image, it is also necessary to determine the target transparency of the image. Similarly, the user can pre-set the target transparency through the application, and the application can also determine the target transparency according to the to-be-processed The corresponding target transparency is dynamically selected based on the brightness, color and other information of the video frame, which will not be described in detail here. Finally, after the superimposed background image is superimposed on the blurred video frame according to the target transparency, the background image of the special effect video frame is obtained. It should also be noted that the overlaid background image is set to have a certain transparency.

示例性的，当确定出模糊视频帧后，应用可以根据模糊视频帧自动选择一幅灰色的图像作为叠加背景图像，同时，根据用户预先设置的参数确定这幅灰色图像的目标透明度为50％，在此基础上，即可将灰色图像调整为50％透明度后叠加在模糊视频帧上，并将叠加后的图像作为背景图像。For example, after determining the blurred video frame, the application can automatically select a gray image as the overlay background image based on the blurred video frame, and at the same time determine the target transparency of this gray image to be 50% based on parameters preset by the user. On this basis, the gray image can be adjusted to 50% transparency and then superimposed on the blurred video frame, and the superimposed image can be used as the background image.

在本实施例中，确定出背景图像后，即可将与背景信息对应的时刻相一致的音频特效作为前景图像，并叠加在背景图像上，从而构建出特效视频帧。本领域技术人员应当理解，当待处理视频帧有多个时，应用可以基于本实施例的方案为各待处理视频帧分别确定出相应的特效视频帧，本公开实施例在此不再赘述。In this embodiment, after the background image is determined, the audio special effects consistent with the time corresponding to the background information can be used as the foreground image and superimposed on the background image, thereby constructing a special effects video frame. Those skilled in the art should understand that when there are multiple video frames to be processed, the application can determine corresponding special effect video frames for each video frame to be processed based on the solution of this embodiment. The embodiments of this disclosure will not be repeated here.

S130、通过对各待处理视频帧的特效视频帧拼接处理，得到目标特效视频。S130. Obtain the target special effects video by splicing the special effects video frames of each video frame to be processed.

在本实施例中，针对多个待处理视频帧确定出相应的多幅特效视频帧后，即可按照各待处理视频帧携带的时间戳确定出多幅特效视频帧对应的序列，从而根据序列将多幅特效视频帧进行拼接，得到目标特效视频。可以理解，在目标特效视频中，各待处理视频帧中的内容会以模糊的视觉效果展示出来，同时，在画面上层的特定位置会显示相应时刻的声纹语音，即，将用户此刻的音频以可视化的形式呈现出来。In this embodiment, after the corresponding special effects video frames are determined for the multiple video frames to be processed, the sequence corresponding to the multiple special effects video frames can be determined according to the timestamp carried by each video frame to be processed, so that according to the sequence Multiple special effects video frames are spliced to obtain the target special effects video. It can be understood that in the target special effects video, the content in each video frame to be processed will be displayed with a blurred visual effect. At the same time, the voiceprint voice of the corresponding moment will be displayed at a specific position on the upper layer of the screen, that is, the user's audio at the moment will be displayed. Presented in a visual form.

在本实施例中，由于将待处理视频帧模糊处理，仅显示声纹语音后，虽然满足了用户的个性化需求，却可能影响其他用户的视频观看体验，因此，为了提升特效视频对其他用户的吸引度，还可以在特效视频中添加目标人物模型，下面对这一过程进行说明。In this embodiment, since the video frame to be processed is blurred and only the voiceprint is displayed, although the user's personalized needs are met, it may affect the video viewing experience of other users. Therefore, in order to improve the visibility of the special effects video to other users To increase the attractiveness, you can also add a target character model to the special effects video. This process is explained below.

具体来说，在前景图像中叠加目标人物模型；获取当前待处理视频帧中目标对象的肢体动作信息和/或面部表情；调整目标人物模型，与肢体动作和/或面部表情相匹配。Specifically, the target character model is superimposed on the foreground image; the body movement information and/or facial expressions of the target object in the current video frame to be processed are obtained; the target character model is adjusted to match the body movements and/or facial expressions.

其中，目标人物模型可以是预先设置的静态或动态的3D模型，该模型至少可以呈现出拟人化的形象，例如，一个虚拟的卡通人物。同时，目标人物模型可以在前景图像中进行叠加，例如，将目标人物模型呈现在声纹语音的上方或下方。本领域技术人员应当理解，目标人物模型同样会对背景图像全部区域或部分区域进行遮挡，从而使构建的特效视频帧更具层次感。The target character model can be a preset static or dynamic 3D model, which can at least present an anthropomorphic image, for example, a virtual cartoon character. At the same time, the target character model can be superimposed in the foreground image, for example, the target character model is presented above or below the voiceprint voice. Those skilled in the art should understand that the target character model will also block all or part of the background image, thereby making the constructed special effects video frame more layered.

在本实施例中，当目标人物模型为动态模型时，为了进一步提升特效视频的趣味性，还可以使特效视频中模型的动作与用户真实的动作相匹配。示例性的，在当前待处理视频帧中通过关键点识别技术确定出用户手臂抬起时，也需要对目标人物模型的手臂进行适应性调整，即，将虚拟人物的手臂抬起；在下一待处理视频帧中，若确定出用户手臂已经放下，则需要将虚拟人物的手臂适应性调整为放下。基于此，在最终生成的特效视频中，目标人物模型即可做出与用户实际动作基本一致的动作。需要说明的是，在实际应用过程中，应用还可以对待处理视频帧中用户的面部表情进行捕捉，从而适应性调整目标人物模型的表情，本公开实施例在此不再赘述。In this embodiment, when the target character model is a dynamic model, in order to further enhance the interest of the special effects video, the movements of the model in the special effects video can also be matched with the real movements of the user. For example, when it is determined that the user's arm is raised through key point recognition technology in the current video frame to be processed, the arm of the target character model also needs to be adaptively adjusted, that is, the arm of the virtual character is raised; During the processing of the video frame, if it is determined that the user's arm has been lowered, the avatar's arm needs to be adaptively adjusted to be lowered. Based on this, in the final generated special effects video, the target character model can make actions that are basically consistent with the user's actual actions. It should be noted that during the actual application process, the application can also capture the user's facial expression in the video frame to be processed, thereby adaptively adjusting the expression of the target character model. The embodiments of the present disclosure will not be repeated here.

可以理解为，在特效视频的任意一帧画面中，目标人物模型的肢体动作和/或面部表情，都与待处理视频帧中目标对象的肢体动作和/或面部表情相匹配。It can be understood that in any frame of the special effects video, the body movements and/or facial expressions of the target character model match the body movements and/or facial expressions of the target object in the video frame to be processed.

在实际应用过程中，一方面，特效视频可以按照本实施例的方案实时生成，例如，在多个用户进行视频通话的过程中实时生成。另一方面，还可以对现有的视频进行后期处理而生成特效视频。下面对经后期处理所得到的特效视频进行说明。In practical applications, on the one hand, special effects videos can be generated in real time according to the solution of this embodiment, for example, during video calls between multiple users. On the other hand, you can also post-process existing videos to generate special effects videos. The following is an explanation of the special effects video obtained through post-processing.

具体的，若目标特效视频帧是在视频录制的模式下生成的，则确定与音频信息相对应的文字信息；在播放目标特效视频时，于播放界面的目标区域中显示文字信息。Specifically, if the target special effects video frame is generated in the video recording mode, text information corresponding to the audio information is determined; when playing the target special effects video, the text information is displayed in the target area of the playback interface.

具体来说，区别于多个用户之间的视频通话模式，视频录制模式是指用户自主拍摄视频的模式，在这种模式下，用户可以基于应用提供的功能拍摄特效视频，并对生成的特效视频执行进一步处理、存储或分享等操作。Specifically, different from the video call mode between multiple users, the video recording mode refers to the mode in which users shoot videos independently. In this mode, users can shoot special effects videos based on the functions provided by the application, and control the generated special effects. The video undergoes operations such as further processing, storage or sharing.

当目标特效视频在视频录制模式下生成时，为了便于用户分享特效视频，提高应用的智能度，可以基于预先训练好的语音识别模型和/或语义识别模型确定与音频信息对应的文字信息，并将文字信息显示在播放界面的目标区域内，在这一过程中，为了避免文字信息被遮挡从而影响用户的观看体验，目标区域位于各待处理特效视频帧的前景图像中；同时，为了使文字信息不遮挡各帧特效图像中的声纹语音，也为了避免观看的用户对两种元素产生视觉上的混淆，还需要将与音频信息相对应的文字信息和声纹语音于播放界面中下区别显示，例如，将文字信息以白色、特定字体的形式显示在播放界面最下方，将声纹语音以水平形状、多种颜色叠加在背景图像中心位置，从而显示在播放界面的中央。When the target special effects video is generated in video recording mode, in order to facilitate users to share the special effects video and improve the intelligence of the application, the text information corresponding to the audio information can be determined based on the pre-trained speech recognition model and/or semantic recognition model, and The text information is displayed in the target area of the playback interface. In this process, in order to avoid the text information being blocked and thus affecting the user's viewing experience, the target area is located in the foreground image of each special effects video frame to be processed; at the same time, in order to make the text The information does not block the voiceprint voice in each frame of the special effects image. In order to avoid visual confusion between the two elements for viewing users, the text information corresponding to the audio information and the voiceprint voice need to be distinguished in the playback interface. For example, the text information is displayed in white and in a specific font at the bottom of the playback interface, and the voiceprint voice is superimposed in a horizontal shape and multiple colors on the center of the background image, thereby displaying it in the center of the playback interface.

本领域技术人员应当理解，特效视频中各帧画面对应的音频信息可以存在差异，基于此，播放界面中显示的文字信息也会随着视频播放而不断变化。通过在播放界面中显示与音频信息对应的文字信息，实现了自动为特效视频添加“字幕”的技术效果，避免了用户录入的音频信息不清晰，从而影响其他用户的特效视频观看体验的问题。Those skilled in the art should understand that the audio information corresponding to each frame in the special effects video may be different. Based on this, the text information displayed in the playback interface will also continue to change as the video is played. By displaying the text information corresponding to the audio information in the playback interface, the technical effect of automatically adding "subtitles" to the special effects video is realized, which avoids the problem that the audio information entered by the user is not clear, thus affecting the viewing experience of other users' special effects video.

需要说明的是，当为目标特效视频生成对应的文字信息后，为了避免特效视频中音频信息、文字信息以及声纹语音三种元素出现不同步的现象，还可以调整音频信息、文字信息以及声纹语音同频显示。示例性的，预先在应用中开发用于调整上述三种元素时间戳的控件，当为特效视频中各帧画面对应的音频信息确定出相应的文字信息后，如果在所生成的特效视频中发现某一帧或多帧内文字信息与实际音频信息不同步时，则可以通过对控制文字信息时间戳的控件施加调整操作，从而调整某一帧或多帧特效图像对应的文字信息。可以理解，当特效视频中某一帧或多帧内声纹语音与实际音频不同步时，同样可以按照上述方式对其时间戳进行调整，本公开实施例在此不再赘述。It should be noted that when the corresponding text information is generated for the target special effects video, in order to avoid the audio information, text information and voiceprint voice elements in the special effects video from being out of sync, you can also adjust the audio information, text information and voice. Text and voice are displayed on the same frequency. For example, controls for adjusting the timestamps of the above three elements are developed in the application in advance. After the corresponding text information is determined for the audio information corresponding to each frame in the special effects video, if it is found in the generated special effects video When the text information in a certain frame or multiple frames is out of sync with the actual audio information, you can adjust the text information corresponding to a certain frame or multiple frames of special effects images by applying an adjustment operation to the control that controls the timestamp of the text information. It can be understood that when the voiceprint voice and the actual audio in a certain frame or multiple frames of the special effects video are out of synchronization, the timestamp can also be adjusted in the above manner, and the embodiments of the present disclosure will not be repeated here.

在本实施例中，应用还可以将与音频信息相关联的目标关键字展示于显示界面上。可选的，根据当前待处理视频帧之前的各历史待处理视频帧的历史音频信息，更新显示界面上的至少一个目标关键词。In this embodiment, the application may also display the target keyword associated with the audio information on the display interface. Optionally, update at least one target keyword on the display interface based on the historical audio information of each historical video frame to be processed before the current video frame to be processed.

具体来说，在确定出用户的音频信息后，应用可以基于预先训练好的语音处理模型对音频信息进行解析，并对其中的关键词进行提取。其中，关键词可以是音频信息中出现频次最高的一个或多个词汇，还可以是音频信息中出现的、与预先构建的关键词词库中的内容相一致的词汇，当然，在实际应用过程中，关键词还可以是音频信息中出现的当前热门词汇，或者，与某一领域相关专业词汇等，本领域技术人员应当理解，关键词的提取规则可以根据实际需求进行设置，本公开实施例在此不做具体的限定。Specifically, after determining the user's audio information, the application can parse the audio information based on a pre-trained speech processing model and extract keywords therein. Among them, the keywords can be one or more words that appear most frequently in the audio information, or they can also be words that appear in the audio information and are consistent with the content in the pre-built keyword thesaurus. Of course, in the actual application process , the keywords can also be current hot words appearing in the audio information, or professional vocabulary related to a certain field, etc. Those skilled in the art should understand that the keyword extraction rules can be set according to actual needs. Embodiments of the present disclosure There are no specific limitations here.

在本实施例中，当应用确定出音频信息中的关键词后，即可将关键词展示于显示界面中，其中，所展示的关键词至少用于实现对历史视频帧的定位功能。具体的，当检测到对触发目标关键词时，跳转至与目标关键词所对应的历史目标特效视频帧，并播放；其中，历史目标特效视频帧是基于对历史待处理视频帧特效处理后得到的视频帧。In this embodiment, after the application determines the keywords in the audio information, the keywords can be displayed in the display interface, where the displayed keywords are at least used to realize the positioning function of historical video frames. Specifically, when a triggering target keyword is detected, jump to the historical target special effects video frame corresponding to the target keyword and play it; wherein, the historical target special effects video frame is based on the special effects processing of the historical to-be-processed video frame. The resulting video frames.

可以理解为，当应用检测到用户针对于其中任意一个关键词的触发操作时，该关键词即是目标关键词，进一步的，应用可以将当前播放的特效视频帧自动跳转至包含有该关键词的历史特效视频帧，通过这种方式，使用户在后续回忆自己陈述与某一关键词相关联的内容时，可以快速定位出这段内容对应的特效视频帧。示例性的，在用户的音频信息中出现过“技术方案”这一词汇，同时，应用已确定该词汇为关键词并展示于显示界面中，在此基础上，当用户后续回放应用所生成的特效视频时，如果希望仅观看提到“技术方案”的那一部分特效视频，则可以对展示于显示界面的“技术方案”这一词汇进行点击，当应用检测到用户的触发操作后，即可自动跳转至首次出现“技术方案”这一词汇的历史特效视频帧，并从该帧开始继续播放。It can be understood that when the application detects the user's trigger operation for any one of the keywords, the keyword is the target keyword. Furthermore, the application can automatically jump to the currently playing special effects video frame to the frame containing the key. In this way, users can quickly locate the special effects video frames corresponding to this content when they later recall the content associated with a certain keyword. For example, the word "technical solution" appears in the user's audio information. At the same time, the application has determined that the word is a keyword and displayed it in the display interface. On this basis, when the user subsequently plays back the words generated by the application, When viewing a special effects video, if you want to watch only the part of the special effects video that mentions "technical solution", you can click on the word "technical solution" displayed on the display interface. When the application detects the user's trigger operation, you can Automatically jump to the historical special effects video frame where the word "technical solution" first appears, and continue playing from that frame.

实施例二Embodiment 2

图2为本公开实施例二所提供的一种确定特效视频的装置结构示意图，如图2所示，所述装置包括：模糊视频帧确定模块210、特效视频帧生成模块220、以及目标特效视频生成模块230。Figure 2 is a schematic structural diagram of a device for determining special effects video provided in Embodiment 2 of the present disclosure. As shown in Figure 2, the device includes: a blur video frame determination module 210, a special effect video frame generation module 220, and a target special effect video Generate module 230.

模糊视频帧确定模块210，用于响应于特效触发操作，确定与当前待处理视频帧相对应的模糊视频帧；以及，确定与所述当前待处理视频帧的音频信息相一致的音频特效。The blurred video frame determination module 210 is configured to determine the blurred video frame corresponding to the current video frame to be processed in response to the special effect triggering operation; and to determine the audio special effects consistent with the audio information of the current video frame to be processed.

特效视频帧生成模块220，用于将所述模糊视频帧作为当前特效视频帧的背景图像，将所述音频特效作为当前特效视频帧的前景图像。The special effect video frame generation module 220 is configured to use the blurred video frame as the background image of the current special effect video frame, and the audio special effect as the foreground image of the current special effect video frame.

目标特效视频生成模块230，用于通过对各待处理视频帧的特效视频帧拼接处理，得到目标特效视频。The target special effects video generation module 230 is used to obtain the target special effects video by splicing the special effect video frames of the video frames to be processed.

在上述各技术方案的基础上，模糊视频帧确定模块210包括模糊视频帧确定单元以及音频特效确定单元。Based on the above technical solutions, the blurred video frame determination module 210 includes a blurred video frame determination unit and an audio special effects determination unit.

模糊视频帧确定单元，用于为所述当前待处理视频帧中添加模糊滤镜，得到所述模糊视频帧；或，对所述当前待处理视频帧高斯模糊处理，得到模糊视频帧；或，将所述当前待处理视频帧输入至模糊处理模型中，得到所述模糊视频帧；或，若当前待处理视频帧中包括目标对象时，对所述目标对象模糊处理，得到模糊视频帧。A blur video frame determination unit configured to add a blur filter to the current video frame to be processed to obtain the blurred video frame; or to perform Gaussian blur processing on the current video frame to be processed to obtain a blurred video frame; or, Input the current video frame to be processed into the blur processing model to obtain the blurred video frame; or, if the current video frame to be processed includes a target object, blur the target object to obtain the blurred video frame.

音频特效确定单元，用于基于声纹特征提取模型对所述当前待处理视频帧所对应的音频信息进行处理，得到所述音频特效。The audio special effects determination unit is used to process the audio information corresponding to the current video frame to be processed based on the voiceprint feature extraction model to obtain the audio special effects.

在上述各技术方案的基础上，所述音频特效与音频信息的音频特征相对应；所述音频特征包括声纹频谱特征；所述音频特效的显示形式包括动态显示和/或静态显示，所述动态显示为基于动画显示音频特征，所述静态显示为所述音频特征在显示界面上静态展示。On the basis of the above technical solutions, the audio special effects correspond to the audio characteristics of the audio information; the audio characteristics include voiceprint spectrum characteristics; the display form of the audio special effects includes dynamic display and/or static display. The dynamic display is to display audio features based on animation, and the static display is to statically display the audio features on the display interface.

在上述各技术方案的基础上，特效视频帧生成模块220包括叠加背景图像确定单元以及背景图像确定单元。Based on the above technical solutions, the special effects video frame generation module 220 includes a superimposed background image determination unit and a background image determination unit.

叠加背景图像确定单元，用于确定与所述当前待处理视频帧相对应的叠加背景图像，以及所述叠加背景图像所对应的目标透明度。An overlay background image determination unit is configured to determine an overlay background image corresponding to the currently to-be-processed video frame, and a target transparency corresponding to the overlay background image.

背景图像确定单元，用于将所述叠加背景图像依据所述目标透明度，叠加在所述模糊视频帧上，作为所述背景图像。A background image determining unit configured to superimpose the superimposed background image on the blurred video frame according to the target transparency as the background image.

可选的，叠加背景图像确定单元，还用于根据所述当前待处理视频帧中各像素点的像素值，确定像素均值；基于所述像素均值，确定与所述当前待处理视频帧相对应的叠加背景图像。Optionally, the superimposed background image determination unit is also configured to determine the pixel mean value according to the pixel value of each pixel point in the current video frame to be processed; based on the pixel mean value, determine the corresponding pixel value of the current video frame to be processed. overlay background image.

在上述各技术方案的基础上，确定特效视频的装置还包括目标人物模型确定模块。Based on the above technical solutions, the device for determining special effects video also includes a target character model determination module.

目标人物模型确定模块，用于在所述前景图像中叠加目标人物模型；获取所述当前待处理视频帧中目标对象的肢体动作信息和/或面部表情；调整所述目标人物模型，与所述肢体动作和/或所述面部表情相匹配。The target character model determination module is used to superimpose the target character model in the foreground image; obtain the body movement information and/or facial expression of the target object in the current video frame to be processed; adjust the target character model, and the Body movements and/or facial expressions matched.

在上述各技术方案的基础上，确定特效视频的装置还包括目标关键词更新模块。Based on the above technical solutions, the device for determining special effects video also includes a target keyword update module.

目标关键词更新模块，用于根据当前待处理视频帧之前的各历史待处理视频帧的历史音频信息，更新显示界面上的至少一个目标关键词。The target keyword update module is configured to update at least one target keyword on the display interface based on the historical audio information of each historical video frame to be processed before the current video frame to be processed.

在上述各技术方案的基础上，确定特效视频的装置还包括跳转模块。Based on the above technical solutions, the device for determining special effect video also includes a jump module.

跳转模块，用于当检测到对触发目标关键词时，跳转至与所述目标关键词所对应的历史目标特效视频帧，并播放；其中，所述历史目标特效视频帧是基于对历史待处理视频帧特效处理后得到的视频帧。A jump module, used to jump to the historical target special effects video frame corresponding to the target keyword when a trigger target keyword is detected, and play it; wherein the historical target special effects video frame is based on the historical target special effects video frame. The video frame obtained after special effects processing of the video frame to be processed.

在上述各技术方案的基础上，确定特效视频的装置还包括文字信息确定模块。Based on the above technical solutions, the device for determining special effects video also includes a text information determination module.

文字信息确定模块，用于若所述目标特效视频帧是在视频录制的模式下生成的，则确定与音频信息相对应的文字信息；在播放所述目标特效视频时，于播放界面的目标区域中显示所述文字信息；其中，所述目标区域位于各待处理特效视频帧的前景图像中。Text information determination module, used to determine the text information corresponding to the audio information if the target special effects video frame is generated in the video recording mode; when playing the target special effects video, in the target area of the playback interface The text information is displayed in; wherein the target area is located in the foreground image of each special effect video frame to be processed.

在上述各技术方案的基础上，确定特效视频的装置还包括调整模块。Based on the above technical solutions, the device for determining special effect video also includes an adjustment module.

调整模块，用于调整所述音频信息、文字信息以及音频特效同频显示。The adjustment module is used to adjust the audio information, text information and audio special effects to be displayed at the same frequency.

在上述各技术方案的基础上，确定特效视频的装置还包括区别显示模块。On the basis of the above technical solutions, the device for determining special effect video also includes a distinction display module.

区别显示模块，用于将与音频信息相对应的文字信息和音频特效于播放界面中下区别显示。The differential display module is used to differentially display text information and audio special effects corresponding to the audio information in the middle and lower parts of the playback interface.

本实施例所提供的技术方案，响应于特效触发操作，确定出与当前待处理视频帧对应的模糊视频帧，以及与当前待处理视频帧的音频信息相一致的音频特效；将模糊视频帧作为背景图像，将音频特效作为前景图像，从而构建出当前特效视频帧，进一步的，将各特效视频帧进行拼接处理，得到目标特效视频，不仅将用户的语音以可视化的形式呈现出来，增强了特效视频的趣味性，同时，通过将用户拍摄的待处理视频帧模糊处理，满足了用户的个性化需求，提升了用户在制作特效视频过程中的使用体验。The technical solution provided by this embodiment determines, in response to the special effect triggering operation, the blurred video frame corresponding to the current video frame to be processed, and the audio special effects consistent with the audio information of the current video frame to be processed; the blurred video frame is regarded as The background image uses the audio special effects as the foreground image to construct the current special effects video frame. Further, each special effects video frame is spliced to obtain the target special effects video. It not only presents the user's voice in a visual form, but also enhances the special effects. At the same time, by blurring the video frames shot by users to be processed, it meets the personalized needs of users and improves the user experience in the process of making special effects videos.

本公开实施例所提供的确定特效视频的装置可执行本公开任意实施例所提供的确定特效视频的方法，具备执行方法相应的功能模块和有益效果。The device for determining special effects video provided by the embodiments of the present disclosure can execute the method for determining special effects video provided by any embodiment of the present disclosure, and has corresponding functional modules and beneficial effects for executing the method.

值得注意的是，上述装置所包括的各个单元和模块只是按照功能逻辑进行划分的，但并不局限于上述的划分，只要能够实现相应的功能即可；另外，各功能单元的具体名称也只是为了便于相互区分，并不用于限制本公开实施例的保护范围。It is worth noting that the various units and modules included in the above-mentioned devices are only divided according to functional logic, but are not limited to the above-mentioned divisions, as long as they can achieve the corresponding functions; in addition, the specific names of each functional unit are just In order to facilitate mutual differentiation, it is not used to limit the protection scope of the embodiments of the present disclosure.

实施例三Embodiment 3

图3为本公开实施例三所提供的一种电子设备的结构示意图。下面参考图3，其示出了适于用来实现本公开实施例的电子设备(例如图3中的终端设备或服务器)300的结构示意图。本公开实施例中的终端设备可以包括但不限于诸如移动电话、笔记本电脑、数字广播接收器、PDA(个人数字助理)、PAD(平板电脑)、PMP(便携式多媒体播放器)、车载终端(例如车载导航终端)等等的移动终端以及诸如数字TV、台式计算机等等的固定终端。图3示出的电子设备仅仅是一个示例，不应对本公开实施例的功能和使用范围带来任何限制。FIG. 3 is a schematic structural diagram of an electronic device provided by Embodiment 3 of the present disclosure. Referring now to FIG. 3 , a schematic structural diagram of an electronic device (such as the terminal device or server in FIG. 3 ) 300 suitable for implementing embodiments of the present disclosure is shown. Terminal devices in embodiments of the present disclosure may include, but are not limited to, mobile phones, laptops, digital broadcast receivers, PDAs (personal digital assistants), PADs (tablets), PMPs (portable multimedia players), vehicle-mounted terminals (such as Mobile terminals such as car navigation terminals) and fixed terminals such as digital TVs, desktop computers, etc. The electronic device shown in FIG. 3 is only an example and should not impose any limitations on the functions and scope of use of the embodiments of the present disclosure.

如图3所示，电子设备300可以包括处理装置(例如中央处理器、图案处理器等)301，其可以根据存储在只读存储器(ROM)302中的程序或者从存储装置306加载到随机访问存储器(RAM)303中的程序而执行各种适当的动作和处理。在RAM 303中，还存储有电子设备300操作所需的各种程序和数据。处理装置301、ROM 302以及RAM 303通过总线304彼此相连。编辑/输出(I/O)接口305也连接至总线304。As shown in FIG. 3 , the electronic device 300 may include a processing device (such as a central processing unit, a pattern processor, etc.) 301 , which may be loaded into a random access device according to a program stored in a read-only memory (ROM) 302 or from a storage device 306 . The program in the memory (RAM) 303 executes various appropriate actions and processes. In the RAM 303, various programs and data required for the operation of the electronic device 300 are also stored. The processing device 301, the ROM 302 and the RAM 303 are connected to each other via a bus 304. An editing/output (I/O) interface 305 is also connected to bus 304.

通常，以下装置可以连接至I/O接口305：包括例如触摸屏、触摸板、键盘、鼠标、摄像头、麦克风、加速度计、陀螺仪等的编辑装置306；包括例如液晶显示器(LCD)、扬声器、振动器等的输出装置307；包括例如磁带、硬盘等的存储装置308；以及通信装置309。通信装置309可以允许电子设备300与其他设备进行无线或有线通信以交换数据。虽然图3示出了具有各种装置的电子设备300，但是应理解的是，并不要求实施或具备所有示出的装置。可以替代地实施或具备更多或更少的装置。Generally, the following devices may be connected to the I/O interface 305: an editing device 306 including, for example, a touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; including, for example, a liquid crystal display (LCD), speaker, vibration An output device 307 such as a computer; a storage device 308 including a magnetic tape, a hard disk, etc.; and a communication device 309. The communication device 309 may allow the electronic device 300 to communicate wirelessly or wiredly with other devices to exchange data. Although FIG. 3 illustrates electronic device 300 with various means, it should be understood that implementation or availability of all illustrated means is not required. More or fewer means may alternatively be implemented or provided.

特别地，根据本公开的实施例，上文参考流程图描述的过程可以被实现为计算机软件程序。例如，本公开的实施例包括一种计算机程序产品，其包括承载在非暂态计算机可读介质上的计算机程序，该计算机程序包含用于执行流程图所示的方法的程序代码。在这样的实施例中，该计算机程序可以通过通信装置309从网络上被下载和安装，或者从存储装置306被安装，或者从ROM 302被安装。在该计算机程序被处理装置301执行时，执行本公开实施例的方法中限定的上述功能。In particular, according to embodiments of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product including a computer program carried on a non-transitory computer-readable medium, the computer program containing program code for performing the method illustrated in the flowchart. In such embodiments, the computer program may be downloaded and installed from the network via communication device 309, or from storage device 306, or from ROM 302. When the computer program is executed by the processing device 301, the above-mentioned functions defined in the method of the embodiment of the present disclosure are performed.

本公开实施例提供的电子设备与上述实施例提供的确定特效视频的方法属于同一发明构思，未在本实施例中详尽描述的技术细节可参见上述实施例，并且本实施例与上述实施例具有相同的有益效果。The electronic device provided by the embodiments of the present disclosure and the method for determining special effect video provided by the above embodiments belong to the same inventive concept. Technical details that are not described in detail in this embodiment can be referred to the above embodiments, and this embodiment has the same characteristics as the above embodiments. Same beneficial effects.

实施例四Embodiment 4

本公开实施例提供了一种计算机存储介质，其上存储有计算机程序，该程序被处理器执行时实现上述实施例所提供的确定特效视频的方法。Embodiments of the present disclosure provide a computer storage medium on which a computer program is stored. When the program is executed by a processor, the method for determining special effect video provided in the above embodiments is implemented.

需要说明的是，本公开上述的计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质或者是上述两者的任意组合。计算机可读存储介质例如可以是——但不限于——电、磁、光、电磁、红外线、或半导体的系统、装置或器件，或者任意以上的组合。计算机可读存储介质的更具体的例子可以包括但不限于：具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机访问存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑磁盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本公开中，计算机可读存储介质可以是任何包含或存储程序的有形介质，该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。而在本公开中，计算机可读信号介质可以包括在基带中或者作为载波一部分传播的数据信号，其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式，包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读信号介质还可以是计算机可读存储介质以外的任何计算机可读介质，该计算机可读信号介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。计算机可读介质上包含的程序代码可以用任何适当的介质传输，包括但不限于：电线、光缆、RF(射频)等等，或者上述的任意合适的组合。It should be noted that the computer-readable medium mentioned above in the present disclosure may be a computer-readable signal medium or a computer-readable storage medium, or any combination of the above two. The computer-readable storage medium may be, for example, but is not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus or device, or any combination thereof. More specific examples of computer readable storage media may include, but are not limited to: an electrical connection having one or more wires, a portable computer disk, a hard drive, random access memory (RAM), read only memory (ROM), removable Programmed read-only memory (EPROM or flash memory), fiber optics, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above. In this disclosure, a computer-readable storage medium may be any tangible medium that contains or stores a program for use by or in connection with an instruction execution system, apparatus, or device. In the present disclosure, a computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, carrying computer-readable program code therein. Such propagated data signals may take a variety of forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the above. A computer-readable signal medium may also be any computer-readable medium other than a computer-readable storage medium that can send, propagate, or transmit a program for use by or in connection with an instruction execution system, apparatus, or device . Program code embodied on a computer-readable medium may be transmitted using any suitable medium, including but not limited to: wire, optical cable, RF (radio frequency), etc., or any suitable combination of the above.

在一些实施方式中，客户端、服务器可以利用诸如HTTP(HyperText TransferProtocol，超文本传输协议)之类的任何当前已知或未来研发的网络协议进行通信，并且可以与任意形式或介质的数字数据通信(例如，通信网络)互连。通信网络的示例包括局域网(“LAN”)，广域网(“WAN”)，网际网(例如，互联网)以及端对端网络(例如，ad hoc端对端网络)，以及任何当前已知或未来研发的网络。In some embodiments, the client and server can communicate using any currently known or future developed network protocol such as HTTP (HyperText Transfer Protocol), and can communicate with digital data in any form or medium. (e.g., communications network) interconnection. Examples of communication networks include local area networks ("LAN"), wide area networks ("WAN"), the Internet (e.g., the Internet), and end-to-end networks (e.g., ad hoc end-to-end networks), as well as any currently known or developed in the future network of.

上述计算机可读介质可以是上述电子设备中所包含的；也可以是单独存在，而未装配入该电子设备中。The above-mentioned computer-readable medium may be included in the above-mentioned electronic device; it may also exist independently without being assembled into the electronic device.

上述计算机可读介质承载有一个或者多个程序，当上述一个或者多个程序被该电子设备执行时，使得该电子设备：The above-mentioned computer-readable medium carries one or more programs. When the above-mentioned one or more programs are executed by the electronic device, the electronic device:

可以以一种或多种程序设计语言或其组合来编写用于执行本公开的操作的计算机程序代码，上述程序设计语言包括但不限于面向对象的程序设计语言—诸如Java、Smalltalk、C++，还包括常规的过程式程序设计语言—诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中，远程计算机可以通过任意种类的网络——包括局域网(LAN)或广域网(WAN)—连接到用户计算机，或者，可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。Computer program code for performing the operations of the present disclosure may be written in one or more programming languages, including but not limited to object-oriented programming languages—such as Java, Smalltalk, C++, and Includes conventional procedural programming languages—such as "C" or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In situations involving remote computers, the remote computer can be connected to the user's computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or it can be connected to an external computer (such as an Internet service provider through Internet connection).

附图中的流程图和框图，图示了按照本公开各种实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上，流程图或框图中的每个方框可以代表一个模块、程序段、或代码的一部分，该模块、程序段、或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意，在有些作为替换的实现中，方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如，两个接连地表示的方框实际上可以基本并行地执行，它们有时也可以按相反的顺序执行，这依所涉及的功能而定。也要注意的是，框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合，可以用执行规定的功能或操作的专用的基于硬件的系统来实现，或者可以用专用硬件与计算机指令的组合来实现。The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operations of possible implementations of systems, methods, and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagram may represent a module, segment, or portion of code that contains one or more logic functions that implement the specified executable instructions. It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown one after another may actually execute substantially in parallel, or they may sometimes execute in the reverse order, depending on the functionality involved. It will also be noted that each block of the block diagram and/or flowchart illustration, and combinations of blocks in the block diagram and/or flowchart illustration, can be implemented by special purpose hardware-based systems that perform the specified functions or operations. , or can be implemented using a combination of specialized hardware and computer instructions.

描述于本公开实施例中所涉及到的单元可以通过软件的方式实现，也可以通过硬件的方式来实现。其中，单元的名称在某种情况下并不构成对该单元本身的限定，例如，第一获取单元还可以被描述为“获取至少两个网际协议地址的单元”。The units involved in the embodiments of the present disclosure can be implemented in software or hardware. The name of the unit does not constitute a limitation on the unit itself under certain circumstances. For example, the first acquisition unit can also be described as "the unit that acquires at least two Internet Protocol addresses."

本文中以上描述的功能可以至少部分地由一个或多个硬件逻辑部件来执行。例如，非限制性地，可以使用的示范类型的硬件逻辑部件包括：现场可编程门阵列(FPGA)、专用集成电路(ASIC)、专用标准产品(ASSP)、片上系统(SOC)、复杂可编程逻辑设备(CPLD)等等。The functions described above herein may be performed, at least in part, by one or more hardware logic components. For example, and without limitation, exemplary types of hardware logic components that may be used include: Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), Systems on Chips (SOCs), Complex Programmable Logical device (CPLD) and so on.

在本公开的上下文中，机器可读介质可以是有形的介质，其可以包含或存储以供指令执行系统、装置或设备使用或与指令执行系统、装置或设备结合地使用的程序。机器可读介质可以是机器可读信号介质或机器可读储存介质。机器可读介质可以包括但不限于电子的、磁性的、光学的、电磁的、红外的、或半导体系统、装置或设备，或者上述内容的任何合适组合。机器可读存储介质的更具体示例会包括基于一个或多个线的电气连接、便携式计算机盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦除可编程只读存储器(EPROM或快闪存储器)、光纤、便捷式紧凑盘只读存储器(CD-ROM)、光学储存设备、磁储存设备、或上述内容的任何合适组合。In the context of this disclosure, a machine-readable medium may be a tangible medium that may contain or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. Machine-readable media may include, but are not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, devices or devices, or any suitable combination of the foregoing. More specific examples of machine-readable storage media would include electrical connections based on one or more wires, laptop disks, hard drives, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above.

根据本公开的一个或多个实施例，【示例一】提供了一种确定特效视频的方法，该方法包括：According to one or more embodiments of the present disclosure, [Example 1] provides a method for determining special effects videos, which method includes:

根据本公开的一个或多个实施例，【示例二】提供了一种确定特效视频的方法，该方法，还包括：According to one or more embodiments of the present disclosure, [Example 2] provides a method for determining special effects videos, which method further includes:

可选的，为所述当前待处理视频帧中添加模糊滤镜，得到所述模糊视频帧；或，Optionally, add a blur filter to the current video frame to be processed to obtain the blurred video frame; or,

对所述当前待处理视频帧高斯模糊处理，得到模糊视频帧；或，Perform Gaussian blur processing on the current video frame to be processed to obtain a blurred video frame; or,

将所述当前待处理视频帧输入至模糊处理模型中，得到所述模糊视频帧；或，Input the current video frame to be processed into the blur processing model to obtain the blurred video frame; or,

若当前待处理视频帧中包括目标对象时，对所述目标对象模糊处理，得到模糊视频帧。If the current video frame to be processed includes a target object, the target object is blurred to obtain a blurred video frame.

根据本公开的一个或多个实施例，【示例三】提供了一种确定特效视频的方法，该方法，还包括：According to one or more embodiments of the present disclosure, [Example 3] provides a method for determining special effects videos, which method further includes:

可选的，基于声纹特征提取模型对所述当前待处理视频帧所对应的音频信息进行处理，得到所述音频特效。Optionally, the audio information corresponding to the current video frame to be processed is processed based on a voiceprint feature extraction model to obtain the audio special effect.

根据本公开的一个或多个实施例，【示例四】提供了一种确定特效视频的方法，该方法，还包括：According to one or more embodiments of the present disclosure, [Example 4] provides a method for determining special effects videos, which method further includes:

可选的，所述音频特效与音频信息的音频特征相对应；所述音频特征包括声纹频谱特征；所述音频特效的显示形式包括动态显示和/或静态显示，所述动态显示为基于动画显示音频特征，所述静态显示为所述音频特征在显示界面上静态展示。Optionally, the audio special effects correspond to the audio features of the audio information; the audio features include voiceprint spectrum features; the display form of the audio special effects includes dynamic display and/or static display, and the dynamic display is based on animation. Display audio features, and the static display means that the audio features are statically displayed on the display interface.

根据本公开的一个或多个实施例，【示例五】提供了一种确定特效视频的方法，该方法，还包括：According to one or more embodiments of the present disclosure, [Example 5] provides a method for determining special effects videos, which method further includes:

可选的，确定与所述当前待处理视频帧相对应的叠加背景图像，以及所述叠加背景图像所对应的目标透明度；Optionally, determine the superimposed background image corresponding to the current video frame to be processed, and the target transparency corresponding to the superimposed background image;

将所述叠加背景图像依据所述目标透明度，叠加在所述模糊视频帧上，作为所述背景图像。The superimposed background image is superimposed on the blurred video frame according to the target transparency as the background image.

根据本公开的一个或多个实施例，【示例六】提供了一种确定特效视频的方法，该方法，还包括：According to one or more embodiments of the present disclosure, [Example 6] provides a method for determining special effects videos, which method further includes:

可选的，根据所述当前待处理视频帧中各像素点的像素值，确定像素均值；Optionally, determine the pixel mean value based on the pixel value of each pixel in the current video frame to be processed;

基于所述像素均值，确定与所述当前待处理视频帧相对应的叠加背景图像。Based on the pixel mean, an overlay background image corresponding to the current video frame to be processed is determined.

根据本公开的一个或多个实施例，【示例七】提供了一种确定特效视频的方法，该方法，还包括：According to one or more embodiments of the present disclosure, [Example 7] provides a method for determining special effects videos, which method further includes:

可选的，在所述前景图像中叠加目标人物模型；Optionally, superimpose a target character model in the foreground image;

获取所述当前待处理视频帧中目标对象的肢体动作信息和/或面部表情；Obtain body movement information and/or facial expressions of the target object in the current video frame to be processed;

调整所述目标人物模型，与所述肢体动作和/或所述面部表情相匹配。The target character model is adjusted to match the body movements and/or the facial expressions.

根据本公开的一个或多个实施例，【示例八】提供了一种确定特效视频的方法，该方法，还包括：According to one or more embodiments of the present disclosure, [Example 8] provides a method for determining special effects video, the method further includes:

可选的，根据当前待处理视频帧之前的各历史待处理视频帧的历史音频信息，更新显示界面上的至少一个目标关键词。Optionally, update at least one target keyword on the display interface based on the historical audio information of each historical video frame to be processed before the current video frame to be processed.

根据本公开的一个或多个实施例，【示例九】提供了一种确定特效视频的方法，该方法，还包括：According to one or more embodiments of the present disclosure, [Example 9] provides a method for determining special effects video, the method further includes:

可选的，当检测到对触发目标关键词时，跳转至与所述目标关键词所对应的历史目标特效视频帧，并播放；Optionally, when a triggering target keyword is detected, jump to the historical target special effects video frame corresponding to the target keyword and play it;

其中，所述历史目标特效视频帧是基于对历史待处理视频帧特效处理后得到的视频帧。Wherein, the historical target special effect video frames are video frames obtained after special effects processing of historical video frames to be processed.

根据本公开的一个或多个实施例，【示例十】提供了一种确定特效视频的方法，该方法，还包括：According to one or more embodiments of the present disclosure, [Example 10] provides a method for determining special effects video, the method further includes:

可选的，若所述目标特效视频帧是在视频录制的模式下生成的，则确定与音频信息相对应的文字信息；Optionally, if the target special effect video frame is generated in the video recording mode, determine the text information corresponding to the audio information;

在播放所述目标特效视频时，于播放界面的目标区域中显示所述文字信息；When playing the target special effects video, display the text information in the target area of the playback interface;

其中，所述目标区域位于各待处理特效视频帧的前景图像中。Wherein, the target area is located in the foreground image of each special effect video frame to be processed.

根据本公开的一个或多个实施例，【示例十一】提供了一种确定特效视频的方法，该方法，还包括：According to one or more embodiments of the present disclosure, [Example 11] provides a method for determining special effects video, the method further includes:

可选的，调整所述音频信息、文字信息以及音频特效同频显示。Optionally, adjust the audio information, text information and audio special effects to be displayed at the same frequency.

根据本公开的一个或多个实施例，【示例十二】提供了一种确定特效视频的方法，该方法，还包括：According to one or more embodiments of the present disclosure, [Example 12] provides a method for determining special effects videos, and the method further includes:

可选的，将与音频信息相对应的文字信息和音频特效于播放界面中下区别显示。Optionally, text information and audio special effects corresponding to the audio information are displayed differently in the middle and lower parts of the playback interface.

根据本公开的一个或多个实施例，【示例十三】提供了一种确定特效视频的装置，该装置包括：According to one or more embodiments of the present disclosure, [Example 13] provides a device for determining special effects video, the device includes:

以上描述仅为本公开的较佳实施例以及对所运用技术原理的说明。本领域技术人员应当理解，本公开中所涉及的公开范围，并不限于上述技术特征的特定组合而成的技术方案，同时也应涵盖在不脱离上述公开构思的情况下，由上述技术特征或其等同特征进行任意组合而形成的其它技术方案。例如上述特征与本公开中公开的(但不限于)具有类似功能的技术特征进行互相替换而形成的技术方案。The above description is only a description of the preferred embodiments of the present disclosure and the technical principles applied. Those skilled in the art should understand that the disclosure scope involved in the present disclosure is not limited to technical solutions composed of specific combinations of the above technical features, but should also cover solutions composed of the above technical features or without departing from the above disclosed concept. Other technical solutions formed by any combination of equivalent features. For example, a technical solution is formed by replacing the above features with technical features with similar functions disclosed in this disclosure (but not limited to).

此外，虽然采用特定次序描绘了各操作，但是这不应当理解为要求这些操作以所示出的特定次序或以顺序次序执行来执行。在一定环境下，多任务和并行处理可能是有利的。同样地，虽然在上面论述中包含了若干具体实现细节，但是这些不应当被解释为对本公开的范围的限制。在单独的实施例的上下文中描述的某些特征还可以组合地实现在单个实施例中。相反地，在单个实施例的上下文中描述的各种特征也可以单独地或以任何合适的子组合的方式实现在多个实施例中。Furthermore, although operations are depicted in a specific order, this should not be understood as requiring that these operations be performed in the specific order shown or performed in a sequential order. Under certain circumstances, multitasking and parallel processing may be advantageous. Likewise, although several specific implementation details are included in the above discussion, these should not be construed as limiting the scope of the present disclosure. Certain features that are described in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination.

尽管已经采用特定于结构特征和/或方法逻辑动作的语言描述了本主题，但是应当理解所附权利要求书中所限定的主题未必局限于上面描述的特定特征或动作。相反，上面所描述的特定特征和动作仅仅是实现权利要求书的示例形式。Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are merely example forms of implementing the claims.

Claims

1. A method of determining a special effect video, comprising:

responding to the special effect triggering operation, and determining a fuzzy video frame corresponding to the current video frame to be processed; determining an audio special effect consistent with the audio information of the current video frame to be processed;

The blurred video frame is used as a background image of the current special effect video frame, and the audio special effect is used as a foreground image of the current special effect video frame;

the target special effect video is obtained through special effect video frame splicing processing of each video frame to be processed;

the step of taking the blurred video frame as the background image of the current special effect video frame comprises the following steps:

determining a superimposed background image corresponding to the current video frame to be processed and a target transparency corresponding to the superimposed background image;

and superposing the superposed background image on the fuzzy video frame according to the target transparency to serve as the background image.

2. The method of claim 1, wherein the determining a blurred video frame corresponding to a current video frame to be processed comprises:

adding a fuzzy filter into the current video frame to be processed to obtain the fuzzy video frame; or alternatively, the first and second heat exchangers may be,

gaussian blur processing is carried out on the current video frame to be processed to obtain a blurred video frame; or alternatively, the first and second heat exchangers may be,

inputting the current video frame to be processed into a fuzzy processing model to obtain the fuzzy video frame; or alternatively, the first and second heat exchangers may be,

and if the current video frame to be processed comprises the target object, blurring the target object to obtain a blurred video frame.

3. The method of claim 1, wherein said determining an audio special effect consistent with audio information of the current video frame to be processed comprises:

and processing the audio information corresponding to the current video frame to be processed based on the voiceprint feature extraction model to obtain the audio special effect.

4. A method according to any one of claims 1-3, wherein the audio effects correspond to audio features of the audio information; the audio features include voiceprint spectral features; the display form of the audio special effect comprises dynamic display and/or static display, wherein the dynamic display is used for displaying audio characteristics based on animation, and the static display is used for displaying the audio characteristics on a display interface in a static mode.

5. The method of claim 1, wherein the determining the superimposed background image corresponding to the current video frame to be processed comprises:

determining a pixel mean value according to the pixel value of each pixel point in the current video frame to be processed;

and determining an overlapped background image corresponding to the current video frame to be processed based on the pixel mean value.

6. The method as recited in claim 1, further comprising:

Superimposing a target character model in the foreground image;

acquiring limb action information and/or facial expression of a target object in the current video frame to be processed;

the target character model is adjusted to match the limb movements and/or the facial expressions.

7. The method as recited in claim 1, further comprising:

updating at least one target keyword on a display interface according to the historical audio information of each historical video frame to be processed before the current video frame to be processed;

updating at least one target keyword on a display interface according to the historical audio information of each historical video frame to be processed before the current video frame to be processed, wherein the updating comprises the following steps:

after the audio information is determined, analyzing the audio information based on a pre-trained voice processing model, extracting keywords in the audio information, and displaying the keywords in a display interface, wherein the displayed keywords are at least used for realizing the positioning function of historical video frames.

8. The method as recited in claim 7, further comprising:

when triggering operation on a target keyword is detected, jumping to a historical target special effect video frame corresponding to the target keyword, and playing;

The historical target special effect video frame is based on a video frame obtained after special effect processing of a historical to-be-processed video frame.

9. The method as recited in claim 1, further comprising:

if the target special effect video frame is generated in a video recording mode, determining text information corresponding to the audio information;

displaying the text information in a target area of a playing interface when the target special effect video is played;

the target area is located in a foreground image of each special effect video frame to be processed.

10. The method as recited in claim 9, further comprising:

and adjusting the audio information, the text information and the audio special effect to be displayed in the same frequency.

11. The method as recited in claim 10, further comprising:

and displaying the text information and the audio special effect corresponding to the audio information in the playing interface in a lower distinguishing way.

12. An apparatus for determining a special effect video, comprising:

the fuzzy video frame determining module is used for responding to the special effect triggering operation and determining a fuzzy video frame corresponding to the current video frame to be processed; determining an audio special effect consistent with the audio information of the current video frame to be processed;

The special effect video frame generation module is used for taking the blurred video frame as a background image of the current special effect video frame and taking the audio special effect as a foreground image of the current special effect video frame;

the target special effect video generation module is used for obtaining target special effect video through the special effect video frame splicing processing of each video frame to be processed;

the special effect video frame generation module comprises a superimposed background image determination unit and a background image determination unit;

the superimposed background image determining unit is used for determining a superimposed background image corresponding to the current video frame to be processed and a target transparency corresponding to the superimposed background image;

the background image determining unit is used for superposing the superposed background image on the blurred video frame according to the target transparency to serve as the background image.

13. An electronic device, the electronic device comprising:

one or more processors;

storage means for storing one or more programs,

the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method of determining special effects video of any of claims 1-11.

14. A storage medium containing computer executable instructions for performing the method of determining special effects video of any one of claims 1-11 when executed by a computer processor.