CN104954673A

CN104954673A - Camera rotating control method and user terminal

Info

Publication number: CN104954673A
Application number: CN201510320929.7A
Authority: CN
Inventors: 张强
Original assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Current assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority date: 2015-06-11
Filing date: 2015-06-11
Publication date: 2015-09-30
Anticipated expiration: 2035-06-11
Also published as: CN104954673B

Abstract

The embodiment of the invention discloses a camera rotation control method and a user terminal. Wherein, the method includes: acquiring the target voice of the target person to be photographed; determining the target area where the target person is located according to the propagation direction of the target voice, and rotating the camera to the target area; according to a preset screening strategy, Screening out a target person matching the target voice from a plurality of people in the target area; determining a target position where the target person is located, and adjusting the camera to the target position. The implementation of the embodiment of the present invention can improve the accuracy of the rotation position of the camera.

Description

A camera rotation control method and user terminal

技术领域technical field

本发明涉及智能终端技术领域，尤其涉及一种摄像头旋转控制方法及用户终端。The present invention relates to the technical field of intelligent terminals, in particular to a camera rotation control method and a user terminal.

背景技术Background technique

随着电子技术的不断发展，越来越多的用户终端(如手机、平板电脑等)上配置有旋转摄像头。当用户通过旋转摄像头进行拍照的时候，为了达到更好的拍照效果，通常需要手动调整摄像头的旋转角度，在旋转摄像头的旋转角度调整之后再通过所述旋转摄像头进行拍照。然而，手动调整旋转摄像头的旋转角度通常不能很好地将旋转摄像头调整到理想的位置，需要多次对旋转摄像头进行调整才能将旋转摄像头调整到理想的位置。由此可见，现有技术中，手动调整用户终端的旋转摄像头的旋转角度较为繁琐且准确性较低。With the continuous development of electronic technology, more and more user terminals (such as mobile phones, tablet computers, etc.) are equipped with rotating cameras. When the user takes pictures by rotating the camera, in order to achieve a better photo effect, it is usually necessary to manually adjust the rotation angle of the camera, and then take pictures through the rotation camera after the rotation angle of the rotation camera is adjusted. However, manually adjusting the rotation angle of the rotating camera usually cannot adjust the rotating camera to an ideal position well, and the rotating camera needs to be adjusted several times to adjust the rotating camera to an ideal position. It can be seen that, in the prior art, manually adjusting the rotation angle of the rotating camera of the user terminal is cumbersome and has low accuracy.

发明内容Contents of the invention

本发明实施例提供了一种摄像头旋转控制方法及用户终端，可以提高摄像头旋转位置的准确性。Embodiments of the present invention provide a camera rotation control method and a user terminal, which can improve the accuracy of the camera rotation position.

本发明实施例第一方面公开了一种摄像头旋转控制方法，包括：The first aspect of the embodiment of the present invention discloses a camera rotation control method, including:

获取待拍摄目标人物的目标声音；Obtain the target voice of the target person to be photographed;

根据所述目标声音的传播方向，确定所述目标人物所在的目标区域，并将摄像头旋转至所述目标区域；determining the target area where the target person is located according to the propagation direction of the target sound, and rotating the camera to the target area;

根据预设筛选策略，从所述目标区域的多个人物中筛选出与所述目标声音匹配的目标人物；Screening out a target person matching the target voice from a plurality of people in the target area according to a preset screening strategy;

确定所述目标人物所在的目标位置，并将所述摄像头调整至所述目标位置。A target position where the target person is located is determined, and the camera is adjusted to the target position.

结合第一方面，在第一方面的第一种可能的实施方式中，所述获取待拍摄目标人物的目标声音，包括：With reference to the first aspect, in a first possible implementation manner of the first aspect, the acquiring the target voice of the target person to be photographed includes:

接收包括待拍摄目标人物的声音在内的环境声音；Receive ambient sound including the sound of the target person to be photographed;

根据预先存储的所述目标人物的目标声音特征，从接收到的所述环境声音中，滤除与所述目标声音特征不匹配的声音，以获取所述目标人物的声音。According to the pre-stored target voice characteristics of the target person, from the received environmental sounds, sounds that do not match the target voice characteristics are filtered out, so as to obtain the target person's voice.

结合第一方面或结合第一方面的第一种可能的实施方式，在第一方面的第二种可能的实施方式中，所述预设筛选策略包括人脸筛选策略，所述根据预设筛选策略，从所述目标区域的多个人物中筛选出与所述目标声音匹配的目标人物，包括：In combination with the first aspect or the first possible implementation manner of the first aspect, in the second possible implementation manner of the first aspect, the preset screening strategy includes a face screening strategy, and the preset screening strategy The strategy is to select a target person who matches the target voice from a plurality of characters in the target area, including:

根据人脸筛选策略，从所述目标区域的多个人脸中筛选出与所述目标声音匹配的目标人脸；According to the human face screening strategy, a target human face matching the target voice is selected from a plurality of human faces in the target area;

根据所述目标人脸确定所述目标人物。The target person is determined according to the target face.

结合第一方面的第二种可能的实施方式，在第一方面的第三种可能的实施方式中，所述根据人脸筛选策略，从所述目标区域的多个人脸中筛选出与所述目标声音匹配的目标人脸，包括：With reference to the second possible implementation manner of the first aspect, in a third possible implementation manner of the first aspect, according to the face screening strategy, the faces in the target area are screened out The target face that the target voice matches, including:

获取所述目标声音的目标声音特征；Acquiring target sound characteristics of the target sound;

根据所述目标声音特征，从预先存储的声音特征与人脸的对应关系中，查询所述目标声音特征对应的人脸；According to the target sound feature, query the face corresponding to the target sound feature from the pre-stored correspondence between the sound feature and the face;

将查询到的所述目标声音特征对应的人脸与所述目标区域的多个人脸进行比对，当查询到的所述目标声音特征对应的人脸与所述目标区域的多个人脸中的一个人脸相匹配时，将相匹配的人脸确定为目标人脸。Comparing the queried human face corresponding to the target sound feature with the multiple human faces in the target area, when the queried human face corresponding to the target sound feature is When a human face is matched, the matched human face is determined as the target human face.

结合第一方面或结合第一方面的第一种可能的实施方式，在第一方面的第四种可能的实施方式中，所述预设筛选策略包括人体模型筛选策略，所述根据预设筛选策略，从所述目标区域的多个人物中筛选出与所述目标声音匹配的目标人物，包括：In combination with the first aspect or the first possible implementation manner of the first aspect, in the fourth possible implementation manner of the first aspect, the preset screening strategy includes a human body model screening strategy, and the preset screening strategy The strategy is to select a target person who matches the target voice from a plurality of characters in the target area, including:

从所述摄像头的预览图像中提取所述目标区域的多个人物的人体模型；Extracting human body models of multiple people in the target area from the preview image of the camera;

根据人体模型筛选策略，从所述目标区域的多个人物的人体模型中筛选出与所述目标声音匹配的目标人体模型；According to the mannequin screening strategy, a target mannequin matching the target voice is selected from the mannequins of multiple persons in the target area;

根据所述目标人体模型确定所述目标人物。The target person is determined according to the target human body model.

结合第一方面的第四种可能的实施方式，在第一方面的第五种可能的实施方式中，所述根据人体模型筛选策略，从所述目标区域的多个人物的人体模型中筛选出与所述目标声音匹配的目标人体模型，包括：With reference to the fourth possible implementation manner of the first aspect, in the fifth possible implementation manner of the first aspect, according to the human body model screening strategy, the human body models of multiple people in the target area are selected A target manikin matching the target voice, comprising:

根据所述目标声音特征，从预先存储的声音特征与人体模型的对应关系中，查询所述目标声音特征对应的人体模型；According to the target sound feature, from the correspondence between the pre-stored sound feature and the human body model, query the human body model corresponding to the target sound feature;

将查询到的所述目标声音特征对应的人体模型与所述目标区域的多个人物的人体模型进行比对，当查询到的所述目标声音特征对应的人体模型与所述目标区域的多个人物的人体模型中的一个人体模型相匹配时，将相匹配的人体模型确定为目标人体模型。comparing the queried human body model corresponding to the target sound feature with the human body models of multiple people in the target area; When one of the human body models of the character matches, the matched human body model is determined as the target human body model.

本发明实施例第二方面公开了一种用户终端，包括：The second aspect of the embodiment of the present invention discloses a user terminal, including:

获取模块，用于获取待拍摄目标人物的目标声音；An acquisition module, configured to acquire the target voice of the target person to be photographed;

旋转模块，用于根据所述目标声音的传播方向，确定所述目标人物所在的目标区域，并将摄像头旋转至所述目标区域；A rotation module, configured to determine the target area where the target person is located according to the propagation direction of the target sound, and rotate the camera to the target area;

筛选模块，用于根据预设筛选策略，从所述目标区域的多个人物中筛选出与所述目标声音匹配的目标人物；A screening module, configured to select a target person matching the target voice from multiple persons in the target area according to a preset screening strategy;

调整模块，用于确定所述目标人物所在的目标位置，并将所述摄像头调整至所述目标位置。An adjustment module, configured to determine the target position where the target person is, and adjust the camera to the target position.

结合第二方面，在第二方面的第一种可能的实施方式中，所述获取模块包括：With reference to the second aspect, in a first possible implementation manner of the second aspect, the obtaining module includes:

接收单元，用于接收包括待拍摄目标人物的声音在内的环境声音；A receiving unit, configured to receive ambient sound including the sound of the target person to be photographed;

滤除单元，用于根据预先存储的所述目标人物的目标声音特征，从接收到的所述环境声音中，滤除与所述目标声音特征不匹配的声音，以获取所述目标人物的声音。A filtering unit, configured to filter out sounds that do not match the target voice characteristics from the received ambient sounds according to the pre-stored target voice characteristics of the target person, so as to obtain the voice of the target person .

结合第二方面或结合第二方面的第一种可能的实施方式，在第二方面的第二种可能的实施方式中，所述预设筛选策略包括人脸筛选策略，所述筛选模块包括：In combination with the second aspect or the first possible implementation manner of the second aspect, in the second possible implementation manner of the second aspect, the preset screening strategy includes a face screening strategy, and the screening module includes:

第一筛选单元，用于根据人脸筛选策略，从所述目标区域的多个人脸中筛选出与所述目标声音匹配的目标人脸；A first screening unit, configured to select a target face matching the target voice from multiple faces in the target area according to a face screening strategy;

第一确定单元，用于根据所述目标人脸确定所述目标人物。A first determining unit, configured to determine the target person according to the target person's face.

结合第二方面的第二种可能的实施方式，在第二方面的第三种可能的实施方式中，所述第一筛选单元包括：With reference to the second possible implementation manner of the second aspect, in a third possible implementation manner of the second aspect, the first screening unit includes:

第一获取子单元，用于获取所述目标声音的目标声音特征；a first acquisition subunit, configured to acquire target sound features of the target sound;

第一查询子单元，用于根据所述目标声音特征，从预先存储的声音特征与人脸的对应关系中，查询所述目标声音特征对应的人脸；The first query subunit is used to query the human face corresponding to the target voice feature from the pre-stored correspondence between the voice feature and the human face according to the target voice feature;

第一比对子单元，用于将查询到的所述目标声音特征对应的人脸与所述目标区域的多个人脸进行比对，当查询到的所述目标声音特征对应的人脸与所述目标区域的多个人脸中的一个人脸相匹配时，将相匹配的人脸确定为目标人脸。The first comparison subunit is configured to compare the queried human face corresponding to the target sound feature with multiple human faces in the target area, when the queried human face corresponding to the target sound feature is the same as the When one of the multiple faces in the target area matches, the matched face is determined as the target face.

结合第二方面或结合第二方面的第一种可能的实施方式，在第二方面的第四种可能的实施方式中，所述预设筛选策略包括人体模型筛选策略，所述筛选模块包括：In combination with the second aspect or the first possible implementation manner of the second aspect, in a fourth possible implementation manner of the second aspect, the preset screening strategy includes a human model screening strategy, and the screening module includes:

提取单元，用于从所述摄像头的预览图像中提取所述目标区域的多个人物的人体模型；An extracting unit, configured to extract human body models of multiple people in the target area from the preview image of the camera;

第二筛选单元，用于根据人体模型筛选策略，从所述目标区域的多个人物的人体模型中筛选出与所述目标声音匹配的目标人体模型；The second screening unit is configured to select a target mannequin that matches the target voice from the mannequins of multiple people in the target area according to the mannequin screening strategy;

第二确定单元，用于根据所述目标人体模型确定所述目标人物。A second determining unit, configured to determine the target person according to the target human body model.

结合第二方面的第四种可能的实施方式，在第二方面的第五种可能的实施方式中，所述第二筛选单元包括：With reference to the fourth possible implementation manner of the second aspect, in a fifth possible implementation manner of the second aspect, the second screening unit includes:

第二获取子单元，用于获取所述目标声音的目标声音特征；The second acquisition subunit is used to acquire the target sound feature of the target sound;

第二查询子单元，用于根据所述目标声音特征，从预先存储的声音特征与人体模型的对应关系中，查询所述目标声音特征对应的人体模型；The second query subunit is used to query the human body model corresponding to the target voice feature from the pre-stored correspondence between the voice feature and the human body model according to the target voice feature;

第二比对子单元，用于将查询到的所述目标声音特征对应的人体模型与所述目标区域的多个人物的人体模型进行比对，当查询到的所述目标声音特征对应的人体模型与所述目标区域的多个人物的人体模型中的一个人体模型相匹配时，将相匹配的人体模型确定为目标人体模型。The second comparison subunit is used to compare the queried human body model corresponding to the target sound feature with the human body models of multiple people in the target area, when the queried human body model corresponding to the target sound feature When the model matches one of the multiple human body models in the target area, the matched human body model is determined as the target human body model.

本发明实施例中，用户终端获取待拍摄目标人物的目标声音，根据目标声音的传播方向，确定目标人物所在的目标区域，并将摄像头旋转至目标区域；进一步地，根据预设筛选策略，从目标区域的多个人物中筛选出与目标声音匹配的目标人物，确定目标人物所在的目标位置后，用户终端就可以将摄像头调整至目标位置。通过本发明实施例，用户通过识别目标人物的目标声音所在的目标区域，将摄像头旋转至该目标区域的大致位置后，进一步确定该目标区域中的目标人物所在的具体位置，并调整摄像头到该具体位置，从而可以提高摄像头旋转位置的准确性。In the embodiment of the present invention, the user terminal obtains the target voice of the target person to be photographed, determines the target area where the target person is located according to the propagation direction of the target voice, and rotates the camera to the target area; further, according to the preset screening strategy, from The target person matching the target voice is selected from the multiple persons in the target area, and after the target position of the target person is determined, the user terminal can adjust the camera to the target position. Through the embodiment of the present invention, the user recognizes the target area where the target voice of the target person is located, rotates the camera to the approximate position of the target area, further determines the specific position of the target person in the target area, and adjusts the camera to the target area. The specific position, so that the accuracy of the camera rotation position can be improved.

附图说明Description of drawings

为了更清楚地说明本发明实施例中的技术方案，下面将对实施例描述中所需要使用的附图作简单地介绍，显而易见地，下面描述中的附图仅仅是本发明的一些实施例，对于本领域普通技术人员来讲，在不付出创造性劳动的前提下，还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings that need to be used in the description of the embodiments will be briefly introduced below. Obviously, the drawings in the following description are only some embodiments of the present invention. For those skilled in the art, other drawings can also be obtained based on these drawings without creative effort.

图1是本发明实施例公开的一种摄像头旋转控制方法的流程示意图；FIG. 1 is a schematic flowchart of a camera rotation control method disclosed in an embodiment of the present invention;

图2是本发明实施例公开的另一种摄像头旋转控制方法的流程示意图；Fig. 2 is a schematic flow chart of another camera rotation control method disclosed in an embodiment of the present invention;

图3是本发明实施例公开的另一种摄像头旋转控制方法的流程示意图；Fig. 3 is a schematic flowchart of another camera rotation control method disclosed in an embodiment of the present invention;

图4是本发明实施例公开的一种用户终端的结构示意图；FIG. 4 is a schematic structural diagram of a user terminal disclosed in an embodiment of the present invention;

图5是本发明实施例公开的另一种用户终端的结构示意图；FIG. 5 is a schematic structural diagram of another user terminal disclosed in an embodiment of the present invention;

图6是本发明实施例公开的另一种用户终端的结构示意图；FIG. 6 is a schematic structural diagram of another user terminal disclosed in an embodiment of the present invention;

图7是本发明实施例公开的另一种用户终端的结构示意图；FIG. 7 is a schematic structural diagram of another user terminal disclosed in an embodiment of the present invention;

图8是本发明实施例公开的另一种用户终端的结构示意图。Fig. 8 is a schematic structural diagram of another user terminal disclosed in an embodiment of the present invention.

具体实施方式detailed description

下面将结合本发明实施例中的附图，对本发明实施例中的技术方案进行清楚、完整地描述，显然，所描述的实施例仅仅是本发明一部分实施例，而不是全部的实施例。基于本发明中的实施例，本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例，都属于本发明保护的范围。The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some of the embodiments of the present invention, not all of them. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.

本发明实施例公开了一种摄像头旋转控制方法及用户终端，可以提高摄像头旋转位置的准确性。以下分别进行详细说明。The embodiment of the invention discloses a camera rotation control method and a user terminal, which can improve the accuracy of the camera rotation position. Each will be described in detail below.

本发明实施例中，用户终端可以包括但不限于智能手机、笔记本电脑、个人计算机(Personal Computer，PC)、个人数字助理(Personal Digital Assistant,PDA)、移动互联网设备(Mobile Internet Device，MID)、智能穿戴设备(如智能手表、智能手环)等各类用户终端其中，该用户终端的操作系统可包括但不限于Android操作系统、IOS操作系统、Symbian(塞班)操作系统、Black Berry(黑莓)操作系统、Windows Phone8操作系统等等，本发明实施例不做限定。In the embodiment of the present invention, the user terminal may include, but not limited to, a smart phone, a notebook computer, a personal computer (Personal Computer, PC), a personal digital assistant (Personal Digital Assistant, PDA), a mobile Internet device (Mobile Internet Device, MID), Various user terminals such as smart wearable devices (such as smart watches and smart bracelets), among which, the operating system of the user terminal may include but not limited to Android operating system, IOS operating system, Symbian (Symbian) operating system, Black Berry (Blackberry) ) operating system, Windows Phone8 operating system, etc., the embodiments of the present invention are not limited.

请参见图1，图1是本发明实施例公开的一种摄像头旋转控制方法的流程示意图。如图1所示，该方法可以包括但不仅限于以下步骤。Please refer to FIG. 1 . FIG. 1 is a schematic flowchart of a camera rotation control method disclosed in an embodiment of the present invention. As shown in Fig. 1, the method may include but not limited to the following steps.

S101、获取待拍摄目标人物的目标声音。S101. Obtain a target voice of a target person to be photographed.

本发明实施例中，用户终端可以通过内置的声音获取装置(如麦克风)来获取待拍摄目标人物的目标声音。In the embodiment of the present invention, the user terminal may acquire the target voice of the target person to be photographed through a built-in voice acquisition device (such as a microphone).

作为一种可选的实施方式，用户终端获取待拍摄目标人物的目标声音的具体实施方式可以包括以下步骤。As an optional implementation manner, the specific implementation manner in which the user terminal acquires the target voice of the target person to be photographed may include the following steps.

11)接收包括待拍摄目标人物的声音在内的环境声音。11) Receive ambient sound including the sound of the target person to be photographed.

12)根据预先存储的目标人物的目标声音特征，从接收到的环境声音中，滤除与目标声音特征不匹配的声音，以获取目标人物的声音。12) According to the pre-stored target voice characteristics of the target person, from the received environmental sounds, sounds that do not match the target voice characteristics are filtered out to obtain the target person's voice.

在该实施例中，由于目标人物所处的环境中可能有其他声音，比如：其他人说话的声音，或者噪音，因此，用户终端接收到的声音是包括待拍摄目标人物的声音在内的环境声音，用户终端可以获取预先存储的目标人物的目标声音特征，进一步地，从接收到的环境声音中，提取每种声音的声音特征，将目标声音特征与提取的每种声音的声音特征一一比对，从提取的声音特征中获取与目标声音特征一致的声音特征，并确定所述声音特征匹配的声音。这样就可以滤除与目标声音特征不匹配的声音，以获取目标人物的声音。其中，每种声音的声音特征可以包括三个，如音调、响度以及音色。用户终端存储的目标人物的目标声音特征可以包括音调、响度以及音色中的任一种或多种。In this embodiment, since there may be other sounds in the environment where the target person is located, such as the voice of other people speaking or noise, the sound received by the user terminal is the environment including the sound of the target person to be photographed. sound, the user terminal can acquire the target voice features of the target person stored in advance, and further extract the sound features of each sound from the received environmental sound, and compare the target voice features with the extracted sound features of each sound Comparing, obtaining the sound features consistent with the target sound features from the extracted sound features, and determining the sound matching the sound features. This makes it possible to filter out voices that do not match the target's vocal characteristics to obtain the voice of the target person. Wherein, the sound characteristics of each sound may include three, such as pitch, loudness, and timbre. The target voice characteristics of the target person stored in the user terminal may include any one or more of pitch, loudness, and timbre.

作为另一种可选的实施方式，在步骤S101之前，所述方法还可以包括以下步骤。As another optional implementation manner, before step S101, the method may further include the following steps.

21)接收摄像头开启指令；21) Receive camera opening instruction;

22)输出提示信息，该提示信息用于提示输入待验证信息；22) Output prompt information, which is used to prompt input of information to be verified;

23)接收响应提示信息输入的待验证信息；23) Receive the information to be verified input in response to the prompt information;

24)验证待验证信息是否与预先设置的预设验证信息一致；24) Verify whether the information to be verified is consistent with the preset verification information set in advance;

25)当验证待验证信息与预设验证信息一致时，响应该摄像头开启指令，开启摄像头。25) When it is verified that the information to be verified is consistent with the preset verification information, the camera is turned on in response to the camera opening instruction.

在该实施例中，摄像头开启指令可以是用户输入的指令，如：用户点击用户终端上的拍照应用图标。In this embodiment, the instruction to turn on the camera may be an instruction input by the user, for example, the user clicks a camera application icon on the user terminal.

在该实施方式中，该待验证信息可以包括但不限于待验证密码、待验证指纹信息、待验证脸形信息、待验证虹膜信息、待验证视网膜信息以及待验证声纹信息中的任意一种或几种的组合。In this embodiment, the information to be verified may include but not limited to password to be verified, fingerprint information to be verified, face shape information to be verified, iris information to be verified, retina information to be verified, and voiceprint information to be verified. Several combinations.

在该实施方式中，该预先设置的预设验证信息可包括但不限于预设验证密码、预设验证指纹信息、预设验证脸形信息、预设验证虹膜信息、预设验证视网膜信息以及预设验证声纹信息中的任意一种或几种的组合。In this embodiment, the preset verification information may include but not limited to a preset verification password, a preset verification fingerprint information, a preset verification face shape information, a preset verification iris information, a preset verification retina information, and a preset Verify any one or a combination of voiceprint information.

在该实施方式中，上述的预设验证信息可以包括指纹串信息以及每一个指纹对应的输入时间；那么相应地，验证待验证信息是否与预设验证信息一致可以包括以下步骤。In this embodiment, the aforementioned preset verification information may include fingerprint string information and the input time corresponding to each fingerprint; correspondingly, verifying whether the information to be verified is consistent with the preset verification information may include the following steps.

验证指纹串是否与预设验证信息包括的指纹串相同，以及相同指纹的输入时间的差值是否均小于预设值；当验证指纹串与预设验证信息包括的指纹串相同，并且相同指纹的输入时间的差值均小于预设值时，那么可以验证待验证信息与预设验证信息一致；反之，验证待验证信息与预设验证信息不一致。其中，通过实施该实施方式，可以防止非法用户在用户终端上开启摄像头，从而可以有效地防止用户终端被非法用户肆意操作。Verify whether the fingerprint string is the same as the fingerprint string included in the preset verification information, and whether the difference between the input time of the same fingerprint is less than the preset value; when the verification fingerprint string is the same as the fingerprint string included in the preset verification information, and the same fingerprint When the difference between the input times is smaller than the preset value, it can be verified that the information to be verified is consistent with the preset verification information; otherwise, it can be verified that the information to be verified is inconsistent with the preset verification information. Wherein, by implementing this implementation manner, it is possible to prevent an illegal user from turning on the camera on the user terminal, thereby effectively preventing the user terminal from being arbitrarily operated by the illegal user.

S102、根据目标声音的传播方向，确定目标人物所在的目标区域，并将摄像头旋转至目标区域。S102. Determine the target area where the target person is located according to the propagation direction of the target sound, and rotate the camera to the target area.

本发明实施例中，用户终端根据目标声音的传播方向，确定目标人物所在的目标区域，并将摄像头旋转至目标区域。In the embodiment of the present invention, the user terminal determines the target area where the target person is located according to the propagation direction of the target sound, and rotates the camera to the target area.

具体地，用户终端在获取到待拍摄目标人物的目标声音之后，可以识别出目标声音是从哪方向传播过来的，用户终端可以根据该传播方向确定目标人物所在的目标区域，即大致的区域范围，在确定目标区域后，就可以将摄像头旋转至目标区域。Specifically, after the user terminal acquires the target sound of the target person to be photographed, it can identify the direction from which the target sound is transmitted, and the user terminal can determine the target area where the target person is located according to the propagation direction, that is, the approximate area range , after the target area is determined, the camera can be rotated to the target area.

S103、根据预设筛选策略，从目标区域的多个人物中筛选出与目标声音匹配的目标人物。S103. According to a preset screening strategy, select a target person matching the target voice from multiple persons in the target area.

本发明实施例中，用户终端根据预设筛选策略，从目标区域的多个人物中筛选出与目标声音匹配的目标人物。其中，该预设筛选策略可以包括但不限于人脸筛选策略或人体模型筛选策略。即：用户终端可以根据人脸来筛选并识别出目标人物或者根据人体模型来筛选并识别出目标人物。In the embodiment of the present invention, the user terminal screens out a target person who matches the target voice from multiple people in the target area according to a preset screening strategy. Wherein, the preset screening strategy may include but not limited to a face screening strategy or a mannequin screening strategy. That is, the user terminal may screen and identify the target person according to the face or screen and identify the target person according to the human body model.

在一个实施例中，用户终端可以根据人脸筛选策略从目标区域的多个人脸中筛选出与目标声音匹配的目标人脸，进而根据目标人脸确定目标人物。In an embodiment, the user terminal may select a target face that matches the target voice from multiple faces in the target area according to the face screening strategy, and then determine the target person according to the target face.

在另一个实施例中，用户终端可以从摄像头的预览图像中提取目标区域的多个人物的人体模型，并根据人体模型筛选策略，从目标区域的多个人物的人体模型中筛选出与目标声音匹配的目标人体模型，进而根据目标人体模型确定目标人物。In another embodiment, the user terminal can extract the human body models of multiple people in the target area from the preview image of the camera, and filter out the human body models of the multiple people in the target area according to the human body model screening strategy. The matched target human body model, and then the target person is determined according to the target human body model.

S104、确定目标人物所在的目标位置，并将摄像头调整至目标位置。S104. Determine the target position where the target person is, and adjust the camera to the target position.

本发明实施例中，用户终端在确定出目标人物后，就可以进一步通过内置的GPS(Global Positioning System，全球定位系统)来确定目标人物所在的目标位置，这样，用户终端就可以进一步调整摄像头，以使摄像头旋转到该目标人物所处的位置，在旋转好摄像头的角度之后，用户终端就可以利用摄像头对该目标人物进行拍摄。In the embodiment of the present invention, after the user terminal determines the target person, it can further determine the target position of the target person through the built-in GPS (Global Positioning System, Global Positioning System), so that the user terminal can further adjust the camera, The camera is rotated to the position where the target person is, and after the angle of the camera is rotated, the user terminal can use the camera to take pictures of the target person.

在图1所描述的方法流程中，用户终端获取待拍摄目标人物的目标声音，根据目标声音的传播方向，确定目标人物所在的目标区域，并将摄像头旋转至目标区域；进一步地，根据预设筛选策略，从目标区域的多个人物中筛选出与目标声音匹配的目标人物，确定目标人物所在的目标位置后，用户终端就可以将摄像头调整至目标位置。通过本发明实施例，用户终端通过识别目标人物的目标声音所在的目标区域，将摄像头旋转至该目标区域的大致位置后，进一步确定该目标区域中的目标人物所在的具体位置，并调整摄像头到该具体位置，从而可以提高摄像头旋转位置的准确性。In the method flow described in FIG. 1, the user terminal obtains the target voice of the target person to be photographed, determines the target area where the target person is located according to the propagation direction of the target sound, and rotates the camera to the target area; further, according to the preset The screening strategy is to select the target person who matches the target voice from multiple people in the target area, and after determining the target position of the target person, the user terminal can adjust the camera to the target position. Through the embodiment of the present invention, the user terminal recognizes the target area where the target voice of the target person is located, rotates the camera to the approximate position of the target area, further determines the specific position of the target person in the target area, and adjusts the camera to the approximate position of the target area. This specific position can improve the accuracy of the camera rotation position.

请参见图2，图2是本发明实施例公开的另一种摄像头旋转控制方法的流程示意图。如图2所示，该方法可以包括以下步骤：Please refer to FIG. 2 . FIG. 2 is a schematic flowchart of another camera rotation control method disclosed in an embodiment of the present invention. As shown in Figure 2, the method may include the following steps:

S201、用户终端获取待拍摄目标人物的目标声音。S201. The user terminal acquires a target voice of a target person to be photographed.

S202、用户终端根据目标声音的传播方向，确定目标人物所在的目标区域，并将摄像头旋转至目标区域。S202. The user terminal determines the target area where the target person is located according to the propagation direction of the target sound, and rotates the camera to the target area.

S203、用户终端根据人脸筛选策略，从目标区域的多个人脸中筛选出与目标声音匹配的目标人脸。S203. The user terminal selects a target face that matches the target voice from multiple faces in the target area according to the face screening strategy.

本发明实施例中，由于每个人的人脸不一样，每个人匹配的声音也不一样，因此，用户终端可以从目标区域的多个人脸中筛选出与目标声音匹配的目标人脸。In the embodiment of the present invention, because each person's face is different, each person's matching voice is also different, therefore, the user terminal can filter out the target face that matches the target voice from multiple faces in the target area.

作为一种可选的实施方式，用户终端根据人脸筛选策略，从目标区域的多个人脸中筛选出与目标声音匹配的目标人脸的具体实施方式可以包括以下步骤。As an optional implementation manner, the specific implementation manner in which the user terminal screens out a target face that matches the target voice from multiple faces in the target area according to the face screening strategy may include the following steps.

11)获取目标声音的目标声音特征。11) Obtain the target sound feature of the target sound.

12)根据目标声音特征，从预先存储的声音特征与人脸的对应关系中，查询目标声音特征对应的人脸。12) According to the target voice feature, query the face corresponding to the target voice feature from the pre-stored correspondence between the voice feature and the face.

13)将查询到的目标声音特征对应的人脸与目标区域的多个人脸进行比对，当查询到的目标声音特征对应的人脸与目标区域的多个人脸中的一个人脸相匹配时，将相匹配的人脸确定为目标人脸。13) Compare the face corresponding to the queried target sound feature with multiple faces in the target area, when the face corresponding to the queried target sound feature matches one of the multiple faces in the target area , and determine the matching face as the target face.

在该实施例中，用户终端可以获取预先存储的目标声音特征，或者从获取到的目标声音中提取目标声音特征，其中，该目标声音特征可以包括音调、响度以及音色中的任一种或多种。用户终端可以以目标声音特征为依据，从预先存储的声音特征与人脸的对应关系中，查询目标声音特征对应的人脸。进一步地，将查询到的目标声音特征对应的人脸与目标区域的多个人脸进行比对，当查询到的目标声音特征对应的人脸与目标区域的多个人脸中的一个人脸相匹配时，将相匹配的人脸确定为目标人脸。In this embodiment, the user terminal may obtain pre-stored target sound features, or extract target sound features from the acquired target sound, where the target sound features may include any one or more of pitch, loudness, and timbre. kind. The user terminal may query the face corresponding to the target voice feature from the pre-stored correspondence between the voice feature and the face based on the target voice feature. Further, comparing the face corresponding to the queried target sound feature with multiple faces in the target area, when the face corresponding to the queried target sound feature matches one of the multiple faces in the target area , determine the matching face as the target face.

具体地，用户终端可以通过摄像头采集目标区域的多个人脸图像，进一步地，根据特征向量法确定每个人脸图像的眼虹膜、鼻翼、嘴角等面像五官轮廓的大小、位置、距离等属性，然后再计算出它们的几何特征量，以构成能够描述每一个人脸的特征向量。用户终端查询目标声音特征对应的人脸，也就是查询目标声音特征对应的人脸特征向量。用户终端将查询到的目标声音特征对应的人脸特征向量与采集的每一个人脸的特征向量一一比对，当比对的匹配度达到预设匹配度(如90％)时，就可以将达到预设匹配度的采集的人脸确定为目标人脸。Specifically, the user terminal can collect multiple face images of the target area through the camera, and further, determine the size, position, distance and other attributes of the facial features such as the iris, nose, and mouth corners of each face image according to the eigenvector method, Then calculate their geometric features to form a feature vector that can describe each face. The user terminal queries the face corresponding to the target voice feature, that is, queries the face feature vector corresponding to the target voice feature. The user terminal compares the face feature vector corresponding to the target voice feature found with the feature vector of each face collected, and when the matching degree of comparison reaches a preset matching degree (such as 90%), it can Determining the collected face that reaches the preset matching degree as the target face.

S204、用户终端根据目标人脸确定目标人物。S204. The user terminal determines the target person according to the target face.

本发明实施例中，由于每个人物的人脸特征向量都不一样，用户终端在确定目标人脸后，就可以确定该目标人脸所属的目标人物。In the embodiment of the present invention, since the face feature vectors of each person are different, after the user terminal determines the target face, it can determine the target person to which the target face belongs.

S205、用户终端确定目标人物所在的目标位置，并将摄像头调整至目标位置。S205. The user terminal determines the target position where the target person is, and adjusts the camera to the target position.

在图2所描述的方法流程中，用户终端获取待拍摄目标人物的目标声音，根据目标声音的传播方向，确定目标人物所在的目标区域，并将摄像头旋转至目标区域；进一步地，根据人脸筛选策略，从目标区域的多个人脸中筛选出与目标声音匹配的目标人脸，在根据目标人脸确定目标人物后，用户终端就可以确定目标人物所在的目标位置后，这样，用户终端就可以将摄像头调整至目标位置。通过本发明实施例，用户终端可以结合目标声音以及人脸筛选策略，从多个人脸中筛选出与目标声音匹配的人脸以确定目标人物，进而确定该目标人物所在的位置，在调整摄像头到该位置，从而可以提高摄像头旋转位置的准确性。In the method flow described in FIG. 2, the user terminal acquires the target voice of the target person to be photographed, determines the target area where the target person is located according to the propagation direction of the target voice, and rotates the camera to the target area; further, according to the face The screening strategy is to select the target face that matches the target voice from multiple faces in the target area. After the target person is determined according to the target face, the user terminal can determine the target position where the target person is located. In this way, the user terminal can The camera can be adjusted to the target position. Through the embodiment of the present invention, the user terminal can combine the target voice and the face screening strategy to filter out the face that matches the target voice from multiple faces to determine the target person, and then determine the location of the target person. After adjusting the camera to This position can improve the accuracy of the camera rotation position.

请参见图3，图3是本发明实施例公开的另一种摄像头旋转控制方法的流程示意图。如图3所示，该方法可以包括以下步骤：Please refer to FIG. 3 . FIG. 3 is a schematic flowchart of another camera rotation control method disclosed in an embodiment of the present invention. As shown in Figure 3, the method may include the following steps:

S301、用户终端获取待拍摄目标人物的目标声音。S301. The user terminal acquires a target voice of a target person to be photographed.

S302、用户终端根据目标声音的传播方向，确定目标人物所在的目标区域，并将摄像头旋转至目标区域。S302. The user terminal determines the target area where the target person is located according to the propagation direction of the target sound, and rotates the camera to the target area.

S303、用户终端从摄像头的预览图像中提取目标区域的多个人物的人体模型。S303. The user terminal extracts human body models of multiple people in the target area from the preview image of the camera.

本发明实施例中，摄像头的预览图像中包括多个人物，用户终端可以依据每一个人物来提取每一个人物的人体模型，其中，该人体模型为三维解剖结构，包括组织器官的尺寸、形状、位置及其空间关系，该提取的人体模型的各组织器官的尺寸与真实人物的各组织器官的尺寸可以相同，也可以构成一定比例，比如：提取的人体模型的各组织器官的尺寸为真实人物的各组织器官的尺寸的20％等。In the embodiment of the present invention, the preview image of the camera includes multiple characters, and the user terminal can extract the human body model of each character according to each character, wherein the human body model is a three-dimensional anatomical structure, including the size, shape, and The position and its spatial relationship, the size of each tissue and organ of the extracted human body model can be the same as the size of each tissue and organ of a real person, or can form a certain ratio, for example: the size of each tissue and organ of the extracted human body model is the same as that of a real person 20% of the size of each tissue organ etc.

S304、用户终端根据人体模型筛选策略，从目标区域的多个人物的人体模型中筛选出与目标声音匹配的目标人体模型。S304. According to the mannequin screening strategy, the user terminal selects a target mannequin that matches the target voice from the mannequins of multiple people in the target area.

本发明实施例中，由于每个人的身材体型，各组织器官的尺寸都会存在某种程度的差异，因此，用户终端可以从目标区域的多个人物的人体模型中筛选出与目标声音匹配的目标人体模型。In the embodiment of the present invention, due to the size of each person's body and size, there will be a certain degree of difference in the size of each tissue and organ. Therefore, the user terminal can filter out the target that matches the target voice from the human body models of multiple people in the target area. mannequin.

作为一种可选的实施方式，用户终端根据人体模型筛选策略，从目标区域的多个人物的人体模型中筛选出与目标声音匹配的目标人体模型的具体实施方式可以包括以下步骤。As an optional implementation manner, the specific implementation manner in which the user terminal selects a target mannequin matching the target voice from the mannequins of multiple persons in the target area according to the mannequin screening strategy may include the following steps.

12)根据目标声音特征，从预先存储的声音特征与人体模型的对应关系中，查询目标声音特征对应的人体模型。12) According to the target voice feature, query the human body model corresponding to the target voice feature from the pre-stored correspondence between the voice feature and the human body model.

13)将查询到的目标声音特征对应的人体模型与目标区域的多个人物的人体模型进行比对，当查询到的目标声音特征对应的人体模型与目标区域的多个人物的人体模型中的一个人体模型相匹配时，将相匹配的人体模型确定为目标人体模型。13) Comparing the human body model corresponding to the queried target sound feature with the human body models of multiple characters in the target area, when the human body model corresponding to the queried target sound feature and the human body models of multiple characters in the target area When a human body model is matched, the matched human body model is determined as the target human body model.

在该实施例中，用户终端可以获取预先存储的目标声音特征，或者从获取到的目标声音中提取目标声音特征，其中，该目标声音特征可以包括音调、响度以及音色中的任一种或多种。用户终端可以以目标声音特征为依据，从预先存储的声音特征与人体模型的对应关系中，查询目标声音特征对应的人体模型。进一步地，将查询到的目标声音特征对应的人体模型与目标区域的多个人物的人体模型进行比对，当查询到的目标声音特征对应的人体模型与目标区域的多个人物的人体模型中的一个人体模型相匹配时，将相匹配的人体模型确定为目标人体模型。In this embodiment, the user terminal may obtain pre-stored target sound features, or extract target sound features from the acquired target sound, where the target sound features may include any one or more of pitch, loudness, and timbre. kind. The user terminal may query the human body model corresponding to the target voice feature from the pre-stored correspondence between the voice feature and the human body model based on the target voice feature. Further, the human body model corresponding to the queried target sound feature is compared with the human body models of multiple people in the target area. When a human body model is matched, the matched human body model is determined as the target human body model.

具体地，用户终端可以通过摄像头采集目标区域的多个人物，在通过内置的人体模型提取装置提取每个人物的人体模型，其中，该提取的人体模型包括每个组织器官的尺寸、形状、位置及其空间关系。用户终端查询目标声音特征对应的人体模型，可以将该目标声音特征对应的人体模型包括的每个组织器官的尺寸、形状、位置及其空间关系与提取的每个人物的人体模型包括的每个组织器官的尺寸、形状、位置及其空间关系一一比对，当比对的匹配度达到预设匹配度(如90％)时，就可以将达到预设匹配度的提取的人体模型确定为目标人体模型。Specifically, the user terminal can collect multiple people in the target area through the camera, and then extract the human body model of each character through the built-in human body model extraction device, wherein the extracted human body model includes the size, shape, and position of each tissue and organ and its spatial relationship. The user terminal queries the human body model corresponding to the target sound feature, and can compare the size, shape, position and spatial relationship of each tissue and organ included in the human body model corresponding to the target sound feature with the extracted human body model of each character. The size, shape, position and spatial relationship of the tissues and organs are compared one by one. When the matching degree of the comparison reaches the preset matching degree (such as 90%), the extracted human body model that reaches the preset matching degree can be determined as Target mannequin.

S305、用户终端根据目标人体模型确定目标人物。S305. The user terminal determines the target person according to the target human body model.

本发明实施例中，用户终端从提取的多个人体模型中确定目标人体模型后，就可以确定该目标人体模型所属的目标人物。In the embodiment of the present invention, after the user terminal determines the target human body model from the multiple extracted human body models, it can determine the target person to which the target human body model belongs.

S306、用户终端确定目标人物所在的目标位置，并将摄像头调整至目标位置。S306. The user terminal determines the target position where the target person is, and adjusts the camera to the target position.

在图3所描述的方法流程中，用户终端获取待拍摄目标人物的目标声音，根据目标声音的传播方向，确定目标人物所在的目标区域，并将摄像头旋转至目标区域；进一步地，从摄像头的预览图像中提取目标区域的多个人物的人体模型，根据人体模型筛选策略，从目标区域的多个人物的人体模型中筛选出与目标声音匹配的目标人体模型，在根据目标人体模型确定目标人物后，用户终端就可以确定目标人物所在的目标位置后，这样，用户终端就可以将摄像头调整至目标位置。通过本发明实施例，用户终端可以结合目标声音以及人体模型筛选策略，从多个人体模型中筛选出与目标声音匹配的目标人体模型以确定目标人物，进而确定该目标人物所在的位置，在调整摄像头到该位置，从而可以提高摄像头旋转位置的准确性。In the method flow described in FIG. 3, the user terminal acquires the target voice of the target person to be photographed, determines the target area where the target person is located according to the propagation direction of the target sound, and rotates the camera to the target area; further, from the camera's Extract the mannequins of multiple people in the target area from the preview image, and select the target mannequin that matches the target voice from the mannequins of multiple people in the target area according to the mannequin screening strategy, and determine the target mannequin according to the target mannequin After that, the user terminal can determine the target position where the target person is, so that the user terminal can adjust the camera to the target position. Through the embodiment of the present invention, the user terminal can combine the target voice and the mannequin screening strategy to filter out the target mannequin that matches the target voice from multiple mannequins to determine the target person, and then determine the location of the target person. Camera to this position, which can improve the accuracy of camera rotation position.

下面为本发明装置实施例，本发明装置实施例用于执行本发明方法实施例中的方法，为了便于说明，仅示出了与本发明实施例相关的部分，具体技术细节未揭示的，请参照本发明上述方法实施例。The following is the embodiment of the device of the present invention, which is used to implement the method in the method embodiment of the present invention. For the convenience of description, only the parts related to the embodiment of the present invention are shown. If the specific technical details are not disclosed, please Refer to the above-mentioned method embodiment of the present invention.

请参见图4，图4是本发明实施例公开的一种用户终端的结构示意图，如图4所示，该用户终端可以包括：获取模块401、旋转模块402、筛选模块403以及调整模块404，其中：Please refer to FIG. 4, which is a schematic structural diagram of a user terminal disclosed in an embodiment of the present invention. As shown in FIG. 4, the user terminal may include: an acquisition module 401, a rotation module 402, a screening module 403, and an adjustment module 404, in:

获取模块401，用于获取待拍摄目标人物的目标声音。The acquisition module 401 is configured to acquire the target voice of the target person to be photographed.

本发明实施例中，获取模块401可以通过内置的声音获取装置(如麦克风)来获取待拍摄目标人物的目标声音。In the embodiment of the present invention, the acquiring module 401 can acquire the target voice of the target person to be photographed through a built-in voice acquiring device (such as a microphone).

旋转模块402，用于根据目标声音的传播方向，确定目标人物所在的目标区域，并将摄像头旋转至目标区域。The rotation module 402 is configured to determine the target area where the target person is located according to the propagation direction of the target sound, and rotate the camera to the target area.

本发明实施例中，旋转模块402根据目标声音的传播方向，确定目标人物所在的目标区域，并将摄像头旋转至目标区域。In the embodiment of the present invention, the rotation module 402 determines the target area where the target person is located according to the propagation direction of the target sound, and rotates the camera to the target area.

具体地，获取模块401在获取到待拍摄目标人物的目标声音之后，可以识别出目标声音是从哪方向传播过来的，旋转模块402可以根据该传播方向确定目标人物所在的目标区域，即大致的区域范围，在确定目标区域后，就可以将摄像头旋转至目标区域。Specifically, after the acquisition module 401 acquires the target sound of the target person to be photographed, it can identify the direction from which the target sound propagates, and the rotation module 402 can determine the target area where the target person is located according to the propagation direction, that is, approximately Area range, after the target area is determined, the camera can be rotated to the target area.

筛选模块403，用于根据预设筛选策略，从目标区域的多个人物中筛选出与目标声音匹配的目标人物。The screening module 403 is configured to select a target person who matches the target voice from multiple persons in the target area according to a preset screening strategy.

本发明实施例中，筛选模块403根据预设筛选策略，从目标区域的多个人物中筛选出与目标声音匹配的目标人物。其中，该预设筛选策略可以包括但不限于人脸筛选策略或人体模型筛选策略。即：筛选模块403可以根据人脸来筛选并识别出目标人物或者根据人体模型来筛选并识别出目标人物。In the embodiment of the present invention, the screening module 403 screens out a target person who matches the target voice from multiple people in the target area according to a preset screening strategy. Wherein, the preset screening strategy may include but not limited to a face screening strategy or a mannequin screening strategy. That is, the screening module 403 can screen and identify the target person according to the human face or screen and identify the target person according to the human body model.

调整模块404，用于确定目标人物所在的目标位置，并将摄像头调整至目标位置。The adjustment module 404 is configured to determine the target position where the target person is, and adjust the camera to the target position.

本发明实施例中，筛选模块403在确定出目标人物后，调整模块404就可以进一步通过内置的GPS(Global Positioning System，全球定位系统)来确定目标人物所在的目标位置，这样，调整模块404就可以进一步调整摄像头，以使摄像头旋转到该目标人物所处的位置，在旋转好摄像头的角度之后，用户终端就可以利用摄像头对该目标人物进行拍摄。In the embodiment of the present invention, after the screening module 403 determines the target person, the adjustment module 404 can further determine the target position of the target person through the built-in GPS (Global Positioning System, Global Positioning System), so that the adjustment module 404 can The camera can be further adjusted so that the camera rotates to the position where the target person is. After the camera is rotated at an angle, the user terminal can use the camera to take pictures of the target person.

请参见图5，图5是本发明实施例公开的另一种用户终端的结构示意图，其中，图5所示的用户终端是在图4所示用户终端的基础上进一步优化得到的，与图4所示的用户终端相比，图5所示的用户终端除了包括图4所示用户终端的所有模块外，还可以包括：Please refer to FIG. 5. FIG. 5 is a schematic structural diagram of another user terminal disclosed in an embodiment of the present invention. The user terminal shown in FIG. 5 is further optimized on the basis of the user terminal shown in FIG. Compared with the user terminal shown in Figure 4, the user terminal shown in Figure 5 may include not only all the modules of the user terminal shown in Figure 4, but also:

接收模块405，用于接收摄像头开启指令；A receiving module 405, configured to receive a camera opening instruction;

输出模块406，用于输出提示信息，该提示信息用于提示输入待验证信息；The output module 406 is used to output prompt information, and the prompt information is used to prompt input of information to be verified;

上述接收模块405，还用于接收响应该提示信息输入的待验证信息；The above-mentioned receiving module 405 is also configured to receive the information to be verified inputted in response to the prompt information;

验证模块407，用于验证该待验证信息是否与预先设置的预设验证信息一致；A verification module 407, configured to verify whether the information to be verified is consistent with the preset verification information set in advance;

开启模块408，用于当上述验证模块407验证该待验证信息与预设验证信息一致时，响应该摄像头开启指令，开启摄像头。The opening module 408 is configured to respond to the camera opening instruction and open the camera when the verification module 407 verifies that the information to be verified is consistent with the preset verification information.

在该实施例中，该摄像头开启指令可以是用户输入的指令，如：用户点击用户终端上的拍照应用图标。In this embodiment, the instruction to turn on the camera may be an instruction input by the user, for example, the user clicks a camera application icon on the user terminal.

在该实施方式中，上述的预设验证信息可以包括指纹串信息以及每一个指纹对应的输入时间；那么相应地，验证模块407验证待验证信息是否与预设验证信息一致可以包括以下步骤：In this embodiment, the above-mentioned preset verification information may include fingerprint string information and the input time corresponding to each fingerprint; then correspondingly, the verification module 407 verifying whether the information to be verified is consistent with the preset verification information may include the following steps:

验证指纹串是否与预设验证信息包括的指纹串相同，并且相同指纹的输入时间的差值是否均小于预设值，如果验证指纹串与预设验证信息包括的指纹串相同，并且相同指纹的输入时间的差值均小于预设值，那么可以验证待验证信息与预设验证信息一致；反之，验证待验证信息与预设验证信息不一致。其中，通过实施该实施方式，可以防止非法用户在用户终端上开启摄像头，从而可以有效地防止用户终端被非法用户肆意操作。Verify whether the fingerprint string is the same as the fingerprint string included in the preset verification information, and whether the difference between the input time of the same fingerprint is less than the preset value, if the verification fingerprint string is the same as the fingerprint string included in the preset verification information, and the same fingerprint If the difference between the input times is smaller than the preset value, it can be verified that the information to be verified is consistent with the preset verification information; otherwise, it can be verified that the information to be verified is inconsistent with the preset verification information. Wherein, by implementing this implementation manner, it is possible to prevent an illegal user from turning on the camera on the user terminal, thereby effectively preventing the user terminal from being arbitrarily operated by the illegal user.

请参见图6，图6是本发明实施例公开的另一种用户终端的结构示意图，其中，其中，图6所示的用户终端是在图4所示用户终端的基础上进一步优化得到的，与图4所示的用户终端相比，图6所示的用户终端除了包括图4所示用户终端的所有模块外，获取模块401可以包括：Please refer to FIG. 6. FIG. 6 is a schematic structural diagram of another user terminal disclosed in an embodiment of the present invention, wherein, the user terminal shown in FIG. 6 is further optimized on the basis of the user terminal shown in FIG. 4, Compared with the user terminal shown in FIG. 4, the user terminal shown in FIG. 6 includes all modules of the user terminal shown in FIG. 4, and the obtaining module 401 may include:

接收单元4011，用于接收包括待拍摄目标人物的声音在内的环境声音；a receiving unit 4011, configured to receive ambient sound including the sound of the target person to be photographed;

滤除单元4012，用于根据预先存储的目标人物的目标声音特征，从接收到的环境声音中，滤除与目标声音特征不匹配的声音，以获取目标人物的声音。The filtering unit 4012 is configured to filter out sounds that do not match the target voice characteristics from the received environmental sounds according to the pre-stored target voice characteristics of the target person, so as to obtain the target person's voice.

在该实施例中，由于目标人物所处的环境中可能有其他声音，比如：其他人说话的声音，或者噪音，因此，接收单元4011接收到的声音是包括待拍摄目标人物的声音在内的环境声音，用户终端可以获取预先存储的目标人物的目标声音特征，进一步地，从接收到的环境声音中，提取每种声音的声音特征，将目标声音特征与提取的每种声音的声音特征一一比对，从提取的声音特征中获取与目标声音特征一致的声音特征，并确定所述声音特征匹配的声音，这样滤除单元4012就可以滤除与目标声音特征不匹配的声音，以获取目标人物的声音。其中，每种声音可以包括三个主要特征，如音调、响度以及音色。用户终端存储的目标人物的目标声音特征可以包括音调、响度以及音色中的任一种或多种。In this embodiment, since there may be other sounds in the environment where the target person is located, such as the voice of other people speaking or noise, the sound received by the receiving unit 4011 includes the sound of the target person to be photographed. Environmental sound, the user terminal can obtain the target voice feature of the target person stored in advance, further, from the received environmental sound, extract the sound feature of each sound, and combine the target sound feature with the extracted sound feature of each sound Once compared, the sound features consistent with the target sound features are obtained from the extracted sound features, and the sound matching the sound features is determined, so that the filtering unit 4012 can filter out sounds that do not match the target sound features to obtain The target person's voice. Wherein, each sound may include three main characteristics, such as pitch, loudness, and timbre. The target voice characteristics of the target person stored in the user terminal may include any one or more of pitch, loudness, and timbre.

请参见图7，图7是本发明实施例公开的另一种用户终端的结构示意图，其中，其中，图7所示的用户终端是在图4所示用户终端的基础上进一步优化得到的，与图4所示的用户终端相比，图7所示的用户终端除了包括图4所示用户终端的所有模块外，筛选模块403可以包括：Please refer to FIG. 7. FIG. 7 is a schematic structural diagram of another user terminal disclosed in an embodiment of the present invention, wherein, the user terminal shown in FIG. 7 is further optimized on the basis of the user terminal shown in FIG. 4, Compared with the user terminal shown in FIG. 4 , except that the user terminal shown in FIG. 7 includes all modules of the user terminal shown in FIG. 4 , the screening module 403 may include:

第一筛选单元4031，用于根据人脸筛选策略，从目标区域的多个人脸中筛选出与目标声音匹配的目标人脸；The first screening unit 4031 is configured to select a target face matching the target voice from multiple faces in the target area according to the face screening strategy;

第一确定单元4032，用于根据目标人脸确定目标人物。The first determining unit 4032 is configured to determine the target person according to the target person's face.

本发明实施例中，由于每个人的人脸不一样，每个人匹配的声音也不一样，因此，第一筛选单元4031可以从目标区域的多个人脸中筛选出与目标声音匹配的目标人脸，第一确定单元4032就可以确定该目标人脸所属的目标人物。In the embodiment of the present invention, because each person's face is different, each person's matching voice is also different, therefore, the first screening unit 4031 can filter out the target face that matches the target voice from multiple faces in the target area , the first determining unit 4032 can determine the target person to which the target face belongs.

作为一种可选的实施方式，上述第一筛选单元4031可以包括：As an optional implementation manner, the above-mentioned first screening unit 4031 may include:

第一获取子单元40311，用于获取目标声音的目标声音特征；The first acquisition subunit 40311 is used to acquire the target sound feature of the target sound;

第一查询子单元40312，用于根据目标声音特征，从预先存储的声音特征与人脸的对应关系中，查询目标声音特征对应的人脸；The first query subunit 40312 is used to query the face corresponding to the target voice feature from the pre-stored correspondence between the voice feature and the face according to the target voice feature;

第一比对子单元40313，用于将查询到的目标声音特征对应的人脸与目标区域的多个人脸进行比对，当查询到的目标声音特征对应的人脸与目标区域的多个人脸中的一个人脸相匹配时，将相匹配的人脸确定为目标人脸。The first comparison subunit 40313 is used to compare the human face corresponding to the queried target sound feature with multiple human faces in the target area, when the queried human face corresponding to the target voice feature is compared with multiple human faces in the target area When one of the faces matches, the matching face is determined as the target face.

在该实施例中，第一获取子单元40311可以获取预先存储的目标声音特征，或者从获取到的目标声音中提取目标声音特征，其中，该目标声音特征可以包括音调、响度以及音色中的任一种或多种。第一查询子单元40312可以以目标声音特征为依据，从预先存储的声音特征与人脸的对应关系中，查询目标声音特征对应的人脸。进一步地，第一比对子单元40313将查询到的目标声音特征对应的人脸与目标区域的多个人脸进行比对，当查询到的目标声音特征对应的人脸与目标区域的多个人脸中的一个人脸相匹配时，将相匹配的人脸确定为目标人脸。In this embodiment, the first acquiring subunit 40311 can acquire the pre-stored target sound features, or extract the target sound features from the acquired target sounds, where the target sound features can include any of pitch, loudness, and timbre. one or more. The first query subunit 40312 may query the face corresponding to the target voice feature from the pre-stored correspondence between the voice feature and the face based on the target voice feature. Further, the first comparison subunit 40313 compares the face corresponding to the queried target voice feature with multiple faces in the target area, when the face corresponding to the queried target voice feature is compared with multiple faces in the target area When one of the faces matches, the matching face is determined as the target face.

具体地，用户终端可以通过摄像头采集目标区域的多个人脸图像，进一步地，根据特征向量法确定每个人脸图像的眼虹膜、鼻翼、嘴角等面像五官轮廓的大小、位置、距离等属性，然后再计算出它们的几何特征量，以构成能够描述每一个人脸的特征向量。第一查询子单元40312查询目标声音特征对应的人脸，也就是查询目标声音特征对应的人脸特征向量。第一比对子单元40313将查询到的目标声音特征对应的人脸特征向量与采集的每一个人脸的特征向量一一比对，当比对的匹配度达到预设匹配度(如90％)时，就可以将达到预设匹配度的采集的人脸确定为目标人脸。Specifically, the user terminal can collect multiple face images of the target area through the camera, and further, determine the size, position, distance and other attributes of the facial features such as the iris, nose, and mouth corners of each face image according to the eigenvector method, Then calculate their geometric features to form a feature vector that can describe each face. The first query subunit 40312 queries the face corresponding to the target voice feature, that is, queries the face feature vector corresponding to the target voice feature. The first comparison subunit 40313 compares the face feature vector corresponding to the target voice feature found with the feature vector of each face collected, and when the matching degree of comparison reaches the preset matching degree (such as 90%) ), the collected face that reaches the preset matching degree can be determined as the target face.

请参见图8，图8是本发明实施例公开的另一种用户终端的结构示意图，其中，其中，图8所示的用户终端是在图4所示用户终端的基础上进一步优化得到的，与图4所示的用户终端相比，图8所示的用户终端除了包括图4所示用户终端的所有模块外，筛选模块403可以包括：Please refer to FIG. 8. FIG. 8 is a schematic structural diagram of another user terminal disclosed in an embodiment of the present invention, wherein, the user terminal shown in FIG. 8 is further optimized on the basis of the user terminal shown in FIG. 4, Compared with the user terminal shown in FIG. 4 , except that the user terminal shown in FIG. 8 includes all modules of the user terminal shown in FIG. 4 , the screening module 403 may include:

提取单元4033，用于从摄像头的预览图像中提取目标区域的多个人物的人体模型。The extracting unit 4033 is configured to extract human body models of multiple people in the target area from the preview image of the camera.

本发明实施例中，摄像头的预览图像中包括多个人物，提取单元4033可以依据每一个人物来提取每一个人物的人体模型，其中，该人体模型为三维解剖结构，包括组织器官的尺寸、形状、位置及其空间关系，该提取的人体模型的各组织器官的尺寸与真实人物的各组织器官的尺寸可以相同，也可以构成一定比例，比如：提取的人体模型的各组织器官的尺寸为真实人物的各组织器官的尺寸的20％等。In the embodiment of the present invention, the preview image of the camera includes multiple characters, and the extraction unit 4033 can extract the human body model of each character according to each character, wherein the human body model is a three-dimensional anatomical structure, including the size and shape of tissues and organs , position and its spatial relationship, the size of each tissue and organ of the extracted human body model can be the same as the size of each tissue and organ of a real person, or can form a certain ratio, for example: the size of each tissue and organ of the extracted human body model is real 20% of the size of each tissue and organ of the character, etc.

第二筛选单元4034，用于根据人体模型筛选策略，从目标区域的多个人物的人体模型中筛选出与目标声音匹配的目标人体模型。The second screening unit 4034 is configured to select a target mannequin that matches the target voice from the mannequins of multiple people in the target area according to the mannequin screening strategy.

第二确定单元4035，用于根据目标人体模型确定目标人物。The second determination unit 4035 is configured to determine the target person according to the target human body model.

本发明实施例中，第二筛选单元4034从提取的多个人体模型中确定目标人体模型后，第二确定单元4035就可以确定该目标人体模型所属的目标人物。In the embodiment of the present invention, after the second screening unit 4034 determines the target human body model from the extracted multiple human body models, the second determining unit 4035 can determine the target person to which the target human body model belongs.

作为一种可选的实施方式，上述第二筛选单元4034可以包括：As an optional implementation manner, the second screening unit 4034 may include:

第二获取子单元40341，用于获取目标声音的目标声音特征；The second acquisition subunit 40341 is used to acquire the target sound feature of the target sound;

第二查询子单元40342，用于根据目标声音特征，从预先存储的声音特征与人体模型的对应关系中，查询目标声音特征对应的人体模型；The second query subunit 40342 is used to query the human body model corresponding to the target voice feature from the pre-stored correspondence between the voice feature and the human body model according to the target voice feature;

第二比对子单元40343，用于将查询到的目标声音特征对应的人体模型与目标区域的多个人物的人体模型进行比对，当查询到的目标声音特征对应的人体模型与目标区域的多个人物的人体模型中的一个人体模型相匹配时，将相匹配的人体模型确定为目标人体模型。The second comparison subunit 40343 is used to compare the human body model corresponding to the queried target sound feature with the human body models of multiple people in the target area. When one of the human body models of the plurality of characters matches, the matched human body model is determined as the target human body model.

在该实施例中，第二获取子单元40341可以获取预先存储的目标声音特征，或者从获取到的目标声音中提取目标声音特征，其中，该目标声音特征可以包括音调、响度以及音色中的任一种或多种。第二查询子单元40342可以以目标声音特征为依据，从预先存储的声音特征与人体模型的对应关系中，查询目标声音特征对应的人体模型。进一步地，第二比对子单元40343将查询到的目标声音特征对应的人体模型与目标区域的多个人物的人体模型进行比对，当查询到的目标声音特征对应的人体模型与目标区域的多个人物的人体模型中的一个人体模型相匹配时，将相匹配的人体模型确定为目标人体模型。In this embodiment, the second acquiring subunit 40341 can acquire the pre-stored target sound features, or extract the target sound features from the acquired target sounds, where the target sound features can include any of pitch, loudness and timbre. one or more. The second query subunit 40342 may query the human body model corresponding to the target voice feature from the pre-stored correspondence between the voice feature and the human body model based on the target voice feature. Further, the second comparison subunit 40343 compares the human body model corresponding to the queried target sound feature with the human body models of multiple people in the target area. When one of the human body models of the plurality of characters matches, the matched human body model is determined as the target human body model.

具体地，用户终端可以通过摄像头采集目标区域的多个人物，提取单元4033在通过内置的人体模型提取装置提取每个人物的人体模型，其中，该提取的人体模型包括每个组织器官的尺寸、形状、位置及其空间关系。第二查询子单元40342查询目标声音特征对应的人体模型，第二比对子单元40343可以将该目标声音特征对应的人体模型包括的每个组织器官的尺寸、形状、位置及其空间关系与提取的每个人物的人体模型包括的每个组织器官的尺寸、形状、位置及其空间关系一一比对，当比对的匹配度达到预设匹配度(如90％)时，就可以将达到预设匹配度的提取的人体模型确定为目标人体模型。Specifically, the user terminal can collect multiple people in the target area through the camera, and the extraction unit 4033 extracts the human body model of each character through the built-in human body model extraction device, wherein the extracted human body model includes the size of each tissue and organ, Shape, position and its spatial relationship. The second query subunit 40342 queries the human body model corresponding to the target sound feature, and the second comparison subunit 40343 can extract the size, shape, position and spatial relationship of each tissue and organ included in the human body model corresponding to the target sound feature The size, shape, position and spatial relationship of each tissue and organ included in the human body model of each character are compared one by one. When the matching degree of comparison reaches the preset matching degree (such as 90%), the The extracted human body model with a preset matching degree is determined as the target human body model.

在图4～图8所描述的用户终端中，获取模块401获取待拍摄目标人物的目标声音，旋转模块402根据目标声音的传播方向，确定目标人物所在的目标区域，并将摄像头旋转至目标区域；进一步地，筛选模块403根据预设筛选策略，从目标区域的多个人物中筛选出与目标声音匹配的目标人物，调整模块404确定目标人物所在的目标位置后，就可以将摄像头调整至目标位置。通过本发明实施例，旋转模块402通过识别目标人物的目标声音所在的目标区域，将摄像头旋转至该目标区域的大致位置后，进一步调整模块404确定该目标区域中的目标人物所在的具体位置，并调整摄像头到该具体位置，从而可以提高摄像头旋转位置的准确性。In the user terminal described in FIGS. 4 to 8 , the acquisition module 401 acquires the target voice of the target person to be photographed, and the rotation module 402 determines the target area where the target person is located according to the propagation direction of the target voice, and rotates the camera to the target area Further, the screening module 403 screens out the target person who matches the target voice from multiple people in the target area according to the preset screening strategy, and after the adjustment module 404 determines the target position where the target person is located, the camera can be adjusted to the target Location. According to the embodiment of the present invention, the rotation module 402 recognizes the target area where the target voice of the target person is located, and after rotating the camera to the approximate position of the target area, the adjustment module 404 further determines the specific position of the target person in the target area, And adjust the camera to the specific position, so that the accuracy of the rotation position of the camera can be improved.

需要说明的是，对于前述的各个方法实施例，为了简单描述，故将其都表述为一系列的动作组合，但是本领域技术人员应该知悉，本申请并不受所描述的动作顺序的限制，因为依据本申请，某一些步骤可以采用其他顺序或者同时进行。其次，本领域技术人员也应该知悉，说明书中所描述的实施例均属于优选实施例，所涉及的动作和单元并不一定是本申请所必须的。It should be noted that, for the sake of simple description, all the aforementioned method embodiments are expressed as a series of action combinations, but those skilled in the art should know that the present application is not limited by the described action sequence. Because according to the application, certain steps may be performed in other order or simultaneously. Secondly, those skilled in the art should also know that the embodiments described in the specification belong to preferred embodiments, and the actions and units involved are not necessarily required by this application.

在上述实施例中，对各个实施例的描述都各有侧重，某个实施例中没有详细描述的部分，可以参见其他实施例的相关描述。In the foregoing embodiments, the descriptions of each embodiment have their own emphases, and for parts not described in detail in a certain embodiment, reference may be made to relevant descriptions of other embodiments.

本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程，是可以通过计算机程序来指令相关的硬件来完成，所述的程序可存储于计算机可读取存储介质中，该程序在执行时，可包括如上述各方法的实施例的流程。其中，所述的存储介质可为磁碟、光盘、只读存储记忆体(Read-Only Memory，ROM)或随机存储记忆体(Random Access Memory，RAM)等。Those of ordinary skill in the art can understand that all or part of the processes in the methods of the above embodiments can be implemented through computer programs to instruct related hardware, and the programs can be stored in computer-readable storage media. During execution, it may include the processes of the embodiments of the above-mentioned methods. Wherein, the storage medium may be a magnetic disk, an optical disk, a read-only memory (Read-Only Memory, ROM) or a random access memory (Random Access Memory, RAM), etc.

以上所揭露的仅为本发明较佳实施例而已，当然不能以此来限定本发明之权利范围，因此依本发明权利要求所作的等同变化，仍属本发明所涵盖的范围。The above disclosures are only preferred embodiments of the present invention, and certainly cannot limit the scope of rights of the present invention. Therefore, equivalent changes made according to the claims of the present invention still fall within the scope of the present invention.

Claims

1. a camera method of controlling rotation, is characterized in that, comprising:

Obtain the target sound of target person to be captured;

According to the direction of propagation of described target sound, determine the target area at described target person place, and camera is rotated to described target area;

According to default screening strategy, from multiple personages of described target area, filter out the target person mated with described target sound;

Determine the target location at described target person place, and described camera is adjusted to described target location.

2. method according to claim 1, is characterized in that, the target sound of described acquisition target person to be captured, comprising:

Receive the ambient sound comprising the sound of target person to be captured;

According to the target sound feature of the described target person prestored, from the described ambient sound received, filtering and the unmatched sound of described target sound feature, to obtain the sound of described target person.

3. method according to claim 1 and 2, it is characterized in that, described default screening strategy comprises face screening strategy, and described basis presets screening strategy, from multiple personage of described target area, filter out the target person mated with described target sound, comprising:

According to face screening strategy, from multiple faces of described target area, filter out the target face mated with described target sound;

Described target person is determined according to described target face.

4. method according to claim 3, is characterized in that, described according to face screening strategy, filters out the target face mated with described target sound, comprising from multiple faces of described target area:

Obtain the target sound feature of described target sound;

According to described target sound feature, from the corresponding relation of the sound characteristic prestored and face, inquire about the face that described target sound feature is corresponding;

Multiple faces of face corresponding for the described target sound feature inquired and described target area are compared, when the face that the described target sound feature inquired is corresponding mates with people's appearance in multiple faces of described target area, the face matched is defined as target face.

5. method according to claim 1 and 2, it is characterized in that, described default screening strategy comprises manikin screening strategy, and described basis presets screening strategy, from multiple personage of described target area, filter out the target person mated with described target sound, comprising:

The manikin of multiple personages of described target area is extracted from the preview image of described camera;

According to manikin screening strategy, from the manikin of multiple personages of described target area, filter out the target body model mated with described target sound;

Described target person is determined according to described target body model.

6. method according to claim 5, is characterized in that, described according to manikin screening strategy, filters out the target body model mated with described target sound, comprising from the manikin of multiple personages of described target area:

Obtain the target sound feature of described target sound;

According to described target sound feature, from the corresponding relation of the sound characteristic prestored and manikin, inquire about the manikin that described target sound feature is corresponding;

The manikin of multiple personages of manikin corresponding for the described target sound feature inquired and described target area is compared, when a manikin in the manikin of multiple personages of manikin corresponding to the described target sound feature inquired and described target area matches, the manikin matched is defined as target body model.

7. a user terminal, is characterized in that, comprising:

Acquisition module, for obtaining the target sound of target person to be captured;

Rotary module, for the direction of propagation according to described target sound, determines the target area at described target person place, and rotates to described target area by camera;

Screening module, for according to presetting screening strategy, filters out the target person mated with described target sound from multiple personages of described target area;

Adjusting module, for determining the target location at described target person place, and is adjusted to described target location by described camera.

8. user terminal according to claim 7, is characterized in that, described acquisition module comprises:

Receiving element, for receiving the ambient sound of the sound comprising target person to be captured;

Filtering unit, for the target sound feature according to the described target person prestored, from the described ambient sound received, filtering and the unmatched sound of described target sound feature, to obtain the sound of described target person.

9. the user terminal according to claim 7 or 8, is characterized in that, described default screening strategy comprises face screening strategy, and described screening module comprises:

First screening unit, for according to face screening strategy, filters out the target face mated with described target sound from multiple faces of described target area;

First determining unit, for determining described target person according to described target face.

10. user terminal according to claim 9, is characterized in that, described first screening unit comprises:

First obtains subelement, for obtaining the target sound feature of described target sound;

First inquiry subelement, for according to described target sound feature, from the corresponding relation of the sound characteristic prestored and face, inquires about the face that described target sound feature is corresponding;

First comparer unit, for multiple faces of face corresponding for the described target sound feature inquired and described target area are compared, when the face that the described target sound feature inquired is corresponding mates with people's appearance in multiple faces of described target area, the face matched is defined as target face.

11. user terminals according to claim 7 or 8, it is characterized in that, described default screening strategy comprises manikin screening strategy, and described screening module comprises:

Extraction unit, for extracting the manikin of multiple personages of described target area in the preview image from described camera;

Second screening unit, for according to manikin screening strategy, filters out the target body model mated with described target sound from the manikin of multiple personages of described target area;

Second determining unit, for determining described target person according to described target body model.

12. user terminals according to claim 11, is characterized in that, described second screening unit comprises:

Second obtains subelement, for obtaining the target sound feature of described target sound;

Second inquiry subelement, for according to described target sound feature, from the corresponding relation of the sound characteristic prestored and manikin, inquires about the manikin that described target sound feature is corresponding;

Second comparer unit, manikin for the multiple personages by manikin corresponding for the described target sound feature inquired and described target area is compared, when a manikin in the manikin of multiple personages of manikin corresponding to the described target sound feature inquired and described target area matches, the manikin matched is defined as target body model.