CN106251334B

CN106251334B - A kind of camera parameters method of adjustment, instructor in broadcasting's video camera and system

Info

Publication number: CN106251334B
Application number: CN201610562671.6A
Authority: CN
Inventors: 刘源
Original assignee: Huawei Technologies Co Ltd
Current assignee: Huawei Technologies Co Ltd
Priority date: 2016-07-18
Filing date: 2016-07-18
Publication date: 2019-03-01
Anticipated expiration: 2036-07-18
Also published as: WO2018014730A1; CN106251334A

Abstract

The embodiment of the invention discloses a kind of camera parameters method of adjustment, instructor in broadcasting's video camera and systems, wherein this method comprises: determining the target video object for needing to shoot；The target video camera for shooting the target video object is filtered out from each video camera of instructor in broadcasting's camera system where instructor in broadcasting's video camera according to preset instructor in broadcasting's strategy；The first three-dimensional coordinate of the target video object is obtained, first three-dimensional coordinate is three-dimensional coordinate of the target video object under corresponding first coordinate system of the target video camera；The camera parameter of the target video camera is adjusted to camera parameter corresponding with first three-dimensional coordinate, and the video image after output adjustment camera parameter.Using this programme, the efficiency of camera parameters adjustment can be improved, and promote video camera shooting effect.

Description

Camera parameter adjusting method, broadcast guiding camera and system

Technical Field

The invention relates to the technical field of image processing, in particular to a camera parameter adjusting method, a broadcast directing camera and a system.

Background

With the continuous development of image processing technology and the internet, more and more scenes are needed to adopt the video conference, and the video conference brings great convenience for remote communication among users. At present, when a video conference is carried out, a plurality of cameras are often required to be deployed for shooting so as to obtain front images of participants. For example, referring to fig. 1, fig. 1 is a schematic view of a video conference scene. In this scenario, as shown in fig. 1, the conference room employs an elongated oval conference table with the seats of the participants surrounding the table, the participants including a and B seated in opposition, with cameras C0 and C1 disposed on either side of the projection screen in front of a and B. Then for a, the front image of a can be captured only by the camera C0, and the front of a cannot be captured by the camera C1. For B, the front image of B can be captured only by the camera C1, and the front image of B cannot be captured by the camera C0. Therefore, a plurality of cameras are needed to shoot to realize the video conference.

When a plurality of cameras are deployed for conference shooting, parameters of the cameras are generally adjusted manually through a remote controller or other methods to obtain a better shooting effect. However, the manual adjustment mode requires the operator to have a certain professional knowledge of the camera, and the operation process is complicated, so that the adjustment efficiency is low, and a better shooting effect cannot be ensured in time. In addition, the shooting camera can be determined and the shooting effect of the camera can be adjusted by means of sound source positioning. The sound source localization method is to locate and track a speaking participant (i.e., "speaker"), and simultaneously capture a close-up of the speaker using one camera, track the face position of the speaker at the time of close-up, and perform a shot PTZ (Pan Tilt Zoom) adjustment so that the face of the speaker is located in the middle area of the image. However, the sound source localization method only considers the adjustment of the front of the speaker to the center of the image, but does not consider the effect of capturing the image, and cannot ensure a better capturing effect.

Disclosure of Invention

The embodiment of the invention provides a camera parameter adjusting method, a broadcast directing camera and a system, which can improve the efficiency of camera parameter adjustment and improve the shooting effect of the camera.

In a first aspect, an embodiment of the present invention provides a method for adjusting parameters of a camera, where the method is applied to a director camera, and includes:

determining a target video object to be shot;

screening out a target camera for shooting the target video object from all cameras of a broadcast guiding camera system where the broadcast guiding camera is located according to a preset broadcast guiding strategy;

acquiring a first three-dimensional coordinate of the target video object, wherein the first three-dimensional coordinate is a three-dimensional coordinate of the target video object in a first coordinate system corresponding to the target camera;

and adjusting the shooting parameters of the target camera to shooting parameters corresponding to the first three-dimensional coordinates, and outputting a video image with the shooting parameters adjusted.

The first three-dimensional coordinate may be a three-dimensional coordinate of the target video object in a first coordinate system corresponding to the target camera. The target camera may be a director camera or a common PTZ camera, and the first coordinate system corresponding to the target camera may refer to a three-dimensional coordinate system established with the optical center of the target camera as an origin, or a three-dimensional coordinate system established with another arbitrary reference object as an origin, which is not limited in the embodiment of the present invention.

The target video object may be any one or more video objects in a shooting scene corresponding to a director camera system in which the director camera is located.

In some embodiments, the obtaining the first three-dimensional coordinates of the target video object comprises:

acquiring a second three-dimensional coordinate transmitted by a binocular camera connected with the broadcast guiding camera, wherein the second three-dimensional coordinate is a three-dimensional coordinate of the target video object in a second coordinate system corresponding to the binocular camera;

and converting the second three-dimensional coordinate into a first three-dimensional coordinate according to the pre-calibrated position relationship between the binocular camera and the target camera.

In some embodiments, the second three-dimensional coordinates may be calculated by the binocular camera through two-dimensional coordinates of the video object in the left view and the right view of the binocular camera, respectively, and the acquired internal and external parameters of the binocular camera.

The second three-dimensional coordinate system corresponding to the binocular camera may be a three-dimensional coordinate system established with the optical center of the binocular camera as an origin or a three-dimensional coordinate system established with any other reference object as an origin. The two-dimensional coordinates may specifically be pixel coordinates of the target video object in the left view and the right view of the binocular camera.

In some embodiments, the determining a target video object to be shot comprises:

acquiring a shot image transmitted by a binocular camera, wherein the shot image comprises at least one video object;

establishing a video object model comprising the at least one video object, and determining a target video object from the at least one video object;

the method for screening out the target cameras for shooting the target video objects from all cameras of the broadcast guiding camera system where the broadcast guiding cameras are located according to the preset broadcast guiding strategy comprises the following steps:

determining the target video object from shot images obtained by all cameras in the broadcast guiding camera system respectively, and obtaining shooting effect parameters of the target video object in all the cameras;

and determining the camera with the shooting effect parameter meeting the preset director strategy as a target camera for shooting the target video object.

In some embodiments, determining the target video object from captured images acquired by a camera comprises:

converting the second three-dimensional coordinate into a third three-dimensional coordinate according to a pre-calibrated position relation between the binocular camera and the current camera;

judging whether the overlapping area of the target video object under the third three-dimensional coordinate and the area of the video object under the three-dimensional coordinate detected by the current camera exceeds a preset area threshold value or not;

and if so, determining the video object as the target video object.

The current camera is any one of the cameras, except the binocular camera, in the broadcast guiding camera system, which has a calibrated position relationship with the broadcast guiding camera, and the third three-dimensional coordinate is a three-dimensional coordinate of the target video object in a third coordinate system corresponding to the current camera.

In some embodiments, the shooting effect parameter includes any one or more of an eye-to-eye effect parameter, an occlusion relation parameter, and a scene object parameter of a shooting area of the target video object in a coordinate system corresponding to a current camera, where the current camera is any one of cameras in the director shooting system except for the binocular camera.

The eye-to-eye effect parameter may include a rotation angle of the target video object relative to a coordinate system corresponding to the current camera, where the rotation angle is determined according to the rotation angle of the target video object in the second coordinate system and a pre-calibrated position relationship between the binocular camera and the current camera. The smaller the rotation angle, the better the eye-to-eye effect.

The occlusion relation parameter and the scene object parameter may be determined by re-projecting the area of the scene object detected by the current camera to the imaging plane of the current camera according to a pre-calibrated position relation between the binocular camera and the current camera. The better the output image effect without occlusion relation (the smaller the occlusion relation parameter). The smaller the area and the smaller the number of the scene objects are, the better the image output effect is; otherwise, the output image effect is worse.

In a second aspect, an embodiment of the present invention further provides a director camera, including: a memory and a processor, the processor coupled to the memory; wherein,

the memory is used for storing driving software;

the processor reads the driving software from the memory and executes part or all of the steps of the camera parameter adjustment method of the first aspect under the action of the driving software.

In a third aspect, an embodiment of the present invention further provides a parameter adjusting apparatus, which includes an object determining unit, a selecting unit, an obtaining unit, and a parameter adjusting unit, and the parameter adjusting apparatus implements, through the above units, part or all of the steps of the camera parameter adjusting method in the first aspect.

In a fourth aspect, an embodiment of the present invention further provides a computer storage medium, where the computer storage medium stores a program, and the program includes, when executed, some or all of the steps of the camera parameter adjustment method according to the first aspect.

In a fifth aspect, an embodiment of the present invention further provides a broadcast guiding camera system, including a first camera and at least one second camera, where the first camera includes a broadcast guiding camera and a binocular camera, and the broadcast guiding camera and the binocular camera, and the first camera and the second camera are connected through a wired interface or a wireless interface; wherein,

the program guide camera is used for determining a target video object to be shot and screening out a target camera for shooting the target video object from cameras of the program guide camera system according to a preset program guide strategy;

the binocular camera is used for acquiring a second three-dimensional coordinate of the target video object and transmitting the second three-dimensional coordinate to the broadcast guiding camera; the second three-dimensional coordinate is a three-dimensional coordinate of the target video object in a second coordinate system corresponding to the binocular camera;

the broadcasting guide camera is used for receiving the second three-dimensional coordinates transmitted by the binocular camera; converting the second three-dimensional coordinate into a first three-dimensional coordinate according to a pre-calibrated position relation between the binocular camera and the target camera; adjusting the shooting parameters of the target camera to shooting parameters corresponding to the first three-dimensional coordinates, and outputting a video image with the shooting parameters adjusted; and the first three-dimensional coordinate is a three-dimensional coordinate of the target video object in a first coordinate system corresponding to the target camera.

In some embodiments, the second camera may include a director camera and a binocular camera, and the target camera may be any director camera in the director's camera system; alternatively, the second camera may also be a normal PTZ camera, and the target camera may be the director camera or a normal PTZ camera.

In some embodiments, the binocular camera may be disposed on a preset director's stand and connected to the director's camera through the director's stand.

The embodiment of the invention has the following beneficial effects:

in the embodiment of the invention, after the target video object to be shot is determined, the target camera with the best shooting effect of the target video object is screened out from all cameras of the directing and shooting system according to the preset directing strategy, the three-dimensional coordinate of the target video object in the coordinate system corresponding to the target camera is obtained, the target camera is controlled to adjust the camera parameters according to the three-dimensional coordinate of the target video object, and the video image with the shooting parameters adjusted is output, so that the directing and shooting system can detect and preset directing and shooting strategy based on the three-dimensional coordinate, the precision of video object detection and tracking is improved, the efficiency of camera parameter adjustment is improved, and the shooting effect of the camera is effectively improved.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

FIG. 1 is a schematic view of a video conference scenario;

fig. 2 is a schematic flowchart of a method for adjusting parameters of a camera according to an embodiment of the present invention;

FIG. 3a is a schematic view of an imaging model of a camera according to an embodiment of the present invention;

FIG. 3b is a schematic diagram of a calibration scenario of multiple cameras according to an embodiment of the present invention;

fig. 3c is a three-dimensional positioning schematic diagram of a binocular camera according to an embodiment of the present invention;

FIG. 3d is a schematic view of a rotational model of a PTZ camera provided by an embodiment of the present invention;

fig. 4a is a schematic diagram of a video object matching scene according to an embodiment of the present invention;

FIG. 4b is an imaging view of a set of video objects of FIG. 4 a;

fig. 5 is a schematic structural diagram of a parameter adjusting apparatus according to an embodiment of the present invention;

fig. 6 is a schematic structural diagram of a director camera system according to an embodiment of the present invention;

fig. 7 is a schematic structural diagram of a first camera according to an embodiment of the present invention;

fig. 8 is a schematic networking diagram of a director camera system according to an embodiment of the present invention;

fig. 9 is a schematic structural diagram of a director camera according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

It should be understood that the references to "first", "second", and "third", etc. in the embodiments of the present invention are for distinguishing different objects, and are not intended to describe a particular order. Furthermore, the terms "comprises" and any variations thereof, are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those steps or elements listed, but may alternatively include other steps or elements not listed, or inherent to such process, method, article, or apparatus.

It should be understood that the director camera according to the embodiment of the present invention may be specifically a PTZ camera for implementing the technical solution according to the embodiment of the present invention, and the PTZ camera may be connected to a binocular camera, and the director camera may be applied to scenes such as conferences and training, and may be deployed according to different scenes in terms of the position and number of the director camera.

In some embodiments, the binocular camera may be mounted on a director's mount, i.e., the director's camera may be coupled to the binocular camera through the director's mount (simply "mount"). Wherein, the director camera is used for carrying out director shooting and tracking. In addition, a microphone can be further mounted on the support and can be used for achieving functions of sound source positioning, sound source identification and the like. The director camera and cradle may be separate or integrated and may communicate with each other using a control interface, such as a serial interface.

In some embodiments, the binocular camera may be used for Video acquisition, Video pre-processing, motion detection, face detection, human face detection, scene object detection, feature detection/matching, binocular camera calibration, multi-camera calibration, etc., the microphone may be used for Audio acquisition, Audio pre-processing, Video acquisition, sound source behavior identification, etc., the director camera may be used for Audio Video (AV) object 3D localization, AV object modeling, AV object tracking, motion/pose recognition, director control, and Video switching/synthesis, etc. The method comprises the steps that video acquisition comprises the step of synchronously acquiring video streams of a binocular camera and a director camera; the video preprocessing comprises preprocessing input binocular images, such as denoising, resolution and frame rate changing and the like; the motion detection comprises the steps of detecting a moving object in a scene, and separating the moving object from a static background to obtain a region of the moving object; the face detection comprises detecting a face target object in a scene and outputting face detection information, such as face position, area, direction and the like; the human shape detection comprises detecting a human shape head and shoulder part area in a scene and outputting detection information; scene object detection includes detecting other objects in the scene besides people, such as lamps, windows, conference tables, and the like; the characteristic detection/matching comprises the steps of carrying out characteristic detection and matching on the detected moving object region, detecting a characteristic object (such as a characteristic point) in one image, matching in the other image, and outputting matched characteristic object information; the binocular camera calibration comprises the steps of calibrating a binocular camera to obtain internal and external parameters of the binocular camera, and calculating three-dimensional coordinates of a video object in a video image; the multi-camera calibration comprises the steps of calibrating the relative position relation of the plurality of broadcasting-directing cameras, obtaining the relative reference information of the plurality of broadcasting-directing cameras, and positioning the video object in a plurality of camera coordinate systems. Further, the audio acquisition comprises synchronously acquiring multi-channel audio data of the microphone; the audio preprocessing comprises the steps of carrying out 3A processing on input multi-channel audio data, wherein the 3A processing comprises automatic exposure control (AE), automatic focusing control (AF) and automatic white balance control (AWB); the sound source positioning comprises the steps of detecting input multi-channel audio data and finding out two-dimensional position information of a sound production object; sound source behavior recognition involves detecting and counting speech behaviors of video objects in a scene. Further, the 3D positioning of the AV object comprises the steps of obtaining depth information of object features in an image according to internal and external parameters of a binocular camera and parallax information obtained by feature detection/matching, obtaining three-dimensional position information of the object features in a single broadcasting guide camera coordinate system by combining the result of audio positioning, and obtaining position information of the features in other broadcasting guide camera coordinate systems according to the position of the features in the single broadcasting guide coordinate system and the relative position relation of a plurality of broadcasting guide cameras; the AV object modeling comprises the steps of constructing a model of the AV object by combining information such as sound source positioning, face information, characteristic objects, scene objects and the like; the AV object tracking comprises tracking a plurality of AV objects in a scene and updating the state information of the objects; motion/gesture recognition includes recognizing motions, gestures, etc. of an AV object, such as recognizing standing gestures, gesturing motions, etc. of the object; the method comprises the steps of determining a broadcasting guide strategy by combining the action/posture recognition result and the sound source behavior recognition result, and controlling and outputting a control instruction, video object and scene characteristic information, a video output strategy and the like corresponding to the broadcasting guide strategy by a broadcasting guide camera. The camera control instructions can be used for controlling the PTZ camera to perform PTZ operation, namely translation, tilting, zooming operation and the like, the video object and scene characteristic information can be used for information sharing among a plurality of director cameras, and the video output strategy can be used for controlling the output strategy of video streams of a single director camera or a plurality of director cameras.

The embodiment of the invention provides a camera parameter adjusting method, a broadcast directing camera and a system, which can improve the efficiency of camera parameter adjustment and improve the shooting effect of the camera. The details are described below.

Further, please refer to fig. 2, fig. 2 is a schematic flow chart of a camera parameter adjustment method according to an embodiment of the present invention. In particular, the method of the embodiment of the present invention may be specifically applied to the director camera described above. As shown in fig. 2, the method for adjusting parameters of a camera according to an embodiment of the present invention may include the following steps:

101. and determining a target video object to be shot.

102. And screening out a target camera for shooting the target video object from all cameras of the broadcast guiding camera system where the broadcast guiding camera is located according to a preset broadcast guiding strategy.

Optionally, the determining of the target video object to be shot may specifically be: acquiring a shot image transmitted by a binocular camera, wherein the shot image comprises at least one video object; and establishing a video object model comprising the at least one video object, and determining a target video object from the at least one video object. Further, the screening of the target camera for shooting the target video object from the cameras of the broadcast guiding camera system where the broadcast guiding camera is located according to the preset broadcast guiding strategy may specifically be: determining the target video object from shot images obtained by all cameras in the broadcast guiding camera system respectively, and obtaining shooting effect parameters of the target video object in all the cameras; and determining the camera with the shooting effect parameter meeting the preset director strategy as a target camera for shooting the target video object. The director camera system may be configured to deploy one or more director cameras, that is, the director camera system may be configured to be a director camera + a director camera, or may be configured to be a director camera + a general camera (e.g., a general PTZ camera). Specifically, the video object model may include all video objects in a shooting scene corresponding to a director camera system in which the director camera is located. If other cameras in the broadcast guiding camera system also comprise broadcast guiding cameras, shot images sent by binocular cameras connected with the other broadcast guiding cameras can be received, and the video object models are updated to obtain the video object models of all the video objects in the shot scene. Wherein, the target video object can be any one or more video objects in the shooting scene.

Optionally, the shooting effect parameter may include any one or more of an eye-to-eye effect parameter, an occlusion relation parameter, and a scene object parameter of a shooting area of the target video object in a coordinate system corresponding to the current camera. The current camera is any one of the cameras in the broadcast guiding camera system except for the binocular camera, that is, the current camera may be any one of the broadcast guiding cameras in the broadcast guiding camera system or a common PTZ camera.

The eye-to-eye effect parameter may include a rotation angle of the target video object relative to a coordinate system corresponding to the current camera, and the rotation angle may be determined according to the rotation angle of the target video object in the second coordinate system and a pre-calibrated position relationship between the binocular camera and the current camera. Specifically, the rotation angle of the target video object relative to the coordinate system corresponding to the current camera may refer to an optical axis angle of a human face or a human-shaped object corresponding to the target video object relative to the current camera (the director camera or the ordinary PTZ camera). The smaller the angle is, the more the face is presented in a face-up manner, that is, the better the eye-to-eye effect is, the better the output image effect is.

The occlusion relation parameter and the scene object parameter may be determined by re-projecting the region of the scene object detected by the current camera to the imaging plane of the current camera according to a pre-calibrated position relation between the binocular camera and the current camera. Specifically, if the areas of two video objects overlap, the occlusion relationship between the two objects can be determined using the depth information, and the video object closer to the binocular camera will occlude the video object farther away. The better the output image effect without occlusion relation (the smaller the occlusion relation parameter). The scene objects indicated by the scene object parameters can comprise lamp tubes, windows, tables and the like, and the smaller the area and the smaller the number of the scene objects are, the better the image output effect is; otherwise, the output image effect is worse.

103. And acquiring a first three-dimensional coordinate of the target video object.

The first three-dimensional coordinate may be a three-dimensional coordinate of the target video object in a first coordinate system corresponding to the target camera. The target camera may be configured as the above-mentioned director camera or a general PTZ camera, and the first coordinate system corresponding to the target camera may refer to a three-dimensional coordinate system established with the optical center of the target camera as an origin, or a three-dimensional coordinate system established with another arbitrary reference object as an origin, which is not limited in the embodiment of the present invention.

Optionally, the director camera may be connected to a preset binocular camera. The obtaining of the first three-dimensional coordinate of the target video object may specifically be: acquiring a second three-dimensional coordinate transmitted by a binocular camera connected with the broadcast guiding camera; and converting the second three-dimensional coordinate into a first three-dimensional coordinate according to the pre-calibrated position relationship between the binocular camera and the target camera. Further optionally, the second three-dimensional coordinate may be obtained by the binocular camera through two-dimensional coordinates of the target video object in the left view and the right view of the binocular camera, which are respectively obtained, and through calculation of the obtained internal and external parameters of the binocular camera. The second three-dimensional coordinate system corresponding to the binocular camera may be a three-dimensional coordinate system established with the optical center of the binocular camera as an origin or a three-dimensional coordinate system established with any other reference object as an origin. The two-dimensional coordinates may specifically be pixel coordinates of the target video object in the left view and the right view of the binocular camera.

In a specific embodiment, the positional relationship between the binocular cameras, the positional relationship between the director camera and the binocular cameras, and the positional relationship between the cameras of multiple positions in the director camera system may be calibrated in advance. The parameters obtained by calibrating the binocular camera system can be used for calculating the three-dimensional coordinates of the video object in the coordinate system corresponding to the binocular camera; the position relation calibration between the broadcasting guide camera and the binocular camera can be used for calculating the three-dimensional coordinates of the video object under the broadcasting guide camera coordinate system; and the parameters calibrated by the position relation among the cameras of the multiple machine positions can be used for calculating the three-dimensional coordinates of the video object under the camera coordinate system of each machine position when the deployment scene of the multiple machine positions is calculated, so as to be convenient for coordinate conversion. The deployment mode of the multi-position camera can be the deployment mode of the director camera and the director camera or the deployment mode of the director camera and the common PTZ camera. Each director camera can be called as a machine position, when a plurality of director cameras are adopted for matching shooting, one master machine position can be determined, the rest slave machine positions are the slave machine positions, and the director camera serving as the slave machine position can register the information of IP (Internet protocol) and the like of the director camera on the master machine position, so that the master machine position can manage the plurality of slave machine positions. Specifically, the calibration process is briefly described below. The binocular camera includes a left camera and a right camera, an image acquired by the left camera may be referred to as a left view, and an image acquired by the right camera may be referred to as a right view. The imaging (projection) model of a single camera can be described by the following formula:

x＝PX＝K[R|t]X

as shown in fig. 3a, x is a pixel coordinate of a certain point (i.e. a video object, and specifically, a feature point corresponding to the video object) in the scene in the image coordinate system, which is a two-dimensional coordinate; x is the position coordinate of a certain point in the scene under a world coordinate system; p is a 3 × 4 projection matrix. PX means P X. Where K is a 3 × 3 camera reference matrix, which can be expressed as:

wherein f is_x,f_yIs the equivalent focal length in the x and y directions, c_x,c_yAnd s is a skew deformation coefficient (sensor and optical axis are not perpendicular, and are usually small and can be ignored in the calibration process).

Further, R and t are camera external parameters, respectively expressed as a rotation matrix of 3 × 3 and a translation vector of 3 × 1, as follows:

R＝[r₁ r₂ r₃]

t＝[t₁ t₂ t₃]^T

wherein r is₁,r₂,r₃Is a 3 x 1 column vector in the rotation matrix.

Due to factors such as optical characteristics of a camera lens and manufacturing and installation of an image sensing device, an image actually shot by the camera is not ideal, and has distortion, so that image distortion can be modeled to obtain an ideal image. Specifically, the model of camera image distortion can be described according to the following formula:

wherein x is_p,y_pFor corrected pixel position, x_d,y_dTo correct the pre-pixel position, k₁,k₂,k₃As radial distortion coefficient, p₁,p₂Is the tangential distortion coefficient.

Based on the imaging model of the monocular camera, when the world coordinate system is known to be transformed into the rotation matrix R of the left camera coordinate system and the right camera coordinate system₁And R₂And a translation vector t₁And t₂Then, relative external parameters between the binocular cameras can be obtained, including a rotation matrix R and a translational vector T:

it should be understood that, in the embodiment of the present invention, the positional relationship between the binocular cameras and the positional relationship between the director camera, such as the PTZ camera, and the binocular cameras are fixed, and the two calibrations may be performed before shipment, that is, the data obtained by the two calibrations, such as the internal and external parameter data, is fixed. Optionally, in the embodiment of the present invention, the calibration of the camera may adopt various schemes, such as a planar calibration method (also referred to as a Zhang calibration method) of Zhang, and a Brown method is adopted for calculating the distortion parameter, which is not described herein again.

Further, as can be seen from the above binocular camera calibration principle, the essence of calibrating the positional relationship of cameras in multiple locations, such as multiple director cameras, is to find the relative external parameters between two adjacent director cameras, and calculate the external parameters between any two director cameras according to the relative external parameters between the adjacent director cameras, thereby obtaining the positional relationship between any two director cameras. When the multi-director camera is deployed, a large shooting overlapping area needs to be arranged between every two director cameras, a plurality of machine positions form a system similar to a surrounding multi-camera, and a rotation matrix and a translation vector of the ith camera relative to the jth camera are as follows:

R_i,i-1R_i-1,i-2...R_j+1,j

R_i,i-1R_i-1,i-2...R_j+2,j+1T_j+R_i,i-1R_i-1,i-2...R_j+3,j+2T_j+1+...+R_i,i-1T_i-2+T_i-1

wherein R is_i,i-1R_i-1,i-2...R_j+1,jRepresents R_i,i-1×R_i-1,i-2×...×R_j+1,j. When the director camera is deployed, the positions of the cameras for positioning on different director supports are changed according to the actual deployment scene, so that the position relation among the multiple director cameras cannot be pre-calibrated before the equipment leaves a factory, and the on-site calibration can be performed when the director camera is deployed.

Further, it is assumed that the cameras in the broadcast guiding camera system are broadcast guiding cameras, and each broadcast guiding camera is connected with a binocular camera. Referring to fig. 3b, fig. 3b is a schematic view of a calibration scene of a multi-director camera according to an embodiment of the present invention. As shown in fig. 3b, during calibration, communication between two adjacent director cameras or between a binocular camera and the director cameras may be performed through a Local Area Network (LAN) or a wireless fidelity (Wi-Fi) Network, including transmitting a calibration template image, calibration parameters, and the like, and the transmission Protocol may employ various Network protocols, such as a HyperText Transfer Protocol (HTTP). Specifically, a global ID number may be assigned in advance to each director camera and a camera (binocular camera) connected to the director camera. For example, a camera may be selected as the starting position, such as the leftmost or rightmost camera, and the ID numbers of other cameras may be incremented in a counterclockwise or clockwise direction. A group of cameras is selected from all cameras to participate in the calibration, and the selection principle may be to ensure that the overlap area between adjacent cameras is maximized. As shown in fig. 3b, it is assumed that 3 bays D1, D2, D3 are deployed in the current shooting scene, each bay including a director camera (respectively denoted as PTZ0, PTZ1 and PTZ2) and a binocular camera (including left and right cameras, respectively denoted as C0, C1, C2, C3, C4, C5). Assume that cameras with ID numbers C0, C2, and C4 are selected for calibration, and one of the director cameras is selected as the calibration computing device, such as the director camera of the host position described above. Furthermore, external reference calibration between every two cameras can be carried out according to the left-to-right direction or the right-to-left direction, so that relative external reference between the two cameras is obtained. Optionally, a camera relative position relationship table may be maintained in each director camera, as shown in table one below. Wherein, each calibration will add or update one of the table entries, and each table entry is uniquely determined by the ID numbers of two cameras.

Watch 1

Table item	Camera ID1	Camera ID2	Relative external reference
				1	C0	C2	R₀₂,T₀₂
2	C2	C4	R₂₄,T₂₄
				...	......	...	......

After the calibration is completed, the director camera serving as the calibration computing device can send the position relation table to all other director cameras through the network for storage. Further, according to the position relationship table and external parameters of the binocular cameras (the external parameters can be calibrated before leaving factories), the position relationship between any two cameras (including between the binocular cameras, between the binocular camera and the PTZ camera and between the PTZ cameras) in the calibration scene can be calculated.

For example, assuming that the director camera D3 in fig. 3b is a calibration computing device and the director cameras D1 and D2 are cameras to be calibrated, the director camera D3 may be set as the master position for calibration, the other cameras may be set as slave positions, and calibration may be initiated through the director camera D3. Before calibration, it is necessary to ensure that cameras are interconnected through a network, overlapping regions can be shot between the cameras to be calibrated, and calibration templates (such as checkerboard templates) are arranged in the overlapping regions. When calibration is needed, the director camera D3 starts a calibration process and sends an image acquisition command to the director camera D1, where the acquisition command includes the ID number of the director camera D1 (i.e., D1) and the ID number of the binocular camera that needs to be acquired (C4 or C5). The director camera D1 receives the capture command, performs template-specific image capture, and transmits the captured image data to the director camera D3. Similarly, director camera D3 obtains a scaled template image captured by the binocular camera on director camera D2. If the binocular camera to be calibrated is located on the director camera D3, the director camera D3 may directly acquire the calibration template image of the binocular camera. After obtaining the calibration template images of the cameras to be calibrated, the director camera D3 may perform checkerboard corner detection on the calibration template images of the two cameras, and if the two images can detect all the checkerboard corners, it may indicate that the acquisition is successful; otherwise, the two images may be discarded and the image may be reacquired. Furthermore, a plurality of calibration template images of two cameras to be calibrated can be obtained in a circulating manner by changing the positions of the calibration templates and stored in the director camera D3, when the number requirement of the calibration template images is met, the director camera D3 can perform camera calibration, and the internal parameters of each camera are calibrated before leaving the factory, so that the internal parameters can be used as input initial values of calibration. Obtaining relative external parameters R and T between the two cameras after calibration is finished, calculating whether the re-projection (shadow) error is smaller than a preset threshold value, and if the re-projection error is larger than the threshold value, indicating that the calibration fails; otherwise, the calibration is successful. After the director camera D3 completes calibration, the position relationship table may be updated based on the calculated relative position relationship of the cameras and sent to other director cameras.

Furthermore, after the position relationship among the binocular cameras, the position relationship among the director cameras and the binocular cameras and the position relationship among the multi-director cameras are calibrated, the video object in the shooting range of the director cameras can be positioned, the three-dimensional position information of the video object is obtained, the proper set position of the director camera is determined according to the obtained three-dimensional position information, the parameter of the director camera is adjusted according to a director strategy corresponding to the three-dimensional position information, and the director camera is controlled to be positioned to the proper position to shoot the video object. The positioning of the video object comprises binocular camera three-dimensional positioning, single-director camera such as PTZ camera positioning and three-dimensional positioning among cameras in multiple machine positions.

Specifically, in the process of three-dimensional positioning of the binocular camera, the depth position information of an observation point in a scene in a camera coordinate system can be calculated by using a stereo image shot by the binocular camera, so that the three-dimensional position information of the observation point is determined. The method is the same as the principle of human eyes for perceiving depth distance, and is called binocular camera ranging. As shown in fig. 3c, which provides a binocular cameraThe three-dimensional positioning schematic diagram of the machine is briefly introduced below to the distance measuring principle of the binocular camera system. Wherein, P is an observation point in a world coordinate system and is shot and imaged by the left camera and the right camera. Wherein the position of the P point in the physical coordinate system of the left camera is X_L，Y_L，Z_LThe coordinate of the pixel position of the imaging point in the left view is x_l，y_l(ii) a The position in the physical coordinate system of the right camera is X_R，Y_R，Z_RThe coordinate of the imaging point pixel position in the right view is x_r，y_rAssuming that relative external parameters of the left camera and the right camera are R and T; the focal lengths of the left camera and the right camera are respectively as follows: f. of_l，f_r. According to the binocular camera model, the imaging models of the left camera and the right camera and the physical coordinate position relation of the left camera and the right camera are known as follows:

the following can be derived from the above formula:

wherein x is_l，y_l，x_r，y_rThe value of (a) can be obtained by image matching, f_l，f_rR, T can be obtained by calibrating a binocular camera, so that X can be calculated_L，Y_L，Z_LAnd X_R，Y_R，Z_RSo as to determine the three-dimensional coordinates of the observation point in the scene in the coordinate system corresponding to the binocular camera.

Further, in a director camera such as PTZ cameraIn the three-dimensional positioning process, the basic purpose of the PTZ camera positioning is to know the physical coordinates of a certain target in the PTZ camera coordinate system, and how to position a certain point of the target to a specific pixel coordinate position in an image by rotating the PTZ camera. The physical coordinates of the target in the PTZ camera coordinate system can be calculated by the three-dimensional position of the target in the binocular camera coordinate system and the calibrated position relationship between the binocular camera and the PTZ camera. Fig. 3d is a schematic view of a rotational model of a PTZ camera according to an embodiment of the present invention. As shown in FIG. 3d, assume that the position coordinate where the target point P is expected to appear is x₀,y₀The physical coordinate position of the target point P is X, Y, Z, and the pixel coordinate position on the imaging plane is X_c,y_cThen, the rotation angles can be modeled by the following equations when the pixel position of point P and the target position coincide, respectively, by rotating around the X-axis and the Y-axis:

since the PTZ camera is a zoom camera, it is necessary to acquire a functional relationship between the zoom factor Z and internal parameters such as a focal length and a distortion coefficient. For example, a polynomial fit may be used to the zoom factor Z and the focal length f_x，f_yThe following relationship is obtained:

f_x＝a₀+a₁Z+a₂Z²+...a_nZⁿ

f_y＝b₀+b₁Z+b₂Z²+...b_nZⁿ

specifically, under different Z values, calibrating to obtain camera internal parameters, and calculatingObtain corresponding f_x，f_yAnd distortion coefficients, and fitting the coefficients using a least squares method. Other parameters such as distortion coefficients may be handled in a similar manner. After the internal parameters of the camera under different Z values are obtained, the values of delta p and delta t can be calculated according to a Pan/Tilt model formula.

Further optionally, in a multi-director camera scene, shot images sent by binocular cameras connected with other director cameras can be acquired, and the video object model is updated after video object matching is performed. Then, determining the target video object from the captured image acquired by the camera may specifically be: converting the second three-dimensional coordinate into a third three-dimensional coordinate according to a pre-calibrated position relation between the binocular camera and the current camera; judging whether the overlapping area of the target video object under the third three-dimensional coordinate and the area of the video object under the three-dimensional coordinate detected by the current camera exceeds a preset area threshold value or not; and if so, determining the video object as the target video object. I.e. the video object matching is successful. The third three-dimensional coordinate is a three-dimensional coordinate of the target video object in a third coordinate system corresponding to the current camera, and the current camera is any one of the cameras in the director shooting system except the binocular camera. For example, in a multi-director camera scenario, the current camera may be a binocular camera other than the binocular camera of the master position.

In a specific embodiment, the purpose of three-dimensional positioning of the multi-position video object is to calculate and obtain three-dimensional coordinates of the video object in the coordinate systems of other broadcasting guide camera binocular cameras or a PZT camera according to the three-dimensional coordinates of the video object in a certain broadcasting guide camera binocular coordinate system. The coordinate vector of a certain observation point (i.e. a video object, specifically, a certain feature point of the video object) in the camera D1 is known as X₁And outer reference R of camera D2 relative to camera D1₂₁，T₂₁(by binocular imagingMachine calibrated), the coordinate vector X of the observation point in camera D2 can be calculated₂：

X₂＝R₂₁X₁+t₂₁

Specifically, please refer to fig. 4a, which is a schematic diagram of an object matching scene according to an embodiment of the present invention. Multi-pose video object three-dimensional positioning may be used to determine the correspondence of multiple video objects. As shown in FIG. 4a, three-position cameras D1, D2, and D3 are deployed in a scene with O₁、O₂And O₃A total of 3 participants. Further, as shown in FIG. 4b, is a set of video object images of the video object of FIG. 4a, which is the participant O₁Imaging in director cameras D1, D2, and D3 at different perspectives. For participant O₁And detecting to obtain a video object VO by using a binocular camera in the station D1 through algorithms such as face detection and the like₁₁And then, obtaining the three-dimensional position of the object in a D1 binocular coordinate system by adopting a binocular camera three-dimensional positioning algorithm. Similarly, D2 and D3 also detect the video object VO₁₂And VO₁₃And calculating the three-dimensional coordinates of the object under the D2 and D3 binocular coordinate systems. When the multi-pose video object is positioned in three dimensions, the video object VO under the D1 coordinate system can be calibrated by utilizing the position relations among D1, D2 and D3₁₁Is translated to the D2 and D3 coordinate systems. And detects the overlapping area thereof. If VO after conversion is detected₁₁Three-dimensional position of (3) and (VO)₁₂、VO₁₃When the position overlapping region of (A) exceeds a certain area threshold, the VO can be considered₁₁、VO₁₂And VO₁₃And the video objects are successfully matched with each other. Further, if there are multiple video objects with close adjacent distances in the image, the matching error may be caused by simply determining the correspondence relationship between the video objects by using the position overlapping area. Therefore, the accuracy of determining the corresponding relation can be improved by further combining the image information of the video object and a matching algorithm. The matching algorithm may include a template matching algorithm, etc. Two-dimensional matching of video objects detected by a binocular camera at a location, such as a host location, for example, using a template matching algorithmThe image is used as a known template, video objects detected by binocular cameras of other positions are matched with the known template one by one, and the most matched object of the video objects is found through algorithms such as square error matching, correlation matching and the like, so that the corresponding relation of the objects is established.

Further, after the three-dimensional positioning of the binocular camera and the three-dimensional positioning of the PTZ camera are determined, video object detection tracking and scene modeling can be performed. The purpose of video object detection/tracking is to construct and describe video objects present in a scene, and to track and identify these objects. Video objects include participant objects, as well as scene objects, such as light tubes, windows, conference tables, and the like. The system needs to cyclically process the input image data of the binocular camera, including face detection and matching, human shape detection and matching, moving object detection and matching, scene object detection and matching and the like, establish a model for a video object and update model parameters, and thus, the modeling of the whole shooting scene is performed according to the detected object model. The scene model obtained by modeling can be used for subsequent object identification and director strategy processing. The face detection can be used for detecting video objects at a close distance, for example, detecting participants at a close distance, and for regions at a far distance, a human shape or moving object detection method can be adopted because the face area is small and the detection cannot be well carried out. The human face detection can obtain various parameters of a human face video object, including parameters such as two-dimensional coordinates of a human face external rectangular region, coordinates of a central point, a rectangular area, a rotation angle of the human face around a coordinate axis (representing the degrees of left-right deflection, pitching and rotation of the human face), positions of organs such as eyes, a nose and a mouth in the human face and the like.

Further, after detecting the video object in each frame of image, the video object needs to be tracked in the sequence of video frames, so as to establish the corresponding relationship of the video object in the time domain. The video object tracking algorithm widely used in the conventional art includes a gray-scale-based template matching, a MeanShift, a Camshift, a Kalman filtering algorithm, and the like. The matching of the video object can be applied to a binocular camera, and a corresponding video image area is found in the other camera image by utilizing the video object area detected in one camera image in the binocular camera, so that the feature matching and the calculation of three-dimensional coordinates can be carried out in the matching area of the video object. The matching algorithm of the video object is similar to the tracking algorithm, and the algorithms such as template matching based on gray scale and MeanShift can be adopted.

In the embodiment of the present invention, the video object may be represented by its features, and the commonly used features include feature points, image textures, histogram information, and the like. The feature detection and matching may be performed in the detected video object region, so that three-dimensional position information, i.e., three-dimensional coordinates, of the video object may be calculated from the feature point information, and tracking of the video object may be performed based on the texture information and the histogram information. The feature points are main feature types, and the feature point detection algorithm comprises Harris corner detection, SIFT feature point detection and other algorithms. Further, the feature matching is used for establishing a corresponding relation of features of the same video object of the binocular camera, the feature points can be matched by adopting a matching algorithm such as a FLANN algorithm and a KLT optical flow method, image textures can be matched by adopting an algorithm such as gray template matching, and histograms can be matched by adopting an algorithm such as histogram matching. In summary, according to the feature information obtained by matching and the binocular camera three-dimensional positioning algorithm, the three-dimensional coordinates of the video object features in the single guide camera three-dimensional coordinate system can be calculated, so that a certain video object can be positioned and tracked in a three-dimensional space.

Furthermore, according to the data obtained by the video object detection and matching algorithm and the characteristic detection and matching algorithm and the result of the three-dimensional position calculation of the video object, a plurality of models of the video object can be established in a single director camera coordinate system, and the model data can be updated by the human face, human figure and motion detection and tracking algorithm. Specifically, each video object model may be assigned a unique ID number, and the data in the model represents the attributes of the video object. For example, for a moving object model, the data in the model may include attributes such as an object ID, circumscribed rectangle two-dimensional coordinates, three-dimensional coordinates of object feature points, motion region texture data, histogram data, and so forth. When the position of the moving object changes, the attribute of the moving object is refreshed according to the output of the detection and matching algorithm, but the ID of the object is kept unchanged. The establishment of the face and the humanoid object is similar to the moving object model and is not described herein.

It should be understood that in a multi-camera application scenario, video object model data may be exchanged among multiple director cameras through network communication, and after a single director camera obtains video object model data of other director cameras, a corresponding relationship of the video object models may be established by using the above algorithm for three-dimensional positioning of the multiple director cameras and matching of video objects, so as to obtain a director policy for the entire scene. The network communication protocol during communication may adopt a standard protocol such as HTTP protocol, or may also adopt a custom protocol, and data of the video object model is formatted, packaged, and transmitted according to a certain format such as an extensible markup Language (XML) format. By matching and merging multiple director camera video object models, a single director camera can model the entire shooting scene. The scene model comprises a plurality of models of video objects, and reflects the characteristics and the distribution situation of the video objects in a three-dimensional space. The director camera needs to maintain the scene model, including adding and deleting object models and object model attributes. For example, when a new participant appears in a scene and a binocular camera detects a new face or a human-shaped object, an object model is established and then added into an object model set; deleting the model of the object after the participant leaves the scene; and after the positions of the participants are changed, updating the parameters of the corresponding object models. And a program guide strategy is formulated according to the latest video object model, and the camera with the best position is selected for shooting.

104. And adjusting the shooting parameters of the target camera to shooting parameters corresponding to the first three-dimensional coordinates, and outputting a video image with the shooting parameters adjusted.

In a specific embodiment, after a video object model including all video objects is established (updated) and a target video object is determined, one or more director camera positions with the best shooting effect can be selected according to a preset director strategy, for example, a camera with a better shooting effect is determined according to eye-to-eye effect parameters, occlusion relation parameters, scene object parameters of a shooting area, and the like. Specifically, the eye-to-eye effect needs to be determined according to the optical axis angle of the human face/human-shaped object relative to the PTZ camera, and the smaller the angle is, the more the human face is presented in a face-up manner, and the better the eye-to-eye effect is. Specifically, the rotation angles (yaw, pitch, and rotation angles) of the three-dimensional coordinate axes with the face/human figure as the center with respect to the coordinate system of the binocular camera can be obtained by the face/human figure detection and calculation method, and the rotation angles of the face/human figure with respect to the binocular camera are converted into the rotation angles with respect to the PTZ camera by using the formula for converting the coordinate system between the cameras. In the conversion process, external parameters between the calibrated binocular camera and the PTZ camera and external parameters between the binocular cameras at different machine positions are used for determining the eye-to-eye effect parameters of the eye. And a PTZ camera priority queue with eye-to-eye effect can be further established for each video object, and the camera with better eye-to-eye effect has higher priority.

Further, when the occlusion relation of the video object is obtained, the area (such as an external rectangle) of the video object detected by a certain director camera can be known according to a camera projection equation, and the area can be re-projected onto the imaging plane of each machine position PTZ camera by using the calibrated external parameters between the binocular camera of the single director camera and the PTZ camera and the external parameters between the binocular cameras of different machine positions. If the regions of two video objects overlap, the occlusion relationship between the two objects can be determined using the depth information, i.e., a video object closer to the binocular camera will occlude a video object farther away. Thus, a priority queue of PTZ cameras with an occlusion relationship can be established for each video object, with cameras without occlusion having a higher priority.

Further, in addition to person-based video object detection, the system also detects other video objects (scene objects) of interest in the scene, such as lights, windows, conference tables, and the like. The detection of these objects may employ algorithms based on image color and edge characteristics, and the like. For example, for lamp tube detection, Canny operator can be used to extract the edge of a lamp tube to obtain its long straight line feature, and then whether an overexposed pixel region (light emitting region) exists in the adjacent region is detected, and the lamp tube object can be detected according to these two features to obtain the coordinates of its circumscribed rectangle. The detection of the window is similar to the detection of the lamp tube, quadrilateral features can be obtained through edge detection, and then whether the window is a window is judged according to whether an overexposed pixel region with a certain area exists in a quadrilateral. The conference table can also be detected by using edge features in the image. When the parameters of the scene object are obtained, the area of the scene object detected by a certain director camera can be known according to a camera projection equation, and the area is re-projected onto the imaging plane of each position PTZ camera by using the external parameters between the calibrated binocular camera of the single director camera and the PTZ camera and the external parameters between the binocular cameras of different positions. For objects such as lighting tubes and windows, there are usually large areas of overexposure, which results in poor automatic exposure effect of the camera and dark scene, while for scene objects such as desks, there are large areas of red or yellow, which results in color cast of the automatic white balance of the camera, and these scene objects should be avoided appearing in the image as much as possible. Therefore, a PTZ camera priority queue can be established whether scene objects which are not beneficial to image effect can be shot, and the camera with lower probability of shooting the scene objects has higher priority.

Further, the director camera, such as the director camera of the host position, establishes a priority queue for the PTZ camera of each position according to the acquired image effect parameters in combination with a preset director policy, so as to determine the camera to be selected. Specifically, one or more video objects to be shot, i.e., target video objects, such as a talking video object determined according to the sound source positioning result, can be determined in advance to shoot a close-up of the video object; or adopting an AutoFrame strategy, when all video objects are required to be taken as the target video object, adjusting Pan/Tilt to bring all video objects in the scene into the shooting range, adjusting Zoom to enable the objects to have proper sizes, and the like. For a target shooting object, a comprehensive PTZ camera priority queue can be determined according to a certain broadcasting guide strategy by utilizing the PTZ camera priority queue of the eye-to-eye effect parameter, the shielding relation parameter and the scene object parameter. The director strategy may be automatically calculated by the system or preset by the user, and the embodiment of the present invention is not limited.

For example, if there are a plurality of cameras satisfying the condition, the PTZ camera with the best scene object parameter, that is, the best image effect, is selected as the target camera to capture. After the selection of the PTZ camera is completed, the host position can adjust the PTZ parameters of the selected PTZ camera according to the three-dimensional coordinates of the target video object so as to obtain the best image effect as far as possible. For example, during voice tracking, when a participant is shot in close-up, objects which affect the brightness effect of an image, such as a lamp tube and a window, are prevented from being shot; in the AutoFrame process, when the Zoom size is adjusted, the phenomenon that a large-area table object is shot to influence the white balance effect of an image is avoided, and the like.

Further, the host bit (director camera of the host bit) may output the selected PTZ camera video image or ID. Optionally, for a multi-director camera system supporting video cascade, the host bit may directly output the image of the selected camera; for the multi-director camera system output by the video matrix, the host bit can output the ID of the selected PTZ camera to the video matrix through a communication interface (such as a serial port or a network port), and the video matrix completes the switching of the camera images.

Referring to fig. 5, fig. 5 is a schematic structural diagram of a parameter adjusting apparatus according to an embodiment of the present invention. Specifically, the apparatus according to the embodiment of the present invention may be specifically disposed in the director camera, as shown in fig. 6, the parameter adjusting apparatus according to the embodiment of the present invention may include an object determining unit 10, a selecting unit 20, an obtaining unit 30, and a parameter adjusting unit 40. Wherein,

the object determination unit 10 is configured to determine a target video object to be shot.

The selecting unit 20 is configured to screen out a target camera for shooting the target video object from each camera of the director camera system where the director camera is located according to a preset director policy.

Optionally, the shooting effect parameter may include any one or more of an eye-to-eye effect parameter, an occlusion relation parameter, and a scene object parameter of a shooting area of the target video object in a coordinate system corresponding to the current camera. The current camera is any one of the cameras in the director shooting system except the binocular camera.

The eye-to-eye effect parameter may include a rotation angle of the target video object relative to a coordinate system corresponding to the current camera, and the rotation angle may be determined according to the rotation angle of the target video object in the second coordinate system and a pre-calibrated position relationship between the binocular camera and the current camera.

The occlusion relation parameter and the scene object parameter may be determined by re-projecting the region of the scene object detected by the current camera to the imaging plane of the current camera according to a pre-calibrated position relation between the binocular camera and the current camera.

The obtaining unit 30 is configured to obtain a first three-dimensional coordinate of the target video object.

The first three-dimensional coordinate may be a three-dimensional coordinate of the target video object in a first coordinate system corresponding to the target camera. The target camera may be the above-mentioned director camera or a general PTZ camera, and the first coordinate system corresponding to the target camera may refer to a three-dimensional coordinate system established with the optical center of the target camera as an origin, or a three-dimensional coordinate system established with another arbitrary reference object as an origin, which is not limited in the embodiment of the present invention.

The parameter adjusting unit 40 is configured to adjust the shooting parameters of the target camera to shooting parameters corresponding to the first three-dimensional coordinates, and output a video image with the shooting parameters adjusted.

Optionally, the obtaining unit 30 may be specifically configured to:

Further optionally, the second three-dimensional coordinate may be obtained by the binocular camera through two-dimensional coordinates of the target video object in the left view and the right view of the binocular camera, which are respectively obtained, and through calculation of the obtained internal and external parameters of the binocular camera. The second three-dimensional coordinate system corresponding to the binocular camera may be a three-dimensional coordinate system established with the optical center of the binocular camera as an origin or a three-dimensional coordinate system established with any other reference object as an origin. The two-dimensional coordinates may specifically be pixel coordinates of the target video object in the left view and the right view of the binocular camera.

Optionally, the object determination unit 10 may be specifically configured to:

the selection unit 20 may be specifically configured to:

Further optionally, a specific way for the selection unit 20 to determine the target video object from the captured image acquired by the camera may be:

converting the second three-dimensional coordinate into a third three-dimensional coordinate according to a pre-calibrated position relationship between the binocular camera and the current camera, wherein the current camera is any one of the cameras in the director shooting system except the binocular camera, and the third three-dimensional coordinate is a three-dimensional coordinate of the target video object in a third coordinate system corresponding to the current camera;

and if so, determining the video object as the target video object.

Referring to fig. 6, fig. 6 is a schematic structural diagram of a director camera system according to an embodiment of the present invention. Specifically, the director camera system according to the embodiment of the present invention may include a first camera 1 and at least one second camera 2, where the first camera 1 includes a director camera 11 and a binocular camera 12, and the director camera 11 and the binocular camera 12, and the first camera 1 and the second camera 2 may be connected through a wired interface or a wireless interface; wherein,

the director camera 11 is configured to determine a target video object to be shot, and screen out a target camera for shooting the target video object from cameras of the director camera system according to a preset director strategy;

the binocular camera 12 is configured to acquire a second three-dimensional coordinate of the target video object, and transmit the second three-dimensional coordinate to the director camera 11; the second three-dimensional coordinate is a three-dimensional coordinate of the target video object in a second coordinate system corresponding to the binocular camera 12;

the director camera 11 is configured to receive the second three-dimensional coordinates transmitted by the binocular camera 12; converting the second three-dimensional coordinate into a first three-dimensional coordinate according to a pre-calibrated position relationship between the binocular camera 12 and the target camera; adjusting the shooting parameters of the target camera to shooting parameters corresponding to the first three-dimensional coordinates, and outputting a video image with the shooting parameters adjusted; and the first three-dimensional coordinate is a three-dimensional coordinate of the target video object in a first coordinate system corresponding to the target camera.

Optionally, the second camera 2 may also include a broadcast guide camera and a binocular camera, and the target camera may be any broadcast guide camera in the broadcast guide camera system; alternatively, the second camera 2 is a normal PTZ camera, and the target camera may be the director camera or a normal PTZ camera. Further optionally, the binocular camera 12 may be disposed on a preset director bracket, and connected to the director camera 11 through the director bracket.

Specifically, as shown in fig. 7, it is a schematic structural diagram of a first camera provided in the embodiment of the present invention. The first camera includes a binocular camera and one or more director cameras. It is assumed that the first camera in the embodiment of the present invention is equipped with 2 director cameras for performing director shooting and tracking, and it can be connected with the binocular camera through a director support (simply referred to as "support") in a wired or wireless manner. The binocular camera is installed on the bracket, in addition, a microphone can be installed on the bracket, the installed microphone can be in an array form, the microphone in the array form can be used for realizing the functions of sound source positioning, sound source identification and the like, and specifically, the binocular camera can comprise a horizontal array microphone and a vertical array microphone. Further, the director camera and the cradle may be separate or integrated, and the director camera and the cradle may communicate using a control interface, such as a serial interface. In some embodiments, the above-mentioned director camera and the director support (including a binocular camera, a microphone, etc.) may also be integrated into one director device, and the connection form of each device in the director camera system is not limited in the embodiments of the present invention.

Further, please refer to fig. 8, which is a schematic networking diagram of a director camera system according to an embodiment of the present invention. As shown in fig. 8, multiple stations may be networked, and the multi-station networking mode includes multiple station networking for installing a broadcast guiding camera, station networking for installing a broadcast guiding camera and a broadcast guiding support + multiple common PTZ cameras, station networking for installing a broadcast guiding camera and a broadcast guiding support + station without a PTZ camera (i.e., only a broadcast guiding support), and station networking without a PTZ camera + multiple common PTZ cameras (i.e., no broadcast guiding support). The cameras of each station can be interconnected through a LAN or Wi-Fi mode to transmit control messages, wherein the control messages comprise camera switching messages, audio and video data such as video object model data and the like. Further alternatively, the control message may be transmitted via an Internet Protocol (IP), for example, using an IP Camera Protocol stack. The binocular cameras in two stands are required to have a shooting overlapping area. When a certain director camera needs to output multiple paths of videos, the video matrix can be connected to the video matrix of the networking system where the director camera is located, and the video matrix is used for switching and outputting the videos. Optionally, the switching policy of the video matrix may be controlled by any designated director camera in the scene, such as a director camera serving as a host, or by a third-party device, which is not limited in the embodiment of the present invention. The video image output by the video matrix can be transmitted to a far end after being coded by the coding and decoding equipment so as to realize a video conference. Specifically, if the number of cameras in the networking is small, video data can be processed in a cascade mode (a director support supports video cascade); if the number of the video sources is large, the videos of the multiple cameras are all output to a video matrix to be processed, and the video matrix is used for switching or synthesizing one or more video sources of the cameras. Further, the cradle may provide external video input/output interfaces, LAN/Wi-Fi portals, serial interfaces, and the like. The video input interface is used for externally connecting input videos of other cameras; the video output interface is used for connecting a terminal or a video matrix and other equipment to output a video image; the serial interface provides a control and debugging interface for the bracket; the LAN/Wi-Fi network interface is used for cascading a plurality of camera positions and can transmit audio and video data, control data and the like.

Furthermore, in a networking scene of the multi-director cameras, the multi-director cameras have video object detection capability and PTZ camera functions, one of the multi-director cameras can be used as a master position and is responsible for output position selection and PTZ camera control, and other cameras are used as slave positions; under the scene of the positions of the director camera and the director support plus a plurality of common PTZ cameras, only one director camera has video object detection capability and is responsible for outputting position selection and PTZ camera control, the common camera is only used as a PTZ camera, and only the director camera has the video object detection capability, so that data of a slave position video object model are not obtained through a network in the scene, and the process of matching the multi-position video object model is carried out.

Specifically, the director camera and the binocular camera in the embodiment of the present invention may refer to the related descriptions of the corresponding embodiments in fig. 1 to 6, and are not described herein again.

Referring to fig. 9, fig. 9 is a schematic structural diagram of a director camera according to an embodiment of the present invention, configured to implement the above-mentioned camera parameter adjustment method. Specifically, as shown in fig. 9, the director camera according to the embodiment of the present invention includes: a communication interface 300, a memory 200 and a processor 100, wherein the processor 100 is connected to the communication interface 300 and the memory 200 respectively. The memory 200 may be a high-speed RAM memory or a non-volatile memory (non-volatile memory), such as at least one disk memory. The communication interface 300, the memory 200 and the processor 100 may be connected by a bus, or may be connected by other methods. In this embodiment, a bus connection is described. The configuration of the apparatus shown in fig. 9 does not constitute a limitation on the embodiments of the present invention, and may include more or less components than those shown, or some components in combination, or a different arrangement of components. Wherein:

the processor 100 is a control center of the device, connects various parts of the entire device using various interfaces and lines, and performs various functions of the device and processes data by operating or executing programs and/or units stored in the memory 200 and calling up driver software stored in the memory 200. The processor 100 may be composed of an Integrated Circuit (IC), for example, a single packaged IC, or a plurality of packaged ICs connected with the same or different functions. For example, the processor 100 may include only a Central Processing Unit (CPU), or may be a combination of a CPU, a Digital Signal Processor (DSP), a Graphics Processing Unit (GPU), and various control chips. In the embodiment of the present invention, the CPU may be a single operation core, or may include multiple operation cores.

The communication interface 300 may include a wired interface, a wireless interface, and the like.

The memory 200 may be used to store driver software (or software program) and units, and the processor 100 and the communication interface 300 execute various functional applications of the device and implement data processing by calling the driver software and the units stored in the memory 200. The memory 200 mainly includes a program storage area and a data storage area, wherein the program storage area can store driver software and the like required for at least one function; the data storage area may store data in accordance with the parameter adjustment process, such as the three-dimensional coordinate information described above.

Specifically, the processor 100 reads the driver software from the memory 200 and executes under the action of the driver software:

determining a target video object to be shot;

Optionally, the processor 100 reads the driving software from the memory 200 and executes the acquiring of the first three-dimensional coordinate of the target video object under the action of the driving software, specifically executing the following steps:

acquiring a second three-dimensional coordinate transmitted by a binocular camera connected with the director camera through the communication interface 300, wherein the second three-dimensional coordinate is a three-dimensional coordinate of the target video object in a second coordinate system corresponding to the binocular camera;

Optionally, the processor 100 reads the driving software from the memory 200 and executes the determination of the target video object to be shot under the action of the driving software, specifically executing the following steps:

acquiring a shot image transmitted by the binocular camera, wherein the shot image comprises at least one video object;

the processor 100 reads the driver software from the memory 200 and executes the following steps to screen out a target camera for shooting the target video object from the cameras of the director camera system where the director camera is located according to a preset director strategy under the action of the driver software:

Optionally, the processor 100 reads the driving software from the memory 200 and determines the target video object from a captured image acquired by a camera under the action of the driving software, and specifically performs the following steps:

and if so, determining the video object as the target video object.

Optionally, the shooting effect parameters may include any one or more of an eye-to-eye effect parameter, an occlusion relation parameter, and a scene object parameter of a shooting area of the target video object in a coordinate system corresponding to the current camera, where the current camera is any one of cameras in the director shooting system except for the binocular camera.

The eye-to-eye effect parameter may include a rotation angle of the target video object relative to a coordinate system corresponding to the current camera, where the rotation angle is determined according to the rotation angle of the target video object in the second coordinate system and a pre-calibrated position relationship between the binocular camera and the current camera.

In the foregoing embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.

In the embodiments provided in the present invention, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and there may be other divisions when actually implemented, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional unit.

The integrated unit implemented in the form of a software functional unit may be stored in a computer readable storage medium. The software functional unit is stored in a storage medium and includes several instructions to enable a computer device (which may be a personal computer, a server, or a network device) or a processor (processor) to execute some steps of the methods according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, an optical disk, or other various media capable of storing program codes.

It will be clear to those skilled in the art that, for convenience and simplicity of description, the foregoing division of the functional units is merely used as an example, and in practical applications, the above function distribution may be performed by different functional units according to needs, that is, the internal structure of the device is divided into different functional units to perform all or part of the above described functions. For the specific working process of the device described above, reference may be made to the corresponding process in the foregoing method embodiment, which is not described herein again.

Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims

1. A camera parameter adjusting method is applied to a director camera, and is characterized by comprising the following steps:

determining a target video object to be shot;

acquiring a second three-dimensional coordinate transmitted by a binocular camera connected with the broadcast directing camera, wherein the second three-dimensional coordinate is a three-dimensional coordinate of the target video object in a second coordinate system corresponding to the binocular camera, and the second three-dimensional coordinate is obtained by the binocular camera through two-dimensional coordinates of the target video object in a left view and a right view of the binocular camera, which are acquired respectively, and the acquired internal and external parameter data of the binocular camera;

converting the second three-dimensional coordinate into a first three-dimensional coordinate according to a pre-calibrated position relationship between the binocular camera and the target camera, wherein the first three-dimensional coordinate is a three-dimensional coordinate of the target video object in a first coordinate system corresponding to the target camera;

2. The method of claim 1, wherein the determining a target video object to be captured comprises:

3. The method of claim 2, wherein determining the target video object from the captured image captured by the camera comprises:

and if so, determining the video object as the target video object.

4. The method according to claim 2, wherein the shooting effect parameters include any one or more of an eye-to-eye effect parameter, an occlusion relation parameter, and a scene object parameter of a shooting area of the target video object in a coordinate system corresponding to a current camera, and the current camera is any camera of the director camera system except for the binocular camera.

5. The method according to claim 4, wherein the eye-to-eye effect parameter includes a rotation angle of the target video object relative to a coordinate system corresponding to the current camera, and the rotation angle is determined according to the rotation angle of the target video object in the second coordinate system and a pre-calibrated position relationship between the binocular camera and the current camera.

6. The method according to claim 4, wherein the occlusion relation parameter and the scene object parameter are determined by re-projecting the region of the scene object detected by the current camera to the imaging plane of the current camera according to a pre-calibrated position relation between the binocular camera and the current camera.

7. The director camera is characterized by comprising a memory, a processor and a communication interface, wherein the processor is connected with the memory and the communication interface; wherein,

the memory is used for storing driving software;

the processor reads the driving software from the memory and executes under the action of the driving software:

determining a target video object to be shot;

acquiring a second three-dimensional coordinate transmitted by a binocular camera connected with the director camera through the communication interface, wherein the second three-dimensional coordinate is a three-dimensional coordinate of the target video object in a second coordinate system corresponding to the binocular camera, and the second three-dimensional coordinate is calculated by the binocular camera through two-dimensional coordinates of the target video object in a left view and a right view of the binocular camera, which are acquired respectively, and the acquired internal and external parameter data of the binocular camera;

8. The director camera of claim 7, wherein said processor reads said driver software from said memory and executes said determining the target video object to be captured under the action of said driver software, specifically executing the following steps:

the processor reads the driving software from the memory and executes the target cameras for shooting the target video objects from all cameras of the broadcasting camera system where the broadcasting cameras are located according to a preset broadcasting strategy under the action of the driving software, and specifically executes the following steps:

9. The director camera of claim 8, wherein said processor reads said driver software from said memory and under the action of said driver software performs the steps of determining said target video object from the captured image captured by the camera, in particular:

and if so, determining the video object as the target video object.

10. The director camera of claim 8, wherein the filming effect parameters include any one or more of an eye-to-eye effect parameter, an occlusion relationship parameter, and a scene object parameter of a filming region of the target video object in a coordinate system corresponding to a current camera, the current camera being any camera of the director camera system other than the binocular camera.

11. The director camera of claim 10, wherein the eye-to-eye effect parameters comprise a rotation angle of the target video object relative to a coordinate system corresponding to the current camera, and the rotation angle is determined according to a rotation angle of the target video object in the second coordinate system and a pre-calibrated position relationship between the binocular camera and the current camera.

12. The director camera of claim 10, wherein the occlusion relationship parameter and the scene object parameter are determined by re-projecting the region of the scene object detected by the current camera onto the imaging plane of the current camera according to a pre-calibrated positional relationship between the binocular camera and the current camera.

13. A broadcast guiding camera system is characterized by comprising a first camera and at least one second camera, wherein the first camera comprises a broadcast guiding camera and a binocular camera, and the broadcast guiding camera is connected with the binocular camera and the first camera is connected with the second camera through wired interfaces or wireless interfaces; wherein,

the binocular camera is used for acquiring a second three-dimensional coordinate of the target video object and transmitting the second three-dimensional coordinate to the broadcast guiding camera; the second three-dimensional coordinate is a three-dimensional coordinate of the target video object in a second coordinate system corresponding to the binocular camera, and the second three-dimensional coordinate is obtained by the binocular camera through two-dimensional coordinates of the target video object in a left view and a right view of the binocular camera, which are respectively obtained, and through calculation of the obtained internal and external parameter data of the binocular camera;