Embodiment
In order to make purpose of the present invention, technical scheme and advantage clearer,, the present invention is further elaborated below in conjunction with drawings and Examples.Should be appreciated that specific embodiment described herein only in order to explanation the present invention, and be not used in qualification the present invention.
The embodiment of the invention is decomposed into multichannel background video and multichannel sport foreground video respectively by the multi-channel video with the multichannel camera acquisition, utilization multichannel background video wherein calculates the projective transformation matrix between the multi-channel video visual field automatically, according to projective transformation matrix the multichannel background video is carried out projective transformation, multichannel background video after the projective transformation is embedded in the full-view visual field, and carry out seamless fusion treatment, according to projective transformation matrix the sport foreground video is carried out projective transformation, sport foreground video to overlapping region, detected visual field carries out parallax correction, and the sport foreground panoramic video of finishing in the full-view visual field merges, last background panoramic video and prospect panoramic video merge, and have realized the automatic generation of panorama dynamic video.
Fig. 1 shows the flow chart of the panoramic video generation method that the embodiment of the invention provides, and details are as follows.
In step S101, gather the multi-channel video of different points of view by multiple cameras.
In step S102, separate background and sport foreground in the video of every road, obtain multichannel background video and multichannel sport foreground video.
Multi-channel video to the multiple cameras collection carries out background estimating and motion detection respectively, obtains multichannel background video and multichannel sport foreground video.In the prior art, video background method of estimation and sport foreground detection method have multiple, are preferable a kind of based on the background estimating of many gauss hybrid models and motion detection wherein, enumerate no longer one by one herein.
In step S103, obtain projective transformation matrix according to the multichannel background video.
Obtain specifically comprising of projective transformation matrix according to the multichannel background video: the characteristic point of obtaining every road background video; Characteristic point and arest neighbors, inferior nearest neighbor distance decision function according to every road background video obtain the set of candidate matches point; Set is purified to candidate matches point; Obtain projective transformation matrix according to the match point set after purifying.
The present invention see also Fig. 2 in order better to explain, for being example with the two-path video, details are as follows to obtain the process of projective transformation matrix according to the two-way background video:
In step S21, calculate feature point set D1, the D2 of two-way background video respectively, and among D1, the D2 each characteristic point to get 128 dimensional feature vectors be example, calculation process is seen Fig. 3, specifically details are as follows.
In step S211, by every road background video, calculate and make up Gauss's yardstick pyramid diagram picture, calculate and make up Gauss's yardstick difference pyramid diagram picture.Specific as follows:
The function of supposing the background video correspondence is f
B(x, y), (x, y is σ) as formula (1) for gaussian kernel function G
In the formula (1), σ is a variance, general desirable empirical value σ=1.5, and exp () represents exponential function.Then Gauss's yardstick pyramid diagram is as f
G(x, y is k) as formula (2)
f
G(x,y,k)=f
B(x,y)*G(x,y,2
kσ),k=0,1,2,... (2)
In the formula (2), * is a convolution algorithm.Gauss's yardstick difference pyramid diagram is as f
D(x, y is k) as formula (3)
f
D(x,y,k)=f
G(x,y,k)-f
G(x,y,k-1),k=1,2,... (3)
In step S212, calculate the Local Extremum set in the difference of Gaussian pyramid diagram picture.Suppose that the difference pyramid has s layer, s 〉=3.Local Extremum is specific as follows:
If (x y) is the pixel locus of difference of Gaussian pyramid diagram picture, k ∈ 1,2 ..., s} is the pyramidal layer of difference position.Order
Then Local Extremum set D1 is as the formula (4):
Dl={P=(x,y,k)|F
min(x,y,k)+F
max(x,y,k)≠0,(x,y)∈Z
2,k=2,3,...s-1} (4)
In step S213, calculate each some P=(x, y, one 128 dimensional feature vector k) among the Local Extremum set D1.Specific as follows: to extreme point P=(x, y, k), at original background video functions f
B(x, y) in, so that (x y) gets 16 * 16 window W for the center
16, calculation window W
16In each pixel place function f
B(x, gradient amplitude y) and direction.With W
16Be cut into size and be 4 * 4 subwindow, have 4 * 4=16 such subwindow, as shown in Figure 4.In each subwindow, calculate the gradient accumulated value of each direction, form one 8 dimension subvector by 8 directional statistics; Owing to have 4 * 4=16 such subwindow, then just produced the characteristic vector of 16 * 8=128 dimension at characteristic point P place.
In step S22, the match point of point in feature point set D2 among the calculated characteristics point set D1, the candidate matches point that obtains D1, D2 is gathered D.Be specially: get the characteristic point P among the D1
1i, in D2, calculate and P
1iNearest point of characteristic distance and characteristic distance time near some P
2n1, P
2n2, the distance between their characteristic vectors is distinguished as the formula (5):
d
1=d(P
1i,P
2n1) d
2=d(P
1i,P
2n2) (5)
If d
1/ d
2<δ then puts (P
1i, P
2n1) be a pair of candidate matches point, the general value of threshold value δ is preferable between 0.5~0.7.
Each point among D1, the D2 is carried out above-mentioned differentiation, obtain candidate matches point set D.
In step S23, D adopts the RANSAC algorithm to purify to the set of candidate matches point, the match point set Dc after obtaining purifying.
Wherein, the step of RANSAC purification algorithm is as follows:
Step1: in D, randomly draw 4 pairs of match points, wherein any 3 conllinear not, otherwise sample drawn again.
Step2:, calculate projective transformation matrix M according to 4 pairs of match points that extract.
Step3: by projective transformation matrix M, calculate among the D each match point to the distance under projective transformation, if distance is less than given threshold value, then this is called interior point under the M to match point.The set that point is formed among the D all is Di, the interior some number Ni of Di.
Step4: the random sampling test to Step1-Step3 is carried out m time.Maximum that time sampling test of point in choosing, as the formula (6), order
Then Dc is the match point pair set after purifying through the RANSAC algorithm.
Calculate the method for projective transformation matrix M among the superincumbent Step2 by 4 pairs of match points, in looking geometry, mature technique is arranged more, repeat no more herein.
Set is obtained after the step of projective transformation matrix according to the match point after the purification, and panoramic video generation method also comprises: obtain optimum projective transformation matrix according to default error function, i.e. step S24.
In step S24, utilize the match point pair set Dc after purifying, use the mutual projected position error optimized Algorithm of match point symmetry to calculate projective transformation matrix M.Concrete grammar is:
If visual field A, the projective transformation matrix between the B are M, total n the match point of match point pair set Dc is right, appoints and gets match point to (P
A(k), P
B(k)) ∈ Dc, P
A(k) ∈ A, P
B(k) ∈ B, k=1...n.Suppose under the Metzler matrix effect P
A(k) subpoint in the B of visual field is Q
B(k), P
B(k) subpoint in the A of visual field is Q
A(k), as the formula (7).
Then under the effect of projective transformation matrix M, the symmetry of match point pair set Dc projected position error function mutually is defined as shown in the formula (8).
Then optimum projective transformation matrix M
*Can be by (M, optimization Dc) obtains, promptly shown in the formula (9) to E.
During specific implementation, the optimization method of based target function has multiple, least square iterative method wherein, and genetic algorithms etc. all are feasible methods, enumerate no longer one by one herein.
Should be appreciated that when multi-channel video comprises three road videos at least it realizes that principle is identical with two-path video, can call various algorithms flexibly and realize, because the process relative complex is not described in detail in this.
In step S104,, generate the prospect panoramic video according to projective transformation matrix and multichannel sport foreground video according to projective transformation matrix and multichannel background video generation background panoramic video.
Step according to projective transformation matrix and multichannel background video generation background panoramic video is specially: according to projective transformation matrix the multichannel background video is projected to unified visual field; Obtain the overlapping region, visual field according to the multichannel background video in the same visual field; According to the overlapping region, visual field the multichannel background video in the unified visual field is carried out seamless fusion treatment.
In embodiments of the present invention, overlapping region, visual field between every two-path video is the convex quadrangle zone, according to the overlapping region, visual field the step that the multichannel background video in the unified visual field carries out seamless fusion treatment is specially: be divided into four triangles according to the overlapping region, visual field of naming a person for a particular job, any one in the overlapping region, visual field; Determine to merge weights according to leg-of-mutton area; According to arbitrarily any the position and merge weights the multichannel background video of the overlapping region, visual field in the unified visual field merged.
In specific implementation, be example with the two-path video equally, establish the projective transformation matrix M between the two-path video of two camera acquisitions, the unified visual field after the projection is C, the background video function of two video cameras is respectively f
GA(x, y, t), f
GB(t), the background panoramic video of twin camera generates step and is specially for x, y:
Step1: use projective transformation matrix M to f
GA(x, y, t), f
GB(x, y t) carry out projective transformation, project under the unified visual field C, and the image function after the conversion is respectively f
MA(x, y, t), f
MB(t), the corresponding visual field of two video cameras after the conversion is respectively A, B for x, y.Calculate overlapping region, the visual field abcd=A ∩ B of A, B, as shown in Figure 5.
Step2: calculate the pixel fusion coefficient w1 in the abcd of overlapping region, visual field, w2.It is specific as follows,
As shown in Figure 5, make P=(x, y) ∈ A ∩ B is the point among overlapping region, the visual field abcd of A, B, four borders of P and overlay region are formed four triangle abP, acP, cdP, bdP respectively, these four leg-of-mutton areas are respectively S1, S2, S3, S4.Make S
M12(S1 S2) is minimum value among S1 and the S2, S to=min
M34(S3 S4) is minimum value among S3 and the S4, then fusion coefficients such as formula (10) to=min
Step3: (t) ∈ A ∪ B is a point in the full-view visual field, then panorama background f for x, y to establish P=
C(fusion t) as shown in Equation (11) for x, y.
Wherein, A-B is meant the difference set of set A and set B in the formula (1), and B-A is meant the difference set of set B and set A.In case after projective transformation matrix was determined, then the full-view visual field overlapping region had just been determined, fusion coefficients w1, w2 have also just determined.Therefore fusion coefficients w1, w2 can only calculate once after calculating projective transformation matrix, and is stored as the form of look-up table, obtains by the look-up table mode during follow-up fusion.
Wherein, be specially according to projective transformation matrix and multichannel sport foreground video generation prospect panoramic video: multichannel sport foreground video-projection is arrived unified visual field according to projective transformation matrix; Multichannel sport foreground video in the unified visual field is merged.
Identical with background data projective transformation mode, at first the sport foreground data are carried out projective transformation according to projective transformation matrix M, multichannel sport foreground data projection is arrived unified visual field.
Because twin camera is in different points of view, the depth of field of same target between different cameras generally there are differences, and generally there is certain parallax in the same target after the conversion in the different visual fields overlapped fov zone (S104 draws by step).As shown in Figure 6, Object
AAnd Object
BFor the unified visual field of sport foreground after conversion of the same target correspondence in two visual fields is position in the overlapping region, visual field in the full-view visual field, Δ d is both displacement difference.
In embodiments of the present invention, the prospect panoramic video merges before the generation, need whether visual field A and the sport foreground among the B in the field of detection overlapping region are same target, when both are same target, need earlier visual field A in the overlapping region, visual field and B to be carried out parallax correction, otherwise need not to carry out parallax correction.
In embodiments of the present invention, the determination methods of same target specifically: whether the barycenter of judging the simply connected region of isolated sport foreground video in two visual fields is in the overlapping region, visual field.The sport foreground video that is in the overlapping region, visual field is mated association.Be specially:
If the moving target in the overlapped fov zone among the A of visual field is O
A(i), i=1,2, L m, the moving target in the overlapped fov zone among the B of visual field is O
B(j), j=1,2, L n.In embodiments of the present invention, moving target is generally a simply connected region, is not a point.
Calculate O
A(i) area S
A(i) (pixel count sum), i=1,2, L m; Calculate O
B(j) area S
B(j), j=1,2, L n.
Calculate O
AThe length-width ratio L of boundary rectangle frame (i)
A(i), i=1,2, L m; Calculate O
BThe length-width ratio L of boundary rectangle frame (j)
B(j), j=1,2, L n.
Calculate O
A(i) RGB color histogram vector H
A(i), i=1,2, L m; Calculate O
B(j) RGB color histogram vector H
B(j), j=1,2, L n.Wherein, H
A(i) and H
B(i) be 3 * 256=768 n dimensional vector n.
Set weighted value w
S, w
L, w
HCalculate O
A(i) and O
B(j) matching distance, as shown in Equation (12).
d(O
A(i),O
B(j))=w
S·||S
A(i)-S
B(j)||+w
L·||L
A(i)-L
B(j)||+w
H·||H
A(i)-H
B(j)|| (12)
Calculate the moving target O in overlapped fov zone among the A
A(i), i=1,2, the moving target O in overlapped fov zone among L m and the B
B(j), j=1,2, the correlation distance of L n, as shown in Equation (13).
T
ik=d(O
A(i),O
B(k))=min{d(O
A(i),O
B(j)),j=1,2,...n} (13)
Wherein as the formula (14);
Be O
B(k) be in B with O
A(i) moving target of matching distance minimum.
It is right to calculate the related moving target of coupling, and rule is:
If T
Ik<δ
0, i=1,2, L m, then O
A(i) and O
B(k) coupling is the same target among two visual field A, the B; Otherwise, then in the B of visual field, do not have and O
A(i) Pi Pei target.
After having determined the same target of coupling, the parallax correction in the overlapped fov zone is specific as follows,
Visual field A and visual field B are projected under the common coordinate system C, can select the coordinate system of one of them visual field to overlap with common coordinate system C.Suppose the benchmark visual field of B visual field as projection, B visual field target projection is the reference position of merging the visual field to the target location among the C, and definite problem of different target position transfers the position correction that the target in the A visual field is carried out corresponding to the coupling target among the B in the full-view visual field.When target entered A, overlapping region, B visual field by non-overlapped B field of view, visual field A projected to that the target location deducts Δ d among the common coordinate system C, is y
CA=y
A-Δ d; On the contrary, when target entered A, overlapping region, B visual field by non-overlapped A visual field, visual field A projected to that the target location adds Δ d, i.e. y among the common coordinate system C
CA=y
A+ Δ d.Wherein, y
ABe the barycenter of the target among the A of the visual field horizontal coordinate to the geometric center of visual field A, Δ d is the barycenter horizontal parallax of the same target of the A of coupling, B visual field.Y in like manner
BFor the barycenter of the target among the B of visual field horizontal coordinate, no longer describe in detail to the geometric center of visual field B.As shown in Figure 6.The bearing calibration of vertical parallax and the bearing calibration of horizontal parallax are similar.
Can learn that by foregoing the different motion prospect video of overlapping region, visual field correspondence is that the condition of same target is that the moving target barycenter of different motion prospect video is in the overlapping region, visual field.The motion barycenter of the sport foreground video in the A visual field occurs in the overlapping region, visual field of unified visual field, and the sport foreground target barycenter of B visual field is not when occurring in the overlapping region, visual field of unified visual field, same target can not be mated related computing, then can occur same target coverage phenomenon this moment when directly merging.At this moment sport foreground video that mates that need be in the overlapping region, visual field and the sport foreground video in another overlapping region, visual field mate, if the match is successful, then are considered as same target mutually, otherwise are considered as non-same target.
When the success of same object matching and its are merging after position in the visual field determines,, make that merging target the profile ghost image occurs because coupling objective contour size often has inconsistency.Perhaps when one of the same target of two visual fields complete and another when imperfect, also can cause syncretizing effect not good enough because of shape difference when multichannel sport foreground video merges.Need to merge template this moment and obtain the prospect panoramic video according to default prospect.Wherein prospect fusion template can be to obtain with following mode:
In the Non-overlapping Domain of A, B visual field, then prospect fusion template zone is exactly a foreground area, and template position remains unchanged in visual field separately;
In the overlapping region of A, B visual field, the sport foreground video to the same target of A, B visual field coupling by the parallax correction result calculated, moves to barycenter with two zones and overlaps, and the union of getting these two zones merges template M as prospect
AB
After the acquisition prospect merged template, the fusion of prospect panoramic video can obtain in the following way:
In the Non-overlapping Domain of A, B visual field, directly use panorama zone that A, B visual field obtain respectively separately as the prospect panoramic video;
In the overlapping region of A, B visual field, the prospect of obtaining is merged template M
ABBarycenter be placed on the centroid position of A, the uncorrected moving target in B visual field respectively, and calculate the common factor M of this template and A, B visual field respectively
A=M
AB∩ A, M
B=M
AB∩ B; If M
AArea greater than M
BArea, with template M
ABBe placed on the centroid position place of the corresponding target in the original video that does not separate through the background prospect of visual field A, take out original video zone that template covers as the panorama foreground target; Otherwise, if M
AArea less than M
BArea, then in the original video of visual field B, carry out above-mentioned computing; The prospect in the video overlay zone that will obtain by above-mentioned computing is placed into the suitable position of full-view visual field by parallax correction.Merge by above-mentioned prospect, can realize that target area is big and that profile is more complete embeds in the prospect panorama, so not only can solve the ghost image problem, and also can finely merge when the body of same target has certain difference.
In the prior art, because the depth of field and visual angle is different between moving target and the background, can't use unified projective transformation to the while of the target and background in image registration, thereby make the panoramic picture after the fusion be easy to generate ghost image and diplopia problem, and the embodiment of the invention has effectively overcome these problems.
In step S105, background panoramic video and prospect panoramic video are merged, generate panoramic video.
Background panoramic video and prospect panoramic video are merged, obtain complete panoramic video.Be specially, establish f
B(x, y is t) for calculating the background panorama that generates, f
F(x, y is t) for calculating the prospect panorama that generates, complete panoramic video f
T(x, y t) are obtained by formula (15).
The structure chart of the panoramic video generation system that the embodiment of the invention provides sees also Fig. 7, only show the part relevant with the embodiment of the invention for convenience of explanation, this system is built in the unit that software unit, hardware cell or the software and hardware of portable terminal or other-end equipment combine.
In embodiments of the present invention, system comprises multi-channel video collecting unit 71, separative element 72, projective transformation matrix computing unit 73, background panoramic video generation unit 74, prospect panoramic video generation unit 75 and panoramic video generation unit 76.
Multi-channel video collecting unit 71 is gathered the multi-channel video of different points of view by multiple cameras; Background and sport foreground in every road video that separative element 72 separation multi-channel video collecting units 71 are gathered obtain multichannel background video and multichannel sport foreground video; Projective transformation matrix computing unit 73 obtains projective transformation matrix according to the multichannel background video that separative element 72 obtains; The multichannel background video generation background panoramic video that projective transformation matrix that background panoramic video generation unit 74 calculates according to projective transformation matrix computing unit 73 and separative element 72 obtain; The multichannel sport foreground video that projective transformation matrix that prospect panoramic video generation unit 75 calculates according to projective transformation matrix computing unit 73 and separative element 72 obtain generates the prospect panoramic video; Panoramic video generation unit 76 merges the background panoramic video of background panoramic video generation unit 74 generations and the prospect panoramic video of prospect panoramic video generation unit 75 generations, generates panoramic video.
Wherein, projective transformation matrix computing unit 73 comprises:
The characteristic point acquisition module is used to obtain the characteristic point of every road background video that separative element 72 obtains;
Candidate matches point set acquisition module is used for the characteristic point of every road background video of obtaining according to the characteristic point acquisition module and default arest neighbors, inferior nearest neighbor distance decision function and obtains candidate matches point and gather;
The purification module is used for the candidate matches point set that candidate matches point set acquisition module obtains is purified;
The projective transformation matrix acquisition module is used for obtaining projective transformation matrix according to the match point set after the purification of purification module.
Embodiment repeats no more as mentioned above.
The present invention is by carrying out feature point extraction and characteristic point is mated automatically in the multichannel background video, and and then calculates projective transformation matrix between the multichannel visual field by the characteristic point of coupling.This method directly on original video extract minutiae and the method for calculating projective transformation matrix compare computational accuracy height, good stability.
For the projective transformation matrix that is optimized, projective transformation matrix computing unit 73 also comprises:
Optimum projective transformation matrix acquisition module, at least one projective transformation matrix that is used for obtaining according to the projective transformation matrix acquisition module and default error function obtain optimum projective transformation matrix.
Background panoramic video generation unit 74 comprises:
The background video projection module, the projective transformation matrix that calculates according to projective transformation matrix computing unit 73 projects to unified visual field with the multichannel background video that separative element 72 obtains;
Overlapping region, visual field acquisition module, the multichannel background video that is used for the unified visual field that projection obtains according to the background video projection module obtains the overlapping region, visual field;
The background video Fusion Module, the multichannel background video that is used for merging the unified visual field that projection obtains to the background video projection module, overlapping region, visual field that obtains according to overlapping region, visual field acquisition module carries out seamless fusion treatment.
Wherein, adopt the triangle area ratio method to determine to merge weights during the background panoramic video merges, eliminated the splicing vestige of public view field transitional region effectively, realized quick seamless fusion.
Embodiment repeats no more as mentioned above.
Simultaneously, prospect panoramic video generation unit 75 comprises:
Sport foreground video-projection module, the multichannel sport foreground video-projection that the projective transformation matrix that calculates according to projective transformation matrix computing unit 73 obtains separative element 72 arrives unified visual field;
Sport foreground video Fusion Module, the multichannel sport foreground video that is used for the unified visual field that projection obtains to sport foreground video-projection module merges.
Wherein sport foreground video Fusion Module further comprises:
Same target judge module, when being used for barycenter when the simply connected region of the multichannel sport foreground video of the unified visual field that the projection of sport foreground video-projection module obtains and all being in the overlapping region, visual field that overlapping region, visual field acquisition module obtains, related computing is carried out in overlapping region, visual field to the multichannel sport foreground video in the unified visual field, judges according to related operation result whether the overlapping region, visual field of the multichannel sport foreground video in the unified visual field is same target;
The parallax correction module, be used for when same target judge module judges that the overlapping region, visual field of the multichannel sport foreground video of unified visual field is same target, parallax correction is carried out in overlapping region, visual field to the multichannel sport foreground video in the unified visual field, and the multichannel sport foreground video after the parallax correction merges in the visual field to unifying to merge template according to default prospect, generates the prospect panoramic video.
Embodiment repeats no more as mentioned above.The embodiment of the invention obtains the multi-channel video of different points of view by the multichannel camera acquisition, multi-channel video decomposed respectively obtain multichannel background video and multichannel sport foreground video, utilize the multichannel background video to calculate projective transformation matrix automatically, and obtain the overlapping region, visual field, according to projective transformation matrix background video is carried out projective transformation, background video after the conversion is embedded in the full-view visual field, and the overlapping region, visual field carried out seamless fusion treatment, according to projective transformation matrix the sport foreground video is carried out projective transformation, the sport foreground video data of overlapping region, detected visual field correspondence is carried out parallax correction and the prospect panoramic video merges; Last background panoramic video and prospect panoramic video merge and obtain panoramic video, realized that the multichannel dynamic video that will have the visual field of overlapping is generated as the panorama dynamic video automatically, ghost image and diplopia problem have been solved preferably at overlapping region, panoramic video visual field moving target, the interference of having got rid of sport foreground of obtaining owing to projective transformation matrix has improved projective transformation matrix calculates and uses automatically between the video camera accuracy and stability greatly.The above only is preferred embodiment of the present invention, not in order to restriction the present invention, all any modifications of being done within the spirit and principles in the present invention, is equal to and replaces and improvement etc., all should be included within protection scope of the present invention.